Commit Graph

161 Commits (a34f8375925222c1af4e296bf9d457b21a019910)

Author SHA1 Message Date
Michael Peter Christen 70f03f7c8e do not cache search requests to Solr if the result is used for
10 years ago
Michael Peter Christen c67c5c0709 added new solr schema fields which record the occurences of vocabulary
10 years ago
Michael Peter Christen 0550b54d56 added fix to postprocessing: avoid caching of postprocessing collection
10 years ago
Michael Peter Christen 0a879c98e7 added new 'firstSeen' database table and necessary data structures which
10 years ago
Michael Peter Christen 95d87f00b3 fix for bad query generation in doublecheck in postprocessing
10 years ago
Michael Peter Christen 92007e5d2d more enhancements to posprocessing speed
10 years ago
Michael Peter Christen 9a7fe9e0d1 fix for bad timing computation in postprocessing
10 years ago
Michael Peter Christen bd16119a00 another fix for postprocessing (the query for "" on numeric field did
10 years ago
Michael Peter Christen 327e83bfe7 more fixes in postprocessing: partitioning of the complete queue to
10 years ago
orbiter 71758f0d62 enhanced postprocessing by usage of a field-list generation to prevent
10 years ago
Michael Peter Christen fe537679de fix for exact_signature_unique_b, exact_signature_copycount_i,
10 years ago
Michael Peter Christen 2e5214eb21 added field postprocessing.partialUpdate to settings which can be used
10 years ago
Michael Peter Christen 2e09da9832 npe fix
10 years ago
Michael Peter Christen d80418f1b1 added partial updates to solr during postprocessing: during
10 years ago
Michael Peter Christen b1cfbc4a04 added new solr field url_paths_count_i which can be used to enhance the
10 years ago
Michael Peter Christen 30d4402cd1 fixed location search
10 years ago
orbiter f3a12801f0 Merge branch 'master' of git@gitorious.org:yacy/rc1.git
10 years ago
orbiter d93325a578 lazy handling of process_sxt field (part of postprocessing)
10 years ago
reger b5ca20de15 preserve content_type (mime) if supplied in preference of construct in from file type.
10 years ago
reger fb1fcc2b03 handle noarchive tag, skip writing page to cache
10 years ago
Michael Peter Christen 2645dc816a added warning for not well-formed postprocessing queries
11 years ago
Michael Peter Christen 6d3d4c4ea6 changed the concurrent enumeration of query results in such a way that
11 years ago
Michael Peter Christen e87dc08c0d set the correct fail time in error docs
11 years ago
Michael Peter Christen a7dd89c4de changed method to write the citation index: do not catch up references
11 years ago
orbiter d68438c3d9 make sure that the postprocessing background thread never dies by any
11 years ago
orbiter 927aaa95a6 concurrency bugfix
11 years ago
reger f9db5dd6c5 reduce doublecontent check document (prevent out of memory)
11 years ago
reger a8508417d1 catch NPE during crawl (OAI import)
11 years ago
Michael Peter Christen 6344718f8b reducing the concurrent query stack size and reduced concurrency of
11 years ago
Michael Peter Christen 191ec8c82a added concurrency to postprocess rewrite process
11 years ago
Michael Peter Christen a1e8bdd5e9 log ppm instead of docs/second
11 years ago
Michael Peter Christen 338f574bdc no sorting if http/www unique fields are not demanded (makes query
11 years ago
Michael Peter Christen 0ceeceb35e more logic on Solr queries; usage of the query terms in posprocessing,
11 years ago
orbiter 4099296b45 added new classes which shall reduce call overhead to Solr (stub)
11 years ago
orbiter 3491ab4c38 removed unused images from webgraph edge computation
11 years ago
orbiter 1027f3d04a fix for the usage of ready-prepared solr queries, some queries are
11 years ago
Michael Peter Christen 504327b15c fix for condition for writing the webgraph
11 years ago
Michael Peter Christen 4eec1a7452 refactoring (change Metadata name of load time data structure to avoid
11 years ago
reger f96cfdc84d prevent array out of bound exception on getRankingProfile(x)
11 years ago
Michael Peter Christen 2de159719b added an option to set 'obey nofollow' for links with rel="nofollow"
11 years ago
Michael Peter Christen bf1b6b93e7 do not write CR values to webgraph if no CR values are computed
11 years ago
Michael Peter Christen 8514bffc22 enhanced postprocessing status report
11 years ago
Michael Peter Christen b5d78ba156 reduced number of solr queries during crawling
11 years ago
Michael Peter Christen fb3dd56b02 fix for processing of noindex flag in http header
11 years ago
Michael Peter Christen b0d941626f fixed bugs in canonical, robots and title/description unique calculation
11 years ago
Michael Peter Christen 1092e798a5 fixed double content postprocessing
11 years ago
Michael Peter Christen 36e623d8bf enhanced metadata enrichment for media file type search:
11 years ago
Michael Peter Christen 922979aae1 added option to prefer http over https in unique-protocol ranking
11 years ago
Michael Peter Christen b3b174e2b8 fixed webgraph postprocessing and status display in Crawler_p servlet
11 years ago
Michael Peter Christen 8ad41a882c fixed several problems with postprocessing:
11 years ago