Commit Graph

357 Commits (1027f3d04a267c72aebf6d0fd1504bde3055e3f9)

Author SHA1 Message Date
Michael Peter Christen f94c91315b if the webgraph is used, then use it also for reference computation to
10 years ago
Michael Peter Christen 4eec1a7452 refactoring (change Metadata name of load time data structure to avoid
10 years ago
Michael Peter Christen 2de159719b added an option to set 'obey nofollow' for links with rel="nofollow"
10 years ago
Michael Peter Christen d07cdd8c3b added SolrCloud access mode and configuration
10 years ago
Michael Peter Christen b5fc2b63ea removed exist() retrieval functions from error cache and replaced it
11 years ago
Michael Peter Christen b5d78ba156 reduced number of solr queries during crawling
11 years ago
Michael Peter Christen fd87fa1613 removed more unnecessary exist-checks in ErrorCache
11 years ago
Michael Peter Christen f2b476e08b don't do a double check to solr for failed documents if they are not
11 years ago
Michael Peter Christen 09dcdb9b19 update to solr 4.9.0
11 years ago
Michael Peter Christen 5b94a257ce no timeout for large reference collections
11 years ago
Michael Peter Christen 8ad41a882c fixed several problems with postprocessing:
11 years ago
Michael Peter Christen f0db501630 better handling of ranking parameters and new default values for date
11 years ago
Michael Peter Christen 53948da7d0 tried to make last_modified recognition smarter
11 years ago
Michael Peter Christen 6634b5b737 debug code for index distribution testing
11 years ago
orbiter 97983ba89f fixed generics warnings for generic array instantiation that appeared
11 years ago
Michael Peter Christen 10cf8215bd added crawl depth for failed documents
11 years ago
Michael Peter Christen 9a5ab4e2c1 removed clickdepth_i field and related postprocessing. This information
11 years ago
Michael Peter Christen da86f150ab - added a new Crawler Balancer: HostBalancer and HostQueues:
11 years ago
orbiter 95780eed32 Merge branch 'master' of git@gitorious.org:yacy/rc1.git
11 years ago
Michael Peter Christen 6bd8c6f195 fix for wrong status codes of error pages
11 years ago
orbiter c250fac9f4 linkstructure refactoring to get more options for clickdepth analysis
11 years ago
Michael Peter Christen bd886054cb new structure and enhancements for link graph computation:
11 years ago
Michael Peter Christen ebd44a7080 replaced solr 4.6.1 with solr 4.7.1 and added index migration to
11 years ago
Michael Peter Christen 926d28dd3f fixed a bug which prevented crawl starts after a network switch
11 years ago
reger 227c42bc96 eleminate obsolete URIMetaDataRow class
11 years ago
Michael Peter Christen 63c9fcf3e0 free configuration of postprocessing clickdepth maximum depth and time
11 years ago
Michael Peter Christen 51800007c4 - added concurrency to postprocessing of webgraph document
11 years ago
Michael Peter Christen fdaeac374a - enhanced postprocessing speed and memory footprint (by using HashMaps
11 years ago
Michael Peter Christen 7c1b968378 another fix for the shutdown exceptions
11 years ago
Michael Peter Christen 7640834b37 removed double concurrency to put Solr documents into the index. The
11 years ago
Michael Peter Christen 0f6b72f24b do not use luke requests for remote solr servers if the result is
11 years ago
orbiter ced1a96f9c fixed error cache
11 years ago
orbiter cfb647db6e - introduced a miss cache in ConcurrentUpdateSolrConnector
11 years ago
orbiter a87d8e4a8e changed caching of ConcurrentUpdateSolrConnector: it caches now also the
11 years ago
orbiter f6e441dd77 refactoring
11 years ago
orbiter 76c53faeb2 removed unused code (HostStat)
11 years ago
Michael Peter Christen 254a7ac66c fixed cleaning of index
11 years ago
Michael Peter Christen 69391e5d9e changed strategy to test existence of documents in Solr: using the
11 years ago
Michael Peter Christen 9eb668e951 enhanced the resource observer
11 years ago
Michael Peter Christen bf97e38b83 removed clearURLIndex, which is a stub remaining from the old metadata
11 years ago
Michael Peter Christen 195e5868d3 catch solr close exceptions
11 years ago
Michael Peter Christen 0cabcbbe83 more efficient wordcount
11 years ago
Michael Peter Christen 3d474a843e added memory protection for postprocessing
11 years ago
Michael Peter Christen 9228214f9b enrichment of PerformanceMemory display of SolrInfoMBean table
11 years ago
Michael Peter Christen e8bdf16ea7 added statistic information for solr resources in PerformanceMemory
11 years ago
Michael Peter Christen 456e52e0d5 enhanced strategy to clear solr caches
11 years ago
orbiter c40ba51ca6 added new suggest method which replaces more-than-one suggestions:
11 years ago
reger 9b24dae2b7 add language navigation filter clause to rwi results
11 years ago
Michael Peter Christen c84bcc878a first try to add a generic solr servlet as luke request servlet
11 years ago
Michael Peter Christen 1ea17bd9f3 - removed old metadata database and all migration code
11 years ago