Commit Graph

344 Commits (0d29b972ccee9f5bffbb72dd3b0954ac57958443)

Author SHA1 Message Date
Michael Peter Christen 1687737771 Abstraction of HandleMap and HandleSet
13 years ago
Michael Peter Christen 6f1ddb2519 Moved solr index-add method to the same method where the YaCy index is
13 years ago
Michael Peter Christen 315d83cfa0 cleanup
13 years ago
Michael Peter Christen 76202f068e extended abstraction of local and remote solr index using one front-end
13 years ago
Michael Peter Christen 826967513b changed options in IndexFederated_p to switch on/off parts of the index
13 years ago
orbiter 69e743d9e3 - more abstraction for the RWI index as preparation for solr integration
13 years ago
orbiter 05a3ffd03a patches to ensure that solr connectors are active ony if they have a
13 years ago
orbiter 5a3c829872 embedded solr is only initiated if it is activated with
13 years ago
Michael Peter Christen 58e7d1952f reduction of logging to prevent too much IO caused be logging
13 years ago
orbiter 0cbda0b2b8 - replaced all length() == 0 and size() == 0 with isEmpty()
13 years ago
orbiter c7afa8bc48 using SwitchboardConstants for solr attributes
13 years ago
Michael Peter Christen d09d9f2364 filter old peers from bootstrap (now stronger: 60 minutes instead of
13 years ago
Michael Peter Christen b0c408788b made class methods static where possible
13 years ago
Michael Peter Christen 0301aba1e9 removed unused method parameters
13 years ago
Michael Peter Christen ea10766bfd cleaned unnecessary nested code
13 years ago
Michael Peter Christen 7249d9c9de bugfix for concurrent seed loader
13 years ago
Michael Peter Christen c72d3b12cd concurrently initialize the seed list during p2p network bootstrap
13 years ago
Michael Peter Christen 1825f165b8 better integration of blacklist according to use case
13 years ago
Michael Peter Christen c18fa9fa75 Merge branch 'master' of git://gitorious.org/~reger/yacy/bbyacy-rc1
13 years ago
Michael Peter Christen ce8d4b87d9 fixes for new eclipse 'Juno' warning 'Resource leak'.
13 years ago
Michael Peter Christen 0c345d1559 giving threads name so its easier to see whats happening during
13 years ago
reger 067728bccc add search result heuristic. adding a crawl job with depth-1 for every displayed search result (crawling every external linked page of displayed search result pages)
13 years ago
Michael Peter Christen 03280fb161 removed segments-concept and the Segments class:
13 years ago
Michael Peter Christen 9116013c64 - allow lazy initialization of solr value (if using 'lazy', then no
13 years ago
Michael Peter Christen 96aeb127e3 generalized localhost naming.
13 years ago
Michael Peter Christen 77f795756c fixing redirects and status codes: storing of status code in
13 years ago
Michael Peter Christen 8dd469b9dd added option to configure the autocommit delay time of solr on-the-fly
13 years ago
Michael Peter Christen b9dfca4b0a - fixed IndexFederated Servlet / a embedded Solr can now be selected
13 years ago
Michael Peter Christen b9d42fd9c8 using com.google.common.io.Files instead of homebrew methods
13 years ago
Michael Peter Christen a5eb91fa60 refactoring
13 years ago
Michael Peter Christen 0752983fbd - automatic periodic saving of triplestore
13 years ago
cominch a95127c9af Triplestore: initalize per-user triplestores
13 years ago
Michael Peter Christen 4ee6fb1de9 added missing blacklist dht cache storage (maybe due to mistakes in
13 years ago
Roland 'Quix0r' Haeder edaa09b9b1 Rewrote all String blacklist types to enum 'BlacklistType', closes bug
13 years ago
Roland 'Quix0r' Haeder af5a597e47 Scroogle is not comming back, remove dead code
13 years ago
Michael Peter Christen cde20911bb saved a bit more ram using UTF8 String compression for OpenGeoDB and
13 years ago
Michael Peter Christen 2280a7b276 - changed initialization order to prefer allocation of memory for table
13 years ago
Michael Peter Christen 0746308bc2 only the metadata tables shall be able to use the tail cache
13 years ago
Michael Peter Christen 41c02cb10e - less restrictions for usage of Table RAM copy
13 years ago
Michael Peter Christen b0095c8d3c flush the compressor cache when a cleanup is done
13 years ago
Michael Peter Christen d0ec8018f5 fixes for bad long computation
13 years ago
Michael Peter Christen a1fe65b115 performance hacks
13 years ago
Michael Peter Christen e0d8643226 - performance hacks
13 years ago
Michael Peter Christen c846e9ca14 redesign of the crawler monitor page: show crawled pages instead of
13 years ago
Michael Peter Christen 7e0ddbd275 added a "fromCache" flag in Response object to omit one cache.has()
13 years ago
Michael Peter Christen fb94b47b1a changed queue sizes to have less memory occupied during indexing
13 years ago
Michael Peter Christen f294f2e295 bugfix to http://bugs.yacy.net/view.php?id=181
13 years ago
Michael Peter Christen acf8d521a2 fix for http://bugs.yacy.net/view.php?id=126
13 years ago
Michael Peter Christen bb88878b4d the last commit was incomplete..
13 years ago
Michael Peter Christen d320a31ae1 bugfix for http://bugs.yacy.net/view.php?id=186
13 years ago
reger b2175ea4ef Add possibility to set custom Solr field names for the YaCy default Solr attributes.
13 years ago
Michael Peter Christen cb54c1737b solrj connector bugfix
13 years ago
Roland 'Quix0r' Haeder a093ccf5eb Now used synchronization in all close() methods to make sure all objects
13 years ago
Michael Peter Christen 0d58fea210 made multiple connector default
13 years ago
Michael Peter Christen 8864141872 more abstraction in solr connection classes
13 years ago
Michael Peter Christen c00efc2717 made the solr connection more generic
13 years ago
Michael Peter Christen ea2bd43b28 patch for broken configurations
13 years ago
Michael Peter Christen ba6aaabc51 refactoring + parser bugfixes
13 years ago
Michael Peter Christen 19efbf1b0f - apply directDocByURL to NOLOAD Queue
13 years ago
Michael Peter Christen 659178942f - Redesigned crawler and parser to accept embedded links from the NOLOAD
13 years ago
Michael Peter Christen f8cd57c92f new indexing strategy: ALL links that appear anywhere are indexed, not
13 years ago
Michael Peter Christen 14f67f217c refactoring of ContentDomain: now subclass of Classification
13 years ago
Michael Peter Christen 33d1062c79 refactoring: the cache belongs to the crawler
13 years ago
Michael Christen 8fc86fe397 added storage of full anchor link structure:
13 years ago
Lotus 0b3f39136e allow custom ppm lower than minimum button on /Crawler_p.html
13 years ago
Michael Peter Christen 9ad1d8dde2 complete redesign of crawl queue monitoring: do not look at a
13 years ago
Michael Peter Christen 2e5cd6a1b2 fixed parser extension deny list generation and usage
13 years ago
Michael Peter Christen 3cd6dcd352 do not add new solr fields as activated fields
13 years ago
Lotus c73af39e54 refactoring of tray icon class,
13 years ago
Michael Peter Christen 254adea51c small fixes
13 years ago
Marek Otahal 72adbeae90 !Important: move from Hashtable to HashMap
13 years ago
Michael Peter Christen 2ee8cbeb2c Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
13 years ago
Michael Peter Christen 992dbdf4bb added noload statistic to servlets
13 years ago
stbrumm d18095dc48 Patch fuer Issue 0000102
13 years ago
Michael Christen 0797b0de99 new handling of remote search processes: looking for seeds will now not
13 years ago
Michael Christen 9e5894c784 Removed handling of components objects for URIMetadataRows.
13 years ago
Michael Christen c715d19c09 fixes for dependency on svn
13 years ago
Michael Christen 044f83feed added some pauses into the search process which shall produce
13 years ago
orbiter f9216e388c - faster ping to clean up old peers faster
13 years ago
orbiter e22f8497c9 - tested the ARC methods
13 years ago
orbiter bc5df0eef5 updated ranking tables (fresh computation)
13 years ago
orbiter 5a55397f99 some last-minute performance hacks
13 years ago
orbiter 06352b8d6b more logging
13 years ago
orbiter 017a01714d - enhanced logging in robots.txt parser for remote debugging
13 years ago
orbiter 3a15e58e28 - increased stability when opening the robots table
13 years ago
orbiter 78ce3b13be typo
13 years ago
orbiter 85d6bf4ac4 fixed urls to media content during indexing
13 years ago
orbiter 3a807e10cf - added a cache for active crawl profiles to the crawl switchboard
13 years ago
orbiter e58438c01c - added a new retry connector for solr (for cases where solr responses are slow)
13 years ago
orbiter 5af9598bd1 enhanced exported row parsing during row import
14 years ago
orbiter a7df70221e refactoring
14 years ago
orbiter cf4fd525ee added directDocByURL attribute in crawl profile
14 years ago
orbiter b250e6466d implemented crawl restrictions for IP pattern and country lists
14 years ago
orbiter d2ea250d99 refactoring:
14 years ago