Commit Graph

514 Commits (83701a1b4cc7739f2a2fbf52797ce6eeac6857ec)

Author SHA1 Message Date
Michael Peter Christen 0301aba1e9 removed unused method parameters
13 years ago
Michael Peter Christen ea10766bfd cleaned unnecessary nested code
13 years ago
Michael Peter Christen 1825f165b8 better integration of blacklist according to use case
13 years ago
Michael Peter Christen 0c345d1559 giving threads name so its easier to see whats happening during
13 years ago
Michael Peter Christen 03280fb161 removed segments-concept and the Segments class:
13 years ago
Michael Peter Christen 3fd4a01286 added option to record urls that are forwarded to the solr index
13 years ago
Michael Peter Christen 96aeb127e3 generalized localhost naming.
13 years ago
Michael Peter Christen 77f795756c fixing redirects and status codes: storing of status code in
13 years ago
Michael Peter Christen b9dfca4b0a - fixed IndexFederated Servlet / a embedded Solr can now be selected
13 years ago
Michael Peter Christen a5eb91fa60 refactoring
13 years ago
Michael Peter Christen de3ef8ad73 removed unimportant warnings
13 years ago
Michael Peter Christen 96f6a5869f more robust OAI-PMH client (large time-out, three re-tries). OAI-PMH
13 years ago
Roland 'Quix0r' Haeder edaa09b9b1 Rewrote all String blacklist types to enum 'BlacklistType', closes bug
13 years ago
Michael Peter Christen b0095c8d3c flush the compressor cache when a cleanup is done
13 years ago
Michael Peter Christen 96e9d77270 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
13 years ago
Michael Peter Christen 3dd8376825 added automatic cleaning of cache if metadata and file database size is
13 years ago
Michael Peter Christen 461a0ce052 removed warnings
13 years ago
Michael Peter Christen 9b4c699526 ehanced location search:
13 years ago
Michael Peter Christen 16b21f7a5b Added more steering in Crawler_p.html interface
13 years ago
Michael Peter Christen acc19e190d hack against 100% cpu during crawl delete
13 years ago
Michael Peter Christen c15fcde1c8 add-on to latest commit
13 years ago
Michael Peter Christen cf47d94888 performance hack to parse numbers inside of substrings without actually
13 years ago
Michael Peter Christen 7e0ddbd275 added a "fromCache" flag in Response object to omit one cache.has()
13 years ago
Michael Peter Christen f294f2e295 bugfix to http://bugs.yacy.net/view.php?id=181
13 years ago
Michael Peter Christen e7e381d110 added configuration to switch off redirection following in crawler
13 years ago
Michael Peter Christen 70505107ca enhanced crawler/balancer: better remaining waiting-time guessing
13 years ago
Michael Peter Christen f150bc218b fixed bug in solr error document
13 years ago
Roland 'Quix0r' Haeder a093ccf5eb Now used synchronization in all close() methods to make sure all objects
13 years ago
Michael Peter Christen ba6aaabc51 refactoring + parser bugfixes
13 years ago
Michael Peter Christen 659178942f - Redesigned crawler and parser to accept embedded links from the NOLOAD
13 years ago
Michael Peter Christen f5efdb21fd refactoring
13 years ago
Michael Peter Christen f8cd57c92f new indexing strategy: ALL links that appear anywhere are indexed, not
13 years ago
Michael Peter Christen a1a5b015d8 refactoring: moved document Classification to cora package
13 years ago
Michael Peter Christen a5d7da68a0 refactoring: removed dependency from switchboard in Balancer/CrawlQueues
13 years ago
Michael Peter Christen 33d1062c79 refactoring: the cache belongs to the crawler
13 years ago
Michael Christen 22f05c83ff fixed default must-match filter for full domain crawls - the old filter
13 years ago
Michael Peter Christen 0cc0290978 bugfix for a must-not-match pattern check. This bug did not make the
13 years ago
Michael Peter Christen 2fc8ecee36 ConcurrentLinkedQueue has a VERY long return time on the .size() method.
13 years ago
Michael Peter Christen c6c61be3f0 fix for http://bugs.yacy.net/view.php?id=148
13 years ago
Michael Peter Christen 0d148c3353 more logging in resource observer
13 years ago
Michael Peter Christen 2fa037ae1d enhanced crawler
13 years ago
Lotus ee89cf5ae5 fix must match filter for full domain crawl
13 years ago
Michael Peter Christen 9ad1d8dde2 complete redesign of crawl queue monitoring: do not look at a
13 years ago
Michael Peter Christen 1f4f60654a Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
13 years ago
Michael Peter Christen 2ee8cbeb2c Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
13 years ago
Michael Peter Christen 992dbdf4bb added noload statistic to servlets
13 years ago
Michael Christen c21966bb43 fix
13 years ago
Michael Christen 13b05f9c08 fix
13 years ago
Michael Christen e5d878c59e Merge branch 'master' of ssh://gitorious.org/yacy/rc1
13 years ago
Michael Christen ec26b2bea4 Merge commit 'fa08ed5ae5d72bddc3cc6a662b23103579e86109' into quix0r
13 years ago