Commit Graph

138 Commits (8d2e6355f81a9c79b8133c549bf30011d9c940bd)

Author SHA1 Message Date
Michael Peter Christen f150bc218b fixed bug in solr error document
13 years ago
Roland 'Quix0r' Haeder a093ccf5eb Now used synchronization in all close() methods to make sure all objects
13 years ago
Michael Peter Christen 659178942f - Redesigned crawler and parser to accept embedded links from the NOLOAD
13 years ago
Michael Peter Christen a5d7da68a0 refactoring: removed dependency from switchboard in Balancer/CrawlQueues
13 years ago
Michael Peter Christen 9ad1d8dde2 complete redesign of crawl queue monitoring: do not look at a
13 years ago
Michael Peter Christen 2ee8cbeb2c Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
13 years ago
Michael Peter Christen 992dbdf4bb added noload statistic to servlets
13 years ago
Michael Christen c21966bb43 fix
13 years ago
Michael Christen 13b05f9c08 fix
13 years ago
Michael Christen e5d878c59e Merge branch 'master' of ssh://gitorious.org/yacy/rc1
13 years ago
Michael Christen ec26b2bea4 Merge commit 'fa08ed5ae5d72bddc3cc6a662b23103579e86109' into quix0r
13 years ago
Michael Christen 216a287a85 Merge commit '6d4e08ed06c5cd28c45981b2ebe31c7f7ec6fd83' into quix0r
13 years ago
stbrumm d18095dc48 Patch fuer Issue 0000102
13 years ago
Roland 'Quix0r' Haeder 901f37d608 Also this ... :( #2
13 years ago
Roland 'Quix0r' Haeder a985717ed2 Also this ... :(
13 years ago
Roland 'Quix0r' Haeder 5f490de554 Fix for ported fix from my old days ...
13 years ago
Roland 'Quix0r' Haeder fa08ed5ae5 Fixed a lot CHMOD rights (no need for execute flag on *.java/*.html) and introduced local/remote crawl size ratio based check
13 years ago
orbiter 5a55397f99 some last-minute performance hacks
13 years ago
orbiter a7df70221e refactoring
13 years ago
orbiter b250e6466d implemented crawl restrictions for IP pattern and country lists
13 years ago
orbiter d2ea250d99 refactoring:
13 years ago
orbiter 85a5487d6d YaCy can now use the solr index to compute text snippets. This makes search result preparation MUCH faster because no document fetching and parsing is necessary any more.
13 years ago
sixcooler 5f8a5ca32d - not doing merge-jobs while short on Memory
13 years ago
sixcooler 59b767eebd stop loading via http at defined maximum of bytes - even size is unknown before loading
13 years ago
orbiter 115abc8917 - more attributes for search progress bar
14 years ago
orbiter 4bea3f9714 hack to reduce resource contention caused by massive UTF8 decodings which use java.nio resources:
14 years ago
orbiter 10e2f588f8 - enhanced ybr ranking computation
14 years ago
orbiter 6fa439c82b - refactoring of robots
14 years ago
orbiter 4c013d9088 more UTF8 getBytes() performance hacks
14 years ago
orbiter b2fe4b7b1a added a handling of appearances of yacy bot entries in robots.txt if this entry addresses the yacy peer
14 years ago
orbiter 1214615185 fix for 'invisible entry', see http://forum.yacy-websuche.de/viewtopic.php?p=22133#p22133
14 years ago
orbiter cb1f49d0f2 replaced all 'new String' with default encoding (missing) or UTF-8 encoding with a String generation method that uses a pre-defined Charset constant for UTF-8. This avoids a cache-lookup for the Charset object using String hashing of the String 'UTF-8'.
14 years ago
orbiter 4588b5a291 - fixed document number limitation for crawls that restrict the number of documents per domain
14 years ago
orbiter 0769f4caa6 added search suggestions for interactive search: is only shown if there are no search results
14 years ago
f1ori 9d2159582f * fix system update if urls are in blacklist (for example for very general blacklists like *.de)
14 years ago
orbiter a563b05b60 enhanced crawler:
14 years ago
orbiter 09badc697b - low-memory patch for crawler
14 years ago
f1ori 7d8de34778 * add a bit documentation to DigestURI, use DigestURI(string) instead of DigestURI(string, null)
14 years ago
orbiter 461a2a6ec7 enhanced remote crawling:
14 years ago
orbiter 65eaf30f77 redesign of crawl profiles data structure. target will be:
14 years ago
sixcooler 661867923a ... migrating to HttpComponents-Client-4.x ...
14 years ago
orbiter a82a93f2fc - better url double check in crawler
14 years ago
sixcooler 15e8c13526 ... migrating to HttpComponents-Client-4.x ...
15 years ago
orbiter 5d00888c95 - added animated visualization for DHT-in and DHT-out in network graphic
15 years ago
orbiter 7bcfa033c9 more abstraction of the htcache when using the LoaderDispatcher:
15 years ago
orbiter 87087f12fe - scanned remote search process and enhanced some data structure and synchronizations here and there
15 years ago
orbiter 11639aef35 - added new protocol loader for 'file'-type URLs
15 years ago
orbiter 2126c03a62 - removed download-limit that can be given for the crawler for non-crawler download tasks. This was necessary because the same procedure was used for other downloads like for the download of dictionary files where a limit is not useful. The limit still stays for the indexer
15 years ago
orbiter c45117f81f fixed dates in metadata
15 years ago
orbiter 90c3e5d6f6 - cleanup, removed unused imports
15 years ago