Commit Graph

120 Commits (42425c800362e36bb59e501adbc214dd405a8b67)

Author SHA1 Message Date
orbiter a7df70221e refactoring
13 years ago
orbiter b250e6466d implemented crawl restrictions for IP pattern and country lists
13 years ago
orbiter d2ea250d99 refactoring:
13 years ago
orbiter 85a5487d6d YaCy can now use the solr index to compute text snippets. This makes search result preparation MUCH faster because no document fetching and parsing is necessary any more.
13 years ago
sixcooler 5f8a5ca32d - not doing merge-jobs while short on Memory
13 years ago
sixcooler 59b767eebd stop loading via http at defined maximum of bytes - even size is unknown before loading
13 years ago
orbiter 115abc8917 - more attributes for search progress bar
14 years ago
orbiter 4bea3f9714 hack to reduce resource contention caused by massive UTF8 decodings which use java.nio resources:
14 years ago
orbiter 10e2f588f8 - enhanced ybr ranking computation
14 years ago
orbiter 6fa439c82b - refactoring of robots
14 years ago
orbiter 4c013d9088 more UTF8 getBytes() performance hacks
14 years ago
orbiter b2fe4b7b1a added a handling of appearances of yacy bot entries in robots.txt if this entry addresses the yacy peer
14 years ago
orbiter 1214615185 fix for 'invisible entry', see http://forum.yacy-websuche.de/viewtopic.php?p=22133#p22133
14 years ago
orbiter cb1f49d0f2 replaced all 'new String' with default encoding (missing) or UTF-8 encoding with a String generation method that uses a pre-defined Charset constant for UTF-8. This avoids a cache-lookup for the Charset object using String hashing of the String 'UTF-8'.
14 years ago
orbiter 4588b5a291 - fixed document number limitation for crawls that restrict the number of documents per domain
14 years ago
orbiter 0769f4caa6 added search suggestions for interactive search: is only shown if there are no search results
14 years ago
f1ori 9d2159582f * fix system update if urls are in blacklist (for example for very general blacklists like *.de)
14 years ago
orbiter a563b05b60 enhanced crawler:
14 years ago
orbiter 09badc697b - low-memory patch for crawler
14 years ago
f1ori 7d8de34778 * add a bit documentation to DigestURI, use DigestURI(string) instead of DigestURI(string, null)
14 years ago
orbiter 461a2a6ec7 enhanced remote crawling:
14 years ago
orbiter 65eaf30f77 redesign of crawl profiles data structure. target will be:
14 years ago
sixcooler 661867923a ... migrating to HttpComponents-Client-4.x ...
14 years ago
orbiter a82a93f2fc - better url double check in crawler
14 years ago
sixcooler 15e8c13526 ... migrating to HttpComponents-Client-4.x ...
15 years ago
orbiter 5d00888c95 - added animated visualization for DHT-in and DHT-out in network graphic
15 years ago
orbiter 7bcfa033c9 more abstraction of the htcache when using the LoaderDispatcher:
15 years ago
orbiter 87087f12fe - scanned remote search process and enhanced some data structure and synchronizations here and there
15 years ago
orbiter 11639aef35 - added new protocol loader for 'file'-type URLs
15 years ago
orbiter 2126c03a62 - removed download-limit that can be given for the crawler for non-crawler download tasks. This was necessary because the same procedure was used for other downloads like for the download of dictionary files where a limit is not useful. The limit still stays for the indexer
15 years ago
orbiter c45117f81f fixed dates in metadata
15 years ago
orbiter 90c3e5d6f6 - cleanup, removed unused imports
15 years ago
orbiter 55d8e686ea performance hacks
15 years ago
orbiter 25aef069a6 continuing String-hash - to - byte[]-hash redesign that was started in SVN 6775
15 years ago
low012 b97ad0f380 *) some minor changes for better code readability
15 years ago
orbiter 1e8e79b9ef redesign of reference hash (URL-hash) parameter hand-over:
15 years ago
orbiter b88f5fbb4b slightly changed crawling policy
15 years ago
orbiter 7684a575c4 fix for deletion of error database each time when YaCy starts up
15 years ago
orbiter e80e060ca6 - increased thread priority for server threads
15 years ago
orbiter 66c0a8e849 more PMD recommendations
15 years ago
orbiter dd459281c8 applied code changes that are recommended by PMD
15 years ago
orbiter 4a5100789f replaced _all_ size() == 0 with isEmpty() and all size() > 0 with !isEmpty(). The isEmpty() method is much faster in some cases, especially when used to access badly balanced hashtables where an size() operation becomes a large iteration.
15 years ago
orbiter 4c99d4683d possible fix for lost crawl profile handles: clean-up job did wrong measurement to see if crawl is still running.
15 years ago
orbiter 4431b9767e added about 450 replacements for printStackTrace() methods to pipe such traces into the log at DATA/LOG/
15 years ago
orbiter a0e891c63d - some redesign in UI menu structure to make room for new 'Content Integration' main menu containing import servlets for Wikimedia Dumps, phpbb3 forum imports and OAI-PMH imports
15 years ago
orbiter b79f4f062f refactoring of yacy documents and parsers: they depend now only on the kelondro classes
15 years ago
orbiter ce8dc575ca refactoring
15 years ago
orbiter bea3b99aff moved table and util classes
15 years ago
orbiter f677d534b1 start of a really extensive refactoring which will produce a hierarchical package structure with the domain yacy.net as package root
15 years ago
orbiter 6e0dc39a7d - some fixes to prevent blocking situations
15 years ago