yacy_search_server

Commit Graph

Author	SHA1	Message	Date
orbiter	138422990a	- removed useCell option: the indexCell data structure is now the default index structure; old collection data is still migrated - added some debugging output to balancer to find a bug - removed unused classes for index collection handling - changed some default values for the process handling: more memory needed to prevent OOM git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5856 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
lotus	ab0030d7a7	allow dht-out for remote-crawl processing peers on default settings git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5834 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	37f892b988	added new concurrent merger class for IndexCell RWI data git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5735 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	67aaffc0a2	- added Latency control to the crawler: because of the strongly enhanced indexing speed when using the new IndexCell RWI data structures (> 2000PPM on my notebook), it is now necessary to control the crawling speed depending on the response time of the target server (which is also YaCy in case of some intranet indexing use cases). The latency factor in crawl delay times is derived from the time that a target hosts takes to answer on http requests. For internet domains, the crawl delay is a minimum of twice the response time, in intranet cases the delay time is now a halve of the response time. - added API to monitor the latency times of the crawler: a new api at /api/latency_p.xml returns the current response times of domains, the time when the domain was accessed by the crawler the last time and many more attributes. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5733 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	024da2916b	refactoring of logging git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5544 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	d39d420b39	performance hacks git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5376 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	2d65887723	- fix for bug in new profile handling - added a new feature in ymageChart (cannot be seen yet, just wait... will be used in profiling chart) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5261 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	ff68f394dd	fix for problem with balancer and lost crawl profiles: if crawl profile ist lost, no robots.txt is loaded any more git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5258 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	1bbf362cef	update to the crawl balancer: better organization and better crawl delay prediction git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5176 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
danielr	621b473b18	* removed some warnings of findbugs (http://findbugs.sf.net ) - removed unnecessary code (unused variables, String.toString) - corrected some calculations (cast int to double or long ;) - improved little performance (using Integer.valueOf() instead of new Integer) - log if some File-actions fail (mkdir(), delete(), ...) and some ignored exceptions - finalized some (more) fields - finally close some streams - made inner classes static if not using environment - generalized some equals (from specificClass to Object) - fixed some potential nullpointer accesses git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5039 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
danielr	3bb870bfcd	added final where possible git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5030 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	c3d461d191	- removed superfluous copyright statement - updated my email address git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5011 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	3ca98fee42	removed superfluous copyright statement git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5010 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	474659a71f	- modified and enhanced the crawl balancer: better list export, fixing of damaged crawl queue at start-up, re-sorting at start-up to enhance domain order - added option to set minimum crawl delta for domains in balancer - added default values to crawl deltas in yacy.init - added configuration for these deltas in performance queues - enhanced performance setting computation (more time for indexing queue for a faster flush - remote crawling is now enabled during local crawling if indexer has space and time for more links - added database stub for new distributed file system - refactoring of time computation to get an abstraction level that will be used by a TTL rule in new distributed file system git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4966 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
danielr	7feae906aa	- organize imports - removed potential null pointer accesses - removed unnecessary casts git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4893 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	2f381b8d7a	- fixed at least two causes for a NPE after a use case switch. A large refactoring was neccessary - added another crawl start option: automatic restriction to sub-path - removed crawlStartSimple and renamed crawl start expert to crawl start (without expert) - some changes to texts in crawl start - added some more deletions when an web index is deleted: delete also queues and robots cache git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4881 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	dd75b3cabc	- patch for bad profiles - time-out when deleting profiles git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4793 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	1689030ee8	refactoring: moved all crawler classes into their own package git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4768 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago

18 Commits (06ed4ef7b3bf986ab973e858677cd01f6f042810)