yacy_search_server

Commit Graph

Author	SHA1	Message	Date
orbiter	4465db7399	removed debug information from network grafic git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4118 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	01e0669264	re-designed some parts of DHT position calculation (effect is the same as before) and replaced old fist hash computation by new method that tries to find a gap in the current dht to do this, it is necessary that the network bootstraping is done before the own hash is computed this made further redesigns in peer initialization order necessary git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4117 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	5b1a937ed8	fix for crawl stack database format change, introduced in SVN 4113 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4115 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	af25c98306	enhanced local search performance in case of a remote search: there is no waiting until the local search terminates to show the result page. the local search appear like all other results from remote peers using a separated thread. This has especially a stron effect, if the local index for a specific word is large. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4114 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	842308ea97	- redesigned crawl start menu, integrated monitoring pages - removed web structure picture from indexing menu and grouped it together with htcache monitor - added a database for terminated crawls, when a crawl is finished it is automatically moved to the new database - extended crawl profile edit servlet, shows now also terminated crawls - option that was used to delete profiles is now redesigned to a function that moves the current crawl to the terminated crawls and removes all urls from the current queues! - fixed here and there problems with indexing queues - enhances indexing speed by changing cache flush sizes. - changed behaviour of crawl result servlet: the list of crawled urls is shown if there is one, othevise the overview window is shown attention: the new profile databases are not compatible with the old one. current crawls will be lost! the web index is not touched. next steps: the database of terminated crawls can be used to start with them a new crawl. This is useful if one wants to re-crawl specific pages and wants to use a old crawl profile. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4113 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	341f7cb327	steps to enhance remote search performance: - added a file size limitation, that disallows parsing of large documents during (offline-) remote search - added profiling information to search result computation, visible at search access tracker. this info shows used time for URL fetch and snippet computation git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4112 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	2f1ff048ba	some fixes to socket connection time-out git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4111 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	3c74014004	automatic deletion of dead client connections git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4110 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	11b4f80bde	- fixed non-closing client connections - added client connection tracker in connections servelet git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4108 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	1488769e1f	cleanup of unmaintained and outdated performance methods: removed object pools in httpc. Object pooling is not recommended, if the creation of the object is not time-intensive. Object pools are only useful, if there is much computation necessary to create some basic data that is stored in the object pool and can be re-used. This does not apply to object pools in YaCy. Object pooling of client sessions would make sense if they would allow re-use of living connections to other yacy clients. But every connection is closed after usage of an object in the client pool, therefore the YaCy server client objects are not such that hold hardware/network-allocated entities. See: http://www.javaperformancetuning.com/news/qotm033.shtml http://java.sun.com/docs/hotspot/HotSpotFAQ.html#gc_pooling http://docs.sun.com/source/816-7159-10/pt_chap5.html http://www.microjava.com/articles/techtalk/recylcle2 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4106 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	4275727d69	fix for peer ping problem (implemented a 3-time re-ping); cause for 'Connection reset' still unknown git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4095 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	6601e37512	clear caches after changing blacklists, closes http://forum.yacy-websuche.de/viewtopic.php?f=6&t=241&p=1964#p1964 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4088 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	5b0c1449e1	various fixes and cleanups for blacklist handling: 1. avoid adding duplicate file name entries in config properties for lists, 2. correctly merge all path masks from all list files for the same host masks, 3. rewrite helper methods standard java methods for Collection transformations, 4. merged various methods with identical functionality for different Collection implementations into one, 5. minor refactoring to improve code readability. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4087 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	841cf71022	fix for NPE in DHT transfer selection, see http://forum.yacy-websuche.de/viewtopic.php?f=6&t=327 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4085 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	f2a3434407	fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=238&p=1341#p1341 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4082 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	f4a5c287fe	re-implemented post-ranking of search results (should enhanced search result quality) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4080 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	8ff5e2c283	- fixed/re-implemented media search - fixed search tipps (topwords, now appearing at the bottom of the page) - added search consequences execution (deletion of bad referenced some time after the search happened) - added some formatting at network table git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4078 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	6c819a6fd9	added cache to favicon display added better synchronization for simultanous search requests git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4076 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
borg-0300	d69013f66a	added patch from Fuchs - http://forum.yacy-websuche.de/viewtopic.php?f=6&t=241 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4075 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	daf0f74361	joined anomic.net.URL, plasmaURL and url hash computation: search profiling showed, that a major amount of time is wasted by computing url hashes. The computation does an intranet-check, which needs a DNS lookup. This caused that each urlhash computation needed 100-200 milliseconds, which caused remote searches to delay at least 1 second more that necessary. The solution to this problem is to attach a URL hash to the URL data structure, because that means that the url hash value can be filled after retrieval of the URL from the database. The redesign of the url/urlhash management caused a major redesign of many parts of the software. Since some parts had been decided to be given up they had been removed during this change to avoid unnecessary maintenance of unused code. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4074 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	e90afa9483	fixed search access tracker git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4072 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	4779f314fe	first version of next-generation search interface: - snippets are not fetched by browser using ajax, they are now fetched internally - YaCy-internat threads control existence of snippets and sort out bad results - search results are prepared using SSI includes - the search result page is visible right after the search request, the results drop in when they are detected - no more time-out strategy during search processes, results are shifted within queues when they arrive from remote peers - added result page switching! after the first 10 results, the next page can be retrieved - number of remote results is updated online on the result page as they drop in - removed old snippet servelet (which had been also a security leak btw) - media search is broken now, will be redesigned and fixed in another step git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4071 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	f9e6cf6a3d	more refactoring of search: integrated first version of ssi-using search interface, but the function is currently disabled git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4063 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	f81ef40cc4	no dht activity for small networks; this is not needed if the network is small git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4062 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	d9472b6a3a	* fixed problem with watch crawler * added new column to network table (remote crawl urls): the new value for provided URLs will be used for new remote crawl method git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4061 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	e332b844b2	- enhanced remote search: during waiting time for remote crawls some urls are fetched so the url cache can be filled with these urls - the url-prefetch is used to sort out some unresolved urls - the snippet-fetcher is triggered with the search event id. This is used to remove missing snippets from the search cache so they will not be displayed again git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4060 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	a34d9b8609	* added a search history cache that maintains search results for 10 minutes it is necessary for the new search process that will do automatic re-searches a positive effect is, that when a re-search is done it can be monitored how many results had been contributed from other peers. The message for this contribution was moved from the end of the result page to the top. * enhanced re-search time when a global search was done an the local index has already a great number of results for this word * re-organised presearch computation; must be further enhanced git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4059 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	ae86d010bb	more refactoring of search processes; also some small speed enhancements git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4058 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	bb426565f0	added new yacy protocol for mass url-pull for better remote crawling distribution git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4056 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	72752bb503	because of a new database structure handling, the memory need for accessing collection objects has been reduced to 50%: - set new memory calculation functions for indexing process - adjusted guessed memory amount -> Testing needed: try new recommended value (see performanceQueues) and see if OOMs occur. -> report maximum recommended value, so we can set new default values. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4053 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	16c203f759	fixed remote search access tracker git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4048 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	344911bfaa	shorter minimum delay values for intranet crawl targets git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4047 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	f890cc86aa	inserted forwarding patch from fuchs see http://forum.yacy-websuche.de/viewtopic.php?f=6&t=233 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4046 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	b5346141b3	made the plasmaHTCache static (there is only one internet, so we need only one cache) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4045 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	947fc46904	refactoring of search process: - re-designed remote request result processing - re-designed local result accumulation, will be further enhanced with snippet fetcher - removed search process handling in switchboad - made snippet class static (there is no need for multiple snippet objects) - removed some redundant tasks in server-side search process, should be a little bit faster now git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4043 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	5c1b444690	some redesign of min/max and normalization computation during search result ordering this saves about 1 millisecond for each URL reference, which has some good effect on the search result computation if a word is searched that appears very often (speed-up of 1 second and more) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4033 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	1af0e3bd84	refactoring git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4031 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	5605887571	refactoring of search processes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4030 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	62347b50f4	added security layer for ViewImage: - images may be requested by localhost and authorized users only, if the request is done using a clear-text URL - the image may be requested also using a code that can be a license to retrieve a URL for everyone - some servelets produce URL licenses for ViewImage, like image search results git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4027 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	69d640b041	added missing synchronization in crawl balancer to avoid that the synchronization is triggered during many-time-used size() operation a notEmpty method was added that can avoid the synchronization many times git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4025 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	9628db6cdc	enhanced memory allocation during database access: - refactoring of kelondroRecords; this class is now divided into kelondroAbstractRecords, kelondroRecords, kelondroCachedRecords, kelondroHandle and kelondroNode - better abstraction of kelondroNodes, such nodes may now be crated by different classes - a new Node defining class kelondroEcoRecords defines Nodes that do not need so much allocation and System.arraycopy - there is less memory transfer on the bus, especially for collection index - now half of memory needed for web index access git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4024 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	57a5b6fa71	some generalization of remote proxy configuration and setting handling in httpc git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4023 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	367fc28928	corrected Brausse->Brausze git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4020 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	e76fe1c078	- replaced unicode characters in copyright holder name ('Brausse') - more logging for bootstrap seedlist loading - larger DHT chunks git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4015 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	8d6aa7a66d	replaced detailed search page by ranking definition page (this is what it essentially is) the ranking definition there will influence the normal web search git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4006 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	7ff4357184	fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=206&hilit=&p=1130#p1130 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4004 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	9ca46a8c69	indexing of local (intranet) urls enabled To do this, one must create a separate YaCy network that has a local URL domain A description how to do this is here: http://www.yacy-websuche.de/wiki/index.php/De:Netzdefinition git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4001 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	40b0547611	- documentaton changes (removed old forum links) - different handling of link quotation - different handling of link normalization - enhanced html/unicode en/de-coding git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3993 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	dcb8687904	fix to update cycle git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3992 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	557f8d80e4	- better logging - fixed bugs in auto-update git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3990 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	f323e1813d	added commons.logging again (is used by mimeTypeParser) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3989 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
michitux	a695c93662	Fix of the Status-page for IE: - reverted revision 3982 and 3979 - added a real fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=163 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3988 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	b6d9cca67e	- fixed problem with yacyVersion and own version generation - within this context: generalized date format handling - extended Update interface: * a version lookup can be triggered manually * a complete lookup + download + re-boot process can be triggered with one click git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3986 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	6071668c3b	better error message in case that a mime type cannot be found. see also http://forum.yacy-websuche.de/viewtopic.php?f=6&t=132&p=587#p587 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3984 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	03847bebc1	removed unused libs git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3971 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	9da0e53fe8	repaired rss feed reader - removed old rss parser - removed unused rss parser libraries - added new rss reader - added previously removed FeedReader_p.java and adopted it to new rss parser - adopted parser interface for rss indexing to new rss parser git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3970 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	26ddf797eb	added bmp and ico image format to all parser/viewing methods git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3969 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	5444b07674	fixed bug with decompression of index abstracts this fixes a problem that occurred when searching for several words git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3968 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	89e1848db6	fixed problem with favicons: target servers had been able to see search words from the referrer of the favicon fetch. This has been removed by using the getImage - servlet for favicon fetch. Since java does not support loading of bmp and ico-Images, such parsers had been added. The image parser had been coded from their original microsoft documentation. This influences also the image-search functionality: there can now be a preview of found bmp-images. Another benefit: favicons for search results are now cached with the HTCACHE. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3965 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	7c5c814a47	- simplified code (removed exception handling where not necessary) - added confirmation dialog for shutdown and restart git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3962 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	a4e8ad95ab	enhancements to news and switchboard queue processing removed direct access and replaced by iteration git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3961 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	a45216b479	fix to prevent bad-formed news messages git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3960 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	bec4dbc753	added options and execution methods for automated updates git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3959 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	208b5297f1	enhanced handling of news records: result is a speedup of Surftips, Supporter, and Network page git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3954 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	36a37f758b	fix for oom exception during release download see http://forum.yacy-websuche.de/viewtopic.php?f=6&t=101&hilit= git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3950 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	3421c64d26	implemented update function: after downloading a release using the download button on the status page the user can choose any of the downloaded versions for a update. this enables also a downgrade to a older version. when the update button is pushed, yacy terminates, installes the choosen version and restarts git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3948 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	c1aad9e508	added parameter for network graphic background git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3942 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	1a45ecb356	- fix for http://forum.yacy-websuche.de/viewtopic.php?f=5&t=14&p=137#p137 - fix for missing restart script in ant built target - removed some more synchronization for size() operations - removed blocking statement on search page git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3935 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	f1ed91a8e4	added option to allow/disallow DHT transmission during indexing see also http://forum.yacy.de/viewtopic.php?f=9&t=8 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3933 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	9bbd39b67c	- removed unfinished auto-updater from roland and martin - added new download-option for releases on the status page still mising: - thomas-style restart for linux/mac - untar/gunzip on shell basis (comes next) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3931 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	1782ef57e5	- added SSI parser and include directive for <!--# include virtual="<file>" --> - added chunked file transfer for non-yacy clients - SSIs are streamed using chunked transfer, partly delivered pages can be seen in browser before transmission is finished - added client-side network unit identification - cleaned up code git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3926 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	0e57a8062b	added network definition for different YaCy networks (needs much more work) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3919 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
auron_x	1d41ebf489	) made age for deletion of too old seeds configurable ) changed naming-scheme of seed-deletion-properties git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3918 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
auron_x	52cb3208d0	*) old (lastseen > 7d) peers are now automatically removed from passive and potential seed-dbs git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3917 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	815e3da62f	fix for http://www.yacy-forum.de/viewtopic.php?p=37353#37353 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3913 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
low012	c59a7ce5c2	*) hopefully fixed a stupid bug (my fault of course) that sometimes messed up the marking of search words in the snippets (see http://www.yacy-forum.de/viewtopic.php?p=37329#37329 ) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3908 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	6518bb6c08	changed release strategy: we will provide two different releases in the future, one standard release and one 'pro'-release. the 'pro'-release contains all additional parsers AND has different default performance values. The pro-version differs therefore from the previous 'all'-version by this default values. The pro-configuration is automatically choosen if the libx-folder exists. If a version is once initialized, its configuration stays independently from an existing libx folder. The ant targets had been changed. There are now 3 different targets to create standard and pro-releases, and one target to upgrade: - dist: creates a standard release (only, no libx target any more) - distPro: creates a pro-release (includes the libx) - distExt: creates a libx-release which includes the libx-folder only. It may be used to upgrade from standard to pro Furthermore, the naming of 'dev'-releases had been removed. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3902 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	069562a14d	fixed problem with re-crawl; replaced error file-db with ram-db git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3900 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	c7a614830a	several bugfixes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3899 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
allo	465145cb6f	revert to insecure, but dau-proof defaults git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3898 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
allo	7ad11ceaaa	security fix for peers without password. allow access only from localhost git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3897 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	71fd972ac0	- reduced default search time - catched case when web structure cannot be painted because of too less data - better logging when balance fails git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3892 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	684ded0e09	added new news types git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3876 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	d7de0938a6	fix for http://www.yacy-forum.de/viewtopic.php?p=36587#36587 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3870 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
karlchenofhell	22ee85ca02	- specified exceptions thrown by ResourceInfoFactory and plasmaHTCache.loadResourceInfo() - caught possible NPE in CacheAdmin_p and added more error-cases - speeded up deletion of entries in the local crawl queue by crawl profile (it has been noted often that this deletion is slow) - added a bit javadoc git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3868 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	dfd5e823c3	automatic limitation of web structure host count git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3867 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	8b0aea6910	fixed automatic deletion of too many referenced hosts in web structure git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3866 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	9a8a87612d	added new qph column to search tracker servlet git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3854 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	e07458bad4	added time-out function to web analysis the default time-out is 1 second git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3852 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
hydrox	4a1bc4743a	*)News-entries with blacklisted URLs are now ignored git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3849 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
theli	339153d40e	*) favicons that are specified in the document content via html link-tags are now detected and displayed on the search page (requested by allo). git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3845 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
karlchenofhell	6265d321bd	- more constants - display why global search is not available on search page git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3839 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
rramthun	18a5380ee3	) situation-dependent lock-buttons for search-page ) removed one unused import and a double definition of "ogg" as media-type git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3817 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
karlchenofhell	9d6605a83c	- fixed NPE in Blacklist Cleaner during deletion of more than one double entries - don't display responseHeader1.db in CacheAdmin_p anymore git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3814 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	594ff95955	:-( git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3801 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	4ca797401e	fix for ConcurrentModificationException see http://www.yacy-forum.de/viewtopic.php?p=36566#36566 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3800 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	7b904e0077	integrated robots.txt crawlDelay into the crawl balancer git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3797 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	52cb033f01	- slightly different painting of web structure picture: hosts that have many own connections are painted farer away (this is not yet cato's idea, this will be implemented in another step) - doc update git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3796 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
allo	6c9df13552	more debugging git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3791 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
allo	d1e1580223	Surftips Blacklist Blacklists List Hardcoded instead of only updated on firststart / migration.java git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3788 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
(no author)	94cc9f05f5	*) Improvements for restart via update wrapper git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3785 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
borg-0300	2ab020445a	bugfix, i think - http://www.yacy-forum.de/viewtopic.php?t=4059 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3777 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
(no author)	ef24bed406	Sorry... git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3760 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
(no author)	a29cb2e1af	blupp git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3759 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	a585b4d41b	added web structure image see http://localhost:8080/WatchWebStructure_p.html git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3747 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	33ad0c8246	added a web structure computation and logging: - all web page parsing operations will now increase a web structure file - the file is computed in memory and dumped at shutdown-time to PLASMASB/webStructure.map in readable form (not a database) - the file can be used externally to analyse the link structure of the crawled pages - the web structure can also be retrieved using a xml-interface at http://localhost:8080/xml/webstructure.xml - the short-term purpose is the computation of a link-graph image (before linuxtag!) - a long-term purpose could be a decentralized computation of the citation rank git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3746 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
karlchenofhell	7904175338	- sorry for typos git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3743 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
karlchenofhell	baa9402b97	- wiki-parser is now configurable via the config setting wikiParser.class which holds the class-name for the parser to use git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3742 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
karlchenofhell	0a64047081	- plasmaParserDocument can process subdocuments now (other archive-parsers may want to use this method) - added 7zip parser - added 'text/sgml' to realtime parseable mimetypes (sometimes returned by the mime type parser) - added new cached output stream class, very suitable for parsers because of limited memory git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3740 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
theli	9a4375b115	*) robots.txt: adding support for crawl-delay git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3737 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
karlchenofhell	086239da36	- added servlet: remote crawler queue overview - added servlet: crawl profile editor git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3731 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	b05e2314cf	another dht selection fix git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3725 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	b28e5d0ee9	protection against wrong word hash length see http://www.yacy-forum.de/viewtopic.php?p=35657#35657 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3723 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	0384b8771b	fix for http://www.yacy-forum.de/viewtopic.php?p=35700#35700 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3719 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	578c2ef130	release 0.52 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3715 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	46367afaaa	update of memory-protection values see http://www.yacy-forum.de/viewtopic.php?p=35539#35539 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3709 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
rramthun	ea87fe5d78	) Updated German translation ) Changed "Lost Handle" error to warning (masses of it if deleting crawl-profile) *) Removed unnecessary code from Windows script git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3708 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	26f05d1fd0	avoid division by zero if search is done for no words this case is relevant if the bluewords (yacy.blue) are used git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3698 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	139c59ebbd	- fixed dht selction problem: the seed tables used a wrong ordering - cleaned some code git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3693 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	e602436fda	fixed problem with cluster routing git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3684 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	d6480dc670	fix for long transfer pauses see http://www.yacy-forum.de/viewtopic.php?p=35243#35243 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3672 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
theli	6f46245a51	) Bookmarks: Ajax icon is displayed while loading title ) First version of a sitemap parser added - currently only autodetection of sitemap files is supported *) DB-Import restructured - pause/resume should work again now git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3666 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
theli	74dd6cac95	) signal yacy shutdown to updater ) some javadoc added git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3658 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
theli	43748f87fb	*) changes required for the uploader git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3655 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
rramthun	e12e934ade	*) Fixed broken compile process. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3650 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	7cf8981a98	- added debugging code for wrong DHT target iterator - restricted distance constraint from 0.4 to 0.2 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3644 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	dd44a1394f	disabled automatic performance setting change - during crawl start - each indexing cycle - for delay values - for short memory cycles git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3634 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	b9add5cf37	some bugfixes: - dht iterator start point - wordIndex synchronization - surftipps url check git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3633 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	06b6e35484	fix for a null pointer exception if clusters are not defined git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3632 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	47e90f31b2	fix for deadlock in plasmaWordIndex.addPageIndex synchronization for class method not necessary see also: http://www.yacy-forum.de/viewtopic.php?p=34959#34959 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3628 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	81844e85b2	- fixed more cluster routing problems - fixed a problem in remote search when balancer caused shift process to wait too long git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3627 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	304ed3f4d2	fix for remote crawl requests in clusters git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3626 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	e48189c710	enhanced cluster routing - cluster definitions can now contain an addition for local ip addresses - cluster-cluster communication uses the local ip address instead the global address, if one is given git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3624 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	485bf1ea83	bugfix for robinson/remote crawl bug git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3614 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	62c947b4aa	next try to fix deadlock in plasmaWordIndex see also: http://www.yacy-forum.de/viewtopic.php?p=34821#34821 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3607 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	871ee1ce0f	one step closer to automatic updates: automatically aquire release information from download archives web pages from latest.yacy-forum.net and yacy.net are retrieved, parsed, links wihin are analysed, sorted and the most recent developer and main releases are provided as direct download link on the status page, if it was discovered that a more recent version than the current version is available. This process is done only once during run-time of a peer, to protect our download archives from DoS by YaCy peers. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3606 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
theli	2399ed817c	) robots.txt parser now extracts the sitemap-URL (will be used later) ) some javadoc added *) junit testclass for robots.txt parser added git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3602 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	fa012789b2	tried to fix a deadlock problem durin shutdown see also: http://www.yacy-forum.de/viewtopic.php?p=34753#34753 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3601 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	e192f616a2	collection of small bugfixes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3600 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	f8de19fb2f	robinson cluster: added client-side protocol implementation - the network configuration page shows a new option: robinson clusters - when a global search is made, all robinson peers are excluded, but: - robinson peers/clusters that provide peer tags and where search words match such tags, they are included in global search. Therefore, robinson peers/clusters support the global yacy network with their indexes, without doin DHT-exchange git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3598 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
(no author)	4f4d3d71dd	) Faster appearance of ConfigBasic by bypassing UPNP-scan in case of existing external connects ) Marked two deprecated source-points *) Added possibility to dump words from indexing to file. Should not affect performance in the current form. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3592 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	657585fe0d	network functions for robinson peers: server-side protection git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3591 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	89c1511738	- added new Network Configuration menu, can be found in basic settings - new cluster functions will be available in this menu, but currently not enabled, because corresponding interface methods are not ready yet - shifted remote crawl settings to new network configuration menu - shifted DHT distribution/receive to the new network configuration menu - adopted some string constants - added cluster configuration settings to yacy.init git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3589 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	62b79aa0a9	bugfix for http://www.yacy-forum.de/viewtopic.php?p=34558#34558 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3586 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	2f3b518169	temporary patch for startup-problem: http://www.yacy-forum.de/viewtopic.php?t=3854 This is a serious problem that is caused by the database bug between 0.511 - 0.513 which produced a large number of double-entries in the RWI index. The uniq()-method tries to fix this, and it does not terminate when the index is large and the number of double-occurrences is also large. This patch does simply implement a time-controlled termination, which does not heal the inconsistency problem. The uniq-method itself is correct and does not need a bugfix, the non-termination is simply caused by the large number of data that is shifted during the process. It was possible to reproduce this behaviour in a test environment. A real fix would need to: - enhance the uniq()-method by using a recursive, binary segmentation of the array to be fixed - uniq() must report the entries that are double - the double-entries must be deleted from the collection index (from the index and the collections) to heal the problem git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3583 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
rramthun	e6fb6426a3	*) Some cosmetical changes and corrections git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3582 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	595ee10468	fixed datatabase inconsistency bugs inserted many debug lines added a huge number of asserts extended database test methods git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3579 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	ca79362b9d	disabling auto-setting of remote crawl performance see also http://www.yacy-forum.de/viewtopic.php?t=3849 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3577 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	7a7a1c7c29	fight against problems with remove-methods and synchronization - some bugs may have been fixed with wrong removal operations - removed temporary storage of remove-positions and replaced by direct deletions - changed synchronization - added many assets - modified dbtest to also test remove during threaded stresstest git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3576 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	063063aa0c	fix for 100% cpu bug during dht selection see also: http://www.yacy-forum.de/viewtopic.php?p=34068#34068 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3570 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago

1 2 3 4 5 ...

1370 Commits (efd0b8371a04d12d3b9f52532f3f3aba1f7b3790)