yacy_search_server

Commit Graph

Author	SHA1	Message	Date
orbiter	64b3b79e44	- fix for termination problem with uniq() - addition to seed dna interpretation git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4208 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	0abf33ed03	- tried to remove deadlock - enhanced searchtime in kelondroRowSets - enhanced uniq() - reverse enumeration causes less time in case of mass removal of doubles git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4207 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
low012	a4010f7dc8	*) fixed bug where dots were added after numbers < 1000: "123" was transformed to "123." which is undesirable git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4206 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	2421127612	fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=513&hilit= git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4204 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	d0d2771883	disabled multiprocessoring of rowCollection.sort for testing purpose git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4202 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	edc4da5317	fix for division by zero in test reoutine git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4201 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	df38aaf7bd	update to RowCollection sort speed-enhancements: - better handling of small collections (less overhead) - usage of pre-sorted limits - different re-sort limit - more testing procedures git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4200 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	0eb60cfe6f	better handling of seed properties git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4199 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	ecba35de72	enhanced computing speed of kelondro core function: sorting the enhancement was made by using better organized data structures and multi-threading during the sort. A sort can be divided into two separate processes when the first partition of the quicksort algorithm was done. Generating a separate thread and starting the thread takes only 10 milliseconds, so using a separate thread makes only sense if the data amount is large. statistics about the speed-up: without ehancement: 250 milliseconds for 100000 entries with data structure enhancement: 170 milliseconds for 100000 entries with additional second thread (if second processor is present): 130 milliseconds. For dual-processor systems, this means about 100% speed-up a test can be made with the following command: java -classpath classes de.anomic.kelondro.kelondroRowCollection git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4198 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	6eaa5a0e64	enhanced local search speed. The ranking process is now 6 times faster that before. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4197 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	425e4ead66	Allow absolute paths in configuration settings. - before absolute paths would be expanded incorrectly, e.g.: fooPath=/a/b/c would become /path/to/yacy/root/a/b/c. Now you can put nearly every dynamically generated data with a configurable path to a location outside of yacys root dir without having to use symlinks (probably good for third party distribution packaging). - abstractServerSwitch.getConfigPath(setting, default) returns a File instance, either with an absolute path or relative to the applications root path. - exceptions (hardcoded): DATA/LOG/yacy.logging DATA/SETTINGS/httpProxy.conf DATA/SETTINGS/user.db TODO: all of these are the global configuration files and they should probably be put into _one_ command line configurable settings path, so it would be possible to package them in /etc/ for example. - add missing workPath to yacy.init (it was used in code, but there was no default in the file) - fix broken skinPath (was skinsPath in yacy.init but skinsPath in the code) + a few other broken config reading caused by typos. - replaced path setting names and their default values with the related static fields in plasmaSwitchboard where not already done/existing git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4196 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
borg-0300	e8d32d9f62	other loglevel git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4195 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
borg-0300	a5d28785b1	less OOM (works for me) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4194 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	ccbfb15b6b	enhancement to crawl stacker enqueue order git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4192 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
hermens	5c5344ae97	Beautify log git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4190 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
hermens	35cf196204	transferRanking(): Do not flush more ranking files than requested by caller. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4189 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
hermens	d0aa8cf25d	Only update handshaked peer's last seed date if it has not been updated recently. Unil now the newer data was overwritten by old data from before the handshake. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4188 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
hermens	8f9d65da67	Small corrections to dhtFlushControl() - Test wCacheMaxChunk against maxURLinCache(), not getMaxWordCount(). This triggered a flush everytime dhtFlushControl() was called. - If triggered, flush at least 1 entry. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4187 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	55c87b3b12	changed behavior of crawl stacker - final flush only when tabletype = RAM - prestacker (dns prefetch) only if tabletype = RAM and busytime <= 100 - number of maximun entries in stacker is configurable in yacy.init (stacker.slots) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4186 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
hermens	18144043e6	Correct UTC Offset at beginning/end of daylight savings time git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4185 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	4fefa53135	removed parser object pool, see also svn 4106 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4184 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	a31b9097a4	preparations for mass remote crawls: two main changes must be implemented to enable mass remote crawls: - shift control of robots.txt to crawl queue (away from stacker). This is necessary since remote crawls can contain unchecked urls. Each peer must check the robots to prevent that it is misused as crawl agent for unwanted file retrieval - implement new index files that control double-check of remotely crawled urls After removal of robots.txt checking from stacker threads, the multi-threading of this process is void. Multithreading has been removed. Also the thread pools for the crawl threads had been removed, since creation of these threads is not resource-consuming, for a detailed explanation see svn 4106 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4181 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	a718858e8b	seed.CCOUNT is interpreted as a double value not int git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4180 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	0e1738899f	* Complete number localization and provide a more reasonable interface to serverObjects: - put(key, value) methods are now used if a value added to the map should be kept as it is. Numbers are transformed (but not formatted) to an equivalent String representation. - putASIS(...) have been removed, now done with simple put(...) (see above). - puNum(...) can be used for number values which should be stored in a formatted way, either depending on the current locale setting for yacy (default) or in a "none" locale (see javadocs and setLocalize()). - putHTML(...) escapes special characters into corresponding HTML enities ('<' => '<') which was done with put(...) before and so was called too often, becauses it is necessary only for very few cases. Additionally there is a "forXML" mode which only replaces < > & ". In short: Use put(...) for almost everything, use putXY(...) if you need some special transformation of the value. A few bugs have been fixed as well, and there should be a small performance improvement for complex pages with a lot of values. * added additional Sum/Avg rows to access tracker pages, see http://forum.yacy-websuche.de/viewtopic.php?f=5&t=456 * removed duplicate code (mostly related to the big changes above). TODO: - make sure, number formats work as expected _everywhere_, report overseen stuff http://forum.yacy-websuche.de/viewtopic.php?f=5&t=437 - probably a good idea to add special putDate() methods as they are used in many pages and create duplicated formatting code + maybe some centralized handling for memory value formatting. - further improve the speed of page creation for the WatchCrawler. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4178 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	f8318436a1	fix for last commit git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4177 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	7d57b80598	distinct keepOrder strategy, more discrete implementation of enhancement introduced in SVN 4158 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4176 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	9a7b093eed	tried to avoid endless loop, see also: http://forum.yacy-websuche.de/viewtopic.php?f=6&t=467&hilit= git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4175 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	b856e377a9	some additions and a small bugfix to SVN 4158 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4173 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
hermens	501a7aae90	Small correction git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4172 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
hermens	caff520988	Removed unnecessary and unused code. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4171 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
hermens	d732840f8a	Avoid ConcurrentModificationException when accessing the PerformanceQueues page while yacy is indexing. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4170 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	35303f9504	add real size values (KBytes) of the DHT-In/Out-RAM-Caches to the PerformanceQueues page. A lot of users seem to tweak this value and it might help in finding the best size in relation to the peer's memory ressources. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4169 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	38bbd4a4b3	no code changes. just touched yacyClient.java to trigger a rebuild of the file in an uncleaned tree. NOTE: run "ant clean" before building SVN 4166/4167 in a tree that includes class files from a previous build to make sure, that every class file is rebuilt! git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4167 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	f717beecb1	- Changed yFormatter handling to be more flexible and produce more readable code for server pages. There are serverObject.putNum() methods to allow adding of number type values in a formatted form, and put() methods for number types that add them without formatting. This reduces the need to transform them into Strings in server pages and removes the HTML encoding step which is unecessary for numbers. - some minor code cleanups (mostly unnecessary casts, null checks) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4166 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	ca83f5a8d9	Add external lib FontBox which is part of the PDFBox (they extracted the font handling code into this package in 0.7.3). Add the packages to the eclipse .classpath. Closes: http://forum.yacy-websuche.de/viewtopic.php?f=5&t=453 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4165 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	3352474dd8	Remove grouping separator in Network.xml (yacystats will woork without it) and format a few more numbers. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4163 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	06e6a1ff62	Add a generalized Formatter class yFormatter inspired by http://forum.yacy-websuche.de/viewtopic.php?f=5&t=437 At the current state it allows formatting of numbers (integer + decimal types) for output according to the Locale derived from the language setting in yacy. Network.(html\|xml) and Status.html have been changed to use it for now (TODO: should be integrated into other servlets as well to reduce duplicate formatting code). NOTE: For now the output format for Network.xml simulates the old behaviour which is wrong (it uses '.' as decimal and grouping separator), to make sure external scripts like the yacystats.de one won't break with this update. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4162 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	e77aec8c9d	fix handling of encrypted PDF-Documents (with default user password "") - update PDFBox package to current version 0.7.3 - use new security model in PDFBox to "guess" wether we can decrypt a document or not NOTE: When upgrading to this version make sure the old PDFBox-0.7.2.jar is removed from libx/ git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4161 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	b5f7df8d0a	Speed up remove operations in rowCollections. - Array element shifting during remove is only done when it is necessary to keep the order of a row collection. - This will speed up the most expensive operation "common word shrinking" by a factor of 500-1000 (in the worst cases we shifted > 60 GB of data during this operation) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4158 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
low012	fdb0b861f8	) fixed wrong calculation of network words, network links, network PPM if peer is senior or principal peer ) added network QPH ) banner is cached for 1 second to avoid DOS ) still no logo git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4154 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	508de558f7	sbStackCrawlThread is null during first cleanProfiles() run at startup. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4152 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	70614385ef	Attempt to fix the "lost profile handle" bug. It seems improbable, but it might happen, that during a crawl all queues (indexing, crawling, ...) except the crawl URL stacker ran empty. This commit adds an additional check for an empty crawl stacker queue before executing the profile cleaner. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4151 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
low012	507ecd8afa	) added banner that can be displayed like this: http://localhost:8080/Banner.png possible arguments: textcolor, bgcolor, bordercolor example: http://localhost:8000/Banner.png?textcolor=ffffff&bgcolor=121212&bordercolor=ffffff take care: YaCy uses CMY color model! ) there are still some known bugs, but I can't continue coding right now git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4149 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	9b0948cb4c	gnarf. mixed up the positions. finally fixed... git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4143 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	c0f5fc51ef	bugfix for last commit git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4142 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	33fb2f756d	added emergency fail case in remote crawls in extreme situations this will cause that no remote crawls are send out any more this is bad, but it protects the case where failing remote crawls fill up the local queue too much, which is even worse git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4141 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	c5a8585ac6	fix more encooding problems in yacysearch.rss. - URL encoding for search terms where required - removed "ugly" CDATA escaping - UTF-8 encoding for the XML - no HTML style escaping for XML/RSS element values Note: some unicode characters might still be encooded in a wrong way. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4140 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	6b00fe0c4e	fix ArrayIndexOutOfBoundsException git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4139 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	3e60ae93b9	modified remote search snippet fetch behavior: do not fetch snippets for more than 300 milliseconds, even if the snippets can be found locally without online fetch git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4137 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	97f1ca52bd	fox for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=390 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4136 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	143fa40d77	fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=394&p=2382#p2382 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4135 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	711641f167	extended client connection clean-up: there are now two time-outs, one for the complete connection time, and one for an idle time connections that are idle for more than 2 minutes are closed, and connections that are alive since more than one hour are also closed if the complete number of connections exceeds 64, all connections more than 64 and have most idle time are also closed During normal operation of peers these forced closings should never appear, but the existence of the idle connection check ensures the availability of the peer and the usability of the host. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4134 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	b19bb6e5b1	- reverted svn 4132; this did not solve the problem and removed the emergency mehtod which caused production failure for shure within some hours - removed and added some debugging lines git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4133 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	1eba408d2f	Make sure that sockets which couldn't be opened aren't handled as active connections, in which case they wouldn't be closed. Please test this and report any problems (connections that stay open for a very long time according to http://<your_yacy_peed>/Connections_p.html to http://forum.yacy-websuche.de/viewtopic.php?f=5&t=386 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4132 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	03c5b4ad68	more fixes to the yacysearch.rss, it's now 100% valid according to http://feedvalidator.org - RFC-822 date time had to include the time instead of date only - <opensearch:link> doesn't exist -> <atom:link>, see http://www.opensearch.org/Specifications/OpenSearch/1.1 - <link> elements are mandatory for <channel> and <item> git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4131 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	d69d386f7d	added additional forced client connection closing if a specific number of simultanous connections is reached the limit is currently set to 64 connections git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4129 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	dea7bee049	- increased minimum time before an active connection is interrupted from 1 minute to 10 minutes - added sorting by connection time in client connection tabe of connectionTimeComparatorInstance git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4128 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	c1440d2241	fixed problem with redirection: redirected URLs had not been tested with the double-check see also: http://forum.yacy-websuche.de/viewtopic.php?f=6&t=348 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4126 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	7404f2c35c	Fix some of the issues with the RSS search interface, see http://forum.yacy-websuche.de/viewtopic.php?f=6&t=392 Note: the new DateFormatter822 in the plasmaSwitchboard is just a copy of the DateFormatter that always uses the US locale to allow formatting of a loocale independent date String. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4124 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	98abe0804d	another enhancement to crawl starts with link files git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4123 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	1b42152a76	fixed and enhanced some details in crawl start with file git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4120 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	4465db7399	removed debug information from network grafic git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4118 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	01e0669264	re-designed some parts of DHT position calculation (effect is the same as before) and replaced old fist hash computation by new method that tries to find a gap in the current dht to do this, it is necessary that the network bootstraping is done before the own hash is computed this made further redesigns in peer initialization order necessary git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4117 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
hermens	d547c3b4bd	Avoid NullPointerException in yacySeedDB.lookupByIP git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4116 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	5b1a937ed8	fix for crawl stack database format change, introduced in SVN 4113 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4115 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	af25c98306	enhanced local search performance in case of a remote search: there is no waiting until the local search terminates to show the result page. the local search appear like all other results from remote peers using a separated thread. This has especially a stron effect, if the local index for a specific word is large. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4114 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	842308ea97	- redesigned crawl start menu, integrated monitoring pages - removed web structure picture from indexing menu and grouped it together with htcache monitor - added a database for terminated crawls, when a crawl is finished it is automatically moved to the new database - extended crawl profile edit servlet, shows now also terminated crawls - option that was used to delete profiles is now redesigned to a function that moves the current crawl to the terminated crawls and removes all urls from the current queues! - fixed here and there problems with indexing queues - enhances indexing speed by changing cache flush sizes. - changed behaviour of crawl result servlet: the list of crawled urls is shown if there is one, othevise the overview window is shown attention: the new profile databases are not compatible with the old one. current crawls will be lost! the web index is not touched. next steps: the database of terminated crawls can be used to start with them a new crawl. This is useful if one wants to re-crawl specific pages and wants to use a old crawl profile. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4113 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	341f7cb327	steps to enhance remote search performance: - added a file size limitation, that disallows parsing of large documents during (offline-) remote search - added profiling information to search result computation, visible at search access tracker. this info shows used time for URL fetch and snippet computation git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4112 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	2f1ff048ba	some fixes to socket connection time-out git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4111 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	3c74014004	automatic deletion of dead client connections git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4110 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	11b4f80bde	- fixed non-closing client connections - added client connection tracker in connections servelet git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4108 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	d352853f2d	fix for non-closing client sessions git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4107 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	1488769e1f	cleanup of unmaintained and outdated performance methods: removed object pools in httpc. Object pooling is not recommended, if the creation of the object is not time-intensive. Object pools are only useful, if there is much computation necessary to create some basic data that is stored in the object pool and can be re-used. This does not apply to object pools in YaCy. Object pooling of client sessions would make sense if they would allow re-use of living connections to other yacy clients. But every connection is closed after usage of an object in the client pool, therefore the YaCy server client objects are not such that hold hardware/network-allocated entities. See: http://www.javaperformancetuning.com/news/qotm033.shtml http://java.sun.com/docs/hotspot/HotSpotFAQ.html#gc_pooling http://docs.sun.com/source/816-7159-10/pt_chap5.html http://www.microjava.com/articles/techtalk/recylcle2 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4106 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	3cb9cdc9be	try to fix connection problem, possible cause for wrong junior status and non-passive passive peers: the YaCy client treats disconnections during data transmissions as error and discards all data transmitted so far this did not happen so far until I removed a delay time at the end of the daemon session which prevented this case. To fix this problem, disconnections during transmissions are not treated as error now, which means that end-of-transmissions with sudden disconnections are not a cause for peer diconnections any more. To be nice to non-updated peers, the sleep time at the end of server sessions is also re-enabled. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4105 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
borg-0300	ba59de773f	again and again junior - test git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4097 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	4275727d69	fix for peer ping problem (implemented a 3-time re-ping); cause for 'Connection reset' still unknown git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4095 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	07d1e98909	fixed round-robin method of peer-ping order (the successfully pinged peer was not updated to current last-seed date) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4093 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	76e4c2d69e	fix for peer-ping in case that remote peer does not respond with valid values git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4091 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	e192f99134	fix small bug introduced in r4089 that appeared when we tried to remove "gzip" encoding from Accept-Encodings header closes http://forum.yacy-websuche.de/viewtopic.php?f=6&t=336 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4090 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	ae4b9308ef	Fix problems with some web servers which couldn't handle the way yacy was sending requests. Thx to celle for the patch. http://forum.yacy-websuche.de/viewtopic.php?f=5&t=320 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4089 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	6601e37512	clear caches after changing blacklists, closes http://forum.yacy-websuche.de/viewtopic.php?f=6&t=241&p=1964#p1964 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4088 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	5b0c1449e1	various fixes and cleanups for blacklist handling: 1. avoid adding duplicate file name entries in config properties for lists, 2. correctly merge all path masks from all list files for the same host masks, 3. rewrite helper methods standard java methods for Collection transformations, 4. merged various methods with identical functionality for different Collection implementations into one, 5. minor refactoring to improve code readability. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4087 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	841cf71022	fix for NPE in DHT transfer selection, see http://forum.yacy-websuche.de/viewtopic.php?f=6&t=327 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4085 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	dbd1eeead5	fix for missing object miss-cache flush value: the value is alway zero because there is no miss-cache flush see http://forum.yacy-websuche.de/viewtopic.php?f=6&t=288 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4083 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	f2a3434407	fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=238&p=1341#p1341 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4082 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	f4a5c287fe	re-implemented post-ranking of search results (should enhanced search result quality) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4080 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	8ff5e2c283	- fixed/re-implemented media search - fixed search tipps (topwords, now appearing at the bottom of the page) - added search consequences execution (deletion of bad referenced some time after the search happened) - added some formatting at network table git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4078 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	6c819a6fd9	added cache to favicon display added better synchronization for simultanous search requests git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4076 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
borg-0300	d69013f66a	added patch from Fuchs - http://forum.yacy-websuche.de/viewtopic.php?f=6&t=241 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4075 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	daf0f74361	joined anomic.net.URL, plasmaURL and url hash computation: search profiling showed, that a major amount of time is wasted by computing url hashes. The computation does an intranet-check, which needs a DNS lookup. This caused that each urlhash computation needed 100-200 milliseconds, which caused remote searches to delay at least 1 second more that necessary. The solution to this problem is to attach a URL hash to the URL data structure, because that means that the url hash value can be filled after retrieval of the URL from the database. The redesign of the url/urlhash management caused a major redesign of many parts of the software. Since some parts had been decided to be given up they had been removed during this change to avoid unnecessary maintenance of unused code. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4074 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	e90afa9483	fixed search access tracker git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4072 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	4779f314fe	first version of next-generation search interface: - snippets are not fetched by browser using ajax, they are now fetched internally - YaCy-internat threads control existence of snippets and sort out bad results - search results are prepared using SSI includes - the search result page is visible right after the search request, the results drop in when they are detected - no more time-out strategy during search processes, results are shifted within queues when they arrive from remote peers - added result page switching! after the first 10 results, the next page can be retrieved - number of remote results is updated online on the result page as they drop in - removed old snippet servelet (which had been also a security leak btw) - media search is broken now, will be redesigned and fixed in another step git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4071 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	6d759ad0a7	- new bot address - removed unused skins git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4065 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	f9e6cf6a3d	more refactoring of search: integrated first version of ssi-using search interface, but the function is currently disabled git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4063 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	f81ef40cc4	no dht activity for small networks; this is not needed if the network is small git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4062 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	d9472b6a3a	* fixed problem with watch crawler * added new column to network table (remote crawl urls): the new value for provided URLs will be used for new remote crawl method git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4061 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	e332b844b2	- enhanced remote search: during waiting time for remote crawls some urls are fetched so the url cache can be filled with these urls - the url-prefetch is used to sort out some unresolved urls - the snippet-fetcher is triggered with the search event id. This is used to remove missing snippets from the search cache so they will not be displayed again git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4060 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	a34d9b8609	* added a search history cache that maintains search results for 10 minutes it is necessary for the new search process that will do automatic re-searches a positive effect is, that when a re-search is done it can be monitored how many results had been contributed from other peers. The message for this contribution was moved from the end of the result page to the top. * enhanced re-search time when a global search was done an the local index has already a great number of results for this word * re-organised presearch computation; must be further enhanced git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4059 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	ae86d010bb	more refactoring of search processes; also some small speed enhancements git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4058 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	bb426565f0	added new yacy protocol for mass url-pull for better remote crawling distribution git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4056 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago

1 2 3 4 5 ...

2692 Commits (6fbda9ef4fbdef064c4ffd723a62cd8498e6fe6e)