yacy_search_server

Commit Graph

Author	SHA1	Message	Date
orbiter	3e38035389	fix for interrupted thread during has() property check git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6370 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	a995b95367	tried a fix for the httpd access bug (too many unclosed sessions) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6362 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	e1fba41cad	better logging git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6361 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	2275f885a8	possible fix for concurrency problem git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6360 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	5a93807781	improved web cache speed: - removed one computation out of a synchronization - removed one not necessary has() call git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6358 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	2e8b2867ff	double performance of store method because it avoids one 'has' git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6357 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	afda5b1adc	new join method for indexes (not yet used) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6356 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	65b66c2c18	better handling of array files of length 0 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6355 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	432154f725	new strategy for concurrent database index key retrieval git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6353 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	1171a72006	fix for deadlock as seen in http://forum.yacy-websuche.de/viewtopic.php?p=17521#p17521 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6343 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	031e6eefbd	some updates to dublin core, metadata browsing, file indexing and parser stability git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6342 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	e4797ebcde	fix for http://forum.yacy-websuche.de/viewtopic.php?p=17509#p17509 corrupted files are ignored git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6339 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	efa7fb34f0	better oom-awareness of miss-cache in cache git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6338 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	c3a4aee255	some redesign with a possible fix for the ReferenceContainerCache. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6336 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	aca8a78eb8	fix for shutdown of DocumentIndex objects git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6333 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	4db34eea73	fix for OOM problem in kelondro Cache git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6331 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	fbd77bd77c	git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6328 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	28d4b921b6	different approach for file search git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6325 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	f99f86c5c5	added concurrency to file indexing class git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6324 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	902d16cf6c	fixes to parser git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6323 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	4a1c852435	fix in usage of RAM copy for Table objects and some cosmetics in asserts. This bug affected Tables in case that a removeOne() was called and a RAM copy of the table was active. It may happen for peer owners with a lot of RAM assigned to YaCy. The bug appeared especially during crawling when the balancer tried to get new entries from the crawl queue. This bug may help to solve report at http://forum.yacy-websuche.de/viewtopic.php?p=17417#p17417 and will be tracked there git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6322 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	68465c37af	added a convenience class to add files into a YaCy index to make this possible, the yacyURL must be able to process file:// urls, which has also been implemented testing of the new class resulted in some bugfixes in other classes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6313 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	27d00285aa	- added a new file reader cache that may serve as full-file-copy of blob database files. This is not yet used - removed class FileWriter and replaced all usage of that class with CachedFileWriter git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6309 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	fd6b9cb7dc	refactoring of IO access classes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6308 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	604c37927f	used comparator for did-you-mean that uses index sizes for comparisment, but: - limit comparisment to only the first 10 elements that had been sorted before without IO - added a size cache to index computation because the size is computed at least twice in set comparator git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6306 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	573d03c7d7	added configuration to enable ram table copy git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6304 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	3be54e1891	fix to rule when to use a ram table copy git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6302 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	700218846c	disabled or removed sleep calls git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6301 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	a10a6cce45	patch for http://forum.yacy-websuche.de/viewtopic.php?p=17289#p17289 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6298 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
hermens	4b83875abd	Small fixes for the heapCacheIterator in ReferenceContainerCache: - Start the iteration at startWordHash - When used with rotation, let the iteration stop when the cache is empty git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6293 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	af3a696fc4	added a fast-fail concept in search processes. The search now has better control if all the remote searches may bring any result. If all processes are finished, then all search tasks fail fast. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6290 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	ce972ff4ef	update to default ranking profile which has now some settings to deny some phpbb3 pages which are redundant in the index when crawling phpbb3. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6288 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	3b9aaf9e9f	- inserted new library tests inside DidYouMean - some redesign of DidYouMean that was necessary to follow a special rule how a library should be used: - the library provides words that start or end with a test word which may be possibly also an empty set of words - all words that the DidYouMean produced with the four production rules are used to generate a set of library-completed words - if this process results in any words from the library, only library-genrated words are taken - if the is no library-generated word at all, take the artifial generated word - all words that result from these rules are tested against the index - the result is ordered using a lightweight comparator that prefers short words - a not-so-much-io test against the index is beeing prepared next - insered the library initialization into the switchboard git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6284 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	39a311d608	better care to do not loose the merge/dump thread git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6278 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	10d3e856b5	better concurrency, less blocking & performance hacks git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6277 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	1a9cfd8718	some performance hacks (CPU only, not IO) this will cause better computation speed for single- and multi-core; there are enhancements that will speed up old and slow machines as well as multi-core CPUs. Indexing of surrogates has been speed up from 4000 PPM to over 20000 PPM on a simple dual core office computer. Since the enhancements are mostly in core routines, the hack should also speed up search performance. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6276 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	92407009b2	cleanup git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6275 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	0ba1beaf56	separated rwi constraint evaluation from rwi ranking and added concurrency git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6274 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	ce7924d712	better concurrency for rwi entry parsing during search processing git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6273 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	0e471ba33b	- fixed a bug in fast digest computation - added a open-on-demand hack to heap files: when a heap file is opened the first time, it is first scanned to get a key index and then it is closed again. This will free up file pointers in cases where a really large number of blob files are opened upon initialization of ArrayStack objects. This should solve also a problem reported in http://forum.yacy-websuche.de/viewtopic.php?p=17191#p17191 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6267 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
hermens	c4d0e22a77	Further speed upof concurrent DHT-receive git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6259 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
hermens	2fbc0696bf	Fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=2334 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6258 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	8e56c2ace6	fix for fixes from this afternoon git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6253 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	6354b5e447	removed possible deadlock, see http://forum.yacy-websuche.de/viewtopic.php?p=17017#p17017 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6251 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	5cc17ccf8a	a better caching with less overhead and more appropriate synchronisation use in more than 10 different data objects git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6250 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	0575f12838	fix for deadlock git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6246 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	fbfdaf063d	- patch to omit IndexOutOfBoundsException when a b64-encoded key appears not to be well-formed. In that case the key is still accepted but rated higher than other regular keys to create a virtual ordering between well-formed and ill-formed keys - check routine at the beginning of the import of table keys that check that all imported keys are well-formed. All records that have a ill-formed key are deleted. This is a hack and is not tested since I don't have bad data here to test with. If the effect is seen in the wild, please report in the forum. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6245 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	c4ae2cd03f	fixed bug that caused deletion of crawl profiles at every application startup git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6240 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	161d2fd2ef	redesign of access to the HTCache (now http.client.Cache): - better control to the cache by using combined request-header and content access methods - refactoring of many classes to comply to this new access method - make shure that the cache is always written if something was loaded - some redesign of the process how http response results are feeded into the new indexing queue - introduction of a cache read policy: * never use the cache * use the cache if entry exist * use the cache if the proxy freshness rule confirmes * use only the cache and go never online - added configuration options for the crawl profiles to use the new cache policies. There is not yet a input during crawl start to set the policy but this will be added in another step. - set the default policies for the existing crawl profiles. If you want them to appear in your default profiles you must delete the crawl profiles database; othervise the policy is 'proxy freshness rule' - enhanced some cache access methods in such a way that unnecessary retrievals are omitted (i.e. for size computation). That should reduce some IO but also a lot of CPU computation because sizes were computed after decompression of content after retrieval of the content from the disc. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6239 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	51534df0cb	fix for possible synchronization problem see also: http://forum.yacy-websuche.de/viewtopic.php?f=6&t=2292&hilit=&p=16787#p16787 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6234 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago

1 2 3 4 5 ...

922 Commits (f1bde59c508d4b952baca3cfe178726b0276d77c)