yacy_search_server

Commit Graph

Author	SHA1	Message	Date
orbiter	a5d481eab1	enhanced navigation - fixed too early computation of navigation - moved navigation rendering to yacysearchtrailer - added more asserts git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6006 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	1c69d9b8b6	more refactoring of the index classes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5995 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	88426912ad	more refactoring to make the segment object easier to use and to be prepared to integrate author navigation git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5992 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
lotus	d813fd26ed	reset sent/received counters on index delete git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5991 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	99bf0b8e41	refactoring of plasmaWordIndex: divided that class into three parts: - the peers object is now hosted by the plasmaSwitchboard - the crawler elements are now in a new class, crawler.CrawlerSwitchboard - the index elements are core of the new segment data structure, which is a bundle of different indexes for the full text and (in the future) navigation indexes and the metadata store. The new class is now in kelondro.text.Segment The refactoring is inspired by the roadmap to create index segments, the option to host different indexes on one peer. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5990 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	876746602d	catch problems of file hash computation, see also: http://forum.yacy-websuche.de/viewtopic.php?p=15245#p15245 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5989 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	fec6f9054f	some refactoring of search methods git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5988 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	3d4b826ca5	migration of all databases that use the deprecated BLOBTree format into the BLOBHeap format. Old databases are migrated automatically. This removes the last very IO-intensive data structures which were still used for Wiki, Blog and Bookmarks. Old database files will still remain in the DATA subdirectory but can be deleted manually if no major bugs appear during migration. There is no need for any user action, all migration is done automatically. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5986 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	26a46b5521	increased default maximum file size for database files to 2GB Other file sizes can now be configured with the attributes filesize.max.win and filesize.max.other the default maximum file size for non-windows OS is now 32GB git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5974 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	addecdb18c	simplified code, removed one unused method in all implementing classes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5972 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
borg-0300	47fce9020c	small change (Orbiter's wish) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5971 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
borg-0300	e07b14e5d7	finally a working fix for 5960 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5970 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
borg-0300	3ebb904d2c	fix for 5960, http://forum.yacy-websuche.de/viewtopic.php?f=6&t=2119 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5969 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	e005cfea37	fix for bug in -incell option of URLAnalysis git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5967 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	a7e392f31b	The collection index will not be supported any more. Existing indexes based on the old index collections must be migrated with YaCy 0.8 - removed index collection classes and all migration tools - added a 'incell' reference collection feature in URL analysis git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5966 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	a2f48863fc	- added prototype for navigation index - refactoring of word index prototype (no functional changes so far) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5965 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	b7457d3807	patch for http://forum.yacy-websuche.de/viewtopic.php?p=14720#p14720 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5960 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	f133d6065c	fix for http://forum.yacy-websuche.de/viewtopic.php?p=14955#p14955 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5958 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	ad9762746d	no exception in case of uniq() time-out, see also http://forum.yacy-websuche.de/viewtopic.php?p=13177#p13177 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5955 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	f2e4d156e8	removed debug messages git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5950 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	c1e5fad9a7	fix for http://forum.yacy-websuche.de/viewtopic.php?p=14767#p14767 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5944 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	8ee3a94e82	fix for non-caching of sitehash, see http://forum.yacy-websuche.de/viewtopic.php?p=14440#p14440 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5942 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
borg-0300	21930d05ed	fix for [B@... git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5941 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	b6ba387e01	fix for http://forum.yacy-websuche.de/viewtopic.php?p=14751#p14751 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5940 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
low012	d164b42604	*) cosmetics git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5934 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	5fb77116c6	added a submenu to index administration to import a wikimedia dump (i.e. a dump from wikipedia) into the YaCy index: see http://localhost:8080/IndexImportWikimedia_p.html git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5930 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
hermens	df733af4fa	Try not to loose content from ram during IndexCell.delete by moving ram.delete after the dangerous operations on the array (array.get and array.delete) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5929 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
hermens	ac72005f2f	Let IndexCell.remove remove entries from the ram portion of the DB as well. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5928 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	8ba7ff5353	a fix and another speed enhancement for the RWI cache git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5927 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	05f077e85f	added stack trace output to solve problem in http://forum.yacy-websuche.de/viewtopic.php?f=6&t=2076&hilit=&p=14612#p14612 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5926 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	71a4cadf31	better and more performant synchronization in SimpleARC, the caching object for word hashes. Speeds up indexing. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5925 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	e6773cbb33	better handling of RWI cache for concurrency and less overhead when writing new entries -> even more indexing speed git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5924 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	c097531e3d	added a catch Exception to all thread to check if any of them silently dies without any other notification git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5922 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	083533e5ec	fix for bugs in IODispatcher git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5921 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	21fbca0410	better scaling of HEAP dump writer for small memory configurations; should prevent OOMs during cache dumps git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5920 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	6e0b57284d	better care for states of the IODispatcher git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5919 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	1db9cdd4e4	fixed bug in writing of robots.txt entries in case that host names exceeded 64 characters and some other problems git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5918 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	057ce14c8e	more fixes (character encoding, parser exceptions, http client failure, blob writing) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5914 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	d2ac0aa682	- fixed possible bugs in Stack (may affect Crawler reset) and RandomAccess handling - increased default memory size to 180MB - fixed possible bug in http client reset (there was a deadlock) - bug in BOBHeap marked, but not solved, cause is still unknown. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5912 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	8d6212233b	fix for IODispatcher git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5896 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	07f09742bb	set of small fixes and comments git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5893 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	9e4db75aac	reduced internal logging and reduced memory that internal logging can use git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5867 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	c10c257255	attempt to fix a deadlock situation where the IODispatcher did not work. I suspect the dispatcher thread has crashed and queues filled so no indexing process was able to write data. This fix tries to heal the problem, but I am unsure if it helps. To get a better view of the problem, some more log outputs had been inserted. Added also a new attribut indexer.threads to get a control over the number of default threads for the indexer (default is 1) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5866 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	fe51f4d668	less synchronization may help to prevent deadlocks git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5863 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	138422990a	- removed useCell option: the indexCell data structure is now the default index structure; old collection data is still migrated - added some debugging output to balancer to find a bug - removed unused classes for index collection handling - changed some default values for the process handling: more memory needed to prevent OOM git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5856 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	16baa7ad24	To translate a mediawiki dump into the YaCy surrogate format do the following: - download a wikipedia dump, i.e. dewiki-20090311-pages-articles.xml.bz2 from http://download.wikimedia.org/dewiki/20090311/ - move dewiki-20090311-pages-articles.xml.bz2 to DATA/HTCACHE/ - start the conversion; open a command shell, move to the yacy home directory and execute java -Xmx2000m -cp classes:lib/bzip2.jar de.anomic.tools.mediawikiIndex -convert DATA/HTCACHE/dewiki-20090311-pages-articles.xml.bz2 DATA/SURROGATES/in/ http://de.wikipedia.org/wiki/ this generates a series of files to DATA/SURROGATES/in if YaCy is running (it may run concurrently), it fetches all new dumps in the surrogate-in directory. The export process is transaction-save, that means YaCy will not start reading a dump while the dump is not completely finished. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5851 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	5195c94838	two patches for performance enhancements of the index handover process from documents to the index cache: - one word prototype is generated for each document, that is re-used when a specific word is stored. - the index cache uses now ByteArray objects to reference to the RWI instead of byte[]. This enhances access to the the map that stores the cache. To dump the cache to the FS, the content must be sorted, but sorting takes less time than maintenance of a sorted map during caching. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5849 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	9416f5c26f	more speed test cases: kelondro provides map functions that are more than 20% faster than standard java classes and use less than halve of the memory of java classes: just start IndexTest (here with 1000000 test objects) Performance test: comparing HashMap, TreeMap and kelondroRow generated 1000000 test data entries STANDARD JAVA CLASS MAPS sorted map time for TreeMap<byte[]> generation: 2110 time for TreeMap<byte[]> test: 2516, 0 bugs memory for TreeMap<byte[]>: 29 MB unsorted map time for HashMap<String> generation: 1157 time for HashMap<String> test: 1516, 0 bugs memory for HashMap<String>: 61 MB KELONDRO-ENHANCED MAPS sorted map time for kelondroMap<byte[]> generation: 1781 time for kelondroMap<byte[]> test: 2452, 0 bugs memory for kelondroMap<byte[]>: 15 MB unsorted map time for HashMap<ByteArray> generation: 828 time for HashMap<ByteArray> test: 953, 0 bugs memory for HashMap<ByteArray>: 9 MB git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5847 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	b53790abb1	more performance hacks: 10% more speed for Base64.compare() which is really often used in YaCy code git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5846 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	8ffb9889e1	some fixes and performance hacks git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5845 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago

1 2 3 4 5 ...

827 Commits (55ff919b5d4bfe29846ad1058a6d609e40a8ebc7)