yacy_search_server

Commit Graph

Author	SHA1	Message	Date
orbiter	275a226cc5	refactoring git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4524 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	2327451653	- changed order of database initialisation (index first) - removed mainly unused init-time for databases (was only used for tree tables, which are not used any more) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4496 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	6dc679785f	- fixed bad sort behavior of kelondroRowSet, in this case: no sort at all! see http://forum.yacy-websuche.de/viewtopic.php?p=4841#p4841 - some memory calculation enhancements in kelondroFlex and a little bit more logging git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4378 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	2f3b2f3481	- extended dbtest for comparisment tests - added initial space option for eco tables - used initial space value in initialization of collectionIndex, this should avoid OOM failures" /Volumes/Magneto/dev/workspace/trunk/source/dbtest.java /Volumes/Magneto/dev/workspace/trunk/source/de/anomic/kelondro/kelondroCollectionIndex.java /Volumes/Magneto/dev/workspace/trunk/source/de/anomic/kelondro/kelondroDyn.java /Volumes/Magneto/dev/workspace/trunk/source/de/anomic/kelondro/kelondroEcoTable.java /Volumes/Magneto/dev/workspace/trunk/source/de/anomic/kelondro/kelondroRow.java /Volumes/Magneto/dev/workspace/trunk/source/de/anomic/kelondro/kelondroSplitTable.java /Volumes/Magneto/dev/workspace/trunk/source/de/anomic/plasma/plasmaCrawlBalancer.java /Volumes/Magneto/dev/workspace/trunk/source/de/anomic/plasma/plasmaCrawlStacker.java /Volumes/Magneto/dev/workspace/trunk/source/de/anomic/plasma/plasmaCrawlZURL.java - added index consistency check (checks for double-occurrences of primary keys in file) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4349 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	9abc927645	to fix inconsistencies in collection index, a double reference reporting mechanism has been implemented git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4347 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	dc26d6262b	- removed write buffer from kelondroCache (was never used because buggy; will now be replaced by new EcoBuffer) - added new data structure 'eco' for an index file that should use only 50% of write-IO compared to kelondroFlex The new eco index is not used yet, but already successfully tested with the collectionIndex The main purpose is to replace the kelondroFlex at every point when enough RAM is available. Othervise, the kelondroFlex stays as option in case of low memory (which then can even use a file-index) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4337 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	b3636f5ba8	re-implemented file index in kelondroFlex git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4323 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	a5054c038d	- added large number of generics - redesign of ordering structures in kelondro (old did not work with strict generics) - 50% IO reduction during read access on kelondroFlex (ommiting of read on index table) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4320 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	71bcf02d3a	- removed pro-version (is the same as standard version, use the standard instead) - changed yacy logo - removed crawlOrder protocol (unused) - removed file index in kelondroFlex (will not work, it takes too long to maintain) - fixed remoted crawl for clusters (now denies remote crawls from peers outside cluster) - 0.562 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4317 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	016fc594af	more generics git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4311 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	3e3d2e39a4	- some refactoring and redesign of kelondroBytesIntMap (created new class kelondroRAMIndex) - more generics - preparation to extend the balancer for flexible forced delay times - set different random-access type, should now omit update of metadata in file and could be a bit faster (lets see) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4309 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	03e7782269	more generics git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4305 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	4dc438f7e7	moved to Java 1.5: - changed build script to use java 1.5 compiler - first stept to resolve missing generics definition (about 400 from over 4100 'missing'-warnings) - added key-iterator to kelondro databases (for rapid from-memory enumerations, will be used for domain name collection, not used yet) please set your development environment to use java 1.5! git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4292 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	48138952ff	added memory measurement for index recreation to avoid OOM during index RAM space extension git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4267 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	ecba35de72	enhanced computing speed of kelondro core function: sorting the enhancement was made by using better organized data structures and multi-threading during the sort. A sort can be divided into two separate processes when the first partition of the quicksort algorithm was done. Generating a separate thread and starting the thread takes only 10 milliseconds, so using a separate thread makes only sense if the data amount is large. statistics about the speed-up: without ehancement: 250 milliseconds for 100000 entries with data structure enhancement: 170 milliseconds for 100000 entries with additional second thread (if second processor is present): 130 milliseconds. For dual-processor systems, this means about 100% speed-up a test can be made with the following command: java -classpath classes de.anomic.kelondro.kelondroRowCollection git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4198 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	6eaa5a0e64	enhanced local search speed. The ranking process is now 6 times faster that before. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4197 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	7d57b80598	distinct keepOrder strategy, more discrete implementation of enhancement introduced in SVN 4158 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4176 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	4779f314fe	first version of next-generation search interface: - snippets are not fetched by browser using ajax, they are now fetched internally - YaCy-internat threads control existence of snippets and sort out bad results - search results are prepared using SSI includes - the search result page is visible right after the search request, the results drop in when they are detected - no more time-out strategy during search processes, results are shifted within queues when they arrive from remote peers - added result page switching! after the first 10 results, the next page can be retrieved - number of remote results is updated online on the result page as they drop in - removed old snippet servelet (which had been also a security leak btw) - media search is broken now, will be redesigned and fixed in another step git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4071 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	9628db6cdc	enhanced memory allocation during database access: - refactoring of kelondroRecords; this class is now divided into kelondroAbstractRecords, kelondroRecords, kelondroCachedRecords, kelondroHandle and kelondroNode - better abstraction of kelondroNodes, such nodes may now be crated by different classes - a new Node defining class kelondroEcoRecords defines Nodes that do not need so much allocation and System.arraycopy - there is less memory transfer on the bus, especially for collection index - now half of memory needed for web index access git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4024 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	1782ef57e5	- added SSI parser and include directive for <!--# include virtual="<file>" --> - added chunked file transfer for non-yacy clients - SSIs are streamed using chunked transfer, partly delivered pages can be seen in browser before transmission is finished - added client-side network unit identification - cleaned up code git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3926 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	5dd9acc2a7	removed calls to deprecated methods git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3865 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	5551ff5306	enhanced index storage data structure kelondroBytesIntMap this stores now two index structures, one for data that is aquired during start-up and one for data that is aquired during run-time. This reduces the grow factor, and should reduce the memory amount in case that a index-reorganisation happens. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3733 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	872eb46cb9	some redesign of the handling of the index for kelondroFlexTable git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3732 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	2f3b518169	temporary patch for startup-problem: http://www.yacy-forum.de/viewtopic.php?t=3854 This is a serious problem that is caused by the database bug between 0.511 - 0.513 which produced a large number of double-entries in the RWI index. The uniq()-method tries to fix this, and it does not terminate when the index is large and the number of double-occurrences is also large. This patch does simply implement a time-controlled termination, which does not heal the inconsistency problem. The uniq-method itself is correct and does not need a bugfix, the non-termination is simply caused by the large number of data that is shifted during the process. It was possible to reproduce this behaviour in a test environment. A real fix would need to: - enhance the uniq()-method by using a recursive, binary segmentation of the array to be fixed - uniq() must report the entries that are double - the double-entries must be deleted from the collection index (from the index and the collections) to heal the problem git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3583 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	ba525ebf52	- re-enabled path optimization that was disabled during testing - re-implemented index load/extend optimization that was removed from kelondroFlexTable, this is now part of kelondroIntBytesIndex git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3580 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	595ee10468	fixed datatabase inconsistency bugs inserted many debug lines added a huge number of asserts extended database test methods git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3579 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	7a7a1c7c29	fight against problems with remove-methods and synchronization - some bugs may have been fixed with wrong removal operations - removed temporary storage of remove-positions and replaced by direct deletions - changed synchronization - added many assets - modified dbtest to also test remove during threaded stresstest git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3576 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	25070822a5	fix for http://www.yacy-forum.de/viewtopic.php?p=33925#33925 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3551 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	159bd0cab5	diverses; b.o. fix for http://www.yacy-forum.de/viewtopic.php?p=33914#33914 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3549 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	40c14a4f0e	- better implementation of search query properties - basic protection against start-up problems when database files are corrupted - auto-delete of not-critical databases during startup when load error occurs - on-the-fly reset option for all database tables - automatic on-the-fly reset for seed tables during enumeration exceptions git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3547 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	fcdf000fbc	bugfix for http://www.yacy-forum.de/viewtopic.php?p=33838#33838 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3543 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	ba2c307ab3	optimized memory allocation in kelondroRow.Entry such an entry cannot be instantiated without allocation of new byte[]; instead it can re-use memory from other kelondroRow.Entry objects. during bugfixing also other bugs may have been solved, maybe the INCONSISTENCY problem could have been solved. One cause can be missing synchronization during bulk storage when a R/W-path optimization is done. To test this case, the optimization is currently switched off. More memory enhancements can be done after this initial change to the allocation scheme. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3536 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	96b79bf86d	redesigned remove method in kelondroRowSet This should fix also numerous bugs like http://www.yacy-forum.de/viewtopic.php?p=31077#31077 (java.lang.ArrayIndexOutOfBoundsException in kelondroRowCollection.removeShift) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3476 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	6ad39bae1e	fixed shutdown problem this fixes the 'inconsistency' messages during start-up git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3457 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	38b93f8cb8	bugfix for my last commit: iterator did not consider secondary start point in case of rotation git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3456 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	d755a8026d	- better OOM protection - better memory allocation for FlexTable indexes - splitting between static index and dynamic index (only the dynamic part must grow) - to enable a merge-iteration of new splittet index, a huge number of classes needed to be adopted for new iterator classes - added new iterator classes that support cloneable iterators - adopted all iterator classes to implement cloneable itarators git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3453 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	23338d2070	small fix for RAM computation git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3447 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	1cba31de43	redesigned ram organization for database caches - each cache can now allocate as much memory as is available - no more fixed limits - replaced old performance memory monitor by new one - added supervision methods as static functions into the classes that provide cache functionality - steering of ram allocation is done with two simple limits that are ram availability-relative git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3434 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	51e12049fa	third generation of R/W head path optimization - data from collection arrays are read in order - merged data is written in order git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3419 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	10a3c20b8d	some more enhancements to R/W Head path optimization git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3415 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	f4cfd19835	second Generation of collection R/W head path optimization: - permanent cache flush is switched off. The optimized cache flush works better if it is a large number of collections that is flushed together - the flush size can be configured instead the flush divisor. There is only one size for all flushes - collection records that shall be removed during collection transition (jump from one collection file to another) are now not really removed but only marked in RAM. add-operations to the collection use these marked collection spaces - index bulk write operations are now separated for each file of a kelondroFlex git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3414 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	1fda50fd3c	correct R/W head positioning in kelondroFlex and some enhancements git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3409 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	8668ac5d91	preparations for collection index cache flush optimization (hand-over commit, no functional change to current code) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3399 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	dc0c06e43d	PLEASE MAKE A BACK-UP OF YOUR COMPLETE DATA DIRECTORY BEFORE USING THIS redesign for better IO performance enhanced database seek-time by avoiding write operations at distant positions of a database file. until now, a USEDC counter was written at the head-section of a kelondroRecords database file (which is the basic data structure of all kelondro database files) to store the actual number of records that are contained in the database. Now, this value is computed from the database file size. This is either done only once at start-time, or continuously when run in asserts enabled. The counter is then updated only in RAM, and written at close of the file. If the close fails, the correct number can be computed from the file size, and if this is not equal to the stored number it is a strong evidence that YaCY was not shut down properly. To preserve consistency, the complete storage-routine had to be re-written. Another change enhances read of nodes in some cases, where the data-tail can be read together with the data-head. This saves another IO lookup during each DB node fetch. Includes also many small bugfixes. IF ANYTHING GOES WRONG, ALL YOUR DATA IS LOST: PLEASE MAKE A BACK-UP git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3375 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	daf2e15f59	some storage process enhancements (write without preceding read) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3348 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	d03cd41266	fix for http://www.yacy-forum.de/viewtopic.php?t=3411 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3331 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	773ba1e91a	- generalized object order handling - controlled object order for all database tables - migrated DHT position computation to correct base64-decoded values this also closed the 'gaps' in the dht positions git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3049 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	bdc9216366	- more asserts - some bugfixes - some patches for bugs that are already in the database git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2935 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	114a76a86e	- added flag to urlhash that shows that domain is a local domain - enhanced local domain detection - bugfixing for memory assignment in kelondroFlexSplit - automatic memory assignment to caches according to available RAM - bugfixes for details during search process git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2924 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	eafb5ecd22	- better usage of memory resources for kelondroFlexSplit - kelondroFlexTables does always load a RAM cache if it has enough ram assigned. Othervise it creates a kelondroTree file-index. If more memory is re-assigned, the file-index is deleted again, and RAM is used. Beware that assignement of too less RAM forces creation of file indexes and start-up time may last for hours. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2923 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago

1 2

86 Commits (9d0af17c5b4b1f280e6563f03ec72b550e3c5ee3)