yacy_search_server

Commit Graph

Author	SHA1	Message	Date
orbiter	56ff9d5fd4	- extended news size from 512 to 1024 characters - a new news db will be created (news1024.db), the old one (news.db) can be deleted - peers with too large news payload are not ignored any more (they may have been invisible because they had a too large news payload!) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6917 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	1defd580bc	- added option to localization search to distinguish between a search for a location according to the search word only or for the relation between a web search results and locations found in the metadata fields - used that to display two layers on map: cities and search result locations - added many marker grafics for the display of the markers on the map - some refactoring of the yacy news code plus bugfixes for latest move from Tree to Table data structure git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6889 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	118d589eff	replaced the very very old data structure 'Records' with a simple table to fix the problem from http://forum.yacy-websuche.de/viewtopic.php?p=20066#p20066 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6876 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	362b7a929b	added extensive memory protection logic to avoid out of memory errors that may be caused by the RowCollection memory allocation function git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6521 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	4a5100789f	replaced _all_ size() == 0 with isEmpty() and all size() > 0 with !isEmpty(). The isEmpty() method is much faster in some cases, especially when used to access badly balanced hashtables where an size() operation becomes a large iteration. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6510 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	5e8038ac4d	- refactoring of blacklists - refactoring of event origin encoding git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6434 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	ce8dc575ca	refactoring git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6398 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	573d03c7d7	added configuration to enable ram table copy git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6304 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	1d8d51075c	refactoring: - removed the plasma package. The name of that package came from a very early pre-version of YaCy, even before YaCy was named AnomicHTTPProxy. The Proxy project introduced search for cache contents using class files that had been developed during the plasma project. Information from 2002 about plasma can be found here: http://web.archive.org/web/20020802110827/http://anomic.de/AnomicPlasma/index.html We stil have one class that comes mostly unchanged from the plasma project, the Condenser class. But this is now part of the document package and all other classes in the plasma package can be assigned to other packages. - cleaned up the http package: better structure of that class and clean isolation of server and client classes. The old HTCache becomes part of the client sub-package of http. - because the plasmaSwitchboard is now part of the search package all servlets had to be touched to declare a different package source. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6232 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	c2359f20dd	refactoring: better abstraction of reference and metadata prototypes. This is a preparation to introduce other index tables as used now only for reverse text indexes. Next application of the reverse index is a citation index. Moved to version 0.74 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5777 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	aa44d9bad9	more refactoring of kelondro.text / deleted de.anomic.index git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5664 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
danielr	3bb870bfcd	added final where possible git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5030 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	c3d461d191	- removed superfluous copyright statement - updated my email address git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5011 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
danielr	7feae906aa	- organize imports - removed potential null pointer accesses - removed unnecessary casts git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4893 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	78087da287	- changed seed file storage to clear text - fixed kill script - fixed saving of seed file (had been corrupted by latest changes) - some refactoring git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4799 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	d2ba1fd2ab	major step forward to network switching (target is easy switch to intranet or other networks .. and back) This change is inspired by the need to see a network connected to the index it creates in a indexing team. It is not possible to divide the network and the index. Therefore all control files for the network was moved to the network within the INDEX/<network-name> subfolder. The remaining YACYDB is superfluous and can be deleted. The yacyDB and yacyNews data structures are now part of plasmaWordIndex. Therefore all methods, using static access to yacySeedDB had to be rewritten. A special problem had been all the port forwarding methods which had been tightly mixed with seed construction. It was not possible to move the port forwarding functions to the place, meaning and usage of plasmaWordIndex. Therefore the port forwarding had been deleted (I guess nobody used it and it can be simulated by methods outside of YaCy). The mySeed.txt is automatically moved to the current network position. A new effect causes that every network will create a different local seed file, which is ok, since the seed identifies the peer only against the network (it is the purpose of the seed hash to give a peer a location within the DHT). No other functional change has been made. The next steps to enable network switcing are: - shift of crawler tables from PLASMADB into the network (crawls are also network-specific) - possibly shift of plasmaWordIndex code into yacy package (index management is network-specific) - servlet to switch networks git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4765 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
danielr	d4bce6affd	refactoring (initialized static fields, removed empty if/else, serialized some fields in serializable classes) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4755 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	d0b893523e	- protection against RAM overflow caused by new peer rss news - more XSS protection git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4742 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	9935e83c86	added new news window into the status page. At this moment it is just a test. The news inside the window are about peer arrivals and departures, remote search accesses and crawls git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4739 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	7f9f639d20	- refactoring and abstraction of index reference (urls) handling: blacklisting is part of reference filtering - refactoring of word/phrase handling: word abstraction from condenser becomes part of index element handling - removed unused code parts from condenser git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4603 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	15397298dc	- refactoring of indexControlRWIs: moved statics to own class; better Dublin Core naming - fix for http://forum.yacy-websuche.de/viewtopic.php?f=5&t=759&hilit=&p=4866#p4866 - some bugfixes in EcoTable according remove method - switched more tables to Eco: crawl Profiles, htcache, seeddb, newsdb git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4397 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	db25425893	more generics git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4382 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	daf0f74361	joined anomic.net.URL, plasmaURL and url hash computation: search profiling showed, that a major amount of time is wasted by computing url hashes. The computation does an intranet-check, which needs a DNS lookup. This caused that each urlhash computation needed 100-200 milliseconds, which caused remote searches to delay at least 1 second more that necessary. The solution to this problem is to attach a URL hash to the URL data structure, because that means that the url hash value can be filled after retrieval of the URL from the database. The redesign of the url/urlhash management caused a major redesign of many parts of the software. Since some parts had been decided to be given up they had been removed during this change to avoid unnecessary maintenance of unused code. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4074 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	e7a99d765e	fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=171 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3977 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	7c5c814a47	- simplified code (removed exception handling where not necessary) - added confirmation dialog for shutdown and restart git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3962 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	a4e8ad95ab	enhancements to news and switchboard queue processing removed direct access and replaced by iteration git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3961 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	a45216b479	fix to prevent bad-formed news messages git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3960 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	4968556668	- fix for broken news queue during iteration - enhancement for searching special news (usage of new iterator) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3957 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	208b5297f1	enhanced handling of news records: result is a speedup of Surftips, Supporter, and Network page git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3954 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	87afdfc2a7	fix for long waiting time during deletion of processed news see http://forum.yacy.de/viewtopic.php?f=6&t=6 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3932 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	1782ef57e5	- added SSI parser and include directive for <!--# include virtual="<file>" --> - added chunked file transfer for non-yacy clients - SSIs are streamed using chunked transfer, partly delivered pages can be seen in browser before transmission is finished - added client-side network unit identification - cleaned up code git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3926 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	684ded0e09	added new news types git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3876 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
hydrox	4a1bc4743a	*)News-entries with blacklisted URLs are now ignored git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3849 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
theli	40ce33e664	*) adding RSS feed for yacy news git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3496 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
theli	589cbd8cbf	*) replacing all yacy-news-category strings with corresponding constants Note: please use these constants from now on git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3495 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	3bb3df3fc0	fix for http://www.yacy-forum.de/viewtopic.php?p=32298#32298 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3460 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	6ad39bae1e	fixed shutdown problem this fixes the 'inconsistency' messages during start-up git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3457 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	1cba31de43	redesigned ram organization for database caches - each cache can now allocate as much memory as is available - no more fixed limits - replaced old performance memory monitor by new one - added supervision methods as static functions into the classes that provide cache functionality - steering of ram allocation is done with two simple limits that are ram availability-relative git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3434 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
hydrox	cb89c74d52	) added blog-comments ) removed debug-output when deleting news git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3405 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	dc0c06e43d	PLEASE MAKE A BACK-UP OF YOUR COMPLETE DATA DIRECTORY BEFORE USING THIS redesign for better IO performance enhanced database seek-time by avoiding write operations at distant positions of a database file. until now, a USEDC counter was written at the head-section of a kelondroRecords database file (which is the basic data structure of all kelondro database files) to store the actual number of records that are contained in the database. Now, this value is computed from the database file size. This is either done only once at start-time, or continuously when run in asserts enabled. The counter is then updated only in RAM, and written at close of the file. If the close fails, the correct number can be computed from the file size, and if this is not equal to the stored number it is a strong evidence that YaCY was not shut down properly. To preserve consistency, the complete storage-routine had to be re-written. Another change enhances read of nodes in some cases, where the data-tail can be read together with the data-head. This saves another IO lookup during each DB node fetch. Includes also many small bugfixes. IF ANYTHING GOES WRONG, ALL YOUR DATA IS LOST: PLEASE MAKE A BACK-UP git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3375 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
hydrox	9184113284	*) fixed News deletion. News are now removed if they are no longer in a news-stack. This does not effect News-entires in the news-db that have no stack-entries. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3336 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	2a9d868f6d	- removed object cache from kelondroTree - generalized object caching and added new object caching class - added object caching wherever kelondroTree was used - added object caching also to usage of kelondroFlex - added object buffering (a write cache) to NURLs - added many assert statements; fixed bugs here and there - added missing close methods to latest added classes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2858 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	ec031eb993	first version of surftipps see http://localhost:8080/index.html git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2627 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	3aac5b26da	- added automatic tag generation when a web page from the search results is added - added new image 'B' in front of search results for bookmark generation - added news generation when a public bookmark is added - the '+' in front of search results has new meaning: positive rating for that result - added news generation when a '+' is hit git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2613 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
theli	b4acbdaa97	*) better handling of server shutdown See: e.g. http://www.yacy-forum.de/viewtopic.php?p=25234 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2470 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	23dd972608	fixed memory calculation in performanceMemory web page fixed also maximum cache size computation git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2429 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	e357599f92	* fixed problem with indexContainer iteration from RAM: indexContainers from RAM must be cloned explicitely to prevent side-effects on stored indexContainer objects in Cache * changed behaviour of urlReference deletion from indexContainers: deletion does not user retrieval of all Elements from the assortments * added textual configuration of kelondroRow and kelondroColumn definition * update of kelondroRow usage in yacyNews * modified kelondroAttrSeq to use modified kelondroColumn parser git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2339 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	92f4cb4d73	added option to configure the start-up delay time for kelondro database files. the start-up delay is used to pre-load the database node cache git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2276 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	66964dc015	removed high/med/low from kelondroRecords cache control. this was done because testing showed that cache-delete operations slowed down record access most, even more that actual IO operations. Cache-delete operations appeared when entries were shifted from low-priority positions to high-priority positions. During a fill of x entries to a database, x/2 delete situation happen which caused two or more delete operations. removing the cache control means that these delete operations are not necessary any more, but it is more difficult to decide which cache elements shall be removed in case that the cache is full. There is not yet a stable solution for this case, but the advantage of a faster cache is more important that the flush problem. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2244 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	4d8f8ba384	added cache-performance analysis for node caches git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2140 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago

1 2

77 Commits (2e75879504dd010b961df4b727f97d231d40fa2c)