yacy_search_server

Commit Graph

Author	SHA1	Message	Date
orbiter	398e210fef	removed synchronization in logging that causes deadlocks in high-performance environments git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6044 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	db3a06dd81	removed cookie handling in httpc: - no need to do cookie handling in proxy, this was switched off so far - no need for cookies in crawler, this was switched on (by mistake) This fix was needed for a case where a web server flooded the crawler with cookies and caused a complete blocking of the httpc. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6043 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	1c54ae4a63	some small changes in HandleMap Testing git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6042 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	2c5554c912	small enhancements in search result computation speed git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6039 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	e0b3984805	added navigation keys for site and author facets to remote search interface git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6038 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	27fa6a66ad	- completed the author navigation - removed some unused variables git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6037 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	a9a8b8d161	- added display of author navigation (usage of that navigator not yet implemented - added a synchronization in pdf parser which should help to avoid deadlocks that occur when displaying several search results pointing to pdf sources - fixed smaller bugs in navigation git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6036 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	c879783008	added steering of navigator computation: - by default the navigator computation if off for servlet yacysearch.html, but: - the servlet is called by default with a option to switch navigator results on this will prevent that metasearch users will get slow results that are caused by unnecessary computations git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6035 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	c079b18ee7	- refactoring of IntegerHandleIndex and LongHandleIndex: both classes had been merged into the new HandleMap class, which handles (key<byte[]>,n-byte-long) pairs with arbitraty key and value length. This will be useful to get a memory-enhanced/minimized database table indexing. - added a analysis method that counts bytes that could be saved in case the new HandleMap can be applied in the most efficient way. Look for the log messages beginning with "HeapReader saturation": in most cases we could save about 30% RAM! - removed the old FlexTable database structure. It was not used any more. - removed memory statistics in PerformanceMemory about flex tables and node caches (node caches were used by Tree Tables, which are also not used any more) - add a stub for a steering of navigation functions. That should help to switch off naviagtion computation in cases where it is not demanded by a client git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6034 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	bead0006da	replaced tmp file extensions by prt git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6033 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	3189f9cd39	fixed problem with DCEntry initialization git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6032 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	a704d82280	patch for problem with digest git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6031 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	3029ef6eb3	fixed a bug that was recently inserted which caused that no idx and gap files were written. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6030 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	b6e274f211	omit most of forced crawl delays by using a separat delay table which flushes delayed URLs at the correct time git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6029 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	d50be59088	- added a automatic re-construction of the domain stack after 10 minutes. this includes then urls to the domain stack that were left over in case of stack size limitations when the domain stack was created the last time - changed the busy sleep time for the crawl thread to 30 millisecons. This is sufficient to crawl with 2000 PPM. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6028 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	5fdba0fa51	- fixed a not working selection rule in balancer - more security about crawl-delay, be more fail-save - better logging in case of long forced crawl-delays git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6027 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	f5602404d5	another speed boost for the balancer git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6026 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	95e8cbd1c3	new fully redesigned balancer and bugfixes regarding lost profile handles and killed crawls git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6025 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	c062385552	fix for http://forum.yacy-websuche.de/viewtopic.php?p=15555#p15555 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6024 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	42ae40b9f6	some bugfixes to database close() methods git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6023 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	a0c53abbe1	- wait until local results are computed during search, see http://forum.yacy-websuche.de/viewtopic.php?f=5&t=2167&hilit=&p=15521#p15521 - show only x+1 pages in page navigator git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6022 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	9bfd22f65d	fix for http://forum.yacy-websuche.de/viewtopic.php?p=15523#p15523 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6020 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	1c77db670f	re-designed response format for navigation: - changed json and rss response templates git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6019 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	15fad767c0	some refactoring of topic generation git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6018 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	cc49aedf12	- fixed problem with remote search NPE - more abstraction for search requests git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6015 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
f1ori	9e18abc2ac	* fix charset detection, see http://forum.yacy-websuche.de/viewtopic.php?f=5&t=2137 * why has this been uncommented??? git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6014 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	c38c852090	modified access method to get index entries out of a array of BLOBs: iterate them, then merge; not collect them and merge then. This should use less memory and may behave better in an environment with many queries. To ensure that too many queries will not cause total blocking, a time-out of one second was also added. After the time-out the index data that was collected so far is returned. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6013 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	ab06a6edd2	renamed topwords to topics and enhanced computation methods of topics topics will now only be computed using the document title, not the document url, because the host navigator is now responsible for statistical effects of urls. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6011 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	a5d481eab1	enhanced navigation - fixed too early computation of navigation - moved navigation rendering to yacysearchtrailer - added more asserts git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6006 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	7639ec2f38	- fixed letter case bug for dc record creation - dc parser is now lazy against letter cases git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5998 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	4522c13ee7	added option for a table prefix when importing phpbb3 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5996 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	1c69d9b8b6	more refactoring of the index classes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5995 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	3d5f2ff544	- added new servlets to support search portal administrators for the integration of yacy search fields in their web pages - moved some servlets from here to there.. - changed menu structure - removed yacyui-portaltest.html which contained an example for the live search which is now integrated on all pages in yacy. The code snippet example from that page is integrated into the ConfigLiveSearch.html servlet git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5994 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	4d4315687f	fix for problem with concurrency in host navigator, bug reported by wsb git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5993 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	88426912ad	more refactoring to make the segment object easier to use and to be prepared to integrate author navigation git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5992 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
lotus	d813fd26ed	reset sent/received counters on index delete git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5991 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	99bf0b8e41	refactoring of plasmaWordIndex: divided that class into three parts: - the peers object is now hosted by the plasmaSwitchboard - the crawler elements are now in a new class, crawler.CrawlerSwitchboard - the index elements are core of the new segment data structure, which is a bundle of different indexes for the full text and (in the future) navigation indexes and the metadata store. The new class is now in kelondro.text.Segment The refactoring is inspired by the roadmap to create index segments, the option to host different indexes on one peer. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5990 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	876746602d	catch problems of file hash computation, see also: http://forum.yacy-websuche.de/viewtopic.php?p=15245#p15245 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5989 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	fec6f9054f	some refactoring of search methods git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5988 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	3d4b826ca5	migration of all databases that use the deprecated BLOBTree format into the BLOBHeap format. Old databases are migrated automatically. This removes the last very IO-intensive data structures which were still used for Wiki, Blog and Bookmarks. Old database files will still remain in the DATA subdirectory but can be deleted manually if no major bugs appear during migration. There is no need for any user action, all migration is done automatically. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5986 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	4b4bddca00	added new submenu to crawler menu: import of phpbb3 forum postings from mysql - yacy can import phpbb3 posts without crawling - all data is written as surrogate - indexed surrogate files can be re-used git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5985 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	d8284046b0	enhanced speed of site navigation computation git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5980 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	c72a5cf326	added stub for PHPBB3 extraction code using direct access to mySQL git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5979 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	e735d3a69f	fix for http://forum.yacy-websuche.de/viewtopic.php?p=15175#p15175 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5978 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	63a0255166	- refactoring: added new content package, which will contain connector classes for different types of data sources to import texts into the YaCy index - refactoring: migrated data objects for the new connector classes - added a DAO interface class to specify an abstract interface for database retrieval connector methods git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5977 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	f246928c20	first attempt to add 'real' Navigation to yacy search results: host navigation - after a search is started, it is analysed how many hits are in each site - this can be done really efficient, because the navigation information is hidden in the url hash and can be computed very fast - the search result shows a column on the right with the hosts and the hits per host - after a click on a host the search is modified using the efficient site: - operator git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5976 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	54b9e99c01	- more information about peer tags - peer tag is by default '*' git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5975 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	26a46b5521	increased default maximum file size for database files to 2GB Other file sizes can now be configured with the attributes filesize.max.win and filesize.max.other the default maximum file size for non-windows OS is now 32GB git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5974 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	addecdb18c	simplified code, removed one unused method in all implementing classes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5972 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
borg-0300	47fce9020c	small change (Orbiter's wish) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5971 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
borg-0300	e07b14e5d7	finally a working fix for 5960 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5970 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
borg-0300	3ebb904d2c	fix for 5960, http://forum.yacy-websuche.de/viewtopic.php?f=6&t=2119 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5969 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
lotus	734680dc70	initialize the ResourceObsever in own thread git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5968 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	e005cfea37	fix for bug in -incell option of URLAnalysis git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5967 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	a7e392f31b	The collection index will not be supported any more. Existing indexes based on the old index collections must be migrated with YaCy 0.8 - removed index collection classes and all migration tools - added a 'incell' reference collection feature in URL analysis git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5966 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	a2f48863fc	- added prototype for navigation index - refactoring of word index prototype (no functional changes so far) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5965 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
lotus	47fd226bdb	proper parsing of sentences does not affect tokens/words git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5964 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	27eb8d62cb	- new development cycle - removed temporary configuration with safe setting for indexer threads (=1) and replaced it with best value computed during performance tests (1/2 of number of processors) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5963 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	b7457d3807	patch for http://forum.yacy-websuche.de/viewtopic.php?p=14720#p14720 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5960 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	bffbe43e09	fix for http://forum.yacy-websuche.de/viewtopic.php?p=14522#p14522 fix for http://forum.yacy-websuche.de/viewtopic.php?p=14955#p14955 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5959 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	f133d6065c	fix for http://forum.yacy-websuche.de/viewtopic.php?p=14955#p14955 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5958 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
lotus	82af994041	added missing loglevel git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5956 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	ad9762746d	no exception in case of uniq() time-out, see also http://forum.yacy-websuche.de/viewtopic.php?p=13177#p13177 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5955 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	1efe686e3f	fix for http://forum.yacy-websuche.de/viewtopic.php?p=13960#p13960 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5954 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
lotus	13fb84ab81	you can define your default number of search results displayed by search.items this applies only to requests through the classic-style page git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5953 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	f2e4d156e8	removed debug messages git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5950 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	709bfc2cd4	added a memory check in http post protocol git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5949 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	c01d6f43e1	- fixed problem with thread dump if no arguments are given - rejecting peers that are older than 6 hours (not-seen during 6 hours) - 0.78, targeting 0.8 at the end of the week git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5948 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	a49edd9415	fix for bug in search with site: constraint git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5947 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	c1e5fad9a7	fix for http://forum.yacy-websuche.de/viewtopic.php?p=14767#p14767 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5944 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	8ee3a94e82	fix for non-caching of sitehash, see http://forum.yacy-websuche.de/viewtopic.php?p=14440#p14440 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5942 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
borg-0300	21930d05ed	fix for [B@... git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5941 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	b6ba387e01	fix for http://forum.yacy-websuche.de/viewtopic.php?p=14751#p14751 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5940 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	4338dcf936	fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=2093&hilit= git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5937 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
lotus	bad7ce9286	experimental option trayIcon.force for unsupported platforms. java 1.6 needed git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5936 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
low012	ea27853c59	) some refactoring ) added one assertion *) no functional changes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5935 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
low012	d164b42604	*) cosmetics git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5934 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	17150b2950	fixed bug in snippet computation git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5932 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	89aeb318d3	enhanced the wikimedia dump import process enhanced the wiki parser and condenser speed git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5931 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	5fb77116c6	added a submenu to index administration to import a wikimedia dump (i.e. a dump from wikipedia) into the YaCy index: see http://localhost:8080/IndexImportWikimedia_p.html git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5930 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
hermens	df733af4fa	Try not to loose content from ram during IndexCell.delete by moving ram.delete after the dangerous operations on the array (array.get and array.delete) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5929 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
hermens	ac72005f2f	Let IndexCell.remove remove entries from the ram portion of the DB as well. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5928 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	8ba7ff5353	a fix and another speed enhancement for the RWI cache git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5927 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	05f077e85f	added stack trace output to solve problem in http://forum.yacy-websuche.de/viewtopic.php?f=6&t=2076&hilit=&p=14612#p14612 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5926 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	71a4cadf31	better and more performant synchronization in SimpleARC, the caching object for word hashes. Speeds up indexing. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5925 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	e6773cbb33	better handling of RWI cache for concurrency and less overhead when writing new entries -> even more indexing speed git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5924 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	c097531e3d	added a catch Exception to all thread to check if any of them silently dies without any other notification git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5922 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	083533e5ec	fix for bugs in IODispatcher git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5921 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	21fbca0410	better scaling of HEAP dump writer for small memory configurations; should prevent OOMs during cache dumps git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5920 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	6e0b57284d	better care for states of the IODispatcher git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5919 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	1db9cdd4e4	fixed bug in writing of robots.txt entries in case that host names exceeded 64 characters and some other problems git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5918 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
f1ori	bde88b684a	* splitt off yacyRelease from yacyVersion * added some gui infos about signatures git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5916 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	057ce14c8e	more fixes (character encoding, parser exceptions, http client failure, blob writing) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5914 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	d2ac0aa682	- fixed possible bugs in Stack (may affect Crawler reset) and RandomAccess handling - increased default memory size to 180MB - fixed possible bug in http client reset (there was a deadlock) - bug in BOBHeap marked, but not solved, cause is still unknown. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5912 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
lotus	1351d903a1	don't follow links like mailto: git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5909 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	e88a66bcae	temporary disabling computation of all sublinks (check needed) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5908 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
low012	ff5f82d780	) removed description of removed commands from wikiHelp ([= =]) ) used format function of Netbeans for wikiCode to make it more readable, no functional changes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5907 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	eacf95213a	fix for crawling of mailto-links git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5906 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	9c6ac43f66	fixes for wiki parser git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5905 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	3a64c9d02f	- fix for problem with concurrency when computing word hashes - fix for search in case that a urlfilter was used and zero results were returned git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5904 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	d3f8aa5a2a	set of small fixes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5903 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
low012	78ffb61297	*) got rid of unnecessary variable which might also fix IndexOutOfBoundsException git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5902 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	d31e6f9c14	fix for http://forum.yacy-websuche.de/viewtopic.php?p=14457#p14457 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5899 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	8d6212233b	fix for IODispatcher git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5896 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	f678472f46	fix for quote problem in json output git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5895 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	d079d6dfdb	small changes in surrogate reader, wiki code and portal test git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5894 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	07f09742bb	set of small fixes and comments git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5893 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
borg-0300	06ed4ef7b3	* better picture handling git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5891 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	5a634cab23	removed generation of anchor link sets in document types that describe container formats. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5890 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
low012	f1244264b8	*) hopefully fixed bug reported in http://forum.yacy-websuche.de/viewtopic.php?t=2057 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5882 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	2e3186189b	fix for mediawikiIndex surrogate producer + added concurrency git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5880 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
apfelmaennchen	6f5ea7b1a8	small fix for previous post git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5879 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
apfelmaennchen	138a0747e3	added serverObjects.putJSON as JSON has very particulare encoding requirements git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5877 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	d977dd9a96	fix for surrogate loader git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5870 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	9cb68353da	fix for bug in ProfilingGraph for ppm >> 10000 ppm (!) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5868 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	9e4db75aac	reduced internal logging and reduced memory that internal logging can use git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5867 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	c10c257255	attempt to fix a deadlock situation where the IODispatcher did not work. I suspect the dispatcher thread has crashed and queues filled so no indexing process was able to write data. This fix tries to heal the problem, but I am unsure if it helps. To get a better view of the problem, some more log outputs had been inserted. Added also a new attribut indexer.threads to get a control over the number of default threads for the indexer (default is 1) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5866 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	09987e93fd	fixed some more bad handling of byte[] git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5865 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	1bcc1450cb	more explaining error message in case of IOExceptions during html parsing git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5864 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	fe51f4d668	less synchronization may help to prevent deadlocks git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5863 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	58802e4201	added missing success test in storeDocumentIndex, see http://forum.yacy-websuche.de/viewtopic.php?f=5&t=1922&hilit= git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5862 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	171e62bee5	addition to the fix from last commit (which did not work) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5860 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	059949a0d1	tried to fix problem with snippet fetch for second search page when verify=false git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5859 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
lotus	b08991e278	moved some constants, rename of Tray class git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5858 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	138422990a	- removed useCell option: the indexCell data structure is now the default index structure; old collection data is still migrated - added some debugging output to balancer to find a bug - removed unused classes for index collection handling - changed some default values for the process handling: more memory needed to prevent OOM git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5856 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	1b9e532c87	some concurrency for wikipedia dump reader git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5855 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
lotus	25d2160288	small fix git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5853 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	16baa7ad24	To translate a mediawiki dump into the YaCy surrogate format do the following: - download a wikipedia dump, i.e. dewiki-20090311-pages-articles.xml.bz2 from http://download.wikimedia.org/dewiki/20090311/ - move dewiki-20090311-pages-articles.xml.bz2 to DATA/HTCACHE/ - start the conversion; open a command shell, move to the yacy home directory and execute java -Xmx2000m -cp classes:lib/bzip2.jar de.anomic.tools.mediawikiIndex -convert DATA/HTCACHE/dewiki-20090311-pages-articles.xml.bz2 DATA/SURROGATES/in/ http://de.wikipedia.org/wiki/ this generates a series of files to DATA/SURROGATES/in if YaCy is running (it may run concurrently), it fetches all new dumps in the surrogate-in directory. The export process is transaction-save, that means YaCy will not start reading a dump while the dump is not completely finished. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5851 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	0b2c98edc9	some more work on the wikipedia-dump exporter (not finished yet) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5850 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	5195c94838	two patches for performance enhancements of the index handover process from documents to the index cache: - one word prototype is generated for each document, that is re-used when a specific word is stored. - the index cache uses now ByteArray objects to reference to the RWI instead of byte[]. This enhances access to the the map that stores the cache. To dump the cache to the FS, the content must be sorted, but sorting takes less time than maintenance of a sorted map during caching. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5849 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	9416f5c26f	more speed test cases: kelondro provides map functions that are more than 20% faster than standard java classes and use less than halve of the memory of java classes: just start IndexTest (here with 1000000 test objects) Performance test: comparing HashMap, TreeMap and kelondroRow generated 1000000 test data entries STANDARD JAVA CLASS MAPS sorted map time for TreeMap<byte[]> generation: 2110 time for TreeMap<byte[]> test: 2516, 0 bugs memory for TreeMap<byte[]>: 29 MB unsorted map time for HashMap<String> generation: 1157 time for HashMap<String> test: 1516, 0 bugs memory for HashMap<String>: 61 MB KELONDRO-ENHANCED MAPS sorted map time for kelondroMap<byte[]> generation: 1781 time for kelondroMap<byte[]> test: 2452, 0 bugs memory for kelondroMap<byte[]>: 15 MB unsorted map time for HashMap<ByteArray> generation: 828 time for HashMap<ByteArray> test: 953, 0 bugs memory for HashMap<ByteArray>: 9 MB git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5847 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	b53790abb1	more performance hacks: 10% more speed for Base64.compare() which is really often used in YaCy code git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5846 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	8ffb9889e1	some fixes and performance hacks git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5845 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	dfb96ecb72	more fixes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5844 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	1b8d346b4c	fixes in connection with transiton to byte[] hashes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5843 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
f1ori	0b0a46d35a	* fix transferRWI as suggested by celle (thanks!) see http://forum.yacy-websuche.de/viewtopic.php?f=5&t=2000#p14023 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5842 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	996572de95	quickfix git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5841 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	380ed2dac0	performance and debugging additions git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5840 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
lotus	635b0a9da7	code-split allow cgi indexing git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5839 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	fa3adbbfc6	added domain checks to surrogate reader and RWI transfer receiver to prevent spaming using surrogates git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5837 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
f1ori	76af84d732	* add custom comparator to ScoreCluster for byte[] * fixes http://forum.yacy-websuche.de/viewtopic.php?f=6&t=2010 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5836 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
lotus	ab0030d7a7	allow dht-out for remote-crawl processing peers on default settings git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5834 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
low012	d1116c049f	) added new method "contains()" to Blacklist interface ) implemented contains() in class AbstractBlacklist *) used new method in Blacklist_p to prevent double entries in blacklists git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5832 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
f1ori	08445e42f0	* don't throw exception, in case of bad charset in http-header git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5831 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
f1ori	2f860a2564	* convert byte[] hashes to string for log output git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5830 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
f1ori	d93a2a6552	* ignore whitespaces so you can copy&paste signatures better git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5828 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	fbcbcc5bdb	export of yacy document objects as dublin core record in xml git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5826 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	d7cbf4cdd4	more performance hacks: less overhead in word hash computation git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5825 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	29e96c1a60	bugfixes and performance hacks git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5824 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	4e97a31009	corrections in dublin core syntax git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5823 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago

1 2 3 4 5 ...

3761 Commits (aee35bff6f75d41f05419b8af44577ac005406a8)