yacy_search_server

Commit Graph

Author	SHA1	Message	Date
f1ori	4907697cfa	* make fileuploads through proxy bigger than 65500 bytes possible * remove gzip-encoding for files from cache git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5407 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	fc8189f3fb	better self-healing of corrupted databases git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5406 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
f1ori	963da8c3f9	* updated tm-extractors to new version 1.0 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5405 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	e34ac22fbd	- added new monitoring servlet at http://localhost:8080/PerformanceConcurrency_p.html - used the new monitoring to do some fine-tuning of the indexing queue git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5402 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
lotus	449e697436	fix for null-seed in seedfile http://forum.yacy-websuche.de/viewtopic.php?f=5&t=1653 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5401 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	d376d81fc4	replaced busy thread control of crawl stacker by blocking threads git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5400 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	f29b48d9ff	patch for IndexOutOfBoundsException git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5399 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
f1ori	0881190b19	* Robots.txt: don't interpret Crawl-Delays for other robots fixes: http://forum.yacy-websuche.de/viewtopic.php?f=5&t=1647 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5398 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	243e73f53b	removed unnecessary usage of kelondroBLOBTree git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5397 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	8cb7170b75	- set status of kelondroTree, kelondroBLOBTree and kelondroFlexTable to deprecated - removed initialization and/or usage of kelondroFlexTable (should meanwhile not be used any more) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5396 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	7535fd7447	- refactoring of CrawlEntry and CrawlStacker - introduced blocking queues in CrawlStacker to make it ready for concurrency - added a second busy thread for the CrawlStacker The CrawlStacker is multithreaded. It shall be transformed into a BlockingThread in another step. The concurrency of the stacker will hopefully solve some problems with cases where DNS blocks. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5395 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
lotus	18513e2ee2	npe fix: http://forum.yacy-websuche.de/viewtopic.php?t=1646 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5393 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	2802138787	- refactoring of CrawlStacker (to prepare it for new multi-Threading to remove DNS lookup bottleneck) - fix of shallBeOwnWord target computation heuristic git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5392 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	db6b3bf5a3	speed enhancement for integrated http server: - tuning hacks in template engine - bypassing the template engine if no servlet present git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5389 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	7cd08bd5fb	fix for NPE in BLOBCompressor git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5388 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	5b94498643	fine-tuning of cache usage from SVN 5386 and a bug fix for overflow in available() method git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5387 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	1779c3c507	- added a read cache to the RAFile interface to RandomAccessFile - added a write buffer to BLOBHeap - modified the BLOBBuffer (is now only to buffer non-compressed content) - added content compression to the HTCache The new read cache will decrease the start/initialization time of BLOB files, like the HTCache, RobotsTxt and other BLOBHeap structures. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5386 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	e1acdb952c	fix for problem with userDB and bookmarksDB which was caused by changes in kelondroRA in SVN 5376 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5385 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	4a2dac659e	more speed hacks: - modified and activated write buffer - increased cache flush factor - fixed a problem with deadlocking of indexing process git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5382 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	47292e696a	more performance hacks git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5379 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	759cef23dd	fix for bug in kelondroAbstractRA.readFully git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5378 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	d39d420b39	performance hacks git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5376 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	0b4808ba3d	added new interactive search feature: - during the user types search queries, the local database is searched - results are presented interactively This was implemented using a new JSON result format for search results in YaCy - added JSON as file format for servlets - refactoring of current search servlets (xml and html) - added JSON output format for search results - added AJAX-based search page, that uses the yacysearch.json selrvlet to print results as a query is typed git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5373 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	74a3d86114	fixed a error response that might present classified information git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5372 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	c6525ab75f	fix for NPE in seed handling git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5371 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
lotus	1951d30a62	addendum to last commit handle words with length < 3 correctly git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5369 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
lotus	325ba7bfb8	only query words with length > 2 this is not complete, yet git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5368 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
low012	e423fa9846	) added method to only get file names in directory listing which match a filter ) only files which end with .black will be listed as blacklists *) added a little bit of Javadoc git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5366 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	513179f404	changed interface to colletctionIndex and adopted all implementing classes: do not return a result of a double-check when adding entries with addUnique git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5363 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	9d64693cfb	reverting again the changes to new concurrent chunkIterator git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5362 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	45ad1c3dd5	- re-activated concurrent iterator for EcoFiles - added javadoc for new concurrent intialization in kelondroBytesLongMap - switched default value for commons storage to false - version step git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5361 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	2e2120046f	speed enhancement for BLOBHeap opening process using concurrency of FileIO and content processing git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5360 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	fa26a8f25a	fix for deadlock-like behavior in balancer git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5358 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	1918a0173e	added more exception handling during crawling git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5357 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	10f5ec1040	reverted last commit (more testing needed) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5356 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
f1ori	5af8923f37	* distribute forgotten jar-file in parser git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5355 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	b0f2003792	fast database initialization and fast start.up of yacy: - applied knowledge about concurrent files stream reading and index processing from the wikimedia reader to the EcoTable initialization process: the file reader is now concurrent to the index generation - changed also some initialization processes to avoid some pauses during initialization git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5354 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	0ca4bc7b79	- added reader and visualization for mediawiki-export files: files exported from mediawiki using the xml schema according to http://www.mediawiki.org/xml/export-0.3/ can be processed to be viewed in a YaCy servlet. To acces such a file, place it into DATA/HTCACHE/mediawiki/ i.e. the export from german wikipedia would be: DATA/HTCACHE/mediawiki/wikipedia.de.xml This file can then be accessed using the URL http://localhost:8080/mediawiki_p.html?dump=wikipedia.de.xml&title=YaCy if this is done the first time, an index file is created (for this case: more than 4 million lines must be written, this takes about 15 minutes) Then try the same url again. - enhanced also the md5 computation speed git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5352 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
danielr	2e63f03ca5	copy&paste vergessen :/ git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5351 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
danielr	cd8082b4e3	fix for http://forum.yacy-websuche.de/viewtopic.php?f=5&t=1111#p11166 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5350 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
lotus	4f996a7651	fix for logparser pattern git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5349 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
f1ori	d18c18971e	* dirlisting in UTF-8 encoding * fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1550&hilit=#p11108 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5348 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	867d0f2f56	removed some unnecessary pause delays git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5346 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
f1ori	d49ffcd818	* files distributed by yacy are utf-8, files from repository use the system default charset * fixes http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1564#p11092 and http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1550 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5345 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	8c96bc2ac1	do not use proxy caching rules for crawling git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5344 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	dba7ef5144	extended crawling constraints: - removed never-used secondary crawl depth - added a must-not-match filter that can be used to exclude urls from a crawl - added stub for crawl tags which will be used to identify search results that had been produced from specific crawls please update the yacybar: replace property name 'crawlFilter' with 'mustmatch'. Additionally, a new parameter named 'mustnotmatch' can be used, which should be by default the empty sring (match-never) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5342 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	96174b2b56	more debugging / better result status logging for parser/caching errors git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5341 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
f1ori	90e78b2cf6	* improve encoding detection of http service git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5337 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	ef66438662	- more space in error db to store larger error messages - added hash to HTCACHE storage files which will make it possible to join separate caches by just copying files git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5329 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	674ad2d55b	different handling of error cases that occur during loading files with http or ftp: methods throw exception instead of returning an error string git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5328 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago

1 2 3 4 5 ...

3357 Commits (4907697cfaea16f91e125279f8a7b4a74fbbf87b)