Commit Graph

3175 Commits (0f9c0bd0d52e9ec10aef70b53ea05037bd6396f9)

Author SHA1 Message Date
orbiter 8e0de7f180 update to language statistic evaluation:
16 years ago
orbiter 1198eeecc7 added language selection to search query:
16 years ago
orbiter 00c1535f84 added ranking and evaluation of language type in a search
16 years ago
lotus a81cb78211 finally some putHTML on htroot/xml/
16 years ago
orbiter bfcf9b7aa3 - added language detection using metadata from documents: html and odt documents provide this information
16 years ago
apfelmaennchen 5e8bd0f29c small fixes to getpageinfo_p.xml and htmlFilterContentScraper.java with respect to keyword extraction
16 years ago
apfelmaennchen 5b2a57bfd0 - /xml/util/getpageinfo_p.xml added <desc> and <lang> tags
16 years ago
orbiter e1f67262f7 - added and removed some debugging output
16 years ago
orbiter ce2a7ed116 integrated language detection classes into condenser environment
16 years ago
orbiter 2b13705839 fixed a mistake in indexing queue processing: documents had been parsed before it was checked if they should be indexed or not. parsing was not necessary for this check, so the check was moved in the queue in front of the document parsing
16 years ago
orbiter 21dbb39afa switched two balancer cases
16 years ago
orbiter 1bbf362cef update to the crawl balancer: better organization and better crawl delay prediction
16 years ago
orbiter ddcf285499 - fixed a bug in performance setting (did not work with german translation)
16 years ago
orbiter 0cd0fee546 fixed bug with wrong proxy result enqueueing. See:
16 years ago
orbiter 670244849d fix for http://forum.yacy-websuche.de/viewtopic.php?p=9835#p9835
16 years ago
lotus fd9233244e configurable free disk space via disk.free
16 years ago
orbiter 25a62cdc3f small fixes
16 years ago
lotus 73f233bb11 * set resource observer to 1000MB
16 years ago
orbiter 5fbccfd75e fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1366&p=9348#p9348
16 years ago
orbiter a28faabfd2 fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1351&p=9242#p9242
16 years ago
apfelmaennchen 7b63c66a08 - bugfix in bookmarksDB.Tag.hasPublicItems()
16 years ago
orbiter 1fb1665e71 increased dht interval to avoid peer selection failure
16 years ago
orbiter 1eb813bd43 shifted index deletion-on-exit rule to the class where the errors are produced
16 years ago
f1ori ba76995d2c * fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1415
16 years ago
f1ori bea6c13139 * with r5137 robotParser didn't work at all -> fix
16 years ago
lotus 3ded1efe84 kelondroExceptionCounter didn't work
16 years ago
f1ori ae677e1738 * fix problem in robotparser, see http://forum.yacy-websuche.de/viewtopic.php?f=5&t=1421&p=9742
16 years ago
lotus 383d89481e count errors before deleting collection.index
16 years ago
lotus 0bb4fbc403 delete corrupted collecion.index on exit for rebuild on next start
16 years ago
lotus b68d06a6e8 performance settings based on network's remote crawl speed
16 years ago
danielr d60b2b198d proxy fixed 'not modified' http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1419
16 years ago
f1ori bd0318ba81 * YaCy only supports gzip-encoding, so remove any other encoding from request
16 years ago
orbiter bb5c898441 enhancements to localsearch behavior
16 years ago
orbiter 42e2d195ac added hint from http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1294
16 years ago
orbiter 39964e88fa fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1329#p9121
16 years ago
orbiter 3f3673b6e5 extended balancer:
16 years ago
orbiter 3c6e8d2015 set default ppm when network is switched
16 years ago
orbiter 3288c19c1a reduce remote crawl PPM for fresh peers in freeworld to 6 PPM
16 years ago
lotus 5ce9a100bb fix(2) for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1416
16 years ago
danielr cf29ca19d4 possible fix for POST character encoding http://forum.yacy-websuche.de/viewtopic.php?f=5&t=1374
16 years ago
danielr a2eeb6138c fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1416
16 years ago
orbiter d09ddabd09 corrected a design mistake (5-byte hashes not necessary)
16 years ago
orbiter c97d0fcee7 modified the domain list export function:
16 years ago
orbiter 77ee0765a4 - added domain statistic generation to IndexControlURLs_p.html servlet
16 years ago
orbiter 80a7bc93d6 - added statistical evaluation about domains that appear during crawling
16 years ago
orbiter 4fbee21cea - added fetch-ahead again (had been removed in last commit)
16 years ago
lotus 423a89ebe8 * fix if yacy was installed to a path with whitespace
16 years ago
orbiter fc03b0437a fixed a error case where a second search after a first search with a different search word failed
16 years ago
orbiter eca171ba2e fix for case where javascript was not filtered by the html parser
16 years ago
lotus e645bae29f display table in log
16 years ago
orbiter ead39064c5 fixed problem with wrong result number calculation
16 years ago
hermens 2437beb96c fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=1360&p=9321#p9321
16 years ago
orbiter 7b12e77a63 fix for http://forum.yacy-websuche.de/viewtopic.php?f=5&t=1393&hilit=&p=9655#p9655
16 years ago
orbiter 05dbba4bab added logging conditions to all fine and finest log line calls
16 years ago
orbiter d3d41e2ee4 - fixed problem with searching with quotes (still not complete, but not as bad as before)
16 years ago
lotus 3fbfd5a78b * fix for non-changing offset on new search term
16 years ago
danielr 219b93df6a - fixed internal error after receiving chunked POST
16 years ago
lotus c245c7a45e delete index.dhtin/out.heap if restore fails
16 years ago
danielr cd19d0aee6 - added warnings for failed transferRWI (dht-in)
16 years ago
orbiter df4ff423c4 added additional properties to query id's to distinguish search events better
16 years ago
danielr d6d9b0f14a fixed transferRWI.html 'Read timed out'
16 years ago
danielr e503158527 Proxy: fix for never ending loading after POST
16 years ago
danielr 1a1d57e449 Proxy: added binary passthrough for POST
16 years ago
apfelmaennchen aa6ae77e5e - autoReCrawl: fix for filter settings
16 years ago
apfelmaennchen 8ae29bad57 - fix to previous change of Crawl Profile Names
16 years ago
apfelmaennchen 434104e4a0 - change Crawl profile name for autoreCrawl
16 years ago
danielr 9ff4fc11da partial fix (images,audio,video) for proxy and content-type problem http://forum.yacy-websuche.de/viewtopic.php?f=5&t=1374
16 years ago
lotus 0df2e47012 changed auto recrawl to comply with new date format
16 years ago
lotus d9d9c522a1 addendum to last commit
16 years ago
lotus 480497f7c9 changed recrawl
16 years ago
orbiter da1b0b2fc6 added two new classes that will be used for the new htcache
16 years ago
orbiter 536e77e8b7 modifications towards a single database operation to read/write http header and cached file at once:
16 years ago
borg-0300 08cdf6db8a fix for wrong "VegaYacyB" peers
16 years ago
danielr 4d937f6b21 fix for http://forum.yacy-websuche.de/viewtopic.php?f=5&t=1396
16 years ago
apfelmaennchen bd931a82f7 - added dynamic filters to autoReCrawl.conf
16 years ago
apfelmaennchen b3fc5e96a3 - removed unused import from bookmarksDB
16 years ago
apfelmaennchen bc048db7b6 - bugfix for bookmarksDB's rebuildDates()
16 years ago
danielr 3c68905540 remove redundant null checks
16 years ago
danielr 753a1ae430 - changed default browser from netscape to firefox
16 years ago
orbiter 7989335ed6 Preparations to replace the HTCache with a new storage data structure:
16 years ago
danielr be28af50f5 - fixed "yacy2yacy no proxy"-problem
16 years ago
f1ori f99c307eff * correct debian build dependencies
16 years ago
orbiter bdae051d9a - extended new performance graph (better timing)
16 years ago
danielr d9cea5ff23 removed annotations which broke the build with java 1.5
16 years ago
danielr a087090bbb fixed starting crawl results in "No parser available to parse mimetype 'application/octet-stream'"
17 years ago
danielr 7e7e6a099a undo 5044
17 years ago
danielr f2d0bd7790 fix for NPE in JakartaHttpClient.setProxy
17 years ago
danielr bb6a6fc233 fixed 'FileUploadException Stream ended unexpectedly'
17 years ago
danielr 8422ee5ec4 - fixed UnsupportedEncoding (in proxy) using defaultCharset if no characterEncoding can be determined
17 years ago
hermens 3ac1988059 Add some sanity checks for invalid seeds
17 years ago
hermens cff4393f0c Fix HTCache so oldest Files get deleted first
17 years ago
danielr 31d97f2b9f replaced httpd.parseMultipart() by a 'right' implementation
17 years ago
danielr 621b473b18 * removed some warnings of findbugs (http://findbugs.sf.net)
17 years ago
apfelmaennchen 0500b1179e added a 2 min start up delay to serverBusyThread autoReCrawl to avoid a Null Pointer Exception...
17 years ago
apfelmaennchen e1574fe02e - added autoReCrawl folders to bookmarks (DATA/SETTINGS/autoReCrawl.conf)
17 years ago
orbiter ebb40d324b enhanced memory chart: shows now also the size of the word cache as third vector.
17 years ago
danielr 17b7845eb5 * refactoring
17 years ago
danielr 3bb870bfcd added final where possible
17 years ago
lotus 7e92484400 fix for open browser on windows 2000
17 years ago
f1ori b0724e5ec0 * add config option to disable cookie monitoring (disabled by default)
17 years ago