Commit Graph

3748 Commits (c0379c3cd359cdf7371ad793719104d8efb871f6)

Author SHA1 Message Date
luccioman db3b9db9c2 Crawl from local file : faster task end when manually terminating crawl.
8 years ago
reger 4c67ed3f8d catch rwi ranking div by zero exception
8 years ago
luccioman 47af33a04c Advanced Crawl from local file : better processing of large files.
8 years ago
luccioman ee92082a3b Updated javadocs : warning about closing stream responsibility.
8 years ago
luccioman 6f49ece22f Fixed redirected URLs processing as crawl start point.
8 years ago
reger 68217465fe div by null in word distance calculation
8 years ago
luccioman 7263d17436 Removed mentions of deprecated LURL-db.
8 years ago
reger 8b74a6bf57 fix min/max calculation of WordReferenceVars.distance()
8 years ago
luccioman da362628fb Added fine log level for too long blacklist matching processing.
8 years ago
reger aaae7c6462 adjust ConcurrentScoreMap internal value map to interface and use parameter
8 years ago
reger 31d2a5645e remove obsolete query variable
8 years ago
luccioman a588ed7628 Applied image headers customization to the new ViewFavicon servlet.
8 years ago
luccioman 7717a3d43d Fixed license headers on files created to improve favicon management.
8 years ago
luccioman 6e1959f469 Merge branch 'master' of https://github.com/yacy/yacy_search_server.git
8 years ago
reger 685d8e86bf Avoid frequent data type casting (float/long) for rwi score
8 years ago
luccioman 3ccd89e274 Fixed MultiProtocolURL.resolveBackpath to handle remaining '..' segments
8 years ago
luccioman 4b699c469a Blacklist refactoring : extracted a function for easier unit testing
8 years ago
luccioman 54cfcc3f56 CrawlCheck_p.html : also display info about disallowed URLs.
8 years ago
luccioman 8b341e9818 Robots : properly handle URLs including non ASCII characters
8 years ago
reger e68b00678e prevent negative score on URIMetadataNode - in the special case were no
8 years ago
luccioman 242707f9b4 Fixed loadFromCache with strategy IFFRESH.
8 years ago
reger b752bcfecb adjust date in text detection to ignore some program version strings
8 years ago
reger b017e97421 optimize condenser language detection a little.
8 years ago
reger ae3717d087 adjust Tokenizer sentence count to ignore repeated punktuation (like !!!! )
8 years ago
reger 474f0476c6 adjust Tokenizer sentence count on trailing text after last recognized sentence
8 years ago
reger 3861ac9293 upd maven dependency-check plugin to reflect changes of https://nvd.nist.gov
8 years ago
reger 681a61dafb adjust rwi index result word position handling used for rwi ranking
8 years ago
reger 14f7577231 add support for older Word versions (Word6/Word95) to docParser
8 years ago
reger 1a79c64495 generalize DateDetection with holiday date rules readily available in icu
8 years ago
reger 6f68f08354 correct DateDetection Silvester date
8 years ago
reger 32a2e3a22a have RSSFeed.getChannel return empty message on missing channel element,
8 years ago
luccioman 8d57b5b970 Added some javadocs.
8 years ago
luccioman 60df09fff9 Fixed some HTML validation errors : Illegal character in query
8 years ago
reger 862f28eaa6 display number of documents/rss-items for label "docs" in load_rss_p servlet
8 years ago
luccioman dcdea2d02f Fixed shutdown for crawler.MaxActiveThreads value greater than 200
8 years ago
luccioman d286ba2c3e Merge branch 'master' of https://github.com/yacy/yacy_search_server.git
8 years ago
luccioman b8f6458152 Prevent yacy main thread from hanging on browser opening process.
8 years ago
reger 70e1eb30a5 prevent StringIndexOutOfBounds in getLocalFile()
8 years ago
luccioman 1bb0b135ac Avoid duplication of various MS Windows file URLs flavors
8 years ago
luccioman b9a8476f02 Removed unused import
8 years ago
reger e73c1eea8c remove unused rootpattern, leftover from commit
8 years ago
reger 6f8c3ccea4 improve url hash computation for file path with mixed java & windows
8 years ago
reger efcb6a1e74 fix supported mime XML -> xml for rssParser (mime normalized to lower case for comparison)
8 years ago
luccioman b3b75b0498 Accessibility : add a customizable alternative text to YaCy log
8 years ago
luccioman f2bc1b268d Updated URL fragment validation rules according to current standards
8 years ago
luccioman b1b8e69da8 Fixed NullPointerException cases
8 years ago
luccioman 3ee4f56c39 Improved ErrorCache behavior when switching networks
8 years ago
luccioman 7d5ba2afa4 Added some JavaDoc and moved crawlStacker close at the right place.
8 years ago
luccioman 8edbcd8ad4 Log eventual Solr instances close errors.
8 years ago
reger 330768c8a2 fix for solr write.lock after mode change http://mantis.tokeek.de/view.php?id=686
8 years ago