Commit Graph

29 Commits (a3ecfe0a45f59536ceb97a0ff3782d158f5e7e31)

Author SHA1 Message Date
orbiter 861f41e67e redesigned NURL-handling:
18 years ago
theli d157201e08 *) IfesL for "Unexpected end of ZLIB" error message
18 years ago
orbiter 109ed0a0bb - cleaned up code; removed methods to write the old data structures
18 years ago
orbiter 30888e7a2f implementation of search constraints
18 years ago
orbiter 497428c8ec refactoring
18 years ago
orbiter 76fceb9997 refactoring
18 years ago
orbiter bb7d4b5d5e refactoring to prepare new RWI entry object
18 years ago
theli a5b9b514c1 *) retry crawling without content-encoding if the content-encoding header was not correct
18 years ago
theli 1d4fb680ce *) CrawlWorker.java: only keep content in memory if size is equal or less than 5MB
18 years ago
theli f17ce28b6d *) plasmaHTCache:
18 years ago
orbiter 310f1c41cd added option to see ranking scores in surftipps
18 years ago
orbiter df1629b05a - code cleanup
18 years ago
theli b6c7b91582 *) Parser now throws an ParserException instead of returning null on parsing errors (e.g. needed by snippet fetcher)
18 years ago
theli a0ddf2ec11 *) AbstractCrawlWorker.java: delete already downloaded data on crawling error
18 years ago
theli fded1f4a5d *) better handling of maximum file size limit in crawler
18 years ago
theli 63893003be *) Adding settings page for the crawler which allows to specify a file size limit and the timeout to use.
18 years ago
theli b44514242a *) crawler/ftp/CrawlWorker.java: better errorhandling
18 years ago
theli 7d7f30139c *) crawler/ftp/CrawlWorker.java: delete old cache file
18 years ago
theli 043edfa4d8 *) ftp/ResourceInfo.java ResourceInfo object for ftp resources added
18 years ago
theli dae763d8e3 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2495 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli 4825bfaaf3 *) Bugfix for PrintWriter Problem
18 years ago
theli 7930839594 *) URL.java: userinfo was not taken over when generating a new url from a base url and a rel. path
18 years ago
theli 393a7d10be *) setting htCache.Entry fields to private
18 years ago
theli ab5a9bee66 *) adding some copyright headers
18 years ago
theli fce9e7741b *) next step of restructuring for new crawlers
18 years ago
theli 4e2a950ac9 *) next step of restructuring for new crawlers
18 years ago
theli 09b106eb04 *) next step of restructuring for new crawlers
18 years ago
theli eb9b138986 *) next step of restructuring for new crawlers
18 years ago
theli 1395aae742 *) starting restructuring which is needed to add crawlers for additional protocols
18 years ago