Commit Graph

19 Commits (86a92102640828f1b254b1a6c050f095e534906a)

Author SHA1 Message Date
theli 9649d08171 *) More tolerant robots parser
19 years ago
theli 93cadb47b9 *) More tolerant robots parser for robots-files which missing empty lines between rule blocks
19 years ago
theli f9fb284fb7 *) Better handling of robots.txt files with incorrect keywords
19 years ago
theli b8ceb1ffde *) Adding better https support for crawler
19 years ago
theli 3b5d0eb053 *) Synchronizing robots.txt downloads to avoid parallel downloads of the same file by separate threads
19 years ago
theli 6c48c3ce39 *) Bugfix for ArithmeticException during IndexTransfer
19 years ago
theli 02d9af1a70 *) Restructuring and extending of Remote Proxy Support
19 years ago
theli 40777556c5 *) Connection Tracking
19 years ago
theli 959eefbc4f *) Robots.txt parser/ppt
19 years ago
theli a2fa75e688 *) Asynchronous queuing of crawl job URLs (stackCrawl)
19 years ago
theli 023be89586 *) Bugfix for "Robots.txt wird immer wieder geladen"
19 years ago
orbiter dc474aa22f various bug-fixes
19 years ago
rramthun 9dfbd93c7b Updated german language file
19 years ago
theli 2cd695f376 *) Bugfix path-entries of robots.txt were not decoded correctly
19 years ago
theli f8ad65eae1 *) First trial implementation of robots.txt support
19 years ago
allo 9300689dde bugfix *gr*
19 years ago
allo ebc39a7b9a minor fixes
19 years ago
allo f90f699ab1 missing package line.
19 years ago
allo 06a451768f a simple robotsParser.
19 years ago