Commit Graph

28 Commits (488a0ed580838690c5c119a334cf24ca17a26cee)

Author SHA1 Message Date
theli 34c075c1c7 testcommit with subversive
19 years ago
theli d3da7c9a08 *) Adding support for robots Allow directive
19 years ago
theli 734d18f283 *) more correct robots.txt validation
19 years ago
theli f0ad0d2b2b *) better robots.txt support
19 years ago
theli 915812f597 *) Undoing robots parser policy changes from svn rev. 1421
19 years ago
theli eeba8b055e *) guessing, testing and suggesting alternative hostnames on "unknown host" error
19 years ago
theli 5c56b9ed59 *) catch exceptions that could occur during url decoding
19 years ago
theli 754a35877f *) Changing robots parser cxclusion policy
19 years ago
orbiter 7920e1547d code cleanup
19 years ago
theli 9649d08171 *) More tolerant robots parser
20 years ago
theli 93cadb47b9 *) More tolerant robots parser for robots-files which missing empty lines between rule blocks
20 years ago
theli f9fb284fb7 *) Better handling of robots.txt files with incorrect keywords
20 years ago
theli b8ceb1ffde *) Adding better https support for crawler
20 years ago
theli 3b5d0eb053 *) Synchronizing robots.txt downloads to avoid parallel downloads of the same file by separate threads
20 years ago
theli 6c48c3ce39 *) Bugfix for ArithmeticException during IndexTransfer
20 years ago
theli 02d9af1a70 *) Restructuring and extending of Remote Proxy Support
20 years ago
theli 40777556c5 *) Connection Tracking
20 years ago
theli 959eefbc4f *) Robots.txt parser/ppt
20 years ago
theli a2fa75e688 *) Asynchronous queuing of crawl job URLs (stackCrawl)
20 years ago
theli 023be89586 *) Bugfix for "Robots.txt wird immer wieder geladen"
20 years ago
orbiter dc474aa22f various bug-fixes
20 years ago
rramthun 9dfbd93c7b Updated german language file
20 years ago
theli 2cd695f376 *) Bugfix path-entries of robots.txt were not decoded correctly
20 years ago
theli f8ad65eae1 *) First trial implementation of robots.txt support
20 years ago
allo 9300689dde bugfix *gr*
20 years ago
allo ebc39a7b9a minor fixes
20 years ago
allo f90f699ab1 missing package line.
20 years ago
allo 06a451768f a simple robotsParser.
20 years ago