Commit Graph

28 Commits (c5087710a42213737867484598fbcb1d4ee1ccbf)

Author SHA1 Message Date
theli 34c075c1c7 testcommit with subversive
19 years ago
theli d3da7c9a08 *) Adding support for robots Allow directive
19 years ago
theli 734d18f283 *) more correct robots.txt validation
19 years ago
theli f0ad0d2b2b *) better robots.txt support
19 years ago
theli 915812f597 *) Undoing robots parser policy changes from svn rev. 1421
19 years ago
theli eeba8b055e *) guessing, testing and suggesting alternative hostnames on "unknown host" error
19 years ago
theli 5c56b9ed59 *) catch exceptions that could occur during url decoding
19 years ago
theli 754a35877f *) Changing robots parser cxclusion policy
19 years ago
orbiter 7920e1547d code cleanup
19 years ago
theli 9649d08171 *) More tolerant robots parser
19 years ago
theli 93cadb47b9 *) More tolerant robots parser for robots-files which missing empty lines between rule blocks
19 years ago
theli f9fb284fb7 *) Better handling of robots.txt files with incorrect keywords
19 years ago
theli b8ceb1ffde *) Adding better https support for crawler
19 years ago
theli 3b5d0eb053 *) Synchronizing robots.txt downloads to avoid parallel downloads of the same file by separate threads
19 years ago
theli 6c48c3ce39 *) Bugfix for ArithmeticException during IndexTransfer
19 years ago
theli 02d9af1a70 *) Restructuring and extending of Remote Proxy Support
19 years ago
theli 40777556c5 *) Connection Tracking
19 years ago
theli 959eefbc4f *) Robots.txt parser/ppt
19 years ago
theli a2fa75e688 *) Asynchronous queuing of crawl job URLs (stackCrawl)
19 years ago
theli 023be89586 *) Bugfix for "Robots.txt wird immer wieder geladen"
19 years ago
orbiter dc474aa22f various bug-fixes
19 years ago
rramthun 9dfbd93c7b Updated german language file
19 years ago
theli 2cd695f376 *) Bugfix path-entries of robots.txt were not decoded correctly
19 years ago
theli f8ad65eae1 *) First trial implementation of robots.txt support
19 years ago
allo 9300689dde bugfix *gr*
19 years ago
allo ebc39a7b9a minor fixes
19 years ago
allo f90f699ab1 missing package line.
19 years ago
allo 06a451768f a simple robotsParser.
19 years ago