Commit Graph

9 Commits (ea2bd43b28aaf8d9264ab8fc3a454c88ff210465)

Author SHA1 Message Date
Michael Peter Christen ef5192f8c9 using the generic document parser for crawl starts instead of the html
13 years ago
Michael Peter Christen ce620be783 for for crawl start with smb url
13 years ago
Roland 'Quix0r' Haeder fa08ed5ae5 Fixed a lot CHMOD rights (no need for execute flag on *.java/*.html) and introduced local/remote crawl size ratio based check
13 years ago
orbiter 5a55397f99 some last-minute performance hacks
13 years ago
orbiter c93f10417a add a bookmark automatically each time a new crawl is started
13 years ago
orbiter 017a01714d - enhanced logging in robots.txt parser for remote debugging
13 years ago
cominch cef8ebc41d getpageinfo: Checks if there is a OAI repository behind the URL.
13 years ago
orbiter eb1c7c041d write info about robots.txt evaluation into getpageinfo_p.xml
13 years ago
orbiter f8b8c82421 - refactoring of getpageinfo_p.xml (moved out of util)
13 years ago