Commit Graph

8 Commits (60c9986a0e4a4f4f75e1ba93e6cf1ac0f9f1fed6)

Author SHA1 Message Date
Michael Peter Christen e6a87e0426 enhanced crawler
3 years ago
Lina Ceballos a96752f5ab adding SPDX license and copyright headers
4 years ago
Michael Peter Christen 85a427ec54 support for multiple sitemaps in robots.txt
11 years ago
Michael Peter Christen 5e31bad711 - the webgraph shall store all links which appear on a web page and not
11 years ago
Michael Peter Christen 765943a4b7 Redesign of crawler identification and robots steering. A non-p2p user
11 years ago
Michael Peter Christen 038f956821 fix for sitemap detection: the sitemap url was not visible if it
12 years ago
Michael Peter Christen 2d9e577ad0 replaced the custom robots.txt loader by the standard http loader
12 years ago
Michael Peter Christen 00c1c777fa refactoring
12 years ago