Commit Graph

10 Commits (3c3cb7855555d5b2712bb1bc05cddb5c3175ec45)

Author SHA1 Message Date
Michael Peter Christen 5e31bad711 - the webgraph shall store all links which appear on a web page and not
11 years ago
Michael Peter Christen 765943a4b7 Redesign of crawler identification and robots steering. A non-p2p user
11 years ago
Michael Peter Christen 57ffdfad4c added a crawl option to obey html-meta-robots-noindex. This is on by
12 years ago
Michael Peter Christen 25499eead5 - added a new field for the regular expression in crawl start
12 years ago
Michael Peter Christen 0fe8be7981 enhaced data structures for balancer and latency computation which
12 years ago
Michael Peter Christen ac9540dfb6 removed options for stopwords which are not used
12 years ago
Michael Peter Christen 5f0ab25382 removed the option to prevent removal of & parts inside of the
12 years ago
Michael Peter Christen 2f536cb54d code cleanup: removed unised methods and made more methods and objects
12 years ago
Michael Peter Christen 1533bfd63b refactoring
12 years ago
Michael Peter Christen 00c1c777fa refactoring
12 years ago