Commit Graph

13 Commits (692aab1318ff1ece96b5143459c1a02db54e6e62)

Author SHA1 Message Date
Michael Peter Christen 97930a6aad added must-not-match filter to snapshot generation.
10 years ago
Michael Peter Christen fed26f33a8 enhanced timezone managament for indexed data:
10 years ago
Michael Peter Christen 1309619a71 remove remote indexing option in crawl start if not in p2p mode
10 years ago
Michael Peter Christen b5ac29c9a5 added a html field scraper which reads text from html entities of a
10 years ago
Michael Peter Christen 8df8ffbb6d enhanced the snapshot functionality:
10 years ago
Michael Peter Christen 6f0167fac1 get cloned crawl start parameter for snapshots
10 years ago
Michael Peter Christen 97f6089a41 YaCy can now create web page snapshots as pdf documents which can later
10 years ago
orbiter f642cfbe30 added hint to the regular expression tester
10 years ago
Michael Peter Christen 2de159719b added an option to set 'obey nofollow' for links with rel="nofollow"
10 years ago
Michael Peter Christen 1b279d7a7e fixed external link
11 years ago
reger 89e2c5e884 fix: allow enable of CrawlStartExpert.html #file
11 years ago
Michael Peter Christen a2fba6584f use submitted default userAgent if cloning a crawl
11 years ago
orbiter d29b6db270 made crawl start pages public since they do not reveal individual
11 years ago