From 94f3d90af2467609e42d8cb4981a012547cbe2a5 Mon Sep 17 00:00:00 2001 From: orbiter Date: Thu, 4 Jun 2009 20:03:26 +0000 Subject: [PATCH] added a hint about regular expressions in crawl start git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6021 6c8d7289-2bf4-0310-a012-ef5d649a1542 --- htroot/CrawlStart_p.html | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/htroot/CrawlStart_p.html b/htroot/CrawlStart_p.html index 98257c0e2..63c84919a 100644 --- a/htroot/CrawlStart_p.html +++ b/htroot/CrawlStart_p.html @@ -114,8 +114,9 @@ Restrict to sub-path - The filter is an emacs-like regular expression that must match with the URLs which are used to be crawled; - default is 'catch all'. + The filter is a regular expression + that must match with the URLs which are used to be crawled; default is 'catch all'. + Example: to allow only urls that contain the word 'science', set the filter to '.*science.*'. You can also use an automatic domain-restriction to fully crawl a single domain.