|
|
|
@ -114,8 +114,9 @@
|
|
|
|
|
<input type="radio" name="range" value="subpath" />Restrict to sub-path
|
|
|
|
|
</td>
|
|
|
|
|
<td>
|
|
|
|
|
The filter is an emacs-like regular expression that must match with the URLs which are used to be crawled;
|
|
|
|
|
default is 'catch all'.
|
|
|
|
|
The filter is a <a href="http://java.sun.com/j2se/1.5.0/docs/api/java/util/regex/Pattern.html">regular expression</a>
|
|
|
|
|
that must match with the URLs which are used to be crawled; default is 'catch all'.
|
|
|
|
|
Example: to allow only urls that contain the word 'science', set the filter to '.*science.*'.
|
|
|
|
|
You can also use an automatic domain-restriction to fully crawl a single domain.
|
|
|
|
|
</td>
|
|
|
|
|
</tr>
|
|
|
|
|