- indexUrlMustMatch and indexUrlMustNotMatch which can be used to select
loaded pages for indexing. Default patterns are in such a way that all
loaded pages are also indexed (as before) but when doing an expert crawl
start, then the user may select only specific urls to be indexed.
- crawlerNoDepthLimitMatch is a new pattern that can be used to remove
the crawl depth limitation. This filter a never-match by default (which
causes that the depth is used) but the user can select paths which will
be loaded completely even if a crawl depth is reached.
<inputtype="checkbox"name="directDocByURL"id="directDocByURL"#(directDocByURLChecked)#::checked="checked"#(/directDocByURLChecked)#/>also all linked non-parsable documents
<inputtype="checkbox"name="directDocByURL"id="directDocByURL"#(directDocByURLChecked)#::checked="checked"#(/directDocByURLChecked)#/>also all linked non-parsable documents<br/>
Unlimited crawl depth for URLs matching with: <inputname="crawlingDepthExtension"id="crawlingDepthExtension"type="text"size="30"maxlength="100"value="#[crawlingDepthExtension]#"/>
</td>
<td>
This defines how often the Crawler will follow links (of links..) embedded in websites.
@ -150,7 +151,7 @@
</td>
</tr>
<trvalign="top"class="TableCellLight">
<td><labelfor="mustmatch">Must-Match Filter for URLs</label>:</td>
<td><labelfor="mustmatch">Must-Match Filter for URLs for crawling</label>:</td>