<tr><tdcolspan="2"><inputtype="radio"name="range"id="rangeDomain"value="domain"#(range_domain)#::checked="checked"#(/range_domain)#onclick="document.getElementById('mustmatch').disabled=true;document.getElementById('deleteoldon').disabled=false;document.getElementById('deleteoldage').disabled=false;document.getElementById('deleteoldon').checked=true;"/>Restrict to start domain(s)</td></tr>
<tr><tdcolspan="2"><inputtype="radio"name="range"id="rangeSubpath"value="subpath"#(range_subpath)#::checked="checked"#(/range_subpath)#onclick="document.getElementById('mustmatch').disabled=true;document.getElementById('deleteoldon').disabled=false;document.getElementById('deleteoldage').disabled=false;document.getElementById('deleteoldon').checked=true;"/>Restrict to sub-path(s)</td></tr>
<tr><tdcolspan="2"><inputtype="radio"name="range"id="rangeDomain"value="domain"#(range_domain)#::checked="checked"#(/range_domain)#/>Restrict to start domain(s)</td></tr>
<tr><tdcolspan="2"><inputtype="radio"name="range"id="rangeSubpath"value="subpath"#(range_subpath)#::checked="checked"#(/range_subpath)#/>Restrict to sub-path(s)</td></tr>
Crawls can be restricted to specific countries. This uses the country code that can be computed from
the IP of the server that hosts the page. The filter is not a regular expressions but a list of country codes, separated by comma.
</span></span>
<inputtype="radio"name="countryMustMatchSwitch"id="countryMustMatchSwitch" value="false"#(countryMustMatchSwitchChecked)#::checked="checked"#(/countryMustMatchSwitchChecked)#/>no country code restriction<br/>
<inputtype="radio"name="countryMustMatchSwitch"id="noCountryMustMatchSwitch" value="false"#(countryMustMatchSwitchChecked)#::checked="checked"#(/countryMustMatchSwitchChecked)#/>no country code restriction<br/>
to delete them because they simply do not exist any more. Use this in combination with re-crawl while this time should be longer.
</span></span><inputtype="radio"name="deleteold"id="deleteoldoff"value="off"#(deleteold_off)#::checked="checked"#(/deleteold_off)#/>Do not delete any document before the crawl is started.</dd>
<dt>Delete sub-path</dt>
<dd><inputtype="radio"name="deleteold"id="deleteoldon"value="on"#(deleteold_on)#::checked="checked"#(/deleteold_on)##(range_wide)#::disabled="disabled"#(/range_wide)#/>For each host in the start url list, delete all documents (in the given subpath) from that host.</dd>
<dd><inputtype="radio"name="deleteold"id="deleteoldon"value="on"#(deleteold_on)#::checked="checked"#(/deleteold_on)#/>For each host in the start url list, delete all documents (in the given subpath) from that host.</dd>
<dt>Delete only old</dt>
<dd><inputtype="radio"name="deleteold"id="deleteoldage"value="age"#(deleteold_age)#::checked="checked"#(/deleteold_age)##(range_wide)#::disabled="disabled"#(/range_wide)#/>Treat documents that are loaded
<dd><inputtype="radio"name="deleteold"id="deleteoldage"value="age"#(deleteold_age)#::checked="checked"#(/deleteold_age)#/>Treat documents that are loaded
A web crawl performs a double-check on all links found in the internet against the internal database. If the same url is found again,
then the url is treated as double when you check the 'no doubles' option. A url may be loaded again when it has reached a specific age,
to use that check the 're-load' option.
</span></span><inputtype="radio"name="recrawl" value="nodoubles"#(recrawl_nodoubles)#checked="checked"#(/recrawl_nodoubles)#/>Never load any page that is already known. Only the start-url may be loaded again.</dd>
</span></span><inputtype="radio"name="recrawl"id="reloadoldoff"value="nodoubles"#(recrawl_nodoubles)#checked="checked"#(/recrawl_nodoubles)#/>Never load any page that is already known. Only the start-url may be loaded again.</dd>
<dt>Re-load</dt>
<dd><inputtype="radio"name="recrawl" value="reload"#(recrawl_reload)#checked="checked"#(/recrawl_reload)#/>Treat documents that are loaded
<dd><inputtype="radio"name="recrawl"id="reloadoldage"value="reload"#(recrawl_reload)#checked="checked"#(/recrawl_reload)#/>Treat documents that are loaded