The filter is a <b><ahref="http://download.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html"target="_blank">regular expression</a></b>.
Example: to allow only urls that contain the word 'science', set the must-match filter to '.*science.*'.
You can also use an automatic domain-restriction to fully crawl a single domain.
Attention: you can test the functionality of your regular expressions using the <ahref="/RegexTest.html">Regular Expression Tester</a> within YaCy</a>.
Attention: you can test the functionality of your regular expressions using the <ahref="RegexTest.html">Regular Expression Tester</a> within YaCy</a>.
The filter is a <b><ahref="http://download.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html"target="_blank">regular expression</a></b>
that <b>must not match</b> with the URLs to allow that the content of the url is indexed.
Attention: you can test the functionality of your regular expressions using the <ahref="/RegexTest.html">Regular Expression Tester</a> within YaCy</a>.
Attention: you can test the functionality of your regular expressions using the <ahref="RegexTest.html">Regular Expression Tester</a> within YaCy</a>.
</span></span>
<tableborder="0">
<tr><tdwidth="110"><imgsrc="env/grafics/plus.gif"> must-match</td><td><inputname="indexmustmatch"id="indexmustmatch"type="text"size="55"maxlength="100000"value="#[indexmustmatch]#"onblur="if (this.value=='') this.value='.*';"/></td><td>(must not be empty)</td></tr>
<source>Attention: you can test the functionality of your regular expressions using the <a href="/RegexTest.html">Regular Expression Tester</a> within YaCy</a>.</source>
<source>Attention: you can test the functionality of your regular expressions using the <a href="RegexTest.html">Regular Expression Tester</a> within YaCy</a>.</source>
@ -1301,7 +1301,7 @@ Restrict to sub-path(s)==Запретить часть пути
#that must match with the URLs which are used to be crawled; default is 'catch all'.==которые должны совпадать с ссылками для индексации; по-умолчанию 'берутся все'.
Example: to allow only urls that contain the word 'science', set the must-match filter to '.*science.*'.==Например, для разрешения только ссылок, содержащих слово 'science' , нужно установить фильтр '.*science.*'.
You can also use an automatic domain-restriction to fully crawl a single domain.==Вы можете также использовать автоматическое ограничение домена при полном индексировании простого домена.
Attention: you can test the functionality of your regular expressions using the <a href="/RegexTest.html">Regular Expression Tester</a> within YaCy</a>.==Внимание! Вы можете проверить правильность ваших регулярных выражений используя <a href="/RegexTest.html">тестер</a> в YaCy</a>.
Attention: you can test the functionality of your regular expressions using the <a href="RegexTest.html">Regular Expression Tester</a> within YaCy</a>.==Внимание! Вы можете проверить правильность ваших регулярных выражений используя <a href="/RegexTest.html">тестер</a> в YaCy</a>.
#Must-Not-Match Filter==Фильтр "не должно совпадать"
#This filter must not match to allow that the page is accepted for crawling.==Dieser Filter muss nicht passen, um zu erlauben, dass die Seite zum crawlen akzeptiert wird.
#The empty string is a never-match filter which should do well for most cases.==Пустая строка означает фильтр "никогда не совпадать", который хорошо подходит в большинстве случаев.