Converted one more set of URLs to pure relative ones.

Easier YaCy peer configuration behind a reverse proxy subfolder : no
need for the reverse proxy to rewrite HTML links or URLs in css files.

Tested on Debian Jessie with an apache2 reverse proxy.

See related mantis issues http://mantis.tokeek.de/view.php?id=106 and
http://mantis.tokeek.de/view.php?id=701
pull/93/head
luccioman 8 years ago
parent 74fec066f4
commit 812abfc868

@ -308,7 +308,7 @@
The filter is a <b><a href="http://download.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html" target="_blank">regular expression</a></b>.
Example: to allow only urls that contain the word 'science', set the must-match filter to '.*science.*'.
You can also use an automatic domain-restriction to fully crawl a single domain.
Attention: you can test the functionality of your regular expressions using the <a href="/RegexTest.html">Regular Expression Tester</a> within YaCy</a>.
Attention: you can test the functionality of your regular expressions using the <a href="RegexTest.html">Regular Expression Tester</a> within YaCy</a>.
</span></span>
<table border="0">
<tr><td width="110"><img src="env/grafics/plus.gif"> must-match</td><td></td></tr>
@ -347,7 +347,7 @@
<span class="info" style="float:right"><img src="env/grafics/i16.gif" width="16" height="16" alt="info"/><span style="right:0px;">
The filter is a <b><a href="http://download.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html" target="_blank">regular expression</a></b>
that <b>must not match</b> with the URLs to allow that the content of the url is indexed.
Attention: you can test the functionality of your regular expressions using the <a href="/RegexTest.html">Regular Expression Tester</a> within YaCy</a>.
Attention: you can test the functionality of your regular expressions using the <a href="RegexTest.html">Regular Expression Tester</a> within YaCy</a>.
</span></span>
<table border="0">
<tr><td width="110"><img src="env/grafics/plus.gif"> must-match</td><td><input name="indexmustmatch" id="indexmustmatch" type="text" size="55" maxlength="100000" value="#[indexmustmatch]#" onblur="if (this.value=='') this.value='.*';"/></td><td>(must not be empty)</td></tr>

@ -51,7 +51,7 @@
});
</script>
<div id="api"><a href="/api/webstructure.xml?about=#[besthost]#"><img src="env/grafics/api.png" width="60" height="40" alt="API" /></a>
<div id="api"><a href="api/webstructure.xml?about=#[besthost]#"><img src="env/grafics/api.png" width="60" height="40" alt="API" /></a>
<span>
The data that is visualized here can also be retrieved in a XML file, which lists the reference relation between the domains.
With a GET-property 'about' you get only reference relations about the host that you give in the argument field for 'about'.

@ -51,7 +51,7 @@ To see a list of all APIs, please visit the <a href="http://www.yacy-websuche.de
<dt>Outbound Links (anchors)</dt><dd property="yacy:outbound">#[yacy_outbound]#</dd>
<dt>Incoming Links (citation)</dt><dd property="yacy:citations">#[yacy_citations]#</dd>
<dt>Location</dt><dd><a href="/osm.png?lon=#[geo_long]#&lat=#[geo_lat]#&zoom=14" onclick="return hs.expand(this)">lat=#[geo_lat]#, lon=#[geo_long]#</a></dd>
<dt>Location</dt><dd><a href="../osm.png?lon=#[geo_long]#&lat=#[geo_lat]#&zoom=14" onclick="return hs.expand(this)">lat=#[geo_lat]#, lon=#[geo_long]#</a></dd>
</dl>
</fieldset>

@ -58,7 +58,7 @@
</ul>
</li>
<li id="header_administration">
<form action="/Status.html" method="get">
<form action="Status.html" method="get">
<button accesskey="s" type="submit" class="btn btn-inverse navbar-btn">Administration &raquo;</button>
</form>
</li>
@ -95,8 +95,8 @@
<div class="col-sm-9 col-sm-offset-3 col-md-10 col-md-offset-2 main">
<h1 class="page-header">Dashboard</h1>
<script src="/js/d3.v3.min.js"></script>
<script src="/js/hypertree.js"></script>
<script src="js/d3.v3.min.js"></script>
<script src="js/hypertree.js"></script>
<div id="linkstructure"></div>
<script>$(document).ready(linkstructure("yacy.net", "#linkstructure", 1280, 720, 1000, 1000));</script>

@ -3052,7 +3052,7 @@
<source>You can also use an automatic domain-restriction to fully crawl a single domain.</source>
</trans-unit>
<trans-unit id="a200eeed" xml:space="preserve" approved="no" translate="yes">
<source>Attention: you can test the functionality of your regular expressions using the &lt;a href="/RegexTest.html"&gt;Regular Expression Tester&lt;/a&gt; within YaCy&lt;/a&gt;.</source>
<source>Attention: you can test the functionality of your regular expressions using the &lt;a href="RegexTest.html"&gt;Regular Expression Tester&lt;/a&gt; within YaCy&lt;/a&gt;.</source>
</trans-unit>
<trans-unit id="80c17409" xml:space="preserve" approved="no" translate="yes">
<source>You can limit the maximum number of pages that are fetched and indexed from a single domain with this option.</source>

@ -1301,7 +1301,7 @@ Restrict to sub-path(s)==Запретить часть пути
#that must match with the URLs which are used to be crawled; default is 'catch all'.==которые должны совпадать с ссылками для индексации; по-умолчанию 'берутся все'.
Example: to allow only urls that contain the word 'science', set the must-match filter to '.*science.*'.==Например, для разрешения только ссылок, содержащих слово 'science' , нужно установить фильтр '.*science.*'.
You can also use an automatic domain-restriction to fully crawl a single domain.==Вы можете также использовать автоматическое ограничение домена при полном индексировании простого домена.
Attention: you can test the functionality of your regular expressions using the <a href="/RegexTest.html">Regular Expression Tester</a> within YaCy</a>.==Внимание! Вы можете проверить правильность ваших регулярных выражений используя <a href="/RegexTest.html">тестер</a> в YaCy</a>.
Attention: you can test the functionality of your regular expressions using the <a href="RegexTest.html">Regular Expression Tester</a> within YaCy</a>.==Внимание! Вы можете проверить правильность ваших регулярных выражений используя <a href="/RegexTest.html">тестер</a> в YaCy</a>.
#Must-Not-Match Filter==Фильтр "не должно совпадать"
#This filter must not match to allow that the page is accepted for crawling.==Dieser Filter muss nicht passen, um zu erlauben, dass die Seite zum crawlen akzeptiert wird.
#The empty string is a never-match filter which should do well for most cases.==Пустая строка означает фильтр "никогда не совпадать", который хорошо подходит в большинстве случаев.

Loading…
Cancel
Save