Merge remote-tracking branch 'origin/master' into docker

pull/55/head
luccioman 9 years ago
commit 85f7d32087

@ -1231,7 +1231,7 @@ This cache is very important for a fast search process.==Tato cache je velmi dol
Increasing the cache size will result in more search results and less IO during DHT transfer.==Dosledok zvacsenie tejto cache je viac vysledkov vyhladavania a menej vstupno/vystupnej zataze pocas DHT prenosu. Increasing the cache size will result in more search results and less IO during DHT transfer.==Dosledok zvacsenie tejto cache je viac vysledkov vyhladavania a menej vstupno/vystupnej zataze pocas DHT prenosu.
'noticed' URLs=='zname' URL adresy 'noticed' URLs=='zname' URL adresy
A noticed URL is one that was discovered during crawling but was not loaded yet.==Znama URL adresa je taka ktora bola objavena pocas crawlingu avsak nebola este nahrata. A noticed URL is one that was discovered during crawling but was not loaded yet.==Znama URL adresa je taka ktora bola objavena pocas crawlingu avsak nebola este nahrata.
Increasing the cache size will result in faster double-check during URL recognition when doing crawls.==Erhöhen der Cachegröße resultiert in schnellerer Rücküberprüfung beim Durchführen von Crawls. #Increasing the cache size will result in faster double-check during URL recognition when doing crawls.==Erhöhen der Cachegröße resultiert in schnellerer Rücküberprüfung beim Durchführen von Crawls.
'error' URLs=='chybne' URL adresy 'error' URLs=='chybne' URL adresy
URLs that cannot be loaded are stored in this database. It is also used for double-checked during crawling.==URL adresy ktore nemozu byt nahrante su ulozene v tejto databaze. Takisto sa pouziva pri dvojnasobnej kontrole pocas crawlingu. URLs that cannot be loaded are stored in this database. It is also used for double-checked during crawling.==URL adresy ktore nemozu byt nahrante su ulozene v tejto databaze. Takisto sa pouziva pri dvojnasobnej kontrole pocas crawlingu.
Increasing the cache size will most probably speed up crawling slightly, but not significantly.==Zvacsenie tejto cache pravdepodobne jemne zvysi rychlost crawlingu, nie vsak o vela. Increasing the cache size will most probably speed up crawling slightly, but not significantly.==Zvacsenie tejto cache pravdepodobne jemne zvysi rychlost crawlingu, nie vsak o vela.
@ -1343,7 +1343,6 @@ Page.==stranke.
#File: QuickCrawlLink_p.html #File: QuickCrawlLink_p.html
#--------------------------- #---------------------------
YaCy '#[clientname]#': Quick Crawl Link==YaCy '#[clientname]#''#[clientname]#': Rychly Crawl Link
Quick Crawl Link==Rychly Crawl Link Quick Crawl Link==Rychly Crawl Link
Quickly adding Bookmarks:==Rychly Crawl - Zalozky: Quickly adding Bookmarks:==Rychly Crawl - Zalozky:
Simply drag and drop the link shown below to your Browsers Toolbar/Link-Bar.==Kliknite na tahajte (drag and drop) odkaz nizsie do toolbar/linkbaru Vaseho browsera. Simply drag and drop the link shown below to your Browsers Toolbar/Link-Bar.==Kliknite na tahajte (drag and drop) odkaz nizsie do toolbar/linkbaru Vaseho browsera.
@ -1418,7 +1417,7 @@ Server Access Restrictions==Obmedzenia pristupu k serveru
You can restrict the access to this proxy/server using a two-stage security barrier:==Pristup k tomuto proxy resp. HTTP serveru mozete obmedzit pouzitym 2-stupnovej bezpecnostnej bariery: You can restrict the access to this proxy/server using a two-stage security barrier:==Pristup k tomuto proxy resp. HTTP serveru mozete obmedzit pouzitym 2-stupnovej bezpecnostnej bariery:
define an <i>access domain</i> with a list of granted client IP-numbers or with wildcards==zadajte <i>priestor sietovych domen</i> so zoznamom IP adries povolenych klientov alebo pomocou wildcard znakov define an <i>access domain</i> with a list of granted client IP-numbers or with wildcards==zadajte <i>priestor sietovych domen</i> so zoznamom IP adries povolenych klientov alebo pomocou wildcard znakov
define an <i>user account</i> with an user:password - pair==vytvorte <i>uzivatelsky ucet</i> pomocou paru 'uzivatel:heslo' define an <i>user account</i> with an user:password - pair==vytvorte <i>uzivatelsky ucet</i> pomocou paru 'uzivatel:heslo'
This is the account that restricts access to the proxy function.==Dies sind die Nutzer denen der Zugriff auf die Proxyfunktion gew&auml;hrt wird. #This is the account that restricts access to the proxy function.==Dies sind die Nutzer denen der Zugriff auf die Proxyfunktion gew&auml;hrt wird.
You probably don't want to share the proxy to the internet, so you should set the IP-Number Access Domain to a pattern that corresponds to you local intranet.==Pravdepodobne nechcete na internete zielat Vase proxy, takze by ste mali zvolit IP adresovy priestor tak aby zodpovedal adresam Vaseho intranetu. You probably don't want to share the proxy to the internet, so you should set the IP-Number Access Domain to a pattern that corresponds to you local intranet.==Pravdepodobne nechcete na internete zielat Vase proxy, takze by ste mali zvolit IP adresovy priestor tak aby zodpovedal adresam Vaseho intranetu.
The default setting should be right in most cases.==Predvolene nastavenia by mali byt vo vacsine pripadov spravne. The default setting should be right in most cases.==Predvolene nastavenia by mali byt vo vacsine pripadov spravne.
If you want, you can also set a proxy account so that every proxy user must authenticate first, but this is rather unusual.==Ak chcete mozete tiez vytvorit proxy ucet, takze kazdy uzivatel proxy sa musi najprv prihlasit, co je vsak neobvykle riesenie. If you want, you can also set a proxy account so that every proxy user must authenticate first, but this is rather unusual.==Ak chcete mozete tiez vytvorit proxy ucet, takze kazdy uzivatel proxy sa musi najprv prihlasit, co je vsak neobvykle riesenie.
@ -1697,6 +1696,7 @@ from 'late' peers to enrich this search result.==z pomalych peerov na zlepsenie
#--------------------------- #---------------------------
System-, Index- and Peer-Status==Stav systemu, indexu a peera System-, Index- and Peer-Status==Stav systemu, indexu a peera
Welcome to YaCy!==Vitajte v YaCy! Welcome to YaCy!==Vitajte v YaCy!
Your settings are _not_ protected!==Vase nastavenia _nie_su_ chranene heslom!
"Restart"=="Restartuj" "Restart"=="Restartuj"
"Shutdown"=="Vypni" "Shutdown"=="Vypni"
Public System Properties==Vseobecne systemove vlastnosti Public System Properties==Vseobecne systemove vlastnosti
@ -1776,7 +1776,7 @@ Private System Properties==Sukromne systemove vlastnosti
System Resources==Systemove zdroje System Resources==Systemove zdroje
Processors:==Procesory: Processors:==Procesory:
Protection==Ochrana Protection==Ochrana
<b>Your settings are _not_ protected!</b> Please go to the==<b>Vase nastavenia _nie_su_ chranene heslom</b> Chodte prosim na #<b>Your settings are _not_ protected!</b> Please go to the==<b>Vase nastavenia _nie_su_ chranene heslom</b> Chodte prosim na
settings</a> page <b>immediately</b> and set an administration password.==stranku nastaveni</a> a <b>ihned</b> si zvolte heslo. settings</a> page <b>immediately</b> and set an administration password.==stranku nastaveni</a> a <b>ihned</b> si zvolte heslo.
Your settings are protected by a password.==Vase nastavenia su chranene heslom. Your settings are protected by a password.==Vase nastavenia su chranene heslom.
Peer host==Peer Host Peer host==Peer Host

@ -49,6 +49,7 @@ import java.util.Set;
import net.yacy.cora.util.CommonPattern; import net.yacy.cora.util.CommonPattern;
import net.yacy.cora.util.ConcurrentLog; import net.yacy.cora.util.ConcurrentLog;
import net.yacy.document.SentenceReader;
import net.yacy.kelondro.util.FileUtils; import net.yacy.kelondro.util.FileUtils;
import net.yacy.kelondro.util.Formatter; import net.yacy.kelondro.util.Formatter;
import net.yacy.peers.Seed; import net.yacy.peers.Seed;
@ -71,34 +72,55 @@ public class Translator {
* @param translationTable translation entries : text to translate -> translation * @param translationTable translation entries : text to translate -> translation
* @return source translated * @return source translated
*/ */
public String translate(final StringBuilder source, public String translate(final StringBuilder source,
final Map<String, String> translationTable) { final Map<String, String> translationTable) {
final Set<Map.Entry<String, String>> entries = translationTable.entrySet(); final Set<Map.Entry<String, String>> entries = translationTable.entrySet();
StringBuilder builder = new StringBuilder(source); StringBuilder builder = new StringBuilder(source);
for (final Entry<String, String> entry: entries) { for (final Entry<String, String> entry : entries) {
String key = entry.getKey(); String key = entry.getKey();
/* We have to check key is not empty or indexOf would always return a positive value */ /* We have to check key is not empty or indexOf would always return a positive value */
if (key != null && !key.isEmpty()) { if (key != null && !key.isEmpty()) {
String translation = entry.getValue(); String translation = entry.getValue();
int index = builder.indexOf(key); int index = builder.indexOf(key);
if (index < 0) { if (index < 0) {
// Filename not available, but it will be printed in Log // Filename not available, but it will be printed in Log
// after all untranslated Strings as "Translated file: " // after all untranslated Strings as "Translated file: "
if (ConcurrentLog.isFine("TRANSLATOR")) if (ConcurrentLog.isFine("TRANSLATOR"))
ConcurrentLog.fine("TRANSLATOR", "Unused String: " ConcurrentLog.fine("TRANSLATOR", "Unused String: " + key);
+ key); } else {
} else { while (index >= 0) {
while (index >= 0) {
builder.replace(index, index + key.length(), // check for word boundary before and after translation key
translation); // to avoid translation just on char sequence e.g. as in key="bug" source="mybugfix"
index = builder.indexOf(key, boolean boundary = index + key.length() >= builder.length(); // eof text = end-bondary
index + translation.length());
} if (!boundary) {
} char c = builder.charAt(index + key.length() - 1);
} char lc = builder.charAt(index + key.length());
} boundary |= (SentenceReader.punctuation(c) || SentenceReader.invisible(c)); // special case, basically last char of key
return builder.toString(); boundary |= (SentenceReader.punctuation(lc) || SentenceReader.invisible(lc)); // char after key = end-boundary
} }
// if end-boundary ok check begin-boundary
if (boundary && index > 0) {
char c = builder.charAt(index - 1); // char before key = begin-boundary
boundary = (SentenceReader.punctuation(c) || SentenceReader.invisible(c));
char fc = builder.charAt(index); // special case for key >name< , currently to allow <label>name</label (basically fist char of key)
boundary |= (SentenceReader.punctuation(fc) || SentenceReader.invisible(fc));
}
if (boundary) { // boundary check ok -> translate
builder.replace(index, index + key.length(), translation);
index = builder.indexOf(key, index + translation.length());
} else { // otherwise just skip to next occurence
index = builder.indexOf(key, index + key.length());
}
}
}
}
}
return builder.toString();
}
/** /**
* Load multiple translationLists from one File. Each List starts with #File: relative/path/to/file * Load multiple translationLists from one File. Each List starts with #File: relative/path/to/file

@ -225,8 +225,6 @@ public final class Switchboard extends serverSwitch {
public final static String SOLR_COLLECTION_CONFIGURATION_NAME = "solr.collection.schema"; public final static String SOLR_COLLECTION_CONFIGURATION_NAME = "solr.collection.schema";
public final static String SOLR_WEBGRAPH_CONFIGURATION_NAME = "solr.webgraph.schema"; public final static String SOLR_WEBGRAPH_CONFIGURATION_NAME = "solr.webgraph.schema";
// load slots
public static int xstackCrawlSlots = 2000;
public static long lastPPMUpdate = System.currentTimeMillis() - 30000; public static long lastPPMUpdate = System.currentTimeMillis() - 30000;
private static final int dhtMaxContainerCount = 500; private static final int dhtMaxContainerCount = 500;
private int dhtMaxReferenceCount = 1000; private int dhtMaxReferenceCount = 1000;
@ -235,8 +233,6 @@ public final class Switchboard extends serverSwitch {
public static SortedSet<String> badwords = new TreeSet<String>(NaturalOrder.naturalComparator); public static SortedSet<String> badwords = new TreeSet<String>(NaturalOrder.naturalComparator);
public static SortedSet<String> stopwords = new TreeSet<String>(NaturalOrder.naturalComparator); public static SortedSet<String> stopwords = new TreeSet<String>(NaturalOrder.naturalComparator);
public static SortedSet<String> blueList = null; public static SortedSet<String> blueList = null;
// public static HandleSet badwordHashes = null; // not used 2013-06-06
// public static HandleSet blueListHashes = null; // not used 2013-06-06
public static SortedSet<byte[]> stopwordHashes = null; public static SortedSet<byte[]> stopwordHashes = null;
public static Blacklist urlBlacklist = null; public static Blacklist urlBlacklist = null;
@ -271,7 +267,6 @@ public final class Switchboard extends serverSwitch {
public BookmarksDB bookmarksDB; public BookmarksDB bookmarksDB;
public WebStructureGraph webStructure; public WebStructureGraph webStructure;
public ConcurrentHashMap<String, TreeSet<Long>> localSearchTracker, remoteSearchTracker; // mappings from requesting host to a TreeSet of Long(access time) public ConcurrentHashMap<String, TreeSet<Long>> localSearchTracker, remoteSearchTracker; // mappings from requesting host to a TreeSet of Long(access time)
public long indexedPages = 0;
public int searchQueriesRobinsonFromLocal = 0; // absolute counter of all local queries submitted on this peer from a local or autheticated used public int searchQueriesRobinsonFromLocal = 0; // absolute counter of all local queries submitted on this peer from a local or autheticated used
public int searchQueriesRobinsonFromRemote = 0; // absolute counter of all local queries submitted on this peer from a remote IP without authentication public int searchQueriesRobinsonFromRemote = 0; // absolute counter of all local queries submitted on this peer from a remote IP without authentication
public float searchQueriesGlobal = 0f; // partial counter of remote queries (1/number-of-requested-peers) public float searchQueriesGlobal = 0f; // partial counter of remote queries (1/number-of-requested-peers)
@ -655,7 +650,6 @@ public final class Switchboard extends serverSwitch {
} else { } else {
blueList = new TreeSet<String>(); blueList = new TreeSet<String>();
} }
// blueListHashes = Word.words2hashesHandles(blueList);
this.log.config("loaded blue-list from file " this.log.config("loaded blue-list from file "
+ plasmaBlueListFile.getName() + plasmaBlueListFile.getName()
+ ", " + ", "
@ -680,7 +674,6 @@ public final class Switchboard extends serverSwitch {
badwordsFile = new File(appPath, "defaults/" + SwitchboardConstants.LIST_BADWORDS_DEFAULT); badwordsFile = new File(appPath, "defaults/" + SwitchboardConstants.LIST_BADWORDS_DEFAULT);
} }
badwords = SetTools.loadList(badwordsFile, NaturalOrder.naturalComparator); badwords = SetTools.loadList(badwordsFile, NaturalOrder.naturalComparator);
// badwordHashes = Word.words2hashesHandles(badwords);
this.log.config("loaded badwords from file " this.log.config("loaded badwords from file "
+ badwordsFile.getName() + badwordsFile.getName()
+ ", " + ", "
@ -3044,9 +3037,6 @@ public final class Switchboard extends serverSwitch {
processCase // process case processCase // process case
); );
// increment number of indexed urls
this.indexedPages++;
// update profiling info // update profiling info
if ( System.currentTimeMillis() - lastPPMUpdate > 20000 ) { if ( System.currentTimeMillis() - lastPPMUpdate > 20000 ) {
// we don't want to do this too often // we don't want to do this too often

@ -0,0 +1,63 @@
package net.yacy.data;
import java.util.HashMap;
import java.util.HashSet;
import java.util.Map;
import java.util.Set;
import org.junit.Test;
import static org.junit.Assert.*;
public class TranslatorTest {
/**
* Test of translate method, of class Translator.
*/
@Test
public void testTranslate() {
// test that translator respects word bondaries ( e.g. key=bug not translate "mybugfix"
Translator t = new Translator();
final Map<String, String> translationTable = new HashMap<String, String>();
translationTable.put("MIST", "Nebel"); // key upper case just to easy identify it in test strings
translationTable.put(">MIST", ">Nebel");
translationTable.put("BY", "bei");
translationTable.put(">BY", ">bei");
translationTable.put("BY<", "bei<");
translationTable.put(">BY<", ">bei<");
// source test text, expected not to be translated
Set<String> noChange = new HashSet<String>();
noChange.add("MISTer wong ");
noChange.add("make no MISTake");
noChange.add("value=\"MISTake\" ");
noChange.add("<b>MISTral</b>");
noChange.add("value=\"#[MISTake]#\" ");
noChange.add(" optiMIST ");
noChange.add("goodBY.");
noChange.add(" BYte");
noChange.add("<label>BYte</label>");
//noChange.add(" BY_BY "); // this translates
// source test text, to be translated
Set<String> doChange = new HashSet<String>();
doChange.add("Queen of the MIST ");
doChange.add("value=\"#[MIST]#\" ");
doChange.add("text#[MIST]#text ");
doChange.add("MIST in the forrest");
doChange.add("MIST\nin the forrest");
doChange.add("<label>BY</label>");
String result;
for (String stringToExamine : noChange) {
StringBuilder source = new StringBuilder(stringToExamine);
result = t.translate(source, translationTable);
assertEquals(result, stringToExamine);
}
for (String stringToExamine : doChange) {
StringBuilder source = new StringBuilder(stringToExamine);
result = t.translate(source, translationTable);
assertNotEquals(result, stringToExamine);
}
}
}
Loading…
Cancel
Save