pull/1/head
sixcooler 13 years ago
commit 1f4fcc0f30

@ -955,7 +955,7 @@ routing.deleteOldSeeds.time = 30
# options to remember the default search engines when using the search compare features # options to remember the default search engines when using the search compare features
compare_yacy.left = YaCy compare_yacy.left = YaCy
compare_yacy.right = google.com compare_yacy.right = scroogle.org
# minimum free disk space for crawling (MiB) # minimum free disk space for crawling (MiB)
disk.free = 3000 disk.free = 3000

@ -138,8 +138,11 @@
#(indexReceiveBlockBlacklistChecked.off)#::checked="checked" #(/indexReceiveBlockBlacklistChecked.off)#/> #(indexReceiveBlockBlacklistChecked.off)#::checked="checked" #(/indexReceiveBlockBlacklistChecked.off)#/>
<label for="indexReceiveBlockBlacklistOff">accept transmitted URLs that match your blacklist</label>. <label for="indexReceiveBlockBlacklistOff">accept transmitted URLs that match your blacklist</label>.
</dd> </dd>
<dt>
<input type="submit" name="save" value="Save" />
</dt>
<dd></dd>
</dl> </dl>
<input type="submit" name="save" value="Save" />
</fieldset> </fieldset>
<fieldset> <fieldset>
@ -203,8 +206,11 @@
If you leave the field empty, no peer asks your peer. If you fill in a '*', your peer is always asked. If you leave the field empty, no peer asks your peer. If you fill in a '*', your peer is always asked.
<input type="text" id="peertags" name="peertags" value="#[peertags]#" size="40" maxlength="80" /> <input type="text" id="peertags" name="peertags" value="#[peertags]#" size="40" maxlength="80" />
</dd> </dd>
<dt>
<input type="submit" name="save" value="Save" />
</dt>
<dd></dd>
</dl> </dl>
<input type="submit" name="save" value="Save" />
</fieldset> </fieldset>
</form> </form>
</fieldset> </fieldset>

@ -1,138 +1,36 @@
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml"> <html xmlns="http://www.w3.org/1999/xhtml">
<head> <head>
<title>YaCy: Help</title> <title>YaCy: Tutorial</title>
#%env/templates/metas.template%# #%env/templates/metas.template%#
</head> </head>
<body id="Help"> <body id="Help">
#%env/templates/header.template%# #%env/templates/header.template%#
<h2>Help</h2> <h2>Tutorial</h2>
<p>
This is a distributed web crawler and also a caching HTTP proxy. You are using the <em>online-interface</em> of the application. You can use this interface to configure your personal settings, proxy settings, access control and crawling properties. You can also use this interface to start crawls, send messages to other peers and monitor your index, cache status and crawling processes. Most important, you can use the search page to search either your own or the <em>global</em> index. <p>
</p> You are using the administration interface of your own search engine. You can create your own search index with YaCy.
<p> To learn how to do that, watch one of the demonstration videos below:
For more detailed information, visit the <a href="http://www.yacy.net/">YaCy homepage</a>. </p>
</p>
<h3>Local and Global Search: Options and Functions</h3>
<p>
The proxy provides a search interface that accesses your local index, created from web pages that passed the proxy. The search can also be applied globally, by searching other peers. You can use the following options to enhance your search results:
</p>
<dl class="optionsAndFunctions">
<dt>Search Word List</dt>
<dd>
You can search for several words simultanous. Words must be separated by a single space.
The words are treated conjunctive, that means every must occur in the result, not any.
If you do a global search (see below) you may get different results each time you do a search.
</dd>
<dt>Maximum Number of Results</dt>
<dd>
You can select the number of wanted maximum links. We do not yet support multiple result pages for virtually any possible link.
Instead we encourage you to enhance the search result by submitting more search words.
</dd>
<dt>Result Order Options</dt>
<dd>
The search engine provides an experimental 'Quality' ranking. In contrast to other known search engines we provide also
a result order by date. If you change the order to 'Date-Quality' the most recently updated page from the search results is listed first.
For pages that have the same date the second order, 'Quality' is applied.
</dd>
<dt>Resource Domain</dt>
<dd>
This search engine is constructed to search the web pages that pass the proxy. But the search index is distributed to other peers as well,
so you can search also globally: this function is currently only rudimentary, but can be choosen for test cases. Future releases will
automatically distribute index information <em>before</em> a search happends to form a performant distributed hash table -- a very fast global search.
</dd>
<dt>Maximum Search Time</dt>
<dd>
Searching the local index is extremely fast, it happends within milliseconds, even for a large number (millions) of pages. But searching the
global index needs more time to find the correct remote peer that contains best search results. This is especially the case while the
distributed index is in test mode. Search results get more stable (repeated global search produce more similar results) the longer
the search time is.
</dd>
</dl>
<h4>Accesskeys</h4> <h3><img src="/env/grafics/flag_english_28x17.gif"> Demo from <a href="http://fscons.org/">FSCONS 2010</a>: Web Search By The People, For The People</a></h3>
<p> <iframe src="http://player.vimeo.com/video/32562148?portrait=0" width="720" height="540" frameborder="0"></iframe>
You may want to use accesskeys to navigate through the YaCy webinterface: <p>
</p> <a href="http://twitter.com/?status=FSCONS%202010:%20YaCy%20Demo%20http://vimeo.com/32562148%20%23YaCy">twitter this video</a>
<ul> Download from Vimeo: <a href="http://vimeo.com/32562148">FSCONS 2010: YaCy Demo</a>
<li>Windows and Internet Explorer: Alt + Accesskey + Enter</li> </p>
<li>Windows and Mozilla/Firefox/Netscape: Alt + Accesskey</li>
<li>Windows and Opera: Shift + Esc + Accesskey</li> <h3><img src="/env/grafics/flag_deutsch_28x17.gif"> Demo von <a href="http://dgd.de/ProgrammOberhof2011.aspx">26. Oberhofer Kolloquium 2011</a>: Freier Wissenszugang mit der Suchmaschine YaCy</a></h3>
<li>Macintosh and Internet Explorer: Strg + Accesskey + Enter</li> <iframe src="http://player.vimeo.com/video/32200946?portrait=0" width="720" height="540" frameborder="0"></iframe>
<li>Macintosh and Safari: Strg + Accesskey</li> <p>
<li>Macintosh and Mozilla/Firefox/Netscape: Strg + Accesskey</li> <a href="http://twitter.com/?status=Freie%20Suchmaschine%20Demo%20http://vimeo.com/32200946%20%23YaCy">twitter this video</a>
<li>Macintosh and Opera: Shift + Esc + Accesskey</li> Download from Vimeo: <a href="http://vimeo.com/32200946">26. Oberhofer Kolloquium 2011: YaCy Demo</a>
<li>Linux Mandrake and Galeon/Mozilla: Alt + Accesskey</li> </p>
<li>All OS and Amaya: Strg + Accesskey</li>
</ul> <h3>More Tutorials</h3>
<dl class="accesskeys"> <p>Please see the tutorials on <a href="http://yacy.net">http://yacy.net</a></p>
<dt>s</dt>
<dd>Search Page</dd>
<dt>n</dt>
<dd>News</dd>
<dt>w</dt>
<dd>Network</dd>
<dt>t</dt>
<dd>Status</dd>
</dl>
<h4>Regular Expressions</h4>
<p>YaCy uses Regular Expressions for some functions, for example in the blacklist.</p>
<p>There are some standards for these regexps, YaCy uses the syntax used by Perl 5.</p>
<p>Here ist a short overview about the functions, which should fir for most cases:</p>
<dl class="regexp">
<dt>.</dt>
<dd>arbitrary character</dd>
<dt>x</dt>
<dd>character x</dd>
<dt>[^x]</dt>
<dd>not x</dd>
<dt>x*</dt>
<dd>0 or more times x</dd>
<dt>x?</dt>
<dd>0 or 1 time x</dd>
<dt>x+</dt>
<dd>1 or more times x</dd>
<dt>xy</dt>
<dd>concatenation of x and y</dd>
<dt>x|y</dt>
<dd>x or y</dd>
<dt>(foo|bar)</dt>
<dd>String "foo" or string "bar"</dd>
<dt>[abc]</dt>
<dd>a or b or c (same as a|b|c)</dd>
<dt>[a-c]</dt>
<dd>a or b or c (same as above)</dd>
<dt>x{n}</dt>
<dd>exactly n appearances of x</dd>
<dt>x{n,}</dt>
<dd>at least n appearances of x</dd>
<dt>x{n,m}</dt>
<dd>at least n, maximum m appearanches of x</dd>
<dt>( )</dt>
<dd>Modify priority of instructions</dd>
<dt>\</dt>
<dd>Escape-Character, used to escape special characters (for example "[" or "*"), so that they loose their special meaning</dd>
</dl>
<p>
Regex follow a special priority (descending): concatenation, unary operators (*,+,^,{}), binary operators (|). This can be overridden with brackets.
</p>
<p><strong>Example:</strong></p>
<code>
.*heise.de/.*/[0-9]+
</code>
<p>
This matches heise.de/ with a string in front of it, for example "http://www.", followed by any string, then a slash and a number. The dot in "heise.de" is not escaped with "\", because it represents any character, thus the "." itself, too.
</p>
<p>
A possible URL which would match this regexp is: http://www.heise.de/newsticker/meldung/59421
</p>
<p>
An URL which would not match is: http://www.heise.de/tp/r4/artikel/20/20701/1.html
</p>
<p>
There is ".html" at the end, which is not included with the Regular Expression.
</p>
#%env/templates/footer.template%# #%env/templates/footer.template%#
</body> </body>
</html> </html>

@ -36,9 +36,9 @@ import de.anomic.server.servletProperties;
public class compare_yacy { public class compare_yacy {
private static final String defaultsearchL = "YaCy"; private static final String defaultsearchL = "YaCy";
private static final String defaultsearchR = "google.com"; private static final String defaultsearchR = "scroogle.org";
private static final String[] order = {defaultsearchL, "YaCy (local)", "bing.com", private static final String[] order = {defaultsearchL, "YaCy (local)", "bing.com",
"google.de", defaultsearchR, "scroogle.org", /*"google.de",*/ defaultsearchR, "scroogle.org",
"metager.de", "metager2.de (web)", "metager2.de (international)", "metager.de", "metager2.de (web)", "metager2.de (international)",
"yahoo.com", "romso.de", "search.live.com", "Wikipedia English", "Wikipedia Deutsch", "yahoo.com", "romso.de", "search.live.com", "Wikipedia English", "Wikipedia Deutsch",
"Sciencenet", "dbpedia", "wolfram alpha", "OAIster@OCLC", "oai.yacy.net"}; "Sciencenet", "dbpedia", "wolfram alpha", "OAIster@OCLC", "oai.yacy.net"};
@ -47,8 +47,8 @@ public class compare_yacy {
searchengines.put(defaultsearchL, "yacysearch.html?display=2&resource=global&query="); searchengines.put(defaultsearchL, "yacysearch.html?display=2&resource=global&query=");
searchengines.put("YaCy (local)", "yacysearch.html?display=2&resource=local&query="); searchengines.put("YaCy (local)", "yacysearch.html?display=2&resource=local&query=");
searchengines.put("bing.com", "http://www.bing.com/search?q="); searchengines.put("bing.com", "http://www.bing.com/search?q=");
searchengines.put("google.de", "http://www.google.de/#fp=1&q="); //searchengines.put("google.de", "http://www.google.de/#fp=1&q=");
searchengines.put("google.com", "http://www.google.com/#fp=1&q="); //searchengines.put("google.com", "http://www.google.com/#fp=1&q=");
searchengines.put("scroogle.org", "http://www.scroogle.org/cgi-bin/nbbw.cgi?Gw="); searchengines.put("scroogle.org", "http://www.scroogle.org/cgi-bin/nbbw.cgi?Gw=");
searchengines.put("metager.de", "http://www.metager.de/meta/cgi-bin/meta.ger1?eingabe="); searchengines.put("metager.de", "http://www.metager.de/meta/cgi-bin/meta.ger1?eingabe=");
searchengines.put("metager2.de (web)", "http://www.metager2.de/search.php?ses=web&q="); searchengines.put("metager2.de (web)", "http://www.metager2.de/search.php?ses=web&q=");
@ -73,7 +73,9 @@ public class compare_yacy {
prop.put("display", display); prop.put("display", display);
String default_left = sb.getConfig("compare_yacy.left", defaultsearchL); String default_left = sb.getConfig("compare_yacy.left", defaultsearchL);
if (!searchengines.containsKey(default_left)) default_left = defaultsearchL;
String default_right = sb.getConfig("compare_yacy.right", defaultsearchR); String default_right = sb.getConfig("compare_yacy.right", defaultsearchR);
if (!searchengines.containsKey(default_right)) default_right = defaultsearchR;
if (post != null) { if (post != null) {
if (searchengines.get(post.get("left", default_left)) != null) { if (searchengines.get(post.get("left", default_left)) != null) {

Binary file not shown.

After

Width:  |  Height:  |  Size: 879 B

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.2 KiB

@ -12,7 +12,6 @@
<!--<li><a href="/yacy/ui/" accesskey="s" class="MenuItemLink">Rich Client Search</a></li>--> <!--<li><a href="/yacy/ui/" accesskey="s" class="MenuItemLink">Rich Client Search</a></li>-->
<li><a href="/compare_yacy.html?display=1" class="MenuItemLink">Compare Search</a></li> <li><a href="/compare_yacy.html?display=1" class="MenuItemLink">Compare Search</a></li>
<li><a href="/ConfigPortal.html" class="MenuItemLink">Search Integration</a></li> <li><a href="/ConfigPortal.html" class="MenuItemLink">Search Integration</a></li>
<li><a href="/Help.html" class="MenuItemLink">Help</a></li>
</ul> </ul>
</li> </li>
<li class="menugroup" id="menugroupGlobalIndex"> <li class="menugroup" id="menugroupGlobalIndex">
@ -47,6 +46,7 @@
<ul class="menu"> <ul class="menu">
<li><a href="/Status.html?noforward=" class="MenuItemLink">Admin Console</a></li> <li><a href="/Status.html?noforward=" class="MenuItemLink">Admin Console</a></li>
<li><a href="/Table_API_p.html" class="MenuItemLink lock">API Action Steering</a></li> <li><a href="/Table_API_p.html" class="MenuItemLink lock">API Action Steering</a></li>
<li><a href="/Help.html" class="MenuItemLink">Tutorial</a></li>
<li><a href="/Steering.html?restart=" class="MenuItemLink lock" onclick="return confirm('Confirm Restart')">Re-Start</a></li> <li><a href="/Steering.html?restart=" class="MenuItemLink lock" onclick="return confirm('Confirm Restart')">Re-Start</a></li>
<li><a href="/Steering.html?shutdown=" class="MenuItemLink lock" onclick="return confirm('Confirm Shutdown')">Shutdown</a></li> <li><a href="/Steering.html?shutdown=" class="MenuItemLink lock" onclick="return confirm('Confirm Shutdown')">Shutdown</a></li>
</ul> </ul>

@ -85,7 +85,16 @@ public final class search {
sb.remoteSearchLastAccess = System.currentTimeMillis(); sb.remoteSearchLastAccess = System.currentTimeMillis();
final serverObjects prop = new serverObjects(); final serverObjects prop = new serverObjects();
if ((post == null) || (env == null)) return prop; // set nice default values for error cases
prop.put("searchtime", "0");
prop.put("references", "");
prop.put("joincount", "0");
prop.put("linkcount", "0");
prop.put("links", "");
prop.put("indexcount", "");
prop.put("indexabstract", "");
if (post == null || env == null) return prop;
if (!Protocol.authentifyRequest(post, env)) return prop; if (!Protocol.authentifyRequest(post, env)) return prop;
final String client = header.get(HeaderFramework.CONNECTION_PROP_CLIENTIP); final String client = header.get(HeaderFramework.CONNECTION_PROP_CLIENTIP);
@ -101,7 +110,7 @@ public final class search {
// final String fwdep = post.get("fwdep", ""); // forward depth. if "0" then peer may NOT ask another peer for more results // final String fwdep = post.get("fwdep", ""); // forward depth. if "0" then peer may NOT ask another peer for more results
// final String fwden = post.get("fwden", ""); // forward deny, a list of seed hashes. They may NOT be target of forward hopping // final String fwden = post.get("fwden", ""); // forward deny, a list of seed hashes. They may NOT be target of forward hopping
final int count = Math.min((int) sb.getConfigLong(SwitchboardConstants.REMOTESEARCH_MAXCOUNT_DEFAULT, 100), post.getInt("count", 10)); // maximum number of wanted results final int count = Math.min((int) sb.getConfigLong(SwitchboardConstants.REMOTESEARCH_MAXCOUNT_DEFAULT, 100), post.getInt("count", 10)); // maximum number of wanted results
final long maxtime = Math.min((int) sb.getConfigLong(SwitchboardConstants.REMOTESEARCH_MAXTIME_DEFAULT, 3000), post.getLong("time", 3000)); // maximum number of wanted results final long maxtime = Math.min((int) sb.getConfigLong(SwitchboardConstants.REMOTESEARCH_MAXTIME_DEFAULT, 3000), post.getLong("time", 3000)); // maximum waiting time
final int maxdist= post.getInt("maxdist", Integer.MAX_VALUE); final int maxdist= post.getInt("maxdist", Integer.MAX_VALUE);
final String prefer = post.get("prefer", ""); final String prefer = post.get("prefer", "");
final String contentdom = post.get("contentdom", "text"); final String contentdom = post.get("contentdom", "text");
@ -117,8 +126,8 @@ public final class search {
language = (agent == null) ? "en" : ISO639.userAgentLanguageDetection(agent); language = (agent == null) ? "en" : ISO639.userAgentLanguageDetection(agent);
if (language == null) language = "en"; if (language == null) language = "en";
} }
final int partitions = post.getInt("partitions", 30); final int partitions = post.getInt("partitions", 30);
String profile = post.get("profile", ""); // remote profile hand-over String profile = post.get("profile", ""); // remote profile hand-over
if (profile.length() > 0) profile = crypt.simpleDecode(profile, null); if (profile.length() > 0) profile = crypt.simpleDecode(profile, null);
//final boolean includesnippet = post.get("includesnippet", "false").equals("true"); //final boolean includesnippet = post.get("includesnippet", "false").equals("true");
Bitfield constraint = ((post.containsKey("constraint")) && (post.get("constraint", "").length() > 0)) ? new Bitfield(4, post.get("constraint", "______")) : null; Bitfield constraint = ((post.containsKey("constraint")) && (post.get("constraint", "").length() > 0)) ? new Bitfield(4, post.get("constraint", "______")) : null;
@ -142,13 +151,8 @@ public final class search {
// http://localhost:8090/yacy/search.html?query=4galTpdpDM5Qgh8DKIhGKXws&abstracts=auto (search for linux and book, generate abstract automatically) // http://localhost:8090/yacy/search.html?query=4galTpdpDM5Qgh8DKIhGKXws&abstracts=auto (search for linux and book, generate abstract automatically)
// http://localhost:8090/yacy/search.html?query=&abstracts=4galTpdpDM5Q (only abstracts for linux) // http://localhost:8090/yacy/search.html?query=&abstracts=4galTpdpDM5Q (only abstracts for linux)
if ((sb.isRobinsonMode()) && if (sb.isRobinsonMode() && !sb.isPublicRobinson()) {
(!((sb.isPublicRobinson()) || // if we are a robinson cluster, answer only if this client is known by our network definition
(sb.isInMyCluster(header.get(HeaderFramework.CONNECTION_PROP_CLIENTIP)))))) {
// if we are a robinson cluster, answer only if this client is known by our network definition
prop.put("links", "");
prop.put("linkcount", "0");
prop.put("references", "");
return prop; return prop;
} }
@ -160,19 +164,19 @@ public final class search {
if (trackerHandles.tailSet(Long.valueOf(System.currentTimeMillis() - 3000)).size() > 1) { if (trackerHandles.tailSet(Long.valueOf(System.currentTimeMillis() - 3000)).size() > 1) {
block = true; block = true;
} }
}
if (!block) synchronized (trackerHandles) {
if (trackerHandles.tailSet(Long.valueOf(System.currentTimeMillis() - 60000)).size() > 12) { if (trackerHandles.tailSet(Long.valueOf(System.currentTimeMillis() - 60000)).size() > 12) {
block = true; block = true;
} }
}
if (!block) synchronized (trackerHandles) {
if (trackerHandles.tailSet(Long.valueOf(System.currentTimeMillis() - 600000)).size() > 36) { if (trackerHandles.tailSet(Long.valueOf(System.currentTimeMillis() - 600000)).size() > 36) {
block = true; block = true;
} }
} }
if (block && Domains.isLocal(client, null)) block = false; if (block && Domains.isLocal(client, null)) block = false; // check isLocal here to prevent dns lookup for client
if (block) { if (block) {
prop.put("links", "");
prop.put("linkcount", "0");
prop.put("references", "");
prop.put("searchtime", "0");
return prop; return prop;
} }

@ -207,16 +207,9 @@ public class yacysearch {
} }
// SEARCH // SEARCH
final boolean indexReceiveGranted = sb.getConfigBool(SwitchboardConstants.INDEX_RECEIVE_ALLOW, true) || final boolean clustersearch = sb.isRobinsonMode() && (sb.getConfig("cluster.mode", "").equals("privatecluster") || sb.getConfig("cluster.mode", "").equals("publiccluster"));
sb.getConfigBool(SwitchboardConstants.INDEX_RECEIVE_AUTODISABLED, true); final boolean indexReceiveGranted = sb.getConfigBool(SwitchboardConstants.INDEX_RECEIVE_ALLOW, true) || sb.getConfigBool(SwitchboardConstants.INDEX_RECEIVE_AUTODISABLED, true) || clustersearch;
global = global && indexReceiveGranted; // if the user does not want indexes from remote peers, it cannot be a global search global = global && indexReceiveGranted; // if the user does not want indexes from remote peers, it cannot be a global searchnn
final boolean clustersearch = sb.isRobinsonMode() &&
(sb.getConfig("cluster.mode", "").equals("privatecluster") ||
sb.getConfig("cluster.mode", "").equals("publiccluster"));
if (clustersearch) {
global = true;
} // switches search on, but search target is limited to cluster nodes
// increase search statistic counter // increase search statistic counter
if (!global) { if (!global) {
@ -542,9 +535,6 @@ public class yacysearch {
} }
} }
// prepare search properties
final boolean globalsearch = (global) && indexReceiveGranted;
// do the search // do the search
final HandleSet queryHashes = Word.words2hashesHandles(query[0]); final HandleSet queryHashes = Word.words2hashesHandles(query[0]);
final Pattern snippetPattern = QueryParams.stringSearchPattern(originalquerystring); final Pattern snippetPattern = QueryParams.stringSearchPattern(originalquerystring);
@ -584,8 +574,8 @@ public class yacysearch {
itemsPerPage, itemsPerPage,
offset, offset,
urlmask, urlmask,
(clustersearch && globalsearch) ? QueryParams.Searchdom.CLUSTER : clustersearch && global ? QueryParams.Searchdom.CLUSTER :
((globalsearch) ? QueryParams.Searchdom.GLOBAL : QueryParams.Searchdom.LOCAL), (global && indexReceiveGranted ? QueryParams.Searchdom.GLOBAL : QueryParams.Searchdom.LOCAL),
20, 20,
constraint, constraint,
true, true,
@ -715,7 +705,7 @@ public class yacysearch {
prop.put("num-results_itemscount", Formatter.number(0, true)); prop.put("num-results_itemscount", Formatter.number(0, true));
prop.put("num-results_itemsPerPage", itemsPerPage); prop.put("num-results_itemsPerPage", itemsPerPage);
prop.put("num-results_totalcount", Formatter.number(indexcount, true)); prop.put("num-results_totalcount", Formatter.number(indexcount, true));
prop.put("num-results_globalresults", (globalsearch) ? "1" : "0"); prop.put("num-results_globalresults", global && (indexReceiveGranted || clustersearch) ? "1" : "0");
prop.put("num-results_globalresults_localResourceSize", Formatter.number(theSearch.getRankingResult().getLocalIndexCount(), true)); prop.put("num-results_globalresults_localResourceSize", Formatter.number(theSearch.getRankingResult().getLocalIndexCount(), true));
prop.put("num-results_globalresults_localMissCount", Formatter.number(theSearch.getRankingResult().getMissCount(), true)); prop.put("num-results_globalresults_localMissCount", Formatter.number(theSearch.getRankingResult().getMissCount(), true));
prop.put("num-results_globalresults_remoteResourceSize", Formatter.number(theSearch.getRankingResult().getRemoteResourceSize(), true)); prop.put("num-results_globalresults_remoteResourceSize", Formatter.number(theSearch.getRankingResult().getRemoteResourceSize(), true));

@ -1168,78 +1168,11 @@ The scheduler on crawls can be changed or removed using the <a href=\"Table_API_
#File: Help.html #File: Help.html
#--------------------------- #---------------------------
YaCy: Help==YaCy: Hilfe YaCy: Help==YaCy: Hilfe
>Help==>Hilfe Tutorial==Anleitung
This is a distributed web crawler and also a caching HTTP proxy. You are using the <em>online-interface</em> of the application. You can use this interface to configure your personal settings, proxy settings, access control and crawling properties. You can also use this interface to start crawls, send messages to other peers and monitor your index, cache status and crawling processes. Most important, you can use the search page to search either your own or the <em>global</em> index.==YaCy ist eine Suchmaschine, die ähnlich dem Prinzip des verteilten Rechnens (distributed computing wie z.B. SETI@home) funktioniert. Das ganze heisst hier eher "verteiltes Durchsuchen und Indexieren" des Internets. Ausserdem bringt sie einen HTTP Proxy Server mit. Sie benutzen gerade das <em>Online-Interface</em> von YaCy. Sie können dieses Interface auch zum Konfigurieren Ihrer persönlichen Einstellungen, Proxy-Einstellungen, der Zugriffs-Kontrolle und den Crawl-Einstellungen benutzen. Sie können das Interface auch zum Starten von Crawls, Nachrichtenversand an andere YaCy-Nutzer und zur Überwachung Ihres Index, Cache-Status und des Crawl-Prozesses benutzen. Besonders wichtig ist, dass Sie die Suchseite benutzen können, um entweder den eigenen oder den <em>globalen</em> Index zu durchsuchen. You are using the administration interface of your own search engine==Sie benutzen gerade das Administrationsinterface ihrer eigenen Suchmaschine
For more detailed information, visit the==Für weitergehende Informationen besuchen Sie bitte die You can create your own search index with YaCy==Sie können mit YaCy Ihren eigenen Suchindex erstellen
YaCy homepage==YaCy-Homepage To learn how to do that, watch one of the demonstration videos below==Bitte sehen Sie als Anleitung eine Demonstration (2. Video unten in deutscher Sprache)
Local and Global Search: Options and Functions==Lokale und globale Suche: Optionen und Funktionen
The proxy provides a search interface that accesses your local index, created from web pages that passed the proxy.==Der Proxy bringt eine Suchmaschine mit, die auf Ihren lokalen Index zugreift. Dieser setzt sich aus den Webseiten zusammen, die Sie mit dem Proxy besuchen.
The search can also be applied globally, by searching other peers. You can use the following options to enhance your search results==Die Suche kann aber auch global ausgeweitet werden. Dazu werden die anderen Peers herangezogen und von ihnen Informationen abgefragt. Sie können allerdings auch folgende Optionen benutzen, um mehr Suchergebnisse zu erhalten
Search Word List==Suchwort Liste
You can search for several words simultanous. Words must be separated by a single space.==Sie können eine Suche mit mehreren Wörtern gleichzeitig starten. Die Wörter müssen durch ein Leerzeichen getrennt sein.
The words are treated conjunctive, that means every must occur in the result, not any.==
If you do a global search \(see below\) you may get different results each time you do a search.==Wenn Sie eine globale Suche (siehe unten) durchführen, kann es sein, dass Sie bei wiederholten Suchanfragen jedesmal andere Ergebnisse erhalten.
Maximum Number of Results==Maximale Anzahl an Ergebnissen
You can select the number of wanted maximum links. We do not yet support multiple result pages for virtually any possible link.==Sie können auswählen, wieviele Links Ihnen angezeigt werden sollen. Wir unterstützen im Moment keine mehrfachen Ergebnisseiten für jeden möglichen Link.
Instead we encourage you to enhance the search result by submitting more search words.==Stattdessen möchten wir Sie dazu auffordern, Ihre Suchergebnisse durch das Eingeben von mehreren Suchwörtern, zu vergrößern.
Result Order Options==Reihenfolge der Ergebnisse
The search engine provides an experimental 'Quality' ranking. In contrast to other known search engines we provide also==Die Suchmaschine bietet ein experimentelles 'Qualitäts' Ranking an. Im Gegensatz zu anderen üblichen Suchmaschinen bieten wir
a result order by date. If you change the order to 'Date-Quality' the most recently updated page from the search results is listed first.==auch die Sortierung nach Datum an. Wenn Sie die Reihenfolge in 'Datum-Qualität' ändern, werden die als letztes überarbeiteten Webseiten aus den Suchergebnissen ganz oben aufgelistet.
For pages that have the same date the second order, 'Quality' is applied.==Für Seiten die das gleiche Datum haben, wird dann nach der 'Qualität' geordnet.
Resource Domain==Quelle der Suche
This search engine is constructed to search the web pages that pass the proxy. But the search index is distributed to other peers as well,==Diese Suchmaschine ist so aufgebaut, dass sie die Seiten durchsucht, die mit dem Proxy besucht wurden. Aber der Suchindex ist auch auf andere Peers verteilt,
so you can search also globally: this function is currently only rudimentary, but can be choosen for test cases. Future releases will==so dass Sie auch global suchen können: diese Funktion ist veraltet, aber Sie können sie zum testen benutzen. Neue Versionen werden
automatically distribute index information <em>before</em> a search happends to form a performant distributed hash table -- a very fast global search.==automatisch einen verteilten Index erstellen <em>bevor</em> es geschieht, dass eine Suche, eine verteilte Hash-Tabelle erstellt -- eine sehr schnelle globale Suche.
Maximum Search Time==Maximale Zeit zum Suchen
Searching the local index is extremely fast, it happends within milliseconds, even for a large number \(millions\) of pages. But searching the==Den lokalen Index zu durchsuchen, geht extrem schnell. Es geschieht innerhalb von Millisekunden auch bei einer großen Anzahl (ein paar Millionen) von Seiten.
global index needs more time to find the correct remote peer that contains best search results. This is especially the case while the==Im Gegensatz dazu braucht es mehr Zeit den globalen Index zu durchsuchen, da erst der richtige Peer, der die besten Ergbnisse liefern kann, gefunden werden muss. Dies ist vermutlich auch der Grund warum
distributed index is in test mode. Search results get more stable \(repeated global search produce more similar results\) the longer==diese Art von Suchmaschine noch in der Testphase ist. Die Suchergebnisse werden immer besser (eine wiederholte globale Suche bringt mehr gleiche Ergebnisse), je länger
the search time is.==die Zeit zum Suchen ist.
Accesskeys<==Schnell-Zugriffstasten<
You may want to use accesskeys to navigate through the YaCy webinterface:==Sie können durch das YaCy Web Interface auch mit Hilfe von Tastenkombinationen navigieren. Drücken Sie dazu eine der folgenden Kombinationen ('Taste' steht für einen der drei unten genannten Buchstaben):
Windows and Internet Explorer: Alt \+ Accesskey \+ Enter==Windows und Internet Explorer: Alt + Taste + Enter
Windows and Mozilla/Firefox/Netscape: Alt \+ Accesskey==Windows und Mozilla/Firefox/Netscape: Alt + Taste
Windows and Opera: Shift \+ Esc \+ Accesskey==Windows und Opera: Shift + Esc + Taste
Macintosh and Internet Explorer: Strg \+ Accesskey \+ Enter==Macintosh und Internet Explorer: Str + Taste + Enter
Macintosh and Safari: Strg \+ Accesskey==Macintosh und Safari: Strg + Taste
Macintosh and Mozilla/Firefox/Netscape: Strg \+ Accesskey==Macintosh und Mozilla/Firefox/Netscape: Strg + Taste
Macintosh and Opera: Shift \+ Esc \+ Accesskey==Macintosh und Opera: Shift + Esc + Taste
Linux Mandrake and Galeon/Mozilla: Alt \+ Accesskey==Linux Mandrake und Galeon/Mozilla: Alt + Taste
All OS and Amaya: Strg \+ Accesskey==Alle OS Versionen und Amaya: Strg + Taste
Search Page==Suchseite
Network==Netzwerk
Status==Status
Regular Expressions<==Reguläre Ausdrücke<
YaCy uses Regular Expressions for some functions, for example in the blacklist.==YaCy benutzt reguläre Ausdrücke für einige Funktionen, z.B. die Blacklist.
There are some standards for these regexps, YaCy uses the syntax used by Perl 5.==Es gibt einige Standards für diese Ausdrücke, YaCy orientiert sich an der Syntax von Perl 5.
Here ist a short overview about the functions, which should fir for most cases\:<==Hier finden Sie einen kurzen Überblick über die Funktionen, die die meisten Anwendungsfälle abdecken sollten:<
arbitrary character==beliebiges Zeichen
character x==Zeichen x
not x==nicht x
0 or more times x==0 oder mehr Vorkommen von x
0 or 1 time x==0 oder 1 Vorkommen von x
1 or more times x==1 oder mehr Vorkommen von x
concatenation of x and y==Verkettung von x und y
x or y==x oder y
String "foo" or string "bar"==String "foo" oder String "bar"
a or b or c \(same as a\|b\|c\)==a oder b oder c (dasselbe wie a|b|c)
a or b or c \(same as above\)==a oder b oder c (dasselbe wie oben)
exactly n appearances of x==genau n Vorkommen von x
at least n appearances of x==mindestens n Vorkommen von x
at least n, maximum m appearanches of x==mindestens n, maximal m Vorkommen von m
Modify priority of instructions==Priorität der Befehle ändern
Escape-Character, used to escape special characters \(for example "\[" or "\*"\), so that they loose their special meaning==Escape-Zeichen, das benutzt wird, um Zeichen mit Sonderbedeutung (z.B. "[" oder "*"), zu "escapen", d.h. ihre Sonderbedeutung verlieren.
Regex follow a special priority \(descending\)\: concatenation, unary operators \(\*,\+,\^,\{\}\), binary operators \(\|\). This can be overridden with brackets.==Regex folgen Prioritäten (absteigend): Verkettung, unäre Operatore (*,+,^,{}), binäre Operatoren (|). Die Reihenfolge kann durch das Setzen von Klammern geändert werden.
Example:==Beispiel:
This matches heise.de/ with a string in front of it, for example "http://www.", followed by any string, then a slash and a number. The dot in "heise.de" is not escaped with "\\", because it represents any character, thus the "." itself, too.==Dies erfüllt heise.de/ mit einem String davor, z.B. "http://www.". Es folgt ein beliebiger String, dann ein Slash und eine Zahl. Der Punkt in "heise.de" ist nicht escaped, weil der Punkt für ein beliebiges Zeichen steht, folglich auch für "." selbst.
A possible URL which would match this regexp is\:==Eine URL, die den regulären Ausdruck erfüllen würde, ist:
An URL which would not match is:==Eine URL die diese Regex nicht erfüllen würde ist:
There is ".html" at the end, which is not included with the Regular Expression.==Hier ist ".html" am Ende angehängt, das nicht in der Regex enthalten ist.
#----------------------------- #-----------------------------
#File: index.html #File: index.html

@ -327,7 +327,7 @@ public class URIMetadataRow implements URIMetadata {
assert (s.toString().indexOf(0) < 0); assert (s.toString().indexOf(0) < 0);
s.append(",flags=").append(flags().exportB64()); s.append(",flags=").append(flags().exportB64());
assert (s.toString().indexOf(0) < 0); assert (s.toString().indexOf(0) < 0);
s.append(",lang=").append(language()); s.append(",lang=").append(UTF8.String(language()));
assert (s.toString().indexOf(0) < 0); assert (s.toString().indexOf(0) < 0);
s.append(",llocal=").append(llocal()); s.append(",llocal=").append(llocal());
assert (s.toString().indexOf(0) < 0); assert (s.toString().indexOf(0) < 0);
@ -344,7 +344,8 @@ public class URIMetadataRow implements URIMetadata {
if (this.word != null) { if (this.word != null) {
// append also word properties // append also word properties
s.append(",wi=").append(Base64Order.enhancedCoder.encodeString(this.word.toPropertyForm())); final String wprop = this.word.toPropertyForm();
s.append(",wi=").append(Base64Order.enhancedCoder.encodeString(wprop));
} }
assert (s.toString().indexOf(0) < 0); assert (s.toString().indexOf(0) < 0);
return s; return s;

@ -651,12 +651,14 @@ public final class Protocol {
final boolean global, final boolean global,
final int partitions, final int partitions,
final String hostname, final String hostname,
final String hostaddress, String hostaddress,
final SearchEvent.SecondarySearchSuperviser secondarySearchSuperviser, final SearchEvent.SecondarySearchSuperviser secondarySearchSuperviser,
final RankingProfile rankingProfile, final RankingProfile rankingProfile,
final Bitfield constraint) throws IOException { final Bitfield constraint) throws IOException {
// send a search request to peer with remote Hash // send a search request to peer with remote Hash
//if (hostaddress.equals(mySeed.getClusterAddress())) hostaddress = "127.0.0.1:" + mySeed.getPort(); // for debugging
// INPUT: // INPUT:
// iam : complete seed of the requesting peer // iam : complete seed of the requesting peer
// youare : seed hash of the target peer, used for testing network stability // youare : seed hash of the target peer, used for testing network stability

@ -115,7 +115,7 @@ public final class SearchEvent {
this.IAneardhthash = null; this.IAneardhthash = null;
this.localSearchThread = null; this.localSearchThread = null;
this.order = new ReferenceOrder(this.query.ranking, UTF8.getBytes(this.query.targetlang)); this.order = new ReferenceOrder(this.query.ranking, UTF8.getBytes(this.query.targetlang));
final boolean remote = (this.query.domType == QueryParams.Searchdom.GLOBAL || this.query.domType == QueryParams.Searchdom.CLUSTER) && peers.sizeConnected() > 0 && peers.mySeed().getFlagAcceptRemoteIndex(); final boolean remote = peers.sizeConnected() > 0 && (this.query.domType == QueryParams.Searchdom.CLUSTER || (this.query.domType == QueryParams.Searchdom.GLOBAL && peers.mySeed().getFlagAcceptRemoteIndex()));
final long start = System.currentTimeMillis(); final long start = System.currentTimeMillis();
if (remote) { if (remote) {
// initialize a ranking process that is the target for data // initialize a ranking process that is the target for data

@ -1,11 +0,0 @@
cd `dirname $0`
./startYACY.sh --gui &
echo "****************** YaCy Web Crawler/Indexer & Search Engine *******************"
echo "**** (C) by Michael Peter Christen, usage granted under the GPL Version 2 ****"
echo "**** USE AT YOUR OWN RISK! Project home and releases: http://yacy.net/ ****"
echo "** LOG of YaCy: DATA/LOG/yacy00.log (and yacy<xx>.log) **"
echo "** STOP YaCy: execute stopYACY.sh and wait some seconds **"
echo "** GET HELP for YaCy: see http://wiki.yacy.net and http://forum.yacy.de **"
echo "*******************************************************************************"
echo " >> YaCy started as daemon process. Administration at http://localhost:8090 <<"
echo " You can close this window now, this will NOT shut down your YaCy peer."
Loading…
Cancel
Save