- added new Network Configuration menu, can be found in basic settings

- new cluster functions will be available in this menu, but currently not enabled,
  because corresponding interface methods are not ready yet
- shifted remote crawl settings to new network configuration menu
- shifted DHT distribution/receive to the new network configuration menu
- adopted some string constants
- added cluster configuration settings to yacy.init


git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3589 6c8d7289-2bf4-0310-a012-ef5d649a1542
pull/1/head
orbiter 18 years ago
parent c5c3ecc67e
commit 89c1511738

@ -0,0 +1,98 @@
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>YaCy '#[clientname]#': Network Configuration</title>
#%env/templates/metas.template%#
</head>
<body id="ConfigNetwork">
#%env/templates/header.template%#
#%env/templates/submenuConfig.template%#
<h2>Network Configuration</h2>
#(commit)#
::<div class="commit">Accepted Changes.</div>
::<div class="error">Inapplicable Setting Combination:</div>
#(/commit)#
#(commitCrawlPlea)#::<div class="error">P2P operation can run without remote indexing, but runs better with remote indexing switched on. Please switch 'Accept Remote Crawl Requests' on.</div>#(/commitCrawlPlea)#
#(commitDHTIsRobinson)#::<div class="error">For P2P operation, at least DHT distribution or DHT receive (or both) must be set. You have thus defined a Robinson configuration.</div>#(/commitDHTIsRobinson)#
#(commitDHTNoGlobalSearch)#::<div class="error">Global Search in P2P configuration is only allowed, if both, index receive and distribution is switched on. You have a P2P configuration, but are not allowed to search other peers.</div>#(/commitDHTNoGlobalSearch)#
#(commitRobinson)#::<div class="commit">For Robinson Mode, index distribution and receive is switched off.</div>#(/commitRobinson)#
#(commitRobinsonWithRemoteIndexing)#::<div class="commit">This Robinson Mode switches remote indexing on, but limits targets to peers within the same cluster. Remote indexing requests from peers within the same cluster are accepted.</div>#(/commitRobinsonWithRemoteIndexing)#
#(commitRobinsonWithoutRemoteIndexing)#::<div class="commit">This Robinson Mode does not allow any remote indexing (neither requests nor acceptance of other requests).</div>#(/commitRobinsonWithoutRemoteIndexing)#
<p>
You can configure if you want to participate at the global YaCy network or if you want to have your
own separate search cluster with or without connection to the global network. You may also define
a completely independent search engine instance, without any data exchange between your peer and other
peers, which we call a 'Robinson' peer.
</p>
<form method="post" action="ConfigNetwork_p.html" enctype="multipart/form-data" accept-charset="UTF-8">
<fieldset>
<legend><input type="radio" name="network" id="p2p" value="p2p"#(p2p.checked)#:: checked="checked"#(/p2p.checked)# />Peer-to-Peer Mode</legend>
<dl>
<dt>Index Distribution <input type="checkbox" name="indexDistribute" #(indexDistributeChecked)#::checked="checked" #(/indexDistributeChecked)#/></dt>
<dd>
This enables automated, DHT-ruled Index Transmission to other peers.<br>
<input type="radio" value="on" name="indexDistributeWhileCrawling" #(indexDistributeWhileCrawling.on)#::checked="checked" #(/indexDistributeWhileCrawling.on)#/> enabled
/
<input type="radio" value="off" name="indexDistributeWhileCrawling" #(indexDistributeWhileCrawling.off)#::checked="checked" #(/indexDistributeWhileCrawling.off)#/> disabled
during crawling.
</dd>
<dt>Index Receive <input type="checkbox" name="indexReceive" #(indexReceiveChecked)#::checked="checked" #(/indexReceiveChecked)#::/></dt>
<dd>
Accept remote Index Transmissions.<br>This works only if you have a senior peer. The DHT-rules do not work without this function.<br>
<input type="radio" value="on" name="indexReceiveBlockBlacklist" #(indexReceiveBlockBlacklistChecked.on)#::checked="checked" #(/indexReceiveBlockBlacklistChecked.on)#/> reject
/
<input type="radio" value="off" name="indexReceiveBlockBlacklist" #(indexReceiveBlockBlacklistChecked.off)#::checked="checked" #(/indexReceiveBlockBlacklistChecked.off)#/> accept
transmitted URLs that match your blacklist
</dd>
<dt>Accept Remote Crawl Requests <input type="checkbox" name="crawlResponse" #(crawlResponse)#::checked="checked" #(/crawlResponse)#::/></dt>
<dd>
Perform web indexing upon request of another peer.<br>This works only if you are a senior peer.<br>
Load with a maximum of <input name="acceptCrawlLimit" type="text" size="3" maxlength="3" value="#[acceptCrawlLimit]#" /> pages per minute
</dd>
</dl>
</fieldset>
<fieldset>
<legend><input type="radio" name="network" id="robinson" value="robinson"#(robinson.checked)#:: checked="checked"#(/robinson.checked)# />Robinson Mode</legend>
<p class="help">
If your peer runs in 'Robinson Mode' you run YaCy as a search engine for your own search portal without data exchange to other peers.
There is no index receive and no index distribution between your peer and any other peer.
In case of Robinson-clustering there can be acceptance of remote crawl requests from peers of that cluster.
</p>
<dl>
<dt>Private Peer<input type="radio" value="privatepeer" name="cluster.mode" #(privatepeerChecked)#::checked="checked" #(/privatepeerChecked)#/></dt>
<dd>Your search engine will not contact any other peer, and will reject every request.
</dd>
<!-- not yet implemented
<dt>Private Cluster<input type="radio" value="privatecluster" name="cluster.mode" #(privateclusterChecked)#::checked="checked" #(/privateclusterChecked)#/></dt>
<dd>Your peer is part of a private cluster without public visibility.<br>
Index data is not distributed, but remote crawl requests are distributed and accepted from your cluster.<br>
Search requests are spread over all peers of the cluster, and answered from all peers of the cluster.
</dd>
<dt>Public Cluster<input type="radio" value="publiccluster" name="cluster.mode" #(publicclusterChecked)#::checked="checked" #(/publicclusterChecked)#/></dt>
<dd>Your peer is part of a public cluster within the YaCy network.<br>
Index data is not distributed, but remote crawl requests are distributed and accepted<br>
Search requests are spread over all peers of the cluster, and answered from all peers of the cluster.<br>
List of ip:port - addresses of the cluster: (comma-separated)<br>
<input type="text" name="cluster.peers.ipport" value="#[cluster.peers.ipport]#" size="80" maxlength="800" />
</dd>
-->
<dt>Public Peer<input type="radio" value="publicpeer" name="cluster.mode" #(publicpeerChecked)#::checked="checked" #(/publicpeerChecked)#/></dt>
<dd>You are visible to other peers and contact them to distribute your presence.<br>
Your peer does not accept any outside index data, but responds on all remote search requests.<br>
List of .yacy or .yacyh - domains of the cluster: (comma-separated)<br>
<input type="text" name="cluster.peers.yacydomain" value="#[cluster.peers.yacydomain]#" size="80" maxlength="800" />
</dd>
<dt>Peer Tags</dt>
<dd>When you allow access from the YaCy network, your data is recognized using keywords.<br>
Please describe your search portal with some keywords (comma-separated).<br>
<input type="text" name="peertags" value="#[peertags]#" size="40" maxlength="80" />
</dd>
</dl>
</fieldset>
<input type="submit" name="save" value="Save" />
</form>
#%env/templates/footer.template%#
</body>
</html>

@ -0,0 +1,191 @@
// ConfigNetwork_p.java
// --------------------
// (C) 2007 by Michael Peter Christen; mc@yacy.net, Frankfurt a. M., Germany
// first published 20.04.2007 on http://yacy.net
//
// This is a part of YaCy, a peer-to-peer based web search engine
//
// $LastChangedDate: 2006-04-02 22:40:07 +0200 (So, 02 Apr 2006) $
// $LastChangedRevision: 1986 $
// $LastChangedBy: orbiter $
//
// LICENSE
//
// This program is free software; you can redistribute it and/or modify
// it under the terms of the GNU General Public License as published by
// the Free Software Foundation; either version 2 of the License, or
// (at your option) any later version.
//
// This program is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU General Public License for more details.
//
// You should have received a copy of the GNU General Public License
// along with this program; if not, write to the Free Software
// Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
import de.anomic.http.httpHeader;
import de.anomic.plasma.plasmaSwitchboard;
import de.anomic.server.serverCodings;
import de.anomic.server.serverObjects;
import de.anomic.server.serverSwitch;
import de.anomic.server.serverThread;
import de.anomic.yacy.yacyCore;
public class ConfigNetwork_p {
public static serverObjects respond(httpHeader header, serverObjects post, serverSwitch env) {
plasmaSwitchboard sb = (plasmaSwitchboard) env;
serverObjects prop = new serverObjects();
int commit = 0;
if (post != null) {
boolean crawlResponse = post.get("crawlResponse", "off").equals("on");
// DHT control
boolean indexDistribute = post.get("indexDistribute", "").equals("on");
boolean indexReceive = post.get("indexReceive", "").equals("on");
boolean robinsonmode = post.get("network", "").equals("robinson");
String clustermode = post.get("cluster.mode", "publicpeer");
if (robinsonmode) {
indexDistribute = false;
indexReceive = false;
if ((clustermode.equals("privatepeer")) || (clustermode.equals("publicpeer"))) {
prop.put("commitRobinsonWithoutRemoteIndexing", 1);
crawlResponse = false;
}
if ((clustermode.equals("privatecluster")) || (clustermode.equals("publiccluster"))) {
prop.put("commitRobinsonWithRemoteIndexing", 1);
crawlResponse = true;
}
commit = 1;
} else {
if (!indexDistribute && !indexReceive) {
prop.put("commitDHTIsRobinson", 1);
commit = 2;
} else if (indexDistribute && indexReceive) {
commit = 1;
} else {
prop.put("commitDHTNoGlobalSearch", 1);
commit = 1;
}
if (!crawlResponse) {
prop.put("commitCrawlPlea", 1);
}
}
if (indexDistribute) {
sb.setConfig(plasmaSwitchboard.INDEX_DIST_ALLOW, "true");
} else {
sb.setConfig(plasmaSwitchboard.INDEX_DIST_ALLOW, "false");
}
if (post.get("indexDistributeWhileCrawling","").equals("on")) {
sb.setConfig(plasmaSwitchboard.INDEX_DIST_ALLOW_WHILE_CRAWLING, "true");
} else {
sb.setConfig(plasmaSwitchboard.INDEX_DIST_ALLOW_WHILE_CRAWLING, "false");
}
if (indexReceive) {
sb.setConfig("allowReceiveIndex", "true");
yacyCore.seedDB.mySeed.setFlagAcceptRemoteIndex(true);
} else {
sb.setConfig("allowReceiveIndex", "false");
yacyCore.seedDB.mySeed.setFlagAcceptRemoteIndex(false);
}
if (post.get("indexReceiveBlockBlacklist", "").equals("on")) {
sb.setConfig("indexReceiveBlockBlacklist", "true");
} else {
sb.setConfig("indexReceiveBlockBlacklist", "false");
}
if (post.containsKey("peertags")) {
yacyCore.seedDB.mySeed.setPeerTags(serverCodings.string2set(normalizedList((String) post.get("peertags")), ","));
}
sb.setConfig("cluster.mode", post.get("cluster.mode", "publicpeer"));
// read remote crawl request settings
sb.setConfig("crawlResponse", (crawlResponse) ? "true" : "false");
int newppm = Math.max(1, Integer.parseInt(post.get("acceptCrawlLimit", "1")));
long newBusySleep = Math.max(100, 60000 / newppm);
serverThread rct = sb.getThread(plasmaSwitchboard.CRAWLJOB_REMOTE_TRIGGERED_CRAWL);
rct.setBusySleep(newBusySleep);
sb.setConfig(plasmaSwitchboard.CRAWLJOB_REMOTE_TRIGGERED_CRAWL_BUSYSLEEP, Long.toString(newBusySleep));
sb.setConfig("cluster.peers.ipport", checkIPPortList(post.get("cluster.peers.ipport", "")));
sb.setConfig("cluster.peers.yacydomain", checkYaCyDomainList(post.get("cluster.peers.yacydomain", "")));
}
// write answer code
prop.put("commit", commit);
// write remote crawl request settings
prop.put("crawlResponse", sb.getConfigBool("crawlResponse", false) ? 1 : 0);
long RTCbusySleep = Integer.parseInt(env.getConfig(plasmaSwitchboard.CRAWLJOB_REMOTE_TRIGGERED_CRAWL_BUSYSLEEP, "100"));
int RTCppm = (int) (60000L / RTCbusySleep);
prop.put("acceptCrawlLimit", RTCppm);
boolean indexDistribute = sb.getConfig(plasmaSwitchboard.INDEX_DIST_ALLOW, "true").equals("true");
boolean indexReceive = sb.getConfig("allowReceiveIndex", "true").equals("true");
prop.put("indexDistributeChecked", (indexDistribute) ? 1 : 0);
prop.put("indexDistributeWhileCrawling.on", (sb.getConfig(plasmaSwitchboard.INDEX_DIST_ALLOW_WHILE_CRAWLING, "true").equals("true")) ? 1 : 0);
prop.put("indexDistributeWhileCrawling.off", (sb.getConfig(plasmaSwitchboard.INDEX_DIST_ALLOW_WHILE_CRAWLING, "true").equals("true")) ? 0 : 1);
prop.put("indexReceiveChecked", (indexReceive) ? 1 : 0);
prop.put("indexReceiveBlockBlacklistChecked.on", (sb.getConfig("indexReceiveBlockBlacklist", "true").equals("true")) ? 1 : 0);
prop.put("indexReceiveBlockBlacklistChecked.off", (sb.getConfig("indexReceiveBlockBlacklist", "true").equals("true")) ? 0 : 1);
prop.put("peertags", serverCodings.set2string(yacyCore.seedDB.mySeed.getPeerTags(), ",", false));
// set seed information directly
yacyCore.seedDB.mySeed.setFlagAcceptRemoteCrawl(sb.getConfigBool("crawlResponse", false));
yacyCore.seedDB.mySeed.setFlagAcceptRemoteIndex(indexReceive);
// set p2p/robinson mode flags and values
prop.put("p2p.checked", (indexDistribute || indexReceive) ? 1 : 0);
prop.put("robinson.checked", (indexDistribute || indexReceive) ? 0 : 1);
prop.put("cluster.peers.ipport", sb.getConfig("cluster.peers.ipport", ""));
prop.put("cluster.peers.yacydomain", sb.getConfig("cluster.peers.yacydomain", ""));
// set p2p mode flags
prop.put("privatepeerChecked", (sb.getConfig("cluster.mode", "").equals("privatepeer")) ? 1 : 0);
prop.put("privateclusterChecked", (sb.getConfig("cluster.mode", "").equals("privatecluster")) ? 1 : 0);
prop.put("publicclusterChecked", (sb.getConfig("cluster.mode", "").equals("publiccluster")) ? 1 : 0);
prop.put("publicpeerChecked", (sb.getConfig("cluster.mode", "").equals("publicpeer")) ? 1 : 0);
return prop;
}
public static String normalizedList(String input) {
input = input.replace(' ', ',');
input = input.replace(' ', ';');
input = input.replaceAll(",,", ",");
if (input.startsWith(",")) input = input.substring(1);
if (input.endsWith(",")) input = input.substring(0, input.length() - 1);
return input;
}
public static String checkYaCyDomainList(String input) {
input = normalizedList(input);
String[] s = input.split(",");
input = "";
for (int i = 0; i < s.length; i++) {
if ((s[i].endsWith(".yacyh")) || (s[i].endsWith(".yacy"))) input += "," + s[i];
}
if (input.length() == 0) return input; else return input.substring(1);
}
public static String checkIPPortList(String input) {
input = normalizedList(input);
String[] s = input.split(",");
input = "";
for (int i = 0; i < s.length; i++) {
if (s[i].indexOf(':') >= 9) input += "," + s[i];
}
if (input.length() == 0) return input; else return input.substring(1);
}
}

@ -4,7 +4,7 @@
<title>YaCy '#[clientname]#': Local robots.txt</title> <title>YaCy '#[clientname]#': Local robots.txt</title>
#%env/templates/metas.template%# #%env/templates/metas.template%#
</head> </head>
<body id="ConfigProfile"> <body id="ConfigRobotsTxt">
#%env/templates/header.template%# #%env/templates/header.template%#
#%env/templates/submenuConfig.template%# #%env/templates/submenuConfig.template%#
<h2>Exclude Web-Spiders</h2> <h2>Exclude Web-Spiders</h2>

@ -195,7 +195,7 @@ public class DetailedSearch {
boolean global = post.get("global", "").equals("on"); boolean global = post.get("global", "").equals("on");
boolean postsort = post.get("postsort", "").equals("on"); boolean postsort = post.get("postsort", "").equals("on");
final boolean indexDistributeGranted = sb.getConfig("allowDistributeIndex", "true").equals("true"); final boolean indexDistributeGranted = sb.getConfig(plasmaSwitchboard.INDEX_DIST_ALLOW, "true").equals("true");
final boolean indexReceiveGranted = sb.getConfig("allowReceiveIndex", "true").equals("true"); final boolean indexReceiveGranted = sb.getConfig("allowReceiveIndex", "true").equals("true");
if (!indexDistributeGranted || !indexReceiveGranted) { global = false; } if (!indexDistributeGranted || !indexReceiveGranted) { global = false; }

@ -62,59 +62,6 @@
</td> </td>
</tr> </tr>
</table> </table>
<p>
<strong>DHT Transmission control:</strong><br />
The transmission is necessary for the functionality of global search on other peers.
If you switch off distribution or receipt of RWIs you will be banned from global search.
</p>
<table border="0" cellpadding="5" cellspacing="0">
<colgroup>
<col width="100" />
<col span="2"/>
</colgroup>
<tr valign="top" class="TableCellDark">
<td>Index&nbsp;Distribution:</td>
<td><input type="checkbox" name="indexDistribute" #(indexDistributeChecked)#::checked="checked" #(/indexDistributeChecked)#/></td>
<td></td>
<td>This enables automated, DHT-ruled Index Transmission to other peers.
</td>
</tr>
<tr valign="top" class="TableCellDark">
<td></td>
<td>&nbsp;&nbsp;</td>
<td><input type="checkbox" name="indexDistributeWhileCrawling" #(indexDistributeWhileCrawling)#::checked="checked" #(/indexDistributeWhileCrawling)#/></td>
<td>If checked, DHT-Transmission is enabled even during crawling.</td>
</tr>
<tr valign="top" class="TableCellDark">
<td>Index Receive:</td>
<td><input type="checkbox" name="indexReceive" #(indexReceiveChecked)#::checked="checked" #(/indexReceiveChecked)#::/></td>
<td></td>
<td>Accept remote Index Transmissions. This works only if you are a senior peer.
The DHT-rules do not work without this function.</td>
</tr>
<tr valign="top" class="TableCellDark">
<td></td>
<td>&nbsp;&nbsp;</td>
<td><input type="checkbox" name="indexReceiveBlockBlacklist" #(indexReceiveBlockBlacklistChecked)#::checked="checked" #(/indexReceiveBlockBlacklistChecked)#/></td>
<td>If checked, your peer silently ignores transmitted URLs that match your blacklist</td>
</tr>
<tr valign="top" class="TableCellDark">
<td>Peer Tags:</td>
<td colspan="2"><input type="text" name="peertags" value="#[peertags]#" size="40" maxlength="80" /></td>
<td>If your peer runs in 'Robinson Mode' (Distribution and Receive off), you probably run YaCy as a search engine
for your own search portal. Please describe your search portal with some keywords (comma-separated).
This will help to use your peer as search target even if you do not distribute your web index by
DHT distribution.</td>
</tr>
<tr valign="top" class="TableCellLight">
<td></td>
<td><input type="submit" name="setIndexTransmission" value="set" /></td>
<td></td>
<td>Changes will take effect immediately</td>
</tr>
</table>
</form> </form>
#(keyhashsimilar)#::Sequential List of Word-Hashes:<br /> #(keyhashsimilar)#::Sequential List of Word-Hashes:<br />
#{rows}# #{rows}#

@ -70,7 +70,6 @@ import de.anomic.net.URL;
import de.anomic.plasma.plasmaCondenser; import de.anomic.plasma.plasmaCondenser;
import de.anomic.plasma.plasmaSwitchboard; import de.anomic.plasma.plasmaSwitchboard;
import de.anomic.plasma.urlPattern.plasmaURLPattern; import de.anomic.plasma.urlPattern.plasmaURLPattern;
import de.anomic.server.serverCodings;
import de.anomic.server.serverObjects; import de.anomic.server.serverObjects;
import de.anomic.server.serverSwitch; import de.anomic.server.serverSwitch;
import de.anomic.yacy.yacyClient; import de.anomic.yacy.yacyClient;
@ -94,11 +93,6 @@ public class IndexControl_p {
prop.put("wcount", Integer.toString(switchboard.wordIndex.size())); prop.put("wcount", Integer.toString(switchboard.wordIndex.size()));
prop.put("ucount", Integer.toString(switchboard.wordIndex.loadedURL.size())); prop.put("ucount", Integer.toString(switchboard.wordIndex.loadedURL.size()));
prop.put("otherHosts", ""); prop.put("otherHosts", "");
prop.put("indexDistributeChecked", (switchboard.getConfig("allowDistributeIndex", "true").equals("true")) ? 1 : 0);
prop.put("indexDistributeWhileCrawling", (switchboard.getConfig("allowDistributeIndexWhileCrawling", "true").equals("true")) ? 1 : 0);
prop.put("indexReceiveChecked", (switchboard.getConfig("allowReceiveIndex", "true").equals("true")) ? 1 : 0);
prop.put("indexReceiveBlockBlacklistChecked", (switchboard.getConfig("indexReceiveBlockBlacklist", "true").equals("true")) ? 1 : 0);
prop.put("peertags", serverCodings.set2string(yacyCore.seedDB.mySeed.getPeerTags(), ",", false));
listHosts(prop, ""); listHosts(prop, "");
return prop; // be save return prop; // be save
} }
@ -122,40 +116,6 @@ public class IndexControl_p {
String[] urlx = post.getAll("urlhx.*"); String[] urlx = post.getAll("urlhx.*");
boolean delurl = post.containsKey("delurl"); boolean delurl = post.containsKey("delurl");
boolean delurlref = post.containsKey("delurlref"); boolean delurlref = post.containsKey("delurlref");
// System.out.println("DEBUG CHECK: " + ((delurl) ? "delurl" : "") + " " + ((delurlref) ? "delurlref" : ""));
// DHT control
if (post.containsKey("setIndexTransmission")) {
if (post.get("indexDistribute", "").equals("on")) {
switchboard.setConfig("allowDistributeIndex", "true");
} else {
switchboard.setConfig("allowDistributeIndex", "false");
}
if (post.get("indexDistributeWhileCrawling","").equals("on")) {
switchboard.setConfig("allowDistributeIndexWhileCrawling", "true");
} else {
switchboard.setConfig("allowDistributeIndexWhileCrawling", "false");
}
if (post.get("indexReceive", "").equals("on")) {
switchboard.setConfig("allowReceiveIndex", "true");
yacyCore.seedDB.mySeed.setFlagAcceptRemoteIndex(true);
} else {
switchboard.setConfig("allowReceiveIndex", "false");
yacyCore.seedDB.mySeed.setFlagAcceptRemoteIndex(false);
}
if (post.get("indexReceiveBlockBlacklist", "").equals("on")) {
switchboard.setConfig("indexReceiveBlockBlacklist", "true");
} else {
switchboard.setConfig("indexReceiveBlockBlacklist", "false");
}
if (post.containsKey("peertags")) {
yacyCore.seedDB.mySeed.setPeerTags(serverCodings.string2set((String) post.get("peertags"), ","));
}
}
// delete word // delete word
if (post.containsKey("keyhashdeleteall")) { if (post.containsKey("keyhashdeleteall")) {
@ -455,11 +415,6 @@ public class IndexControl_p {
// insert constants // insert constants
prop.put("wcount", Integer.toString(switchboard.wordIndex.size())); prop.put("wcount", Integer.toString(switchboard.wordIndex.size()));
prop.put("ucount", Integer.toString(switchboard.wordIndex.loadedURL.size())); prop.put("ucount", Integer.toString(switchboard.wordIndex.loadedURL.size()));
prop.put("indexDistributeChecked", (switchboard.getConfig("allowDistributeIndex", "true").equals("true")) ? 1 : 0);
prop.put("indexDistributeWhileCrawling", (switchboard.getConfig("allowDistributeIndexWhileCrawling", "true").equals("true")) ? 1 : 0);
prop.put("indexReceiveChecked", (switchboard.getConfig("allowReceiveIndex", "true").equals("true")) ? 1 : 0);
prop.put("indexReceiveBlockBlacklistChecked", (switchboard.getConfig("indexReceiveBlockBlacklist", "true").equals("true")) ? 1 : 0);
prop.put("peertags", serverCodings.set2string(yacyCore.seedDB.mySeed.getPeerTags(), ",", false));
// return rewrite properties // return rewrite properties
return prop; return prop;
} }

@ -219,52 +219,6 @@
</table> </table>
</form> </form>
<form action="IndexCreate_p.html" method="post" enctype="multipart/form-data">
<p id="distributedIndexing">
<strong>Distributed Indexing: </strong>
Crawling and indexing can be done by remote peers.
Your peer can search and index for other peers and they can search for you.
</p>
<table border="0" cellpadding="5" cellspacing="1">
<colgroup>
<col width="10%" />
<col />
</colgroup>
<tr valign="top" class="TableCellDark">
<td>
<input type="radio" name="dcr" id="acceptCrawlMax" value="acceptCrawlMax" #(acceptCrawlMaxChecked)#::checked="checked"#(/acceptCrawlMaxChecked)# />
</td>
<td><label for="acceptCrawlMax">Accept remote crawling requests and perform crawl at maximum load</label></td>
</tr>
<tr valign="top" class="TableCelllight">
<td>
<input type="radio" name="dcr" id="acceptCrawlLimited" value="acceptCrawlLimited" #(acceptCrawlLimitedChecked)#::checked="checked"#(/acceptCrawlLimitedChecked)# />
</td>
<td>
<label for="acceptCrawlLimited">Accept remote crawling requests and perform crawl at maximum of</label>
<input name="acceptCrawlLimit" type="text" size="4" maxlength="4" value="#[PPM]#" /> Pages Per Minute (minimum is 1, low system load usually at PPM &ge; 30)
</td>
</tr>
<tr valign="top" class="TableCellDark">
<td>
<input type="radio" name="dcr" id="acceptCrawlDenied" value="acceptCrawlDenied" #(acceptCrawlDeniedChecked)#::checked="checked"#(/acceptCrawlDeniedChecked)# />
</td>
<td>
<label for="acceptCrawlDenied">Do not accept remote crawling requests (please set this only if
you cannot accept to crawl only one page per minute; see option above)</label>
</td>
</tr>
<tr valign="top" class="TableCellLight">
<td>
<input type="submit" name="distributedcrawling" value="set" />
</td>
<td>
</td>
</tr>
</table>
</form>
<p> <p>
#(info)# #(info)#
:: ::

@ -52,7 +52,6 @@ import de.anomic.plasma.plasmaURL;
import de.anomic.plasma.plasmaSwitchboard; import de.anomic.plasma.plasmaSwitchboard;
import de.anomic.server.serverObjects; import de.anomic.server.serverObjects;
import de.anomic.server.serverSwitch; import de.anomic.server.serverSwitch;
import de.anomic.server.serverThread;
import de.anomic.yacy.yacyCore; import de.anomic.yacy.yacyCore;
import de.anomic.yacy.yacyNewsPool; import de.anomic.yacy.yacyNewsPool;
import de.anomic.yacy.yacyNewsRecord; import de.anomic.yacy.yacyNewsRecord;
@ -69,26 +68,6 @@ public class IndexCreate_p {
prop.put("refreshbutton", 0); prop.put("refreshbutton", 0);
if (post != null) { if (post != null) {
if (post.containsKey("distributedcrawling")) {
long newBusySleep = Integer.parseInt(env.getConfig(plasmaSwitchboard.CRAWLJOB_REMOTE_TRIGGERED_CRAWL_BUSYSLEEP, "100"));
if (post.get("dcr", "").equals("acceptCrawlMax")) {
env.setConfig("crawlResponse", "true");
newBusySleep = 100;
} else if (post.get("dcr", "").equals("acceptCrawlLimited")) {
env.setConfig("crawlResponse", "true");
int newppm = Integer.parseInt(post.get("acceptCrawlLimit", "1"));
if (newppm < 1) newppm = 1;
newBusySleep = 60000 / newppm;
if (newBusySleep < 100) newBusySleep = 100;
} else if (post.get("dcr", "").equals("acceptCrawlDenied")) {
env.setConfig("crawlResponse", "false");
}
serverThread rct = switchboard.getThread(plasmaSwitchboard.CRAWLJOB_REMOTE_TRIGGERED_CRAWL);
rct.setBusySleep(newBusySleep);
env.setConfig(plasmaSwitchboard.CRAWLJOB_REMOTE_TRIGGERED_CRAWL_BUSYSLEEP, Long.toString(newBusySleep));
//boolean crawlResponse = ((String) post.get("acceptCrawlMax", "")).equals("on");
//env.setConfig("crawlResponse", (crawlResponse) ? "true" : "false");
}
if (post.containsKey("pausecrawlqueue")) { if (post.containsKey("pausecrawlqueue")) {
switchboard.pauseCrawlJob(plasmaSwitchboard.CRAWLJOB_LOCAL_CRAWL); switchboard.pauseCrawlJob(plasmaSwitchboard.CRAWLJOB_LOCAL_CRAWL);
@ -152,30 +131,6 @@ public class IndexCreate_p {
prop.put("crawlingSpeedMinChecked", (LCppm <= 10) ? 1 : 0); prop.put("crawlingSpeedMinChecked", (LCppm <= 10) ? 1 : 0);
prop.put("customPPMdefault", ((LCppm > 10) && (LCppm < 1000)) ? Integer.toString(LCppm) : ""); prop.put("customPPMdefault", ((LCppm > 10) && (LCppm < 1000)) ? Integer.toString(LCppm) : "");
long RTCbusySleep = Integer.parseInt(env.getConfig(plasmaSwitchboard.CRAWLJOB_REMOTE_TRIGGERED_CRAWL_BUSYSLEEP, "100"));
if (RTCbusySleep < 100) {
RTCbusySleep = 100;
env.setConfig(plasmaSwitchboard.CRAWLJOB_REMOTE_TRIGGERED_CRAWL_BUSYSLEEP, Long.toString(RTCbusySleep));
}
if (env.getConfig("crawlResponse", "").equals("true")) {
if (RTCbusySleep <= 100) {
prop.put("acceptCrawlMaxChecked", 1);
prop.put("acceptCrawlLimitedChecked", 0);
prop.put("acceptCrawlDeniedChecked", 0);
} else {
prop.put("acceptCrawlMaxChecked", 0);
prop.put("acceptCrawlLimitedChecked", 1);
prop.put("acceptCrawlDeniedChecked", 0);
}
} else {
prop.put("acceptCrawlMaxChecked", 0);
prop.put("acceptCrawlLimitedChecked", 0);
prop.put("acceptCrawlDeniedChecked", 1);
}
int RTCppm = (RTCbusySleep == 0) ? 60 : (int) (60000L / RTCbusySleep);
if (RTCppm > 60) RTCppm = 60;
prop.put("PPM", RTCppm);
prop.put("xsstopwChecked", env.getConfig("xsstopw", "").equals("true") ? 1 : 0); prop.put("xsstopwChecked", env.getConfig("xsstopw", "").equals("true") ? 1 : 0);
prop.put("xdstopwChecked", env.getConfig("xdstopw", "").equals("true") ? 1 : 0); prop.put("xdstopwChecked", env.getConfig("xdstopw", "").equals("true") ? 1 : 0);
prop.put("xpstopwChecked", env.getConfig("xpstopw", "").equals("true") ? 1 : 0); prop.put("xpstopwChecked", env.getConfig("xpstopw", "").equals("true") ? 1 : 0);

@ -71,7 +71,7 @@ public class IndexShare_p {
} }
if (post.containsKey("indexsharesetting")) { if (post.containsKey("indexsharesetting")) {
switchboard.setConfig("allowDistributeIndex", (post.containsKey("distribute")) ? "true" : "false"); switchboard.setConfig(plasmaSwitchboard.INDEX_DIST_ALLOW, (post.containsKey("distribute")) ? "true" : "false");
switchboard.setConfig("allowReceiveIndex", (post.containsKey("receive")) ? "true" : "false"); switchboard.setConfig("allowReceiveIndex", (post.containsKey("receive")) ? "true" : "false");
switchboard.setConfig("defaultLinkReceiveFrequency", post.get("linkfreq", "30")); switchboard.setConfig("defaultLinkReceiveFrequency", post.get("linkfreq", "30"));
switchboard.setConfig("defaultWordReceiveFrequency", post.get("wordfreq", "10")); switchboard.setConfig("defaultWordReceiveFrequency", post.get("wordfreq", "10"));

@ -3,6 +3,7 @@
<ul class="SubMenu"> <ul class="SubMenu">
<li><a href="/ConfigBasic.html" class="MenuItemLink lock">Basic Configuraton</a></li> <li><a href="/ConfigBasic.html" class="MenuItemLink lock">Basic Configuraton</a></li>
<li><a href="/ConfigLanguage_p.html" class="MenuItemLink lock">Language</a></li> <li><a href="/ConfigLanguage_p.html" class="MenuItemLink lock">Language</a></li>
<li><a href="/ConfigNetwork_p.html" class="MenuItemLink lock">Network</a></li>
<li><a href="/ConfigProfile_p.html" class="MenuItemLink lock">Peer Profile</a></li> <li><a href="/ConfigProfile_p.html" class="MenuItemLink lock">Peer Profile</a></li>
<li><a href="/ConfigSkins_p.html" class="MenuItemLink lock">Interface Skins</a></li> <li><a href="/ConfigSkins_p.html" class="MenuItemLink lock">Interface Skins</a></li>
<li><a href="/ConfigRobotsTxt_p.html" class ="MenuItemLink lock">Local robots.txt</a></li> <li><a href="/ConfigRobotsTxt_p.html" class ="MenuItemLink lock">Local robots.txt</a></li>

@ -62,7 +62,7 @@ public class index {
final String cat = (post == null) ? "href" : post.get("cat", "href"); final String cat = (post == null) ? "href" : post.get("cat", "href");
final int type = (post == null) ? 0 : post.getInt("type", 0); final int type = (post == null) ? 0 : post.getInt("type", 0);
final boolean indexDistributeGranted = sb.getConfig("allowDistributeIndex", "true").equals("true"); final boolean indexDistributeGranted = sb.getConfig(plasmaSwitchboard.INDEX_DIST_ALLOW, "true").equals("true");
final boolean indexReceiveGranted = sb.getConfig("allowReceiveIndex", "true").equals("true"); final boolean indexReceiveGranted = sb.getConfig("allowReceiveIndex", "true").equals("true");
if (!indexDistributeGranted || !indexReceiveGranted) { global = false; } if (!indexDistributeGranted || !indexReceiveGranted) { global = false; }

@ -180,7 +180,7 @@ public class yacysearch {
} }
// SEARCH // SEARCH
final boolean indexDistributeGranted = sb.getConfig("allowDistributeIndex", "true").equals("true"); final boolean indexDistributeGranted = sb.getConfig(plasmaSwitchboard.INDEX_DIST_ALLOW, "true").equals("true");
final boolean indexReceiveGranted = sb.getConfig("allowReceiveIndex", "true").equals("true"); final boolean indexReceiveGranted = sb.getConfig("allowReceiveIndex", "true").equals("true");
final boolean offline = yacyCore.seedDB.mySeed.isVirgin(); final boolean offline = yacyCore.seedDB.mySeed.isVirgin();
if (offline || !indexDistributeGranted || !indexReceiveGranted) { global = false; } if (offline || !indexDistributeGranted || !indexReceiveGranted) { global = false; }

@ -522,6 +522,7 @@ public final class plasmaSwitchboard extends serverAbstractSwitch implements ser
* @see plasmaSwitchboard#INDEX_DIST_ALLOW_WHILE_CRAWLING * @see plasmaSwitchboard#INDEX_DIST_ALLOW_WHILE_CRAWLING
*/ */
public static final String INDEX_DIST_ALLOW = "allowDistributeIndex"; public static final String INDEX_DIST_ALLOW = "allowDistributeIndex";
public static final String INDEX_RECEIVE_ALLOW = "allowReceiveIndex";
/** /**
* <p><code>public static final String <strong>INDEX_DIST_ALLOW_WHILE_CRAWLING</strong> = "allowDistributeIndexWhileCrawling"</code></p> * <p><code>public static final String <strong>INDEX_DIST_ALLOW_WHILE_CRAWLING</strong> = "allowDistributeIndexWhileCrawling"</code></p>
* <p>Name of the setting whether Index Distribution shall be allowed while crawling is in progress, i.e. * <p>Name of the setting whether Index Distribution shall be allowed while crawling is in progress, i.e.
@ -1324,7 +1325,69 @@ public final class plasmaSwitchboard extends serverAbstractSwitch implements ser
} }
public boolean isRobinsonMode() { public boolean isRobinsonMode() {
return (yacyCore.seedDB.sizeConnected() == 0) && (yacyCore.seedDB.mySeed.isVirgin()); // we are in robinson mode, if we do not exchange index by dht distribution
// we need to take care that search requests and remote indexing requests go only
// to the peers in the same cluster, if we run a robinson cluster.
return getConfigBool(plasmaSwitchboard.INDEX_DIST_ALLOW, false) && !getConfigBool(plasmaSwitchboard.INDEX_RECEIVE_ALLOW, false);
}
public boolean isClosedRobinsonCluster() {
// robinson peers may be member of robinson clusters, which can be public or private
// this does not check the robinson attribute, only the specific subtype of the cluster
String clustermode = getConfig("cluster.mode", "publicpeer");
return (clustermode.equals("privatecluster")) || (clustermode.equals("privatepeer"));
}
public boolean isInMyCluster(String peer) {
// check if the given peer is in the own network, if this is a robinson cluster
// depending on the robinson cluster type, the peer String may be a peerhash (b64-hash)
// or a ip:port String or simply a ip String
if (!isRobinsonMode()) return false;
String clustermode = getConfig("cluster.mode", "publicpeer");
if (clustermode.equals("privatecluster")) {
// check if we got the request from a peer in the private cluster
String network = getConfig("cluster.peers.ipport", "");
return network.indexOf(peer) >= 0;
} else if (clustermode.equals("publiccluster")) {
// check if we got the request from a peer in the public cluster
String network = getConfig("cluster.peers.yacydomain", "");
// check for .yacyh hexhash-domain
String hexhash = yacySeed.b64Hash2hexHash(peer);
if (hexhash == null) return false;
if (network.indexOf(hexhash + ".yacyh") >= 0) return true;
// resolve seed
yacySeed seed = yacyCore.seedDB.get(peer);
if (seed == null) return false;
// check for .yacy (name) - Domain
if (network.indexOf(seed.getName() + ".yacy") >= 0) return true;
return false;
} else {
return false;
}
}
public boolean isInMyCluster(yacySeed seed) {
// check if the given peer is in the own network, if this is a robinson cluster
if (seed == null) return false;
if (!isRobinsonMode()) return false;
String clustermode = getConfig("cluster.mode", "publicpeer");
if (clustermode.equals("privatecluster")) {
// check if we got the request from a peer in the private cluster
String network = getConfig("cluster.peers.ipport", "");
return network.indexOf(seed.getAddress()) >= 0;
} else if (clustermode.equals("publiccluster")) {
// check if we got the request from a peer in the public cluster
String network = getConfig("cluster.peers.yacydomain", "");
// check for .yacyh hexhash-domain
String hexhash = yacySeed.b64Hash2hexHash(seed.hash);
if (hexhash == null) return false;
if (network.indexOf(hexhash + ".yacyh") >= 0) return true;
// check for .yacy (name) - Domain
if (network.indexOf(seed.getName() + ".yacy") >= 0) return true;
return false;
} else {
return false;
}
} }
public String urlExists(String hash) { public String urlExists(String hash) {

@ -83,8 +83,6 @@ pkcs12ImportPwd =
superseedFile=superseed.txt superseedFile=superseed.txt
superseedLocation=http://www.yacy.net/superseed.txt superseedLocation=http://www.yacy.net/superseed.txt
# network definition # network definition
# we distiguish local and global networks. Each network type can have different user groups # we distiguish local and global networks. Each network type can have different user groups
# groups can be uncontrolled, moderated or controlled # groups can be uncontrolled, moderated or controlled
@ -98,8 +96,22 @@ superseedLocation=http://www.yacy.net/superseed.txt
# network = all:world:global:uncontrolled:http://yacy.net/ # network = all:world:global:uncontrolled:http://yacy.net/
# the network-uri must have a sub-path yacy/seed.txt containing a list of urls pointing to the # the network-uri must have a sub-path yacy/seed.txt containing a list of urls pointing to the
# peer-address of peers within the group of that network # peer-address of peers within the group of that network
# several network definition strings can be listed in a single # several network definition strings can be listed
# clusters within a network:
# every network can have an unlimited number of clusters. Clusters may be also completely
# sealed and have no connection to other peers. When a cluster does not use the
# p2p protocol and the bootstraping mechanism to contact other peers, we call them
# Robinson peers. They can appear in different 'visibilities':
# - privatepeer: no connection and no data exchange to any other peer
# - privatecluster: connections only to self-defined addresses (other peers in same mode)
# - publiccluster: like privatecluster, but visible and searcheable by public p2p nodes
# - publicpeer: a single peer without cluster connection, but visible for p2p nodes
# all public robinson peers should use a peer tag string to be searcheable if in the
# search request these tags appear
cluster.mode=publicpeer
cluster.peers.yacydomain=localpeer.yacy
cluster.peers.ipport=localhost:8080
# bootstrapLoadTimeout # bootstrapLoadTimeout
# this is the time-out for loading of the seedlist files during bootstraping # this is the time-out for loading of the seedlist files during bootstraping
@ -541,7 +553,7 @@ filterOutStopwordsFromTopwords=true
# ram cache for database files # ram cache for database files
# ram cache for assortment cache cluster (for all 64 files) # ram cache for collection index
ramCacheRWI_time = 30000 ramCacheRWI_time = 30000
# ram cache for responseHeader.db # ram cache for responseHeader.db

Loading…
Cancel
Save