Allow JS resorting of search results by unauthenticated users

Acces rate limitations to this search mode by unauthenticated users are
set low by default to prevent unwanted server overload but can be
customized through the SearchAccessRate_p.html configuration page

Fixes #291
pull/292/head
luccioman 6 years ago
parent 0ab2b49c31
commit a8316c79da

@ -940,6 +940,11 @@ search.public.max.p2p.access.3s = 1
search.public.max.p2p.access.1mn = 6 search.public.max.p2p.access.1mn = 6
search.public.max.p2p.access.10mn = 60 search.public.max.p2p.access.10mn = 60
# Maximum numbers of accesses within a given time period to the search interface in P2P mode with browser-side JavaScript remote results resorting for unauthenticated users and authenticated users with no extended search right
search.public.max.p2p.jsresort.access.3s = 1
search.public.max.p2p.jsresort.access.1mn = 1
search.public.max.p2p.jsresort.access.10mn = 10
# Maximum number of accesses within a given time period to the search interface to support fetching remote results snippets for unauthenticated users and authenticated users with no extended search right # Maximum number of accesses within a given time period to the search interface to support fetching remote results snippets for unauthenticated users and authenticated users with no extended search right
search.public.max.remoteSnippet.access.3s = 1 search.public.max.remoteSnippet.access.3s = 1
search.public.max.remoteSnippet.access.1mn = 4 search.public.max.remoteSnippet.access.1mn = 4

@ -71,8 +71,11 @@
<input type="radio" name="search.jsresort" value="false" #(search.jsresort)#checked="checked"::#(/search.jsresort)# />On demand, server-side <input type="radio" name="search.jsresort" value="false" #(search.jsresort)#checked="checked"::#(/search.jsresort)# />On demand, server-side
</label> </label>
<label title="This usually improves ranking accuracy, but doesn't work well for users who have Javascript disabled, are using screen readers, or are on slow computers."> <label title="This usually improves ranking accuracy, but doesn't work well for users who have Javascript disabled, are using screen readers, or are on slow computers.">
<input type="radio" name="search.jsresort" value="true" #(search.jsresort)#::checked="checked"#(/search.jsresort)# />Automated, with JavaScript in the browser <strong>for authenticated users only</strong>. <input type="radio" name="search.jsresort" value="true" #(search.jsresort)#::checked="checked"#(/search.jsresort)# />Automated, with JavaScript in the browser.
</label> </label>
<p>Automated results resorting with JavaScript makes the browser load the full result set of each search request.
This may lead to high system loads on the server.
Please check the 'Peer-to-peer search with JavaScript results resorting' section in the <a href="SearchAccessRate_p.html">Local Search access rate</a> configuration page to set up proper limitations on this mode by unauthenticated users.</p>
</dd> </dd>
<dt>Remote search encryption</dt> <dt>Remote search encryption</dt>

@ -9,10 +9,8 @@
#%env/templates/submenuAccessTracker.template%# #%env/templates/submenuAccessTracker.template%#
<h2>Local Search access rate limitations</h2> <h2>Local Search access rate limitations</h2>
<p> <p>
You can configure here limitations on access rate to this peer search You can configure here limitations on access rate to this peer search interface by unauthenticated users and users without extended search right
interface by unauthenticated users and users without extended search (see the <a href="ConfigAccounts_p.html">Accounts</a> configuration page for details on users rights).
right (see the <a href="ConfigAccounts_p.html">Accounts</a>
configuration page for details on users rights).
</p> </p>
<form action="SearchAccessRate_p.html" method="post" <form action="SearchAccessRate_p.html" method="post"
@ -22,9 +20,10 @@
<fieldset> <fieldset>
<legend>YaCy search</legend> <legend>YaCy search</legend>
<p>Access rate limitations to this peer search interface. When a <p>
user with limited rights (unauthenticated or without extended search Access rate limitations to this peer search interface.
right) exceeds a limit, the search is blocked.</p> When a user with limited rights (unauthenticated or without extended search right) exceeds a limit, the search is blocked.
</p>
<div class="form-group"> <div class="form-group">
<label class="control-label col-sm-6 col-md-4 col-lg-3" <label class="control-label col-sm-6 col-md-4 col-lg-3"
@ -58,10 +57,10 @@
<fieldset> <fieldset>
<legend>Peer-to-peer search</legend> <legend>Peer-to-peer search</legend>
<p>Access rate limitations to the peer-to-peer search mode. When <p>
a user with limited rights (unauthenticated or without extended Access rate limitations to the peer-to-peer search mode.
search right) exceeds a limit, the search scope falls back to only When a user with limited rights (unauthenticated or without extended search right) exceeds a limit, the search scope falls back to only this local peer index.
this local peer index.</p> </p>
<div class="form-group"> <div class="form-group">
<label class="control-label col-sm-6 col-md-4 col-lg-3" <label class="control-label col-sm-6 col-md-4 col-lg-3"
@ -85,7 +84,7 @@
</div> </div>
<div class="form-group"> <div class="form-group">
<label class="control-label col-sm-6 col-md-4 col-lg-3" <label class="control-label col-sm-6 col-md-4 col-lg-3"
for="search.public.max.p2p.access.10mn">Max searches in 10mn</label> for="search.public.max.p2p.access.10mn">Max searches in 10mn</label>
<div class="col-sm-3 col-md-2"> <div class="col-sm-3 col-md-2">
<input class="form-control" id="search.public.max.p2p.access.10mn" <input class="form-control" id="search.public.max.p2p.access.10mn"
name="search.public.max.p2p.access.10mn" type="number" name="search.public.max.p2p.access.10mn" type="number"
@ -95,18 +94,59 @@
</div> </div>
</fieldset> </fieldset>
<fieldset>
<legend>Peer-to-peer search with JavaScript results resorting</legend>
<p>
Access rate limitations to the peer-to-peer search mode with browser-side JavaScript results resorting enabled
(check the 'Remote results resorting' section in the <a href="ConfigPortal_p.html">Search Portal</a> configuration page).
When a user with limited rights (unauthenticated or without extended search right) exceeds a limit, results resorting becomes only applicable on demand, server-side.
</p>
<div class="form-group">
<label class="control-label col-sm-6 col-md-4 col-lg-3"
for="search.public.max.p2p.jsresort.access.3s">Max searches in 3s</label>
<div class="col-sm-3 col-md-2">
<input class="form-control" id="search.public.max.p2p.jsresort.access.3s"
name="search.public.max.p2p.jsresort.access.3s" type="number"
value="#[search.public.max.p2p.jsresort.access.3s]#" min="0"
max="2147483647" />
</div>
</div>
<div class="form-group">
<label class="control-label col-sm-6 col-md-4 col-lg-3"
for="search.public.max.p2p.jsresort.access.1mn">Max searches in 1mn</label>
<div class="col-sm-3 col-md-2">
<input class="form-control" id="search.public.max.p2p.jsresort.access.1mn"
name="search.public.max.p2p.jsresort.access.1mn" type="number"
value="#[search.public.max.p2p.jsresort.access.1mn]#" min="0"
max="2147483647" />
</div>
</div>
<div class="form-group">
<label class="control-label col-sm-6 col-md-4 col-lg-3"
for="search.public.max.p2p.jsresort.access.10mn">Max searches in 10mn</label>
<div class="col-sm-3 col-md-2">
<input class="form-control" id="search.public.max.p2p.jsresort.access.10mn"
name="search.public.max.p2p.jsresort.access.10mn" type="number"
value="#[search.public.max.p2p.jsresort.access.10mn]#" min="0"
max="2147483647" />
</div>
</div>
</fieldset>
<fieldset> <fieldset>
<legend>Remote snippet load</legend> <legend>Remote snippet load</legend>
<p>Limitations on snippet loading from remote websites. When a <p>
user with limited rights (unauthenticated or without extended search Limitations on snippet loading from remote websites.
right) exceeds a limit, the snippets fetch strategy falls back to When a user with limited rights (unauthenticated or without extended search right) exceeds a limit, the snippets fetch strategy falls back to 'CACHEONLY'
'CACHEONLY' (check the default Snippet Fetch Strategy on the <a href="ConfigPortal_p.html">Search Portal</a> configuration page).</p> (check the default Snippet Fetch Strategy on the <a href="ConfigPortal_p.html">Search Portal</a> configuration page).
</p>
<div class="form-group"> <div class="form-group">
<label class="control-label col-sm-6 col-md-4 col-lg-3" <label class="control-label col-sm-6 col-md-4 col-lg-3"
for="search.public.max.remoteSnippet.access.3s">Max for="search.public.max.remoteSnippet.access.3s">Max searches in 3s</label>
searches in 3s</label>
<div class="col-sm-3 col-md-2"> <div class="col-sm-3 col-md-2">
<input class="form-control" <input class="form-control"
id="search.public.max.remoteSnippet.access.3s" id="search.public.max.remoteSnippet.access.3s"
@ -117,8 +157,7 @@
</div> </div>
<div class="form-group"> <div class="form-group">
<label class="control-label col-sm-6 col-md-4 col-lg-3" <label class="control-label col-sm-6 col-md-4 col-lg-3"
for="search.public.max.remoteSnippet.access.1mn">Max for="search.public.max.remoteSnippet.access.1mn">Max searches in 1mn</label>
searches in 1mn</label>
<div class="col-sm-3 col-md-2"> <div class="col-sm-3 col-md-2">
<input class="form-control" <input class="form-control"
id="search.public.max.remoteSnippet.access.1mn" id="search.public.max.remoteSnippet.access.1mn"
@ -129,8 +168,7 @@
</div> </div>
<div class="form-group"> <div class="form-group">
<label class="control-label col-sm-6 col-md-4 col-lg-3" <label class="control-label col-sm-6 col-md-4 col-lg-3"
for="search.public.max.remoteSnippet.access.10mn">Max for="search.public.max.remoteSnippet.access.10mn">Max searches in 10mn</label>
searches in 10mn</label>
<div class="col-sm-3 col-md-2"> <div class="col-sm-3 col-md-2">
<input class="form-control" <input class="form-control"
id="search.public.max.remoteSnippet.access.10mn" id="search.public.max.remoteSnippet.access.10mn"
@ -142,13 +180,15 @@
</fieldset> </fieldset>
<div class="form-group"> <div class="form-group">
<div class="col-xs-offset-1 col-sm-offset-3 col-md-offset-3 col-lg-offset-2"> <div
class="col-xs-offset-1 col-sm-offset-3 col-md-offset-3 col-lg-offset-2">
<input type="submit" class="btn btn-primary" name="set" <input type="submit" class="btn btn-primary" name="set"
value="Submit" aria-describedby="changeInfo" /> <input value="Submit" aria-describedby="changeInfo" />
<input
type="submit" class="btn btn-default" name="setDefaults" type="submit" class="btn btn-default" name="setDefaults"
value="Set defaults" title="Reset to defaults settings" value="Set defaults" title="Reset to defaults settings"
aria-describedby="changeInfo" /> <em id="changeInfo">Changes aria-describedby="changeInfo" />
will take effect immediately.</em> <em id="changeInfo">Changes will take effect immediately.</em>
</div> </div>
</div> </div>
</form> </form>

@ -300,6 +300,10 @@ public class yacysearch {
/* Maximum number of suggestions to display in the first results page */ /* Maximum number of suggestions to display in the first results page */
final int meanMax = post.getInt("meanCount", 0); final int meanMax = post.getInt("meanCount", 0);
boolean jsResort = global
&& (contentdom == ContentDomain.ALL || contentdom == ContentDomain.TEXT) // For now JavaScript resorting can only be applied for text search
&& sb.getConfigBool(SwitchboardConstants.SEARCH_JS_RESORT, SwitchboardConstants.SEARCH_JS_RESORT_DEFAULT);
// check the search tracker // check the search tracker
TreeSet<Long> trackerHandles = sb.localSearchTracker.get(client); TreeSet<Long> trackerHandles = sb.localSearchTracker.get(client);
@ -344,6 +348,7 @@ public class yacysearch {
SearchAccessRateConstants.PUBLIC_MAX_P2P_ACCESS_3S.getKey(), SearchAccessRateConstants.PUBLIC_MAX_P2P_ACCESS_3S.getKey(),
SearchAccessRateConstants.PUBLIC_MAX_P2P_ACCESS_3S.getDefaultValue())) { SearchAccessRateConstants.PUBLIC_MAX_P2P_ACCESS_3S.getDefaultValue())) {
global = false; global = false;
jsResort = false;
ConcurrentLog.warn("LOCAL_SEARCH", "ACCESS CONTROL: CLIENT FROM " ConcurrentLog.warn("LOCAL_SEARCH", "ACCESS CONTROL: CLIENT FROM "
+ client + client
+ ": " + ": "
@ -354,6 +359,25 @@ public class yacysearch {
+ accInTenMinutes + accInTenMinutes
+ "/600s, " + "/600s, "
+ " requests, disallowed global search"); + " requests, disallowed global search");
} else if (accInTenMinutes >= sb.getConfigInt(SearchAccessRateConstants.PUBLIC_MAX_P2P_JSRESORT_ACCESS_10MN.getKey(),
SearchAccessRateConstants.PUBLIC_MAX_P2P_JSRESORT_ACCESS_10MN.getDefaultValue())
|| accInOneMinute >= sb.getConfigInt(
SearchAccessRateConstants.PUBLIC_MAX_P2P_JSRESORT_ACCESS_1MN.getKey(),
SearchAccessRateConstants.PUBLIC_MAX_P2P_JSRESORT_ACCESS_1MN.getDefaultValue())
|| accInThreeSeconds >= sb.getConfigInt(
SearchAccessRateConstants.PUBLIC_MAX_P2P_JSRESORT_ACCESS_3S.getKey(),
SearchAccessRateConstants.PUBLIC_MAX_P2P_JSRESORT_ACCESS_3S.getDefaultValue())) {
jsResort = false;
ConcurrentLog.warn("LOCAL_SEARCH", "ACCESS CONTROL: CLIENT FROM "
+ client
+ ": "
+ accInThreeSeconds
+ "/3s, "
+ accInOneMinute
+ "/60s, "
+ accInTenMinutes
+ "/600s, "
+ " requests, disallowed JavaScript resorting of global search results");
} }
} }
// protection against too many remote server snippet loads (protects traffic on server) // protection against too many remote server snippet loads (protects traffic on server)
@ -870,7 +894,7 @@ public class yacysearch {
prop.put("geoinfo_loc", i); prop.put("geoinfo_loc", i);
prop.put("geoinfo", "1"); prop.put("geoinfo", "1");
} }
// update the search tracker // update the search tracker
try { try {
synchronized ( trackerHandles ) { synchronized ( trackerHandles ) {
@ -902,9 +926,6 @@ public class yacysearch {
prop.put("num-results_globalresults_remoteIndexCount", Formatter.number(theSearch.remote_rwi_available.get() + theSearch.remote_solr_available.get(), true)); prop.put("num-results_globalresults_remoteIndexCount", Formatter.number(theSearch.remote_rwi_available.get() + theSearch.remote_solr_available.get(), true));
prop.put("num-results_globalresults_remotePeerCount", Formatter.number(theSearch.remote_rwi_peerCount.get() + theSearch.remote_solr_peerCount.get(), true)); prop.put("num-results_globalresults_remotePeerCount", Formatter.number(theSearch.remote_rwi_peerCount.get() + theSearch.remote_solr_peerCount.get(), true));
final boolean jsResort = global && extendedSearchRights // for now enable JavaScript resorting only for authenticated users as it requires too much resources per search request
&& (contentdom == ContentDomain.ALL || contentdom == ContentDomain.TEXT) // For now JavaScript resorting can only be applied for text search
&& sb.getConfigBool(SwitchboardConstants.SEARCH_JS_RESORT, SwitchboardConstants.SEARCH_JS_RESORT_DEFAULT);
prop.put("jsResort", jsResort); prop.put("jsResort", jsResort);
prop.put("num-results_jsResort", jsResort); prop.put("num-results_jsResort", jsResort);

@ -729,7 +729,6 @@ or extended to pages including such medias (provide generally more results, but
Remote results resorting==远端搜索结果排序 Remote results resorting==远端搜索结果排序
>On demand, server-side==>根据需要, 服务器侧 >On demand, server-side==>根据需要, 服务器侧
Automated, with JavaScript in the browser==自动化, 基于嵌入浏览器的JavaScript代码 Automated, with JavaScript in the browser==自动化, 基于嵌入浏览器的JavaScript代码
>for authenticated users only<==>(仅限经过身份验证的用户)<
Remote search encryption==远端搜索加密 Remote search encryption==远端搜索加密
Prefer https for search queries on remote peers.==首选https用于远端节点上的搜索查询. Prefer https for search queries on remote peers.==首选https用于远端节点上的搜索查询.
When SSL/TLS is enabled on remote peers, https should be used to encrypt data exchanged with them when performing peer-to-peer searches.==在远端节点上启用SSL/TLS时,应使用https来加密在执行P2P搜索时与它们交换的数据. When SSL/TLS is enabled on remote peers, https should be used to encrypt data exchanged with them when performing peer-to-peer searches.==在远端节点上启用SSL/TLS时,应使用https来加密在执行P2P搜索时与它们交换的数据.

@ -71,6 +71,30 @@ public enum SearchAccessRateConstants {
*/ */
PUBLIC_MAX_P2P_ACCESS_10MN("search.public.max.p2p.access.10mn", 60), PUBLIC_MAX_P2P_ACCESS_10MN("search.public.max.p2p.access.10mn", 60),
/**
* Configuration for the maximum number of accesses within three seconds to the
* search interface in P2P mode with browser-side JavaScript results resorting
* enabled for unauthenticated users and authenticated users with no extended
* search right
*/
PUBLIC_MAX_P2P_JSRESORT_ACCESS_3S("search.public.max.p2p.jsresort.access.3s", 1),
/**
* Configuration for the maximum number of accesses within one minute to the
* search interface in P2P mode with browser-side JavaScript results resorting
* enabled for unauthenticated users and authenticated users with no extended
* search right
*/
PUBLIC_MAX_P2P_JSRESORT_ACCESS_1MN("search.public.max.p2p.jsresort.access.1mn", 1),
/**
* Configuration for the maximum number of accesses within ten minutes to the
* search interface in P2P mode with browser-side JavaScript results resorting
* enabled for unauthenticated users and authenticated users with no extended
* search right
*/
PUBLIC_MAX_P2P_JSRESORT_ACCESS_10MN("search.public.max.p2p.jsresort.access.10mn", 10),
/** /**
* Configuration for the maximum number of accesses within three seconds to the * Configuration for the maximum number of accesses within three seconds to the
* search interface to support fetching remote results snippets for * search interface to support fetching remote results snippets for

Loading…
Cancel
Save