A port value of -1 will disable this option.
If set to a value greater 0, YaCy listens on this of on the local loopback
address (127.0.0.1) for a shutdown or restart signal.
E.g. connect to http://localhost:8005/shutdown will stop the YaCy server.
http://localhost:8005/restart will restart it.
This option allows to stop YaCy locally independant from the web web
frontend (which might be configured for password protected remote access).
This is a fix for mantis 715 (http://mantis.tokeek.de/view.php?id=715).
A possible path scenario that could leading to this case :
- YaCy is running low in memory
- a search is requested
- before the end of search results rendering, the cleanup job runs and
deletes the running search event from the cache because of short memory
- then yacysearchitem renders with "-UNRESOLVED_PATTERN-" parameter
values passed to the statistics() JavaScript function
HTTP "Referer" header sent by the browser when using YaCy can now be
controlled either with the referrer meta tag as a global policy, or only
for search result links by adding the attribute rel="noreferrer".
To improve privacy with the less possible regressions, the default is
set as meta tag with value "origin-when-cross-origin" : internal YaCy
links behavior is not affected, but when visiting external websites
referrer url is not empty but stripped from query parameters and path.
Older browsers, Safari, MS IE and Edge do not support the referrer meta
tag, so the standard but less flexible noreferrer link type can also be
enabled as an alternative.
User-friendly settings page to be implemented.
- Added a new method to check activation of mandatory fields on
Collection Configuration commit, consistently with checks previously
performed in Switchboard startup and with mandatory fields in the
default schema.
- Reorganized default schema and CollectionConfiguration enumeration :
moved no more mandatory fields in a specific section, and moved fields
enabled at startup to the mandatory section.
- Marked mandatory fields as required and with stronger font in the
IndexSchema_p.html page
As noticed by @reger24, abusive use of OpenSearch systems should be
prevented, especially if allowing to parse and reuse HTML results.
robots.txt file is now checked before requesting an external OpenSearch
system to respect the host exclusions and eventual crawl-delay value.
The check is also performed when trying to add a new OpenSearch URL
template through the /ConfigHeuristics_p.html admin page.
As discussed in PR #93 with @JeremyRand and @reger24 this new advanced
settings page includes:
- a new setting to control remote Solr responses encoding
- some existing debug settings which could not be set through the admin
user interface
This property has been deprecated four years ago by commit
d74472f562. For any active search event
id, it was then always filled with "-UNRESOLVED_PATTERN-".
This is to be more easily understandable and to reflect more accurately
the current memory strategies implementations that eventually set the
"proper" state not only because DHT reception.
As described in mantis 725 (http://mantis.tokeek.de/view.php?id=725) the
HostBrowser.xml api directory entries had incorrect count attribute
value.
This was because the HostBrowser html page and backing template servlet
evolved, but modifications were not reported on the xml api.
This will prevent mistakenly hiding a div element not designed to be an
infobox but having a ".info" parent (After having previously added the
possibility for a div - and not only a span element - to be an infobox).
As mentioned in issue #103, control settings over YaCy disk usage
already existed but lacked a user-friendly way to set them.
I added it to the Performance_p.html administration page with a little
refactoring on the "Resource Observer" fieldset for improved
accessibility and HTML standards respect.
Also added the possibility to enable/disable the autoregulation fonction
from this page.
Made use of ol, li, thead, th, tbody, h1 and h2 html tags.
Added aria-label attributes to provide alternative textual information
previously only conveyed by color cue.
Tested behavior with NVDA 2016.4 screen reader.
Also removed deprecated HTML attributes uses.
Validation performed with Nu Html Checker 17.1.0.
Cross browser tested with :
- Debian Jessie : Firefox ESR 45.6.0
- MS Windows 10 : Firefox 50.1.0, Chrome 55.0.2883.87, MS Edge
In the /HostBrowser.html page "only hosts with urls pending in the
crawler", "only with load errors" and "Administration Options" all
require administration credentials. But they were displayed even to
unauthenticated users, and clicking them did nothing and returned the
/HostBrowser.html page empty.
This new "documentStructure" parameter can be set to false to only get
hosts accumulated references on a resource and thus prevent scraping the
specified URL and getting citations references.
Also set WebStructureGraph constants as final and updated the Javadoc
with example api call URLs.
Host names should not contain XML special characters such as quotation
mark, but at this stage the WebGraph may have mistakenly recorded a host
name with such characters. What's more the DigestURL constructor does
not prevent this.
By the way using serverObjects.putXML to encode host names we ensure
here the rendered XML is well formed and can be parsed by external tools
even if an structure entry is incorrect.
As described in mantis 721 (http://mantis.tokeek.de/view.php?id=721)
WatchWebStructure_p.html failed to include in its structure view https
and other protocols and ports than default http.
As described in mantis 720 (http://mantis.tokeek.de/view.php?id=720),
when requesting this API with a domain name instead of a complete URL
only HTTP references on default port were listed.
This ensure consistent implementation of the url host hash generation
and easier usage finding in source code.
Also added a unit test for this function.
The augmented Browsing option was reduced to the web proxy functionallity.
Augmented browsing is not available and no known plan exist to reimplement
alteration of result pages with additional information.
Measurement sample : import from blacklist local file containing about
15000 entries
- before refactoring : several minutes
- after refactoring : a few seconds!
Favicon display only makes sense for http(s) websites, being public or
intranet. So I modified the favicon conditional display to verify the
result URL protocol rather than if we are in intranet mode.
Also prevented rendering an img HTML tag with empty src on other results
protocols such as ftp or file.
Fixing this thanks to priest2 report
(http://forum.yacy-websuche.de/viewtopic.php?f=23&t=5923).
to use javax.servlet.http.Cookie parameters.
Depreciate now obsolete getHeaderCookies.
Adjust setting of MaxAge to spec if >= 0 otherwise keep default.
For this on the header of the viewed result a "add bookmark" button is
available (for authenticated users).
Currently the bookmark is added to a (virtual) bookmark folder "/proxy"
w/o any additional tags etc.