Measurement sample : import from blacklist local file containing about
15000 entries
- before refactoring : several minutes
- after refactoring : a few seconds!
Favicon display only makes sense for http(s) websites, being public or
intranet. So I modified the favicon conditional display to verify the
result URL protocol rather than if we are in intranet mode.
Also prevented rendering an img HTML tag with empty src on other results
protocols such as ftp or file.
Fixing this thanks to priest2 report
(http://forum.yacy-websuche.de/viewtopic.php?f=23&t=5923).
host_id_s. Use Solr setField instead of addField to prevent
java.lang.ClassCastException: java.util.ArrayList cannot be cast to java.lang.String
at net.yacy.kelondro.data.meta.URIMetadataNode.hosthash(URIMetadataNode.java:247)
at net.yacy.search.query.SearchEvent.addNodes(SearchEvent.java:966)
at net.yacy.peers.Protocol.solrQuery(Protocol.java:1242)
at net.yacy.peers.RemoteSearch$2.run(RemoteSearch.java:349)
This is the latest Java 7 compatible jgit release.
Properly support GitHub tags marked as "Pre-release".
With the previous venerable jgit version 1.1.0, a YaCy repository clone
having such a tag made GitRevTask and GitRevMavenTask crash.
to use javax.servlet.http.Cookie parameters.
Depreciate now obsolete getHeaderCookies.
Adjust setting of MaxAge to spec if >= 0 otherwise keep default.
- Adjusted start script in debug mode to make sure the main java process
can receive signals such as SIGTERM
- Modified docker images main command to properly propagate SIGTERM
signal to the main java process
The main shutdown hook thread was not properly waiting for the main
thread termination which consequently could not properly close resources
and threads. After terminating a running YaCy peer this way (Ctrl+C in
console, or kill <pid> for example), you could see the still existing
DATA/yacy.running file.
Tested with :
- Debian Jessie openjdk 7 and 8 : regular shutdown, Ctrl+C, kill
command, system restart while yacy is running
- Windows 10 Oracle JDK 7 and 8 : non regression on regular shutdown
add "datetime" property of <time> tag to scrapers startdate list.
Datetime is parsed as iso8601 (xml) date, html5 allows partial as well
as duration (not handled by this)
For this on the header of the viewed result a "add bookmark" button is
available (for authenticated users).
Currently the bookmark is added to a (virtual) bookmark folder "/proxy"
w/o any additional tags etc.
Applied rules :
- when the FTP URL denotes a file resource, stack it as any start URL :
eventually embedded links can be followed applying the usual depth rules
- when the FTP URL denotes a directory, list files under this directory
and stack them for crawl, and repeat the process on sub folders until
crawl depth is reached
to not already actives.
Dht results are now included in count this might over shoot on redundant
dht and solr, while the previous solr facet based was always low.
Fixes issue #90 for local queries only: Stealth mode, Portal mode or
Intranet mode.
For P2p mode, the issue would probably be difficult to solve with
reasonable performance. This is still to dig.
Also switched some InterreputedException catch log messages to warn
level as this is normal behavior when shutting down a peer.
Fixed yacysearch buttons navbar behavior to deal correctly with total
results count or offset over 1000. Also improved the buttons navbar to
be able to navigate over 10th page for local queries.
parameters on SSI (server side includes).
Query parameters are already merged by dispatcher.include, making copy
of parameter (RequestDispatcher.INCLUDE_QUERY_STRING) obsolete.
All other parameter are not used as YaCy servlet arguments.
exclude servletPath option as resources are always relative to htroot
or htdocs, the change reflects this.
Theoretically it and the recent adjustments arcording relative urls
allows to configure the instance to be configurable in a path other as
root (/)
The default redirection strategy when using directly HTTPClient is
incorrect when redirection is cross host (the original Host header is
still sent when requesting the redirected location).
YaCy LoaderDispatcher handles redirections properly, thus release
archive files using redirected URLs (such as the URLs on a GitHub
Release page) are successfully downloaded.
When a downloaded archive release is corrupted, empty, or can not be
opened for any reason, the update script must not be launched because it
erases the existing lib/*.jar libraries.