- use JSoup parser for selective rewrite of html body <a href= links only,
instead of regex which rewrites also header href/src links
- this improves display of pages which use header <base> tag
- tags with src attribute are taken from original location (like css) improving display and are not routed trough the indexer
Disadvantage: scripting links will drop out of proxy
Setting of the servlet through web.xml exclusivly (in case one would like to quickly switch back to the YaCyProxyServlet,
leaving the existing code of YaCyProxyServlet untouched available)
- move htroot exist check from old httpdfilehandler to startup, remove from filehandler and legacy proxyhandler
- use SwitchboardConstant.htroot where appropriate
request into a separate thread and ignores the furthure result of a
request if that does not answer within the requested time-out. This is a
try to solve a problem with the peer-ping, which hangs whenever a peer
appears to be dead or blocked.
StackTrace For input string: ""
java.lang.NumberFormatException: For input string: ""
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:504)
at java.lang.Integer.parseInt(Integer.java:527)
at net.yacy.server.http.TemplateEngine.writeTemplate(TemplateEngine.java:241)
at net.yacy.server.http.TemplateEngine.writeTemplate(TemplateEngine.java:199)
at net.yacy.http.servlets.YaCyDefaultServlet.handleTemplate(YaCyDefaultServlet.java:896)
- the admin user name can be configured, in apiExec calls the default "admin" username is used.
TODO: the bin/apicall.sh script should likely take that into account.
- introduce a YaCyHttp interface to modulize/separate http server
- adjust the Jetty version specific implementation part (in package net.yacy.http)
- putting the version specific code in classes starting with Jetty8xxxx
- moved existing Jetty9xxx implementation into a test class (to keep the code)
- adjust build to the changed jars
- make use of the introduced YaCyHttpServer interface in related htroot servlets
- adjust other test cases/classes
are deleted to terminate the crawl because otherwise the crawl will go
on after the load-from-passive stack policy.
- better check if a crawl is terminated using the loader queue.
all unique links! This made it necessary, that a large portion of the
parser and link processing classes must be adopted to carry a different
type of link collection which carry a property attribute which are
attached to web anchors.
- introduction of a new URL class, AnchorURL
- the other url classes, DigestURI and MultiProtocolURI had been renamed
and refactored to fit into a new document package schema, document.id
- cleanup of net.yacy.cora.document package and refactoring