Consistently with the custom Solr http client used for https connections
to remote Solr peers or to YaCy external Solr storage.
This prevent remote Solr requests threads to wait for establishing a
connection to a remote peer longer than the configured timeout.
Initializing Thread names using the Thread constructor parameter is
faster as it already sets a thread name even if no customized one is
given, while an additional call to the Thread.setName() function
internally do synchronized access, eventually runs access check on the
security manager and performs a native call.
Profiling a running YaCy server revealed that the total processing time
spent on Thread.setName() for a typical p2p search was in the range of
seconds.
* less CPU usage using the Solr 'allowedTime' parameter
* increase chances to get some results even when a first operation step
goes in time out by letting some time for final snippets results
processing
Solr can provide partial results for example when a processing time
limit (specified with the parameter `timeAllowed`) is exceeded.
Before this fix, getting partial results from an embedded Solr index
resulted in a ClassCastException :
"org.apache.solr.common.SolrDocumentList cannot be cast to
org.apache.solr.response.ResultContext".
That was caused by concurrent modifications (with addHighlightField()
function) to the same SolrQuery instance when requesting Solr on remote
peers in p2p search.
By not generating MD5 hashes on all words of indexed texts, processing
time is reduced by 30 to 50% on indexed documents with more than 1Mbytes
of plain text.
if at least one of the image size fields is enabled in index (images_height_val,
images_width_val, images_pixel_val).
Previously all fields were required to be enabled (hint: default setting
is height + width enabled)
Ensuring http post method is used for operations with server-side
effects (in respect of http semantics), and a valid transaction token is
provided by the user-agent.
- to prevent unwanted exposure of index entries about private
local/intranet documents when switching from "Intranet Indexing" mode
while attached to remote Solr instance(s)
- to warn user about remote Solr instance(s) still attached when
switching from modes other than "Intranet Indexing"
- properly handle IPv6 loopback address replacement
- replace loopback address or host only when accessing peer remotely
- replace loopback part with the peer hostname as requested rather than
with its seed public IP as this works better for Intranet mode and when
peer is behind a reverse proxy.
Otherwise once this operation is applied, the remote Solr(s) instances
are deconnected and the embedded Solr is connected even if disabled by
setting "core.service.fulltext".
Also use constants for related default setting values.
- Use the EnhancedXMLResponseWriter only when requested output is "exml"
- Use the Standard Solr writers when possible, for example for json, xml
or javabin output formats
- Return an error when the requested format can not been rendered with
an external Solr server only
Important : this modification is necessary for peers using exclusively
an external Solr server to be reachable as robinson targets in p2p
search, as the binary format ("javabin") is the default Solr exchange
format for peers.
Before this, when a peer requested a remote one attached only to an
external Solr (no embedded one), it ended with "Invalid type" error, as
the remote peer answered with xml although binary format was requested.
This is necessary when you want to attach to a dedicated external Solr
server protected with basic http authentication and requested over https
but having only a self-signed certificate.