The https://reproducible-builds.org project invests a lot of work
to make builds reproducible. This is a security property. It allows
to compare the build of binaries from different builder machines.
If they are identical, it means that either the builds have not
been manipulated or an attacker managed to attack all builder
machines in exactly the same way.
One problem that the reproducible-builds project often sees is
that projects include the build time in their binaries. This
makes builds unreproducible for apparently no reason. The build
date should not be of interest since binaries built on different
dates but from the same source code should not be different.
Thus I decided to remove the build date instead of re-implementing
the functionality without the GitRev task. Anyways the reported
date was not the build date but the date of the last git commit
which is even less informative. The git commit ID would have
information value but should only be relevant for "nightly builds".
PKGMANAGER is always false, thus the java code wrapped in
if statements for this property is dead code and can also
be removed.
The Debian packaging removed in c4659f0fb0
did set the PKGMANAGER property to true. When we do distro
packages again, we can revisit this commit and redo it with
property files instead.
RESTARTCMD is only used inside those dead code.
DESTDIR is never used even in the build.xml
to be able to use/reuse Ant targets where task has not been implemented in Gradle build.
- use the import to include the compile of htroot as first important task
! it is possible that first build fails an compile of GitRevTask.jar !
! solution/workaround -> use "ant all" once to compile GitRevTask.jar !
- adjusted build.xml a little
- split compile-core into compile-core and compile-htroot to have a target for htroot comp. only
- set build-path to reuse Gradles build directory
- (fix javadoc failure)
- changed the filtered-copy of yacyBuildProperties.java to ! the build path :-(
as current (copy,delete,exclude) is complicated and not migration worthy,
used simple/straigt forward approach (using a yacyBuildProperties.java.template file as copy source)
instead of loading the solr document, an index only for the last loading
time was created. This prevents that solr has to fetch from its index
while the index is created. Excessive re-loading of documents while
indexing has shown to produce deadlocks, so this should now be
prevented.
Previously, when mixing results from local RWI and local Solr (Stealth
mode), total local Solr count could be ignored on last result pages,
when the page offset was higher than local Solr count but lower than
total RWI count.
Previously search navigators/facets elements were sorted only by counts.
Now from the ConfigSearchPage_p.html admin page, sort direction
(ascending/descending) and type (on counts or labels) can be customized
independently for each navigator.
To make seed upload (in /Settings_p.html?page=seed page) with SCP easier
when the user specify a remote target directory path.
See report by @vikulin in issue #227
SimpleDateFormat must not be used by concurrent threads without
synchronization for parsing or formating dates as it is not thread-safe
(internally holds a calendar instance that is not synchronized).
Prefer now DateTimeFormatter when possible as it is thread-safe without
concurrent access performance bottleneck (does not internally use
synchronization locks).
Initializing Thread names using the Thread constructor parameter is
faster as it already sets a thread name even if no customized one is
given, while an additional call to the Thread.setName() function
internally do synchronized access, eventually runs access check on the
security manager and performs a native call.
Profiling a running YaCy server revealed that the total processing time
spent on Thread.setName() for a typical p2p search was in the range of
seconds.
That was caused by concurrent modifications (with addHighlightField()
function) to the same SolrQuery instance when requesting Solr on remote
peers in p2p search.
This is necessary when you want to attach to a dedicated external Solr
server protected with basic http authentication and requested over https
but having only a self-signed certificate.
Thus allowing to choose at configuration or per search request, whether
extending or not results beyond strict content domain filter (image,
video, audio or application).
Related graphical controls to be added to user interface.
Required for proper operation when the default system locale is Turkish,
as dottless and dotted i characters have specific case conversion rules
in this language.
Introduced through the new configurable setting
network.unit.protocol.https.preferred, defaulting to false for now.
Let choose to prefer using https when available on remote peers to
perform YaCy protocol operations including notably hello or transferRWI.
Not yet implemented for every YaCy protocol operations.
Default is still http to prevent any regressions, but a new setting is
available to choose https as the preferred protocol to perform remote
searches.
New configuration setting 'remotesearch.https.preferred' is manually
editable in yacy.conf file or in Advanced Properties page
(/ConfigProperties_p.html).
Should be enabled as default in the future for improved privacy.
Https could also eventually be used for other peers communications.
Resizing JPEG snapshot images through /api/snapshot.jpg failed when
running on OpenJDK, but rendered successfully with a Oracle JDK.
Details in mantis 772 ( http://mantis.tokeek.de/view.php?id=772 ).
Removing any alpha component (useless in snapshot images) from the
rendered resized image solves the issue.
- added some missing increments from RWI results
- decrement relevant navigator counts when solr or RWI results are
evicted because duplicates detection or constraints checked belatedly
- do not compute facets when unnecessary to avoid unwanted CPU load
- do not increment from facets when already done
- do not rely on facets on remote solr peers requests, as most of the
time only a limited part of their total results if fetched (thus also
preventing unnecessary load on remote peers)
- use a concurrency friendly score map for the dates navigators to
prevent unwanted ConcurrentModificationExceptions
This improves the situation for the most obvious inconsistencies in
search navigators counts, but more has to be done for a true accuracy
(notably when query modifiers constraints are applied belatedly - after
the solr or RWI retrieval request - such as the content domain
constraint)