The resource observer is now able to recognize free disk space AND
available space for YaCy. The amount of space which is assigned for YaCy
are defined in new settings in the configuration file.
Furthermore, there is now a cleanup process which deletes files in case
that an autodelete is activated. The autodelete is now BY DEFAULT ON if
the disk space is low, which means that YaCy starts to delete documents
when the disk is full!
this hack is the occurrence of Exceptions like:
W 2014/02/11 18:51:33 ConcurrentLog GC overhead limit exceeded
java.io.IOException: GC overhead limit exceeded
at
net.yacy.cora.federate.solr.connector.AbstractSolrConnector.getDocumentById(AbstractSolrConnector.java:334)
at
net.yacy.cora.federate.solr.connector.MirrorSolrConnector.getDocumentById(MirrorSolrConnector.java:173)
at
net.yacy.cora.federate.solr.connector.ConcurrentUpdateSolrConnector.getDocumentById(ConcurrentUpdateSolrConnector.java:415)
at net.yacy.search.index.Fulltext.getMetadata(Fulltext.java:331)
at net.yacy.search.index.Fulltext.getMetadata(Fulltext.java:317)
at
net.yacy.search.query.SearchEvent.pullOneRWI(SearchEvent.java:1024)
at
net.yacy.search.query.SearchEvent.pullOneFilteredFromRWI(SearchEvent.java:1047)
at
net.yacy.search.query.SearchEvent$3.run(SearchEvent.java:1263)
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.util.Arrays.copyOfRange(Arrays.java:3077)
at java.lang.StringCoding.decode(StringCoding.java:196)
at java.lang.String.<init>(String.java:491)
at java.lang.String.<init>(String.java:547)
at
org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.readField(CompressingStoredFieldsReader.java:187)
at
org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:351)
at
org.apache.lucene.index.SegmentReader.document(SegmentReader.java:276)
at
org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110)
at
org.apache.lucene.index.IndexReader.document(IndexReader.java:436)
at
org.apache.solr.search.SolrIndexSearcher.doc(SolrIndexSearcher.java:657)
at
net.yacy.cora.federate.solr.connector.EmbeddedSolrConnector.SolrQueryResponse2SolrDocumentList(EmbeddedSolrConnector.java:230)
at
net.yacy.cora.federate.solr.connector.EmbeddedSolrConnector.getDocumentListByParams(EmbeddedSolrConnector.java:320)
at
net.yacy.cora.federate.solr.connector.AbstractSolrConnector.getDocumentById(AbstractSolrConnector.java:330)
... 7 more
This problem was analysed with the Eclipse Memory Analyser after a heap
dump, where the following problem was reported as the main Problem
Suspect:
One instance of "org.apache.solr.util.ConcurrentLRUCache" loaded by
"sun.misc.Launcher$AppClassLoader @ 0x42e940a0" occupies 902.898.256
(61,80%) bytes. The memory is accumulated in one instance of
"java.util.concurrent.ConcurrentHashMap$Segment[]" loaded by "<system
class loader>".
This memory is part of the result cache of Solr. Flushing this cache
appears the most appropriate solution to that problem.
occupied disc space. These values are also shown on the status page.
The disc space calculation shall be used for a disk-limitation of the
search index.
if load > 1 (but < 2) but only if there is enough memory (now: 0.5 GB
RAM available). The memory amount of the postprocessing is the cause
that systems block because they run into a frequent-GC chain which
almost locks the peer. If running with enough memory, the postprocessing
is fast and not damaging to the system.
Because the required RAM of 0.5 GB is never available in default
setting, the postprocessing will not run if the peer is not reconfigured
to use more memory.
introduced, it was also used for search facets. The generic search
facets are now deduced from generic solr fields which makes jena as tool
for facet semantics superfluous.
- redesigned the instance mirror class (which was a mess)
- added final method to close a searcher (which otherwise keeps a cache)
- changed cache clear method which iterates over resources and calls
clear to all caches in the searcher resources
- selecting more than one nav combines the 2 selections (with AND)
- unselecting one nav clears all selected
(e.g. select filetype:pdf and /language/fr shows ~ french pdf's only)
works fine to restrict language for local solrSearches.
More work needs to be done to make rwi/remote searches respect the modifier.language restriction.
webgraph. These cores are now accessible at
/solr/collection1/select instead /solr/select?core=collection1
and
/solr/webgraph/select instead /solr/select?core=webgraph
in addition to the old behavior to support compatibility to the old
peers. These new paths are fully solr standard-conform and will allow
the cross-linking between YaCy peers using their public solr API.