structure, but is not filled yet. To have the opportunity of a second
core, multi-core functionality had to be implemented to the
deep-embedded solr:
- migrated the solr_40 directory content to a subdirectory
'collection1'; the previously used default core is now called
collection1
- added solr_40/webgraph subdirectory as second core
- added a servlet configuration for the second core 'webgraph' in
/IndexSchema_p.html
- added instance handling as addition to solr connections: all solr
connectors are now instances of an solr 'instance' object; this required
a complete re-design of the solr embedding
- migrated also caching and sharding ontop of new instance handling
- migrated the search apis to handle now the access to a specific core,
the default core named 'collection1'
- migrated the remote solr search interface to access shards of cores;
for the yacy remote search the default core is now called 'solr'; using
the peer address as solr address
- migrated the solr backup and restore process: old backups cannot be
used after this migration!
- redesign of solr instance handling in all methods which access the
instances: they cannot hold copies of these instances any more; the must
retrieve the actuall connection object every time they want to write to
it (this solves also some bugs when switching the index/network)
- added another schema 'solr.webgraph.schema', the old solr.keys.list is
replaced by solr.collection.schema
http://forum.yacy-websuche.de/viewtopic.php?p=27509#p27509
Instead, in the start script is checked if the host is a 64 host and
-Dsolr.directoryFactory=solr.MMapDirectoryFactory is set as java option
Reverted the ramBufferSizeMB setting (this was not enabled anyway)
because that may be too much memory for small peers and embedded
systems.
Activated the mergeFactor 4; this was commented out by mistake
the target of these performance settings is the reduction of IO in
general and during search in particual.
- reduced mergeFactor to 4. This will increase the IO during indexing,
but will reduce IO during search. It will also greatly reduce the number
of open files which should make it possible to have overall larger
indexes until the number of open files in an OS is reached.
- increased ramBufferSizeMB to 256mb. This will reduce the number of
commits. This change may compensate the reduction of the mergeFactor.
- disabled updateLog. This is a real-time search feature which is
available in YaCy anyway because a commit is forced if index.html is
called. The updateLog feature causes a lot of IO during indexing and
search and produced a lot of files in SEGMENTS/solr_40/data/tlog
introduce a copy-field for the author field to be copied to a string
field. This field is then used to generate facets. Without this field,
the facet would consist only of the words of the author names, not of
the full author string.
- fixed type definition found by the verifier
- added multivalue-string fields for solr with extension 'sxt'
- added multivalue-integer fields for solr with extension 'val'
- renamed some solr attributes from txt to sxt
- changed solr query line to an explicit AND/OR structure
- added a country code second level domain list to Domains class; with
parser
- added a host string parser to get domain class name, country-code
second-level domain and subdomain out of it
- removed old coordinate attributes