yacy_search_server

Commit Graph

Author	SHA1	Message	Date
Michael Peter Christen	8eb0d490aa	migrated solr to 9.0 This is a major step because solr removed support for embedded solr instances in 9.0 and we want to keep it because we want to ship YaCy with an embedded solr. It was necessary to add parts of solr code into YaCy to make this migration possible. Further on with Solr 9.1 they removed even more parts which are required for embedded operation, therefore we cannot migrate yet further without big changes. If you are running a YaCy instance with Solr 8.x, the migration should be done automatically. If not you require to first migrate to a YaCy version 1.93 with Solr 8.x to migrate to Solr 8 data.	6 months ago
Michael Peter Christen	88cd17ea57	migrated solr from 8.9.0 to 8.11.2; activated also migration script. A YaCy index with solr 8.9.0 will automatically be migrated to 8.11.2. This is a preparation step to migrate to 9.0.0 soon.	1 year ago
Michael Peter Christen	1c0f50985c	fixed documentation and some details of handling of keywords	2 years ago
sgaebel	1cdc55a425	lets SOLR merge bigger segments (up to 50GB) + some setting to reduce caches	3 years ago
Michael Peter Christen	8b4394a6c5	fixes for solr 8.8.1 migration - replace new guava 30 with older 25 because that is the correct dependency for solr 8.8.1. The newer one did actually not work! - index will be crated in a DATA/INDEX/freeworld/SEGMENTS/solr_8_8_1 subfolder. The older solr_6_6 index is not touched but also not migrated. The index starts with fresh (empty) content. - Older indexes must be migrated by hand (export/import) so far until a better solution is found. - Large schema adoptions for lucene 8.8.1	4 years ago
luccioman	5a3d5cb92c	Upgraded Solr config files with the ones provided by Solr release Fixes #292	6 years ago
luccioman	4196101379	Enable soft autocommit in default Solr config Since upgrade from Solr 5.5 to Solr 6.6 (commit `6fe7359`), hard autocommits were still enabled to regularly persist the Solr index to the file system, but new index entries were no more automatically made available for use by the application (soft autocommit). Therefore, YaCy features such as index statistics, that do not perform an explicit commit (as recommended by Solr documentation) were no more accurate. Soft autocommit is now restored as a default, with a time period expected to be sufficient for accuracy while adding only a reasonable system load overhead. Fixes issue #251	6 years ago
reger	41616de0b8	Add SolrConfig ClassicIndexSchemaFactory to prevent Solr startup warning. This overrides Solr default to use managed schema. As we don't use programatic schema changes this directs Solr to use schema.xml, eliminating the warning.	7 years ago
reger	9220ccbec7	remove reference to velocityresponsewriter in solrconfig.xml it is not longer part of solr-core api http://lucene.apache.org/solr/6_6_0/index.html	8 years ago
reger	4be4bfbba6	remove sample path setting in solrconfig.xml not valid in Yacy resulting in startup stop exception after fresh swithch to 1.921	8 years ago
luccioman	f6e8d71718	Prevent high CPU load at startup, caused by the Solr suggester build. Reported by Collision on mantis 758 ( http://mantis.tokeek.de/view.php?id=758 ). Introduced by the new YaCy Solr configuration for Solr 6.6.0 (see commit `6fe735945d`), including now Suggester configuration.	8 years ago
Michael Peter Christen	6fe735945d	migrated Solr 5.5 -> Solr 6.6 and from Java 1.7 -> 1.8 Also: now Version 1.921	8 years ago
reger	35a7d57260	update lucenematchversion to current (5.2.0 -> 5.5.0) there should be no need for reindex by the update	8 years ago
luc	55a4d15775	Added a note on deprecated default search field and operator.	9 years ago
sixcooler	f5a9948860	do not store subfield *_coordinate	9 years ago
sixcooler	fca353e5eb	set startuptype of most solr handlers to lazy	9 years ago
reger	c720b4c249	remove override of dynamicField coordinate_p in solr schema (coordinate_p is not a mandatory field as such doesn't need to be declared as schema.field)	9 years ago
reger	5e45f1a460	enable Solr schema dynamicField _p (type=location) for YaCy coordinate_p field	9 years ago
sixcooler	87e4abe393	fight the fieldcache by usind DocValues: in Solr-5.x the fieldcache has moved and was not cleared anymore. This results in an huge fieldcache. (http://lucene.apache.org/#highlights-of-the-lucene-release-include https://issues.apache.org/jira/browse/LUCENE-5666) Here I try to use DovValues where it is possible. For this I used the Api-Scheme as new basis für the Solr-Schema. This needs at least a complete optimization of the Solr-Index to get a smaller FieldCache. Everything that is indexed with these setting will not use the Fieldcache at all.	9 years ago
reger	00d2062813	Rem depreciated AdminHandlers in solrconfig.xml avoid warning log W org.apache.solr.handler.admin.AdminHandlers <requestHandler name="/admin/" class="solr.admin.AdminHandlers" /> is deprecated . It is not required anymore	10 years ago
Michael Peter Christen	694b22f165	migration to Solr 5.2: huge benefits - this is a lot faster! This is a very complex migration: many classes had been renamed or removed, dependencies changed and the solr index type is now aligned to be a solr cloud repository. Together with the Solr 5.2 library update, one other dependent library had been updated as well: httpclient 4.4->4.4.1 Older indexes are migrated from 4_10 to 5_2. However, the new index structure is more efficient and we recommend to re-index everything. Please use the index export before you do the update to a large surrogate xml file. After the update, start with an empty index and then initialize this with your dump.	10 years ago
Michael Peter Christen	36e9cdb376	testing switching off cold searchers; maybe this brings performance enhancements when using large facets	10 years ago
sixcooler	5594c43d2e	bump to Solr-/Lucene-4.10.3	10 years ago
sixcooler	725b206fb4	update to solr-/lucene-4.10.2	10 years ago
Michael Peter Christen	09dcdb9b19	update to solr 4.9.0	11 years ago
Michael Peter Christen	d4157184ec	migration to Solr 4.8.1 This includes also an update to zookeeper 3.4.6 and a new library that Solr initializes by default: org.restlet from http://restlet.com/download/current#release=stable&edition=jse&distribution=zip which is included in version 2.2.1 from may 6th 2014	11 years ago
Michael Peter Christen	ebd44a7080	replaced solr 4.6.1 with solr 4.7.1 and added index migration to lucene_47	11 years ago
Michael Peter Christen	ee92d748b5	test using compound file format, see UseCompoundFile in https://cwiki.apache.org/confluence/display/solr/IndexConfig+in+SolrConfig This appears to be necessary as many times a java.io.FileNotFoundException: (Too many open files) appears. See also: https://issues.apache.org/jira/browse/SOLR-4 and desperate users at http://stackoverflow.com/questions/3828343/too-many-open-file-exception-while-indexin-using-solr We cannot force users to do a "ulimit -n 1000000", so this action seems to be required.	11 years ago
orbiter	f77afa9d1d	add index on _val fields, this affects especially title length an index on fields make search facets on that field possible	11 years ago
Michael Peter Christen	2f16770681	migrated to solr 4.6.0	11 years ago
Michael Peter Christen	a5c1249ee2	reverted autowarming setting in solrconfig	11 years ago
Michael Peter Christen	81bb50118e	found and fixed a huge memory leak in solr caching (inside Solr). The not-flushed Solr cache is now handled in this way: - it is smaller by default - an Solr-internal process is started to flush the cache periodically (this does NOT clean the cache, just removes old objects) - a Solr-external process (the standard YaCy cleanup-process) now has direct access to the solr internal cache and flushes them completely. The time frame for such a flush is defined by the cleanup-process frequency, by default 10 minutes.	11 years ago
Michael Peter Christen	21aa6a0321	migration to Solr 4.5.0	11 years ago
Michael Peter Christen	1a3e42eca4	index migration to lucene 4.4	11 years ago
sixcooler	1bc6003057	rise autoCommit maxTime to 3 Minutes to reduce IO lower mergeFactor again (5) for less segments	11 years ago
orbiter	1b43e02b86	Merge branch 'master' of git://gitorious.org/~quix0r/yacy/quix0rs-yacy-rc1	12 years ago
orbiter	a548354c71	replaced type of solr schema object sku of text_en_splitting_tight by string	12 years ago
Roland Haeder	ebbb3bc5c1	Fixed CHMOD on many files + added missing loggers (e.g. jena) and made some noisy loggers quiet	12 years ago
Michael Peter Christen	7754a1263b	switching back to the merge factor 10; the solr default.	12 years ago
Michael Peter Christen	959ccc4675	increased the solr merge factor because 4 was too much IO load for frequent index receiving and re-indexing after clickdepth/cr calculation.	12 years ago
reger	8a7fcb391d	enable use of solrcore.properties for property substitution of solrconfig.xml - move setting of system property solr.directoryFactory=solr.MMapDirectoryFactory to solrcore.properties - add check of os.arch for 64bit system, if it fails use default/solrcore.x86.properties (if exists) as solrcore.properties reason: on 32bit MMapDirectoryFactory may fail with..... Caused by: java.io.IOException: Map failed at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:849) at org.apache.lucene.store.MMapDirectory.map(MMapDirectory.java:283)	12 years ago
Michael Peter Christen	eb9d0ba5b1	ranking and boost function update, small bugfixes, better default search field for solr	12 years ago
Michael Peter Christen	a8dc4346e8	default configuration of MMapDirectoryFactory for solr, increased lock timeout, less documents from remote searches (too many results had easily blocked a peer)	12 years ago
Michael Peter Christen	0c1a018bbd	removed 'later' tactic because it used too much RAM, reduced number of soft commits, reduced caching size of search events, ensured that solr results are processed before connection is closed to keep that stuff not too long in RAM	12 years ago
Michael Peter Christen	9bd2aee180	migrated to solr 4.3.0	12 years ago
Michael Peter Christen	addba047e2	changes in ranking computation - an existing ranking servlet for solr was extended. It is now possible to set boost values for fields, boost functions and boost queries. - The ranking can have different instances, but currently only the first one is used - added an abstraction layer for fields which can be used for search and those fields can be edited in the solr ranking configruation - the ranking value from solr within the field score is used to combine remote search requests, which all are created using the same locally defined boost values - reduced the number of fields which are used for search (makes it faster) - replaced some text fields by string fields (makes indexing faster) - removed classes which had no use - made a large number of experiments for a better ranking and created a temporary setting which prefers hits inside titles - adjusted also the RWI-based ranking computation to 'prefer title' - made special cases like for portal search where no post-processing and post-ranking is wanted: this keeps the original ranking order as done by Solr - fixed many bugs with old settings for ranking	12 years ago
Michael Peter Christen	91a0401d59	introduced a second core named 'webgraph'. This core will hold the link structure, but is not filled yet. To have the opportunity of a second core, multi-core functionality had to be implemented to the deep-embedded solr: - migrated the solr_40 directory content to a subdirectory 'collection1'; the previously used default core is now called collection1 - added solr_40/webgraph subdirectory as second core - added a servlet configuration for the second core 'webgraph' in /IndexSchema_p.html - added instance handling as addition to solr connections: all solr connectors are now instances of an solr 'instance' object; this required a complete re-design of the solr embedding - migrated also caching and sharding ontop of new instance handling - migrated the search apis to handle now the access to a specific core, the default core named 'collection1' - migrated the remote solr search interface to access shards of cores; for the yacy remote search the default core is now called 'solr'; using the peer address as solr address - migrated the solr backup and restore process: old backups cannot be used after this migration! - redesign of solr instance handling in all methods which access the instances: they cannot hold copies of these instances any more; the must retrieve the actuall connection object every time they want to write to it (this solves also some bugs when switching the index/network) - added another schema 'solr.webgraph.schema', the old solr.keys.list is replaced by solr.collection.schema	12 years ago
Michael Peter Christen	8651ec35fe	turned author_s into the multi-valued field author_sxt	12 years ago
Michael Peter Christen	9b5bdae1b4	Reverted setting of MMapDirectoryFactory from solrconfig; see http://forum.yacy-websuche.de/viewtopic.php?p=27509#p27509 Instead, in the start script is checked if the host is a 64 host and -Dsolr.directoryFactory=solr.MMapDirectoryFactory is set as java option Reverted the ramBufferSizeMB setting (this was not enabled anyway) because that may be too much memory for small peers and embedded systems. Activated the mergeFactor 4; this was commented out by mistake	12 years ago
orbiter	eb68a30947	solr performance settings the target of these performance settings is the reduction of IO in general and during search in particual. - reduced mergeFactor to 4. This will increase the IO during indexing, but will reduce IO during search. It will also greatly reduce the number of open files which should make it possible to have overall larger indexes until the number of open files in an OS is reached. - increased ramBufferSizeMB to 256mb. This will reduce the number of commits. This change may compensate the reduction of the mergeFactor. - disabled updateLog. This is a real-time search feature which is available in YaCy anyway because a commit is forced if index.html is called. The updateLog feature causes a lot of IO during indexing and search and produced a lot of files in SEGMENTS/solr_40/data/tlog	12 years ago

1 2

59 Commits (70454654f367f3405043937107475526af02ae46)