yacy_search_server

Commit Graph

Author	SHA1	Message	Date
Michael Peter Christen	910a496c9f	replaced http links with https	4 months ago
Michael Peter Christen	687820788d	this assert does not work because of the 9_0_0 solr version format. An 9_0 is expected but it does not work this way with this version.	4 months ago
Michael Peter Christen	f1c70dce33	Merge branch 'master' of github.com:yacy/yacy_search_server	6 months ago
Michael Peter Christen	8eb0d490aa	migrated solr to 9.0 This is a major step because solr removed support for embedded solr instances in 9.0 and we want to keep it because we want to ship YaCy with an embedded solr. It was necessary to add parts of solr code into YaCy to make this migration possible. Further on with Solr 9.1 they removed even more parts which are required for embedded operation, therefore we cannot migrate yet further without big changes. If you are running a YaCy instance with Solr 8.x, the migration should be done automatically. If not you require to first migrate to a YaCy version 1.93 with Solr 8.x to migrate to Solr 8 data.	6 months ago
Michael Peter Christen	b295e38969	fine-tuned the import process of jsonl files which had been missing to actually be able to make searches and browse the index with the host browser	6 months ago
Michael Christen	d097a642c2	Merge pull request #615 from okybaca/logging2 Logging unclutter	12 months ago
Michael Christen	6d5e9ff53f	Merge pull request #616 from okybaca/logging3 changed the log entry REJECTED to CRAWLER * REJECTED, loglevel fine	12 months ago
pr0vieh	dfb2b79609	Add setting for DHT receive loadprereq insted of hardcoded load < 2.0	12 months ago
okybaca	5dee8dbcbd	changed the log entry REJECTED to CRAWLER * REJECTED, loglevel fine	12 months ago
Michael Christen	4c603e23f0	Merge pull request #610 from okybaca/cr-text UI: added a more descriptive message, CitationRank instead of cr	12 months ago
okybaca	553c859703	logging: moved some log-cluttering DHT messages to level 'fine'	12 months ago
sgaebel	d72cd7916c	Merge branch 'master' of https://github.com/yacy/yacy_search_server	1 year ago
sgaebel	0663ae3c99	adds synchornized dumplog	1 year ago
okybaca	cba84632ee	UI: added a more descriptive message, CitationRank instead of cr	1 year ago
Michael Peter Christen	3268a93019	added a 'minified' option to YaCy dumps	1 year ago
Michael Peter Christen	c20c4b8a21	modified export: added maximum number of docs per chunk The export file can now be many files, called chunks. By default still only one chunk is exported. This function is required in case that the exported files shall be imported to an elasticsearch/opensearch index. The bulk import function of elasticsearch/opensearch is limited to 100MB. To make it possible to import YaCy files, those must be splitted into chunks. Right now we cannot estimate the chunk size as bytes, only as number of documents. The user must do experiments to find out the optimum chunk max size, like 50000 docs per chunk. Try this as first attempt.	1 year ago
Michael Peter Christen	24011dcbcc	more file name extensions for json list surrogate files	1 year ago
Michael Peter Christen	7db0534d8a	Added a zim parser to the surrogate import option. You can now import zim files into YaCy by simply moving them to the DATA/SURROGATE/IN folder. They will be fetched and after parsing moved to DATA/SURROGATE/OUT. There are exceptions where the parser is not able to identify the original URL of the documents in the zim file. In that case the file is simply ignored. This commit also carries an important fix to the pdf parser and an increase of the maximum parsing speed to 60000 PPM which should make it possible to index up to 1000 files in one second.	1 year ago
Michael Peter Christen	4308aa5415	removed concept of empty passwords as "no passwords used", because we now start YaCy with a default password (yacy). This has impact of all function that check the current state of password-protection that included the empty password situation, including the warnings to set a password in case that none is set (which cannot be the case any more).	1 year ago
Michael Peter Christen	4da320bebf	added a warning message in ConfigBasic in case that the default password was not changed.	1 year ago
Michael Peter Christen	ff8fe7b6a4	fix for ',' or '.' appearing within a word or number. This will not tokenize the query into parts around that character to make it possible to search for numbers or version numbers.	1 year ago
Michael Peter Christen	88cd17ea57	migrated solr from 8.9.0 to 8.11.2; activated also migration script. A YaCy index with solr 8.9.0 will automatically be migrated to 8.11.2. This is a preparation step to migrate to 9.0.0 soon.	1 year ago
Michael Peter Christen	1c0f50985c	fixed documentation and some details of handling of keywords	2 years ago
Michael Peter Christen	9fcd8f1bda	added canonical filter attention: this is on by default! (it should do the right thing)	2 years ago
Michael Christen	4304e07e6f	crawl profile adoption to new tag valency attribute	2 years ago
Michael Peter Christen	309adb814e	fixed import of jsonlist imort from searchlab.eu using a direct URL	2 years ago
Michael Peter Christen	62d177bf59	stub for jsonlist index importer web page	2 years ago
Michael Peter Christen	efa0425f00	refactoring: moved jsonlist importer to importer class	2 years ago
Michael Peter Christen	49daa32a88	yacy can now read searchlab export dump files using the surrogate input process: - copy the searchlab export file to DATA/SURROGATE/in - the file is processed automatically and then moved to DATA/SURROGATE/OUT	2 years ago
Michael Christen	99174282d8	try to shut down in a bit more ordered way inspired by https://github.com/yacy/yacy_search_server/issues/518	2 years ago
Michael Peter Christen	482f507e65	upgraded solr from 8.8.1 to 8.9.0 should hopefully fix https://github.com/yacy/yacy_search_server/issues/496 because it includes https://issues.apache.org/jira/browse/SOLR-13034	2 years ago
Michael Peter Christen	60c9986a0e	new release file names with date and git hash ...without reference to 9000ish SVN	2 years ago
Michael Peter Christen	9c1bc533fa	removed hazelcast because it is phoning home, see also: https://github.com/yacy/yacy_search_server/issues/504	2 years ago
Michael Peter Christen	fc98ca7a9c	removed ContentControl servlet and functinality This was not used at all (as I know) and was blocking a smooth integration of ivy in the context of an existing JSON parser.	2 years ago
Michael Peter Christen	3d138d3fdd	catch error when initializing hazelcast should fix https://github.com/yacy/yacy_search_server/issues/468	2 years ago
Burkhard	a6a9828181	Merge pull request #440 from lfuelling/master Add setting for public facing port	3 years ago
Daleth Darko	3ced06c731	Various javadoc fixes	3 years ago
reger24	6a1e259fd0	Fix NPE in Switchboard . getURL https://github.com/yacy/yacy_search_server/issues/441	3 years ago
Lukas Fülling	e8a00007f6	add setting for public facing port	3 years ago
Michael Peter Christen	bd3f2483a1	replaced url and date retrieval by only url retrieval This should prevent that the search index is used for freshnes of the index entry.	3 years ago
Michael Peter Christen	163ba26d90	replaced check for load time method instead of loading the solr document, an index only for the last loading time was created. This prevents that solr has to fetch from its index while the index is created. Excessive re-loading of documents while indexing has shown to produce deadlocks, so this should now be prevented.	3 years ago
Michael Peter Christen	be0aebad84	fixes https://github.com/yacy/yacy_search_server/issues/424	3 years ago
Michael Peter Christen	63ad8ce6b2	removed ymarks had not been used since a long time	3 years ago
Michael Peter Christen	ef5a71a592	enhanced crawl start response time for very very large crawl start lists	3 years ago
Michael Peter Christen	e9c5e78868	replaced new Number(Number) with Number.instanceOf to remove deprecation warnings for Java 9	3 years ago
Michael Peter Christen	e81b770f79	enabled crawl starts with very large sets of start urls i.e. 10MB large url list with approx 0.5 million start points	3 years ago
Michael Peter Christen	1cdb21592b	added hazelcast and some modifications to align legacy YaCy with YaCyGrid	4 years ago
Michael Peter Christen	8f876a8c72	added concurrency to enhance indexing speed during json surrogate import	4 years ago
Michael Peter Christen	f8cbaeef93	Merge branch 'master' of git@github.com:yacy/yacy_search_server.git	4 years ago
Michael Peter Christen	a857e3d3d5	fix for json importer	4 years ago

1 2 3 4 5 ...

1497 Commits (910a496c9f657adaa8560d068c5a2e29d9bdd68a)