yacy_search_server

Commit Graph

Author	SHA1	Message	Date
Michael Peter Christen	163ba26d90	replaced check for load time method instead of loading the solr document, an index only for the last loading time was created. This prevents that solr has to fetch from its index while the index is created. Excessive re-loading of documents while indexing has shown to produce deadlocks, so this should now be prevented.	3 years ago
Michael Peter Christen	1ead7b85b5	remove compiler warning "warning: [try] explicit call to close() on an auto-closeable resource"	3 years ago
Michael Peter Christen	59777010dc	Merge branch 'master' of git@github.com:yacy/yacy_search_server.git	3 years ago
Michael Peter Christen	7898815c41	disabling concurrent logging (maybe temporary)	3 years ago
sgaebel	4bf6954474	uses clientBuilder not HttpClients.custom() to have these inside the Pool too	3 years ago
sgaebel	cdf901270c	always use HTTPClient by 'try with resources' pattern to free up resources	3 years ago
sgaebel	69adaa9f55	makes our HTTPClient closable	3 years ago
sgaebel	fc4275f901	handle all references for client, response, request to be able to close them	3 years ago
sgaebel	e7d3a363f2	refactor to use finish()	3 years ago
sgaebel	4fc876f4a3	revert back to use EntityUtils.consumeQuietly - as it simply closes the underlying stream	3 years ago
sgaebel	4f0392e93e	refactor use of AuthSchemeProvider	3 years ago
sgaebel	b74f337859	removes double setting of UserAgent	3 years ago
sgaebel	965748fefb	some refactoring using try with resources	3 years ago
Michael Peter Christen	552ab7051b	fix for warc importer	3 years ago
Michael Peter Christen	3c86b7b780	attempt to make a Mac Release using gradle This is almost working with many workarounds: - run rm lib/yacycore.jar - run ./gradlew clean build bundleNative - run ant clean all - run again rm lib/yacycore.jar - run ./fixMacBuild.sh The build is then inside build/mac/YaCy.app Right now this works so far but it does not have the correct release number inside. Target is to make this working for Windows releases and to embedd jre entirely.	3 years ago
Michael Peter Christen	999c819e3e	Merge branch 'master' of https://github.com/yacy/yacy_search_server.git	3 years ago
Michael Peter Christen	fd770e90e2	spike to identify paths for YaCy within mac application bundles	3 years ago
Michael Peter Christen	d19872fd26	making sure that crawl queues are closed correctly to prevent data loss	3 years ago
sgaebel	90507c0fdc	comments out printing query params to std.out	3 years ago
Michael Peter Christen	be0aebad84	fixes https://github.com/yacy/yacy_search_server/issues/424	3 years ago
Michael Peter Christen	63ad8ce6b2	removed ymarks had not been used since a long time	4 years ago
Michael Peter Christen	ef5a71a592	enhanced crawl start response time for very very large crawl start lists	4 years ago
Michael Peter Christen	4cadd557dc	removed synchronization in table creation to avoid possible deadlocks when handling OnDemandOpenFileIndex which happens quite often during wide crawling	4 years ago
admin	9b7668fa58	reduced memory footprint during indexing/crawling	4 years ago
Michael Peter Christen	e6a87e0426	enhanced crawler a main problem when crawling is long waiting time cuased by crawl-delay values from robots.txt entries. that attribute is not supported by google and interpreted by yandex and bing in different ways. In large crawls there is always one host which blocks the whole crawl with extreme large values. YaCy now still obeys crawl-delay but limits them to 10 seconds. Additionally the blocking logic when loading new robots.txt was analyzed and a deadlock was removed. Furthermore the construction of new queue lists was redesigned and it was ensured that always a large list of different hosts for host-balancing is provided for the loader.	4 years ago
Michael Peter Christen	e9c5e78868	replaced new Number(Number) with Number.instanceOf to remove deprecation warnings for Java 9	4 years ago
Michael Peter Christen	9e13d77de4	removed call to class.finalize() because of deprecation in java 9 next: removal of finalize() implementation after testing with assert false	4 years ago
Michael Peter Christen	9ef4503672	fixed some newInstance() warnings .. by adding .getDeclaredConstructor()	4 years ago
Michael Peter Christen	1d41380f0a	better support for mac-specific tray functions in java 9	4 years ago
Michael Peter Christen	e81b770f79	enabled crawl starts with very large sets of start urls i.e. 10MB large url list with approx 0.5 million start points	4 years ago
Michael Peter Christen	c623a3252e	fix for jdk 14 bug	4 years ago
Michael Peter Christen	dbd211a1ad	removed/replaced reflection in memory tool	4 years ago
Michael Peter Christen	1cdb21592b	added hazelcast and some modifications to align legacy YaCy with YaCyGrid	4 years ago
Michael Christen	42ea2a1c6f	Merge pull request #405 from jfhs/jfhs/support-all-html-entities Improve HTML entities support	4 years ago
Michael Christen	b2af745dd6	Merge pull request #404 from lnceballosz/master NGI0 - Updating licensing aspects according REUSE	4 years ago
jfhs	10bddc2c2d	Decode HTML entities in all property values by default	4 years ago
jfhs	2135d259e3	Replace hardcoded html/xml entities with a file, support decoding all defined HTML entities	4 years ago
Michael Peter Christen	8f876a8c72	added concurrency to enhance indexing speed during json surrogate import	4 years ago
Michael Peter Christen	f8cbaeef93	Merge branch 'master' of git@github.com:yacy/yacy_search_server.git	4 years ago
Michael Peter Christen	a857e3d3d5	fix for json importer	4 years ago
sgaebel	1546232c94	adds ranking for multi document queries only	4 years ago
sgaebel	93b353d22d	does not boost or add fields for zero-row-queries (exists())	4 years ago
sgaebel	f16cd154f7	removes unused imports and variables	4 years ago
sgaebel	c69c462a15	replaces a expensive getLoadTimeURL() by exists() refactors urlExists to getHarvestProcess as that is what it does	4 years ago
sgaebel	a5488ac8f5	uses edismax queries on query counts > 1 only	4 years ago
sgaebel	26223dc25a	replaces getLoadTime() by exists() with a simpler query since solr-8.8.1 getLoadTime() causes a high cpu usage	4 years ago
sgaebel	8e4d014c06	removes useless SolrRequestInfo.clearRequestInfo(), avoids spamming the log	4 years ago
Lina Ceballos	a96752f5ab	adding SPDX license and copyright headers	4 years ago
Michael Peter Christen	e18d0ef544	trying to set a higher priority to the process that is involved in index export	4 years ago
Michael Peter Christen	8b4394a6c5	fixes for solr 8.8.1 migration - replace new guava 30 with older 25 because that is the correct dependency for solr 8.8.1. The newer one did actually not work! - index will be crated in a DATA/INDEX/freeworld/SEGMENTS/solr_8_8_1 subfolder. The older solr_6_6 index is not touched but also not migrated. The index starts with fresh (empty) content. - Older indexes must be migrated by hand (export/import) so far until a better solution is found. - Large schema adoptions for lucene 8.8.1	4 years ago

1 2 3 4 5 ...

8829 Commits (163ba26d90d874ba4ec35daa59ca5a7b09e206bd)