Merge pull request #664 from okybaca/log-kelo
Michael Christen
2024-11-25 12:40:34 +0100
54fa724352reverted deprecation change since we are still using java 11, not 19
Michael Peter Christen
2024-11-25 12:34:05 +0100
6ef3a0fca5code maintenance - removed warnings and replaced deprecated functions
Michael Peter Christen
2024-11-25 12:29:11 +0100
91c753ab96renamed INDEX-TRANSFER-DISPATCHER to DHT-OUT in the log
#664
okybaca
2024-11-25 01:23:57 +0100
feca150672Automatically adjust crawling load limit to the local machine cpu cores The settings in the default configuration file is historic. Many machines have much more CPU cores today and now an auto-scaling to this hardware is better.
Michael Peter Christen
2024-11-25 00:30:36 +0100
16e031caebadded hint in README to install Ivy from archived updatesite.
Michael Peter Christen
2024-11-24 23:59:27 +0100
7e8a1ef0e2log file when OOM appears
Michael Peter Christen
2024-10-18 14:25:42 +0200
3c88f87831jetty upgrade
Michael Peter Christen
2024-10-17 11:36:51 +0200
a8c64b1af2added an artificial snippet [Synonym Match] in case that there is only a match in the synonyms]
Michael Peter Christen
2024-09-21 21:56:12 +0200
f57df061daupgraded commons-io to 2.17.0
Michael Peter Christen
2024-09-21 21:56:12 +0200
c88c30a5c5added an option to ViewFile to see all solr fields which contain texts
Michael Peter Christen
2024-09-21 21:51:19 +0200
0fb77994aaAdd Dockerfile for YaCy deployment
#661
Agent001
2024-09-20 11:51:47 +0000
7b108dadf7Update vercel.json for correct deployment
Agent001
2024-09-20 11:40:05 +0000
1cabc452b5Add vercel.json for Vercel deployment
Agent001
2024-09-20 10:51:41 +0000
6db374fdcfupgraded metadata-extractor to 2.19.0
Michael Peter Christen
2024-08-26 23:49:27 +0200
3944984840added snippet extraction with synonym matching
Michael Peter Christen
2024-08-26 23:44:42 +0200
d181b9e89badded deleted files from commit 254f12d60b which are still needed and had been linked outside of yacy/ui
Michael Peter Christen
2024-07-24 15:57:51 +0200
910a496c9freplaced http links with https
Michael Peter Christen
2024-07-21 18:02:58 +0200
fd45ccf76eadded sponsoring images
Michael Peter Christen
2024-07-21 17:25:22 +0200
687820788dthis assert does not work because of the 9_0_0 solr version format. An 9_0 is expected but it does not work this way with this version.
Michael Peter Christen
2024-07-21 13:33:47 +0200
Merge branch 'yacy:master' into master
#649
virginOne
2024-06-23 19:52:54 +0800
66cf7d4ca5disables autowarm of filtercache, corrects luceneMatchVersion
sgaebel
2024-06-12 12:59:34 +0200
89c07f0900Fix the issue of not being able to import the JSON format export of Solr index due to the inconsistency in time format between the exported JSON format and the Solr time format.
Virgo
2024-06-19 21:14:45 +0800
70454654f3by default open the https url for a given host, not the http url (http does almost not exist any more)
Michael Peter Christen
2024-05-27 00:53:18 +0200
71a6074cc5added setting of cache configuration for solr according to recommendation from https://community.searchlab.eu/t/yacy-support-gpt-chatgpt-assistant/1622 However it is not clear if this configuration actually works (has an effect at all) or is the solution for performance issues.
Michael Peter Christen
2024-05-26 12:59:59 +0200
b8479430b6400 is too small
Michael Peter Christen
2024-05-25 01:21:06 +0200
5f4ea9ac5dreduced memory amount for network image reduced also the number of memory allocation for image storage
Michael Peter Christen
2024-05-25 01:10:23 +0200
fe4c0aa890refactoring of RAG reverse proxy: extracted code for ollama code to their own classes
Michael Peter Christen
2024-05-21 00:06:19 +0200
8b65a4d14efixed build clean
Michael Peter Christen
2024-05-19 18:02:49 +0200
f1c70dce33Merge branch 'master' of github.com:yacy/yacy_search_server
Michael Peter Christen
2024-05-19 17:35:24 +0200
8eb0d490aamigrated solr to 9.0 This is a major step because solr removed support for embedded solr instances in 9.0 and we want to keep it because we want to ship YaCy with an embedded solr. It was necessary to add parts of solr code into YaCy to make this migration possible. Further on with Solr 9.1 they removed even more parts which are required for embedded operation, therefore we cannot migrate yet further without big changes. If you are running a YaCy instance with Solr 8.x, the migration should be done automatically. If not you require to first migrate to a YaCy version 1.93 with Solr 8.x to migrate to Solr 8 data.
Michael Peter Christen
2024-05-19 17:34:57 +0200
b8417e5619removed Mac specific code which is not working any more on recent Macs
Michael Peter Christen
2024-05-19 17:29:16 +0200
13fbff0bffAdded a RAG Proxy for AI Chat with YaCy
Michael Peter Christen
2024-05-19 17:19:09 +0200
59c0cb0f30fixed aarch64 dockerfile
Michael Peter Christen
2024-05-13 02:10:24 +0200
Merge pull request #642 from HeliosLHC/reduce_image_size
Michael Christen
2024-05-12 23:50:24 +0200
c2ad1950e8updated jetty to 9.4.54.v20240208
Michael Peter Christen
2024-05-10 15:41:20 +0200
b295e38969fine-tuned the import process of jsonl files which had been missing to actually be able to make searches and browse the index with the host browser
Michael Peter Christen
2024-05-10 12:13:44 +0200
ceb07a5218fixed problem with zim importer which crashed when non-valid urls appeared
Michael Peter Christen
2023-11-13 11:12:10 +0100
656b3e3e77updated guava to latest and added missing library for failureaccess
Michael Peter Christen
2023-11-13 10:59:49 +0100
3268a93019added a 'minified' option to YaCy dumps
Michael Peter Christen
2023-11-13 10:27:50 +0100
c20c4b8a21modified export: added maximum number of docs per chunk The export file can now be many files, called chunks. By default still only one chunk is exported. This function is required in case that the exported files shall be imported to an elasticsearch/opensearch index. The bulk import function of elasticsearch/opensearch is limited to 100MB. To make it possible to import YaCy files, those must be splitted into chunks. Right now we cannot estimate the chunk size as bytes, only as number of documents. The user must do experiments to find out the optimum chunk max size, like 50000 docs per chunk. Try this as first attempt.
Michael Peter Christen
2023-11-12 22:11:55 +0100
655d8db802detailed directions in index export to explain how the export can be imported again using elasticsearch/opensearch
Michael Peter Christen
2023-11-12 15:26:18 +0100
24011dcbccmore file name extensions for json list surrogate files
Michael Peter Christen
2023-11-06 22:44:18 +0100
34a9fc1a07bugfixes to zim reader:
Michael Peter Christen
2023-11-05 12:46:37 +0100
7db0534d8aAdded a zim parser to the surrogate import option. You can now import zim files into YaCy by simply moving them to the DATA/SURROGATE/IN folder. They will be fetched and after parsing moved to DATA/SURROGATE/OUT. There are exceptions where the parser is not able to identify the original URL of the documents in the zim file. In that case the file is simply ignored. This commit also carries an important fix to the pdf parser and an increase of the maximum parsing speed to 60000 PPM which should make it possible to index up to 1000 files in one second.
Michael Peter Christen
2023-11-05 02:16:40 +0100
70e29937efadded a check in zim importer which tests if import URLs actually exist
Michael Peter Christen
2023-11-04 19:07:50 +0100
496f768c44modified cache strategy for zim clusters
Michael Peter Christen
2023-11-03 18:20:10 +0100
fdc6311dc7added parsing rules for wikibooks and wikinews in zim reader
Michael Peter Christen
2023-11-02 00:27:24 +0100
2ea54b3503fixed blob iterator in zim cluster definition
Michael Peter Christen
2023-11-01 23:43:27 +0100
54fa5d3c2eadded a cluster cache but it requires more testing
Michael Peter Christen
2023-11-01 19:52:44 +0100
41856e9f34added an optimized zim file entry iterator
Michael Peter Christen
2023-11-01 18:50:28 +0100
1c0df28bfbadded a zim importer that can be used for surrogate imports. Can not be used yet because it requires some security additions to verify that the given urls actually work.
Michael Peter Christen
2023-11-01 18:48:40 +0100
b9912ff50drepaired dockerfiles for aarch64 and armv7
Michael Peter Christen
2023-10-29 22:09:24 +0000
Merge pull request #607 from okybaca/wikilinks
Michael Christen
2023-10-29 14:55:26 +0100
4add1f6bc7replaced all the links to legacy legacy wiki to legacy wiki
#607
okybaca
2023-10-29 13:12:24 +0100
e2c86a8ebaadded a ZIM cluster pointer cache
Michael Peter Christen
2023-10-29 12:49:08 +0100
4a54b24703fix for "negative seek offset" error during extension of heap files. This would have always happend when a heap file exceeds 2GB. should fix https://github.com/yacy/yacy_search_server/issues/372
Michael Peter Christen
2023-10-29 09:32:21 +0100
69db75ce45added a link to docker build guide
#605
okybaca
2023-10-29 02:35:57 +0100
9c8fb97985introduced url list and title list caching and enhanced input stream performance in ZIM reader
Michael Peter Christen
2023-10-29 00:43:12 +0200
b0ae660790added Zstandard compressed data decompression for ZIM files type 5 also: more generalization and performance enhancements
Michael Peter Christen
2023-10-28 12:24:29 +0200
ad8ee3a0b6fixed typo in class name
Michael Peter Christen
2023-10-28 08:57:42 +0200
c4082c4ff2refactoring of ZIM reader, simplification, removed unnecessary code
Michael Peter Christen
2023-10-28 08:56:58 +0200
c2b6b6e7b9Fixed a large number of problems in the ZIM reader. This library was not prepared for large data because it was missing long data types for pointers. I had to modify the code-base in a fundamental way: - Proof-Reading, - unclustering, - refactoring, - naming adoption to https://wiki.openzim.org/wiki/ZIM_file_format, - change of Exception handling, - extension to more attributes as defined in spec (bugfix for mime type loading) - bugfix to long parsing (prevented reading of large files) The code is furthermore very inefficient and requires more attention. However the format is very useful for YaCy as there are numerous data sources for ZIM-Files.
Michael Peter Christen
2023-10-27 15:49:23 +0200