yacy_search_server

Commit Graph

Author	SHA1	Message	Date
Michael Peter Christen	5f4ea9ac5d	reduced memory amount for network image reduced also the number of memory allocation for image storage	6 months ago
Michael Peter Christen	fe4c0aa890	refactoring of RAG reverse proxy: extracted code for ollama code to their own classes	6 months ago
Michael Peter Christen	f1c70dce33	Merge branch 'master' of github.com:yacy/yacy_search_server	6 months ago
Michael Peter Christen	8eb0d490aa	migrated solr to 9.0 This is a major step because solr removed support for embedded solr instances in 9.0 and we want to keep it because we want to ship YaCy with an embedded solr. It was necessary to add parts of solr code into YaCy to make this migration possible. Further on with Solr 9.1 they removed even more parts which are required for embedded operation, therefore we cannot migrate yet further without big changes. If you are running a YaCy instance with Solr 8.x, the migration should be done automatically. If not you require to first migrate to a YaCy version 1.93 with Solr 8.x to migrate to Solr 8 data.	6 months ago
Michael Peter Christen	b8417e5619	removed Mac specific code which is not working any more on recent Macs	6 months ago
Michael Peter Christen	13fbff0bff	Added a RAG Proxy for AI Chat with YaCy RAG (Retrieval Augmented Generation) is a method to combine a search engine with a LLM (Large Language Model). When a new prompt is submitted, a search engine injects knowledge from a search into the content. This is done using a reverse proxy between the Chat Client and the LLM. In this case, we used the following software: LLM Backend - Ollama: https://github.com/ollama/ollama Install ollama and then load two required LLM models with the following commands: ollama pull phi3:3.8b ollama pull llama3:8b Chat Client - susi_chat: https://github.com/susiai/susi_chat just clone the repository and the open the file susi_chat/chat_terminal/index.html in your browser. This displays a chat terminal. In this terminal, run the following command: host http://localhost:8090 This sets the LLM backend to your YaCy peer. Then start YaCy. It will provide the LLM endpoint to the client while using ollama in the backend. It then injects search results only from the local Solr index, not from the p2p network (so far).	6 months ago
Michael Peter Christen	b295e38969	fine-tuned the import process of jsonl files which had been missing to actually be able to make searches and browse the index with the host browser	6 months ago
Michael Christen	d097a642c2	Merge pull request #615 from okybaca/logging2 Logging unclutter	12 months ago
Michael Christen	6d5e9ff53f	Merge pull request #616 from okybaca/logging3 changed the log entry REJECTED to CRAWLER * REJECTED, loglevel fine	12 months ago
pr0vieh	dfb2b79609	Add setting for DHT receive loadprereq insted of hardcoded load < 2.0	12 months ago
okybaca	5dee8dbcbd	changed the log entry REJECTED to CRAWLER * REJECTED, loglevel fine	12 months ago
Michael Christen	4c603e23f0	Merge pull request #610 from okybaca/cr-text UI: added a more descriptive message, CitationRank instead of cr	12 months ago
okybaca	7831f294a9	changed regular peerping messages to level fine	12 months ago
okybaca	553c859703	logging: moved some log-cluttering DHT messages to level 'fine'	12 months ago
okybaca	1c5fca9a58	changed network operation log category from YACY to NETWORK	12 months ago
Michael Peter Christen	3d3bdb0f5f	added zim importer rule for mdwiki	1 year ago
Michael Peter Christen	4a611ac6a3	another possible fix for https://github.com/yacy/yacy_search_server/issues/500	1 year ago
sgaebel	d72cd7916c	Merge branch 'master' of https://github.com/yacy/yacy_search_server	1 year ago
sgaebel	0663ae3c99	adds synchornized dumplog	1 year ago
okybaca	cba84632ee	UI: added a more descriptive message, CitationRank instead of cr	1 year ago
Michael Peter Christen	cff0991d85	test if this is helpful for https://github.com/yacy/yacy_search_server/issues/500	1 year ago
Michael Peter Christen	ceb07a5218	fixed problem with zim importer which crashed when non-valid urls appeared	1 year ago
Michael Peter Christen	3268a93019	added a 'minified' option to YaCy dumps	1 year ago
Michael Peter Christen	c20c4b8a21	modified export: added maximum number of docs per chunk The export file can now be many files, called chunks. By default still only one chunk is exported. This function is required in case that the exported files shall be imported to an elasticsearch/opensearch index. The bulk import function of elasticsearch/opensearch is limited to 100MB. To make it possible to import YaCy files, those must be splitted into chunks. Right now we cannot estimate the chunk size as bytes, only as number of documents. The user must do experiments to find out the optimum chunk max size, like 50000 docs per chunk. Try this as first attempt.	1 year ago
Michael Peter Christen	24011dcbcc	more file name extensions for json list surrogate files	1 year ago
Michael Peter Christen	34a9fc1a07	bugfixes to zim reader:	1 year ago
Michael Peter Christen	7db0534d8a	Added a zim parser to the surrogate import option. You can now import zim files into YaCy by simply moving them to the DATA/SURROGATE/IN folder. They will be fetched and after parsing moved to DATA/SURROGATE/OUT. There are exceptions where the parser is not able to identify the original URL of the documents in the zim file. In that case the file is simply ignored. This commit also carries an important fix to the pdf parser and an increase of the maximum parsing speed to 60000 PPM which should make it possible to index up to 1000 files in one second.	1 year ago
Michael Peter Christen	70e29937ef	added a check in zim importer which tests if import URLs actually exist	1 year ago
Michael Peter Christen	496f768c44	modified cache strategy for zim clusters	1 year ago
Michael Peter Christen	fdc6311dc7	added parsing rules for wikibooks and wikinews in zim reader	1 year ago
Michael Peter Christen	2ea54b3503	fixed blob iterator in zim cluster definition	1 year ago
Michael Peter Christen	54fa5d3c2e	added a cluster cache but it requires more testing	1 year ago
Michael Peter Christen	53b01dbf2e	Merge branch 'master' of https://github.com/yacy/yacy_search_server.git	1 year ago
Michael Peter Christen	41856e9f34	added an optimized zim file entry iterator	1 year ago
Michael Peter Christen	1c0df28bfb	added a zim importer that can be used for surrogate imports. Can not be used yet because it requires some security additions to verify that the given urls actually work.	1 year ago
Michael Peter Christen	33b6878ded	Merge branch 'master' of https://github.com/yacy/yacy_search_server.git	1 year ago
okybaca	4add1f6bc7	replaced all the links to legacy legacy wiki to legacy wiki	1 year ago
Michael Peter Christen	e2c86a8eba	added a ZIM cluster pointer cache	1 year ago
Michael Peter Christen	4a54b24703	fix for "negative seek offset" error during extension of heap files. This would have always happend when a heap file exceeds 2GB. should fix https://github.com/yacy/yacy_search_server/issues/372	1 year ago
Michael Peter Christen	9c8fb97985	introduced url list and title list caching and enhanced input stream performance in ZIM reader	1 year ago
Michael Peter Christen	b0ae660790	added Zstandard compressed data decompression for ZIM files type 5 also: more generalization and performance enhancements	1 year ago
Michael Peter Christen	ad8ee3a0b6	fixed typo in class name	1 year ago
Michael Peter Christen	c4082c4ff2	refactoring of ZIM reader, simplification, removed unnecessary code	1 year ago
Michael Peter Christen	c2b6b6e7b9	Fixed a large number of problems in the ZIM reader. This library was not prepared for large data because it was missing long data types for pointers. I had to modify the code-base in a fundamental way: - Proof-Reading, - unclustering, - refactoring, - naming adoption to https://wiki.openzim.org/wiki/ZIM_file_format, - change of Exception handling, - extension to more attributes as defined in spec (bugfix for mime type loading) - bugfix to long parsing (prevented reading of large files) The code is furthermore very inefficient and requires more attention. However the format is very useful for YaCy as there are numerous data sources for ZIM-Files.	1 year ago
Michael Peter Christen	5ba5fb5d23	upgraded pdfbox to 3.0.0	1 year ago
Michael Peter Christen	1fefae9baf	integrated the source code of a openzim file format reader. These are the raw format reader files with no integration in YaCy yet, which will maybe follow as a next step. The zim file format is documented in https://openzim.org and the reader code was taken from the archived, non-maintained repository at https://github.com/openzim/zimreader-java	1 year ago
Michael Peter Christen	4308aa5415	removed concept of empty passwords as "no passwords used", because we now start YaCy with a default password (yacy). This has impact of all function that check the current state of password-protection that included the empty password situation, including the warnings to set a password in case that none is set (which cannot be the case any more).	1 year ago
Michael Peter Christen	2c60ff14bb	fixed default pw comparison	1 year ago
Michael Peter Christen	4da320bebf	added a warning message in ConfigBasic in case that the default password was not changed.	1 year ago
Michael Peter Christen	7830268be1	fix `756c817b5a` must be applied to all code where a transaction token is generated.	1 year ago

1 2 3 4 5 ...

8963 Commits (5f4ea9ac5dd50dadd6523909c458be5452b2ff55)