Commit Graph

  • 66c0ec4352
    Merge c6127e4f2e into 7e8a1ef0e2 #657 Cody Mikol 2024-10-20 00:59:22 +1100
  • 6d9c536726
    Merge 0fb77994aa into 7e8a1ef0e2 #661 ah9807073 2024-10-20 00:59:09 +1100
  • 7e8a1ef0e2 log file when OOM appears master Michael Peter Christen 2024-10-18 14:25:42 +0200
  • 3c88f87831 jetty upgrade Michael Peter Christen 2024-10-17 11:36:51 +0200
  • a8c64b1af2 added an artificial snippet [Synonym Match] in case that there is only a match in the synonyms] Michael Peter Christen 2024-09-21 21:56:12 +0200
  • f57df061da upgraded commons-io to 2.17.0 Michael Peter Christen 2024-09-21 21:56:12 +0200
  • c88c30a5c5 added an option to ViewFile to see all solr fields which contain texts Michael Peter Christen 2024-09-21 21:51:19 +0200
  • 0fb77994aa Add Dockerfile for YaCy deployment #661 Agent001 2024-09-20 11:51:47 +0000
  • 7b108dadf7 Update vercel.json for correct deployment Agent001 2024-09-20 11:40:05 +0000
  • 1cabc452b5 Add vercel.json for Vercel deployment Agent001 2024-09-20 10:51:41 +0000
  • c6127e4f2e feat(nix): add a nix flake #657 Cody Mikol 2024-08-30 20:41:09 -0400
  • 6db374fdcf upgraded metadata-extractor to 2.19.0 Michael Peter Christen 2024-08-26 23:49:27 +0200
  • 3944984840 added snippet extraction with synonym matching Michael Peter Christen 2024-08-26 23:44:42 +0200
  • d181b9e89b added deleted files from commit 254f12d60b which are still needed and had been linked outside of yacy/ui Michael Peter Christen 2024-07-24 15:57:51 +0200
  • 910a496c9f replaced http links with https Michael Peter Christen 2024-07-21 18:02:58 +0200
  • fd45ccf76e added sponsoring images Michael Peter Christen 2024-07-21 17:25:22 +0200
  • 833d720989 upgraded ppt parser by migration of org.apache,poi from 3.17 to 5.3.0 This also fixes the security waning https://github.com/yacy/yacy_search_server/security/dependabot/37 Michael Peter Christen 2024-07-21 15:28:13 +0200
  • 687820788d this assert does not work because of the 9_0_0 solr version format. An 9_0 is expected but it does not work this way with this version. Michael Peter Christen 2024-07-21 13:33:47 +0200
  • accf4e424b
    Merge pull request #649 from virginOne/master Michael Christen 2024-07-10 16:36:54 +0200
  • 2f5f3f8853
    Merge pull request #650 from zutto/master Michael Christen 2024-07-10 16:35:39 +0200
  • 326b5f6e6e
    Merge pull request #651 from okybaca/removeui Michael Christen 2024-07-10 16:34:07 +0200
  • 254f12d60b removed yacy/ui as obsolete #651 okybaca 2024-07-09 16:26:50 +0200
  • 5268ae2ce9 check the document protocol & host values before proceeding to form final url. #650 zutto 2024-06-29 10:11:58 +0300
  • 962aaec0c0 Improve the clarity of deep crawl feature UI text on AutoCrawler zutto 2024-06-29 09:37:05 +0300
  • d958d1c0c4 ensure that returned SolrDocument is not null zutto 2024-06-29 09:33:06 +0300
  • f3cc818305
    Merge branch 'yacy:master' into master #649 virginOne 2024-06-23 19:52:54 +0800
  • 66cf7d4ca5 disables autowarm of filtercache, corrects luceneMatchVersion sgaebel 2024-06-12 12:59:34 +0200
  • 89c07f0900 Fix the issue of not being able to import the JSON format export of Solr index due to the inconsistency in time format between the exported JSON format and the Solr time format. Virgo 2024-06-19 21:14:45 +0800
  • 70454654f3 by default open the https url for a given host, not the http url (http does almost not exist any more) Michael Peter Christen 2024-05-27 00:53:18 +0200
  • 71a6074cc5 added setting of cache configuration for solr according to recommendation from https://community.searchlab.eu/t/yacy-support-gpt-chatgpt-assistant/1622 However it is not clear if this configuration actually works (has an effect at all) or is the solution for performance issues. Michael Peter Christen 2024-05-26 12:59:59 +0200
  • b8479430b6 400 is too small Michael Peter Christen 2024-05-25 01:21:06 +0200
  • 5f4ea9ac5d reduced memory amount for network image reduced also the number of memory allocation for image storage Michael Peter Christen 2024-05-25 01:10:23 +0200
  • fe4c0aa890 refactoring of RAG reverse proxy: extracted code for ollama code to their own classes Michael Peter Christen 2024-05-21 00:06:19 +0200
  • 8b65a4d14e fixed build clean Michael Peter Christen 2024-05-19 18:02:49 +0200
  • f1c70dce33 Merge branch 'master' of github.com:yacy/yacy_search_server Michael Peter Christen 2024-05-19 17:35:24 +0200
  • 8eb0d490aa migrated solr to 9.0 This is a major step because solr removed support for embedded solr instances in 9.0 and we want to keep it because we want to ship YaCy with an embedded solr. It was necessary to add parts of solr code into YaCy to make this migration possible. Further on with Solr 9.1 they removed even more parts which are required for embedded operation, therefore we cannot migrate yet further without big changes. If you are running a YaCy instance with Solr 8.x, the migration should be done automatically. If not you require to first migrate to a YaCy version 1.93 with Solr 8.x to migrate to Solr 8 data. Michael Peter Christen 2024-05-19 17:34:57 +0200
  • b8417e5619 removed Mac specific code which is not working any more on recent Macs Michael Peter Christen 2024-05-19 17:29:16 +0200
  • 13fbff0bff Added a RAG Proxy for AI Chat with YaCy Michael Peter Christen 2024-05-19 17:19:09 +0200
  • 59c0cb0f30 fixed aarch64 dockerfile Michael Peter Christen 2024-05-13 02:10:24 +0200
  • 0405ec8ad2
    Merge pull request #642 from HeliosLHC/reduce_image_size Michael Christen 2024-05-12 23:50:24 +0200
  • c2ad1950e8 updated jetty to 9.4.54.v20240208 Michael Peter Christen 2024-05-10 15:41:20 +0200
  • b295e38969 fine-tuned the import process of jsonl files which had been missing to actually be able to make searches and browse the index with the host browser Michael Peter Christen 2024-05-10 12:13:44 +0200
  • 262b23532d reduce image size #642 HeliosLHC 2024-05-05 23:23:34 +0200
  • de941c6fee
    Merge pull request #633 from LillySchramm/fix/632/remove-sayat-link Michael Christen 2024-04-05 17:43:55 +0800
  • 160e346d4c fix: Remove SayAt.Me Link #633 EPS-DEV 2024-04-04 15:05:41 +0000
  • 656d47c1ac
    Merge pull request #628 from frankenstein91/vagrant Michael Christen 2024-02-27 14:18:24 +0100
  • 9a406d310c
    switch user #628 Frank Tornack 2024-02-04 19:36:48 +0100
  • 6864486196
    add Yacy vagrant Frank Tornack 2024-02-04 19:29:16 +0100
  • 331e0a24fc
    Merge pull request #621 from OFA54/patch-1 Michael Christen 2023-12-20 23:34:35 +0100
  • d825a85a01
    Merge pull request #619 from pr0vieh/initrecrawl Michael Christen 2023-12-09 14:47:13 +0100
  • 35620762ac bring defaults for recrawlindex to init config #619 pr0vieh 2023-12-09 01:32:31 +0100
  • d097a642c2
    Merge pull request #615 from okybaca/logging2 Michael Christen 2023-12-03 16:40:21 +0100
  • 6d5e9ff53f
    Merge pull request #616 from okybaca/logging3 Michael Christen 2023-12-03 16:39:29 +0100
  • d5d4e8fe3a
    Merge pull request #617 from pr0vieh/master Michael Christen 2023-12-03 16:38:46 +0100
  • dfb2b79609 Add setting for DHT receive loadprereq insted of hardcoded load < 2.0 #617 pr0vieh 2023-12-03 01:27:36 +0100
  • 5dee8dbcbd changed the log entry REJECTED to CRAWLER * REJECTED, loglevel fine #616 okybaca 2023-12-02 12:24:36 +0100
  • 4c603e23f0
    Merge pull request #610 from okybaca/cr-text Michael Christen 2023-11-27 12:17:05 +0100
  • 040cd8be6d
    Merge pull request #612 from okybaca/sitemap-fix Michael Christen 2023-11-27 12:16:43 +0100
  • 0233ecd481
    Merge pull request #614 from okybaca/logging Michael Christen 2023-11-27 12:15:34 +0100
  • 7831f294a9 changed regular peerping messages to level fine #615 okybaca 2023-11-27 08:12:03 +0100
  • 553c859703 logging: moved some log-cluttering DHT messages to level 'fine' okybaca 2023-11-27 07:51:42 +0100
  • 1c5fca9a58 changed network operation log category from YACY to NETWORK okybaca 2023-11-26 12:24:09 +0100
  • 2f44fc0257 added some logging prefixes to yacy.logging #614 okybaca 2023-11-25 18:39:08 +0100
  • 89c2a92cfb
    tr.lng #621 OFA 2023-11-18 01:03:28 +0300
  • 3d3bdb0f5f added zim importer rule for mdwiki Michael Peter Christen 2023-11-16 23:11:57 +0100
  • 4a611ac6a3 another possible fix for https://github.com/yacy/yacy_search_server/issues/500 Michael Peter Christen 2023-11-15 23:45:53 +0100
  • 9c59c6814b updated apache libs #612 okybaca 2023-11-15 10:22:00 +0100
  • d72cd7916c Merge branch 'master' of https://github.com/yacy/yacy_search_server sgaebel 2023-11-14 20:43:42 +0100
  • 0663ae3c99 adds synchornized dumplog sgaebel 2020-12-01 22:34:30 +0100
  • cba84632ee UI: added a more descriptive message, CitationRank instead of cr #610 okybaca 2023-11-14 00:05:23 +0100
  • cff0991d85 test if this is helpful for https://github.com/yacy/yacy_search_server/issues/500 Michael Peter Christen 2023-11-13 16:41:19 +0100
  • ceb07a5218 fixed problem with zim importer which crashed when non-valid urls appeared Michael Peter Christen 2023-11-13 11:12:10 +0100
  • 656b3e3e77 updated guava to latest and added missing library for failureaccess Michael Peter Christen 2023-11-13 10:59:49 +0100
  • 3268a93019 added a 'minified' option to YaCy dumps Michael Peter Christen 2023-11-13 10:27:50 +0100
  • c20c4b8a21 modified export: added maximum number of docs per chunk The export file can now be many files, called chunks. By default still only one chunk is exported. This function is required in case that the exported files shall be imported to an elasticsearch/opensearch index. The bulk import function of elasticsearch/opensearch is limited to 100MB. To make it possible to import YaCy files, those must be splitted into chunks. Right now we cannot estimate the chunk size as bytes, only as number of documents. The user must do experiments to find out the optimum chunk max size, like 50000 docs per chunk. Try this as first attempt. Michael Peter Christen 2023-11-12 22:11:55 +0100
  • 655d8db802 detailed directions in index export to explain how the export can be imported again using elasticsearch/opensearch Michael Peter Christen 2023-11-12 15:26:18 +0100
  • 24011dcbcc more file name extensions for json list surrogate files Michael Peter Christen 2023-11-06 22:44:18 +0100
  • 34a9fc1a07 bugfixes to zim reader: Michael Peter Christen 2023-11-05 12:46:37 +0100
  • 7db0534d8a Added a zim parser to the surrogate import option. You can now import zim files into YaCy by simply moving them to the DATA/SURROGATE/IN folder. They will be fetched and after parsing moved to DATA/SURROGATE/OUT. There are exceptions where the parser is not able to identify the original URL of the documents in the zim file. In that case the file is simply ignored. This commit also carries an important fix to the pdf parser and an increase of the maximum parsing speed to 60000 PPM which should make it possible to index up to 1000 files in one second. Michael Peter Christen 2023-11-05 02:16:40 +0100
  • 70e29937ef added a check in zim importer which tests if import URLs actually exist Michael Peter Christen 2023-11-04 19:07:50 +0100
  • 496f768c44 modified cache strategy for zim clusters Michael Peter Christen 2023-11-03 18:20:10 +0100
  • fdc6311dc7 added parsing rules for wikibooks and wikinews in zim reader Michael Peter Christen 2023-11-02 00:27:24 +0100
  • 2ea54b3503 fixed blob iterator in zim cluster definition Michael Peter Christen 2023-11-01 23:43:27 +0100
  • 54fa5d3c2e added a cluster cache but it requires more testing Michael Peter Christen 2023-11-01 19:52:44 +0100
  • 53b01dbf2e Merge branch 'master' of https://github.com/yacy/yacy_search_server.git Michael Peter Christen 2023-11-01 18:57:04 +0100
  • 41856e9f34 added an optimized zim file entry iterator Michael Peter Christen 2023-11-01 18:50:28 +0100
  • 1c0df28bfb added a zim importer that can be used for surrogate imports. Can not be used yet because it requires some security additions to verify that the given urls actually work. Michael Peter Christen 2023-11-01 18:48:40 +0100
  • b9912ff50d repaired dockerfiles for aarch64 and armv7 Michael Peter Christen 2023-10-29 22:09:24 +0000
  • 33b6878ded Merge branch 'master' of https://github.com/yacy/yacy_search_server.git Michael Peter Christen 2023-10-29 14:58:47 +0100
  • 68554cea07
    Merge pull request #605 from okybaca/readme-docker-link Michael Christen 2023-10-29 14:56:26 +0100
  • 06bfd5802f
    Merge pull request #603 from okybaca/dark-green-css Michael Christen 2023-10-29 14:55:58 +0100
  • 43d5cd101e
    Merge pull request #607 from okybaca/wikilinks Michael Christen 2023-10-29 14:55:26 +0100
  • 4add1f6bc7 replaced all the links to legacy legacy wiki to legacy wiki #607 okybaca 2023-10-29 13:12:24 +0100
  • e2c86a8eba added a ZIM cluster pointer cache Michael Peter Christen 2023-10-29 12:49:08 +0100
  • 4a54b24703 fix for "negative seek offset" error during extension of heap files. This would have always happend when a heap file exceeds 2GB. should fix https://github.com/yacy/yacy_search_server/issues/372 Michael Peter Christen 2023-10-29 09:32:21 +0100
  • 69db75ce45 added a link to docker build guide #605 okybaca 2023-10-29 02:35:57 +0100
  • 9c8fb97985 introduced url list and title list caching and enhanced input stream performance in ZIM reader Michael Peter Christen 2023-10-29 00:43:12 +0200
  • b0ae660790 added Zstandard compressed data decompression for ZIM files type 5 also: more generalization and performance enhancements Michael Peter Christen 2023-10-28 12:24:29 +0200
  • ad8ee3a0b6 fixed typo in class name Michael Peter Christen 2023-10-28 08:57:42 +0200
  • c4082c4ff2 refactoring of ZIM reader, simplification, removed unnecessary code Michael Peter Christen 2023-10-28 08:56:58 +0200
  • c2b6b6e7b9 Fixed a large number of problems in the ZIM reader. This library was not prepared for large data because it was missing long data types for pointers. I had to modify the code-base in a fundamental way: - Proof-Reading, - unclustering, - refactoring, - naming adoption to https://wiki.openzim.org/wiki/ZIM_file_format, - change of Exception handling, - extension to more attributes as defined in spec (bugfix for mime type loading) - bugfix to long parsing (prevented reading of large files) The code is furthermore very inefficient and requires more attention. However the format is very useful for YaCy as there are numerous data sources for ZIM-Files. Michael Peter Christen 2023-10-27 15:49:23 +0200
  • 5ba5fb5d23 upgraded pdfbox to 3.0.0 Michael Peter Christen 2023-10-27 12:05:24 +0200
  • c10944bd4a updated bcmail-jdk15on 1.75 to bcmail-jdk18on 1.67 Michael Peter Christen 2023-10-27 11:08:19 +0200
  • 1fefae9baf integrated the source code of a openzim file format reader. These are the raw format reader files with no integration in YaCy yet, which will maybe follow as a next step. The zim file format is documented in https://openzim.org and the reader code was taken from the archived, non-maintained repository at https://github.com/openzim/zimreader-java Michael Peter Christen 2023-10-27 10:59:06 +0200
  • ec2d14e973 fine tuning the dark-green color scheme #603 okybaca 2023-10-26 12:35:22 +0200
  • 4308aa5415 removed concept of empty passwords as "no passwords used", because we now start YaCy with a default password (yacy). This has impact of all function that check the current state of password-protection that included the empty password situation, including the warnings to set a password in case that none is set (which cannot be the case any more). Michael Peter Christen 2023-10-25 22:56:06 +0200
  • 2c60ff14bb fixed default pw comparison Michael Peter Christen 2023-10-25 13:59:02 +0200
  • 4da320bebf added a warning message in ConfigBasic in case that the default password was not changed. Michael Peter Christen 2023-10-24 23:36:26 +0200
  • 7830268be1 fix 756c817b5a must be applied to all code where a transaction token is generated. Michael Peter Christen 2023-10-21 13:00:49 +0200
  • dc6f218520 set the default password for the admin account to "yacy" Michael Peter Christen 2023-10-21 12:09:19 +0200
  • 756c817b5a fix for https://github.com/yacy/yacy_search_server/issues/544 Michael Peter Christen 2023-10-21 11:45:26 +0200
  • bab1cfc7ea
    added required build tools installation Michael Christen 2023-10-20 16:09:47 +0200
  • 03bf259601 fix for https://github.com/yacy/yacy_search_server/issues/363 We still need to set the load in the process because a demand for higher crawl speed may require to increase the maximum load limit. However, following the criticism in the bug, we do never reduce the load limit again. Michael Peter Christen 2023-10-16 18:26:47 +0200
  • 5bc09af426
    Merge pull request #600 from okybaca/scheduler-sort Michael Christen 2023-10-16 13:00:24 +0200
  • 4c1eb34e85 modified link to Process Scheduler in left menu #600 okybaca 2023-10-10 08:30:04 +0200
  • aeb4c7a660 removed warnings during normal build Michael Peter Christen 2023-10-04 22:00:30 +0200
  • 095a444aa7 removed wiki links and added more shields badges Michael Peter Christen 2023-09-30 18:16:38 +0200
  • ca2a21008a added screenshots Michael Peter Christen 2023-09-30 13:07:18 +0200
  • 961d3cc8af
    Merge pull request #597 from joestr/issue/574-fix-mac-script Michael Christen 2023-09-28 21:10:49 +0200
  • a035b21f63
    Merge pull request #598 from joestr/improvement/remove-travis-yml Michael Christen 2023-09-28 21:10:04 +0200