Commit Graph

540 Commits (43d5cd101ef7790a993f0810e20db38a559019f8)

Author SHA1 Message Date
Michael Peter Christen 4308aa5415 removed concept of empty passwords as "no passwords used",
1 year ago
Michael Peter Christen 9fcd8f1bda added canonical filter
2 years ago
Michael Christen 4304e07e6f crawl profile adoption to new tag valency attribute
2 years ago
Michael Peter Christen 309adb814e fixed import of jsonlist imort from searchlab.eu using a direct URL
2 years ago
Michael Peter Christen 62d177bf59 stub for jsonlist index importer web page
2 years ago
Michael Peter Christen efa0425f00 refactoring: moved jsonlist importer to importer class
2 years ago
Michael Peter Christen 49daa32a88 yacy can now read searchlab export dump files
2 years ago
Michael Peter Christen 60c9986a0e new release file names with date and git hash
2 years ago
Michael Peter Christen 9c1bc533fa removed hazelcast because it is phoning home, see also:
2 years ago
Michael Peter Christen fc98ca7a9c removed ContentControl servlet and functinality
2 years ago
Michael Peter Christen 3d138d3fdd catch error when initializing hazelcast
2 years ago
reger24 6a1e259fd0 Fix NPE in Switchboard . getURL https://github.com/yacy/yacy_search_server/issues/441
3 years ago
Michael Peter Christen bd3f2483a1 replaced url and date retrieval by only url retrieval
3 years ago
Michael Peter Christen be0aebad84 fixes https://github.com/yacy/yacy_search_server/issues/424
3 years ago
Michael Peter Christen 63ad8ce6b2 removed ymarks
3 years ago
Michael Peter Christen ef5a71a592 enhanced crawl start response time
3 years ago
Michael Peter Christen e81b770f79 enabled crawl starts with very large sets of start urls
3 years ago
Michael Peter Christen 1cdb21592b added hazelcast and some modifications to align legacy YaCy with
4 years ago
Michael Peter Christen 8f876a8c72 added concurrency to enhance indexing speed during json surrogate import
4 years ago
Michael Peter Christen f8cbaeef93 Merge branch 'master' of git@github.com:yacy/yacy_search_server.git
4 years ago
Michael Peter Christen a857e3d3d5 fix for json importer
4 years ago
sgaebel c69c462a15 replaces a expensive getLoadTimeURL() by exists()
4 years ago
sgaebel 26223dc25a replaces getLoadTime() by exists() with a simpler query
4 years ago
Michael Peter Christen 13a2e6dc6e Merge branch 'master' of https://github.com/yacy/yacy_search_server.git
4 years ago
Michael Peter Christen 0ae8ccf657 Make it possible to set an empty password disabling the authentication
4 years ago
Michael Peter Christen 96592a10cf added option to set yacy configuration values using environment
4 years ago
Michael Peter Christen 198826c362 added network scanner process to discover all YaCy peers in the intranet
4 years ago
Michael Peter Christen 907f121d0c do not overwrite PW with random PW
4 years ago
Michael Peter Christen 3e6a1e0a49 fixed surrogate process counter
4 years ago
Michael Peter Christen baad56d83d beautified default peer names
4 years ago
Michael Peter Christen 6271e9122c javadoc fix
4 years ago
Michael Peter Christen 52228cb6be added a gc to cleanup process (once every 10 minutes)
4 years ago
Michael Peter Christen 22841ffbf1 creating a threaddump during every cleanup process
4 years ago
sgaebel 4a495df63a removes some deprecation-warnings
4 years ago
sgaebel dd9d4b1188 replace org.junit.Assert.assertThat by
4 years ago
Michael Peter Christen e0ad8ca9da replaced json library from JSON.org with libandroid-json-java
5 years ago
luccioman 6b45cd5799 New optional crawl filter on the URL a doc must match to crawl its links
6 years ago
luccioman a5771b1f14 Made SNI extension user configurable without the need for server restart
6 years ago
luccioman e90405b6f0 Support parsing audio URLs without file extension
6 years ago
luccioman 08ea0b0397 Added a configurable timeout to wkhtmltopdf calls for pdf snapshots
6 years ago
luccioman fcf6b16db4 Added new crawler attribute for finer control over Media Type detection
6 years ago
luccioman 54fbe166ba Updated pdf cache clear steps consistently with current pdfbox version
6 years ago
luccioman bdafb14336 Removed redundant synchronization lock on network switch function
6 years ago
luccioman dcad393fe5 Fixed exceeding max size of failreason_s Solr field on large link list
6 years ago
luccioman 2bdd71de60 Added server side columns sorting on the Process Scheduler table
6 years ago
luccioman cced94298a Added a new crawler document filter type using Solr syntax
7 years ago
luccioman b5dc1f376f Made outgoing pools max total connections user configurable
7 years ago
luccioman ee6670fb8f Use a common pooled http connection manager for remote solr instances
7 years ago
luccioman fa4399d5d2 Small perf improvement : initialize threads names early when possible
7 years ago
luccioman a3ec7a7a5f Added analysis optional setting to compute statistics on text snippets
7 years ago