Commit Graph

4502 Commits (d097a642c290c484a7bf5455805f1fe3e623ae67)

Author SHA1 Message Date
jfhs 2135d259e3 Replace hardcoded html/xml entities with a file, support decoding all defined HTML entities
4 years ago
Michael Peter Christen 8f876a8c72 added concurrency to enhance indexing speed during json surrogate import
4 years ago
Michael Peter Christen f8cbaeef93 Merge branch 'master' of git@github.com:yacy/yacy_search_server.git
4 years ago
Michael Peter Christen a857e3d3d5 fix for json importer
4 years ago
sgaebel 1546232c94 adds ranking for multi document queries only
4 years ago
sgaebel 93b353d22d does not boost or add fields for zero-row-queries (exists())
4 years ago
sgaebel f16cd154f7 removes unused imports and variables
4 years ago
sgaebel c69c462a15 replaces a expensive getLoadTimeURL() by exists()
4 years ago
sgaebel a5488ac8f5 uses edismax queries on query counts > 1 only
4 years ago
sgaebel 26223dc25a replaces getLoadTime() by exists() with a simpler query
4 years ago
sgaebel 8e4d014c06 removes useless SolrRequestInfo.clearRequestInfo(), avoids spamming the
4 years ago
Lina Ceballos a96752f5ab adding SPDX license and copyright headers
4 years ago
Michael Peter Christen e18d0ef544 trying to set a higher priority to the process that is involved in index
4 years ago
Michael Peter Christen 8b4394a6c5 fixes for solr 8.8.1 migration
4 years ago
Michael Peter Christen ed9789214e fixed seed initialization problem
4 years ago
Al Sutton 8ade8b8775 Remove forced clear to match new behaviour in 2da71c2a40
4 years ago
Al Sutton 09695fc6d3 Update exceptions to match updated API
4 years ago
Al Sutton 69014a701e Update API Usage
4 years ago
Michael Peter Christen 3da7628117 use environment variables to overwrite configuration variables
4 years ago
Michael Peter Christen 13a2e6dc6e Merge branch 'master' of https://github.com/yacy/yacy_search_server.git
4 years ago
Michael Peter Christen 0ae8ccf657 Make it possible to set an empty password disabling the authentication
4 years ago
Michael Peter Christen 96592a10cf added option to set yacy configuration values using environment
4 years ago
Michael Peter Christen 198826c362 added network scanner process to discover all YaCy peers in the intranet
4 years ago
Michael Peter Christen d9602e8325 Implemented a new syntax in the template engine to simplify json APIs
4 years ago
Michael Peter Christen 5a7f12a9c1 allow network scans for non-standard http/https ports
4 years ago
sgaebel b8d264f7ec fixes logging
4 years ago
Michael Peter Christen 4c920d05b5 removed superfluous lines
4 years ago
Michael Peter Christen 907f121d0c do not overwrite PW with random PW
4 years ago
Michael Peter Christen 3e6a1e0a49 fixed surrogate process counter
4 years ago
Michael Peter Christen d3526c52af fixed a problem in warc importer: do not fail if single WARC entries are
4 years ago
Michael Peter Christen 3078b74e1d Merge branch 'master' of https://github.com/yacy/yacy_search_server.git
4 years ago
Michael Peter Christen 01cc32217f fixed apicall call method parameters
4 years ago
Michael Peter Christen 63f58e4785 enhanced strategy in host browser
4 years ago
Michael Peter Christen 9be36800a4 increased redirect depth by one
4 years ago
Michael Peter Christen d0abb0cedb enabling all crawl profiles in all network modes
4 years ago
Michael Peter Christen baad56d83d beautified default peer names
4 years ago
Michael Peter Christen 43a9f4f574 updated solr 6.6.6 -> 7.7.3
4 years ago
Michael Peter Christen c0d9a3e9a7 turned HostBrowser into a admin-only page, now called IndexBrowser
4 years ago
Michael Peter Christen d359d521a1 fixed warc importer
4 years ago
Michael Peter Christen e54ab39958 Going back to basic authentication for console/shell commands
4 years ago
Michael Peter Christen 6271e9122c javadoc fix
4 years ago
Michael Peter Christen e0f4e3fd9a enhanced ability to debug the code
4 years ago
Michael Peter Christen eea2d71851 prevent creation of auth schema factories every time a servlet is called
4 years ago
Michael Peter Christen fcc9386ed3 enhanced the (already fast!) png exporter
4 years ago
Michael Peter Christen 4e9b425f98 missing fix for latest commit
4 years ago
Michael Peter Christen 3213d9db37 updated jetty from 9.4.17 to 9.4.35
4 years ago
Michael Peter Christen 787fec0658 reduced complexity - removed concurrency in sort
4 years ago
Michael Peter Christen cef5fde343 adding message to UI to make port change transparent
4 years ago
Michael Peter Christen 52228cb6be added a gc to cleanup process (once every 10 minutes)
4 years ago
Michael Peter Christen 22841ffbf1 creating a threaddump during every cleanup process
4 years ago
Michael Peter Christen 36e616271b do better documentation on how to set a default password
4 years ago
Michael Peter Christen df2bf9ef28 try to fix maven build error
4 years ago
Michael Peter Christen 264bab6700 trying to fight the UI unavaiability
4 years ago
Michael Peter Christen 7947baeb49 removed all remaining deprecation warnings
4 years ago
Michael Peter Christen c0f6d6e11d removed one deprecation warning for jetty library initializing ssl
4 years ago
Michael Peter Christen 133440a7a6 some debug lines
4 years ago
sgaebel 3431f91db9 removes unused 'unused' tokens
4 years ago
sgaebel fc03c4b4fe removes some warning and unused objects
4 years ago
sgaebel 4a495df63a removes some deprecation-warnings
4 years ago
sgaebel dd9d4b1188 replace org.junit.Assert.assertThat by
4 years ago
sgaebel df9ea0a42a removes some warnings: unused imports, params
4 years ago
sgaebel 9bc2297161 fixes deleting during recrawl
4 years ago
sgaebel 80785b785e adds deleting during recrawl
4 years ago
Michael Peter Christen e0ad8ca9da replaced json library from JSON.org with libandroid-json-java
5 years ago
Michael Peter Christen 053e54a2c7 grand CORS for json files
5 years ago
Michael Christen cfa27d2fd5 fixed links
5 years ago
Michael Christen cb20aa7e54 removed donation message in search result column
5 years ago
Michael Christen 25227676ae removed some warnings
5 years ago
luccioman 6b45cd5799 New optional crawl filter on the URL a doc must match to crawl its links
6 years ago
luccioman d16bc99835 Added "Show Metadata" links to the ViewFile.html links mode
6 years ago
luccioman a5771b1f14 Made SNI extension user configurable without the need for server restart
6 years ago
luccioman e90405b6f0 Support parsing audio URLs without file extension
6 years ago
luccioman a8316c79da Allow JS resorting of search results by unauthenticated users
6 years ago
luccioman 0ab2b49c31 Made /yacysearch access rate limitations user configurable
6 years ago
luccioman 5b7e41202a Added Solr GSA writer support for responses from remote instances
6 years ago
luccioman 4d8a948455 Properly close PDF snapshots loaded with pdfbox library
6 years ago
luccioman 74e6d6e984 Added Solr GrepHTML writer support for responses from remote instances
6 years ago
luccioman 5e6501974d Added Solr snapshots writer support for responses from remote instances
6 years ago
luccioman 384c37102c Improve accuracy of total results count on latest pages in Stealth mode
6 years ago
luccioman 5e9a08355a Improved logging for federated search
6 years ago
luccioman 9782a98a9c Added the possibility to customize facets sort type and direction
6 years ago
sgaebel c2398fd890 remove warnings: 'Statement unnecessarily nested within else clause'
6 years ago
sgaebel 811d40a6c4 taking care of closing inputstreams, HTTPClient
6 years ago
sgaebel 8d2e7262d9 Recrawl:
6 years ago
sgaebel 8f58c1dcfa extend the SolrServlet to be usable as remote solr (incl. update)
6 years ago
luccioman 7223a2fdb1 Removed usage of now deprecated Jetty function
6 years ago
luccioman 440d9f2fa0 Exclude peers with empty or disabled RWI from remote RWI search
6 years ago
luccioman 08ea0b0397 Added a configurable timeout to wkhtmltopdf calls for pdf snapshots
6 years ago
luccioman 3fb449b3b6 Properly resolve relative URLs against document URL in html base tags
6 years ago
luccioman 73a6e45524 Extended detection of external tools used for Snapshots generation
6 years ago
luccioman 7dc1f60619 Fixed detection of absolute data folder path on MS Windows
6 years ago
luccioman 595e144797 Trace a message on incomplete proper server finish when killing process
6 years ago
luccioman 9daeea823b Fixed concurrency issue on cache used for circles rendering
6 years ago
Michael Peter Christen c347e7d3f8 Merge branch 'master' of https://github.com/yacy/yacy_search_server.git
6 years ago
Michael Peter Christen 848e9304d9 evil bots may crawl harder
6 years ago
luccioman a997133260 Fixed gzip decompression regression on index transfer APIs
6 years ago
luccioman e85f231bdf Fixed termination of Host browser and link structure Solr query threads
6 years ago
luccioman fcf6b16db4 Added new crawler attribute for finer control over Media Type detection
6 years ago
luccioman a83a56473e Added suport for PDF snapshots generation when running on MS Windows
6 years ago
luccioman 8852c97cee Added basic styling for cleaner rendering of missing image snapshots
6 years ago
luccioman 746e0e788d Render a relevant HTTP status code on snapshot image rendering error
6 years ago
luccioman 50b6edfcf5 Updated Solr snapshots writer for a cleaner html head
6 years ago
luccioman f366f43d6b Made snapshots size customizable in Solr snapshots response writer
6 years ago
luccioman 7a62fc0e66 Fixed concurrency issue in custom classloader used for template classes
6 years ago
luccioman 61c337f29a Decode blacklist entries for easier edition of non ascii chars
6 years ago
luccioman ed93221fa1 Improved normalization of blacklist path patterns having non ascii chars
6 years ago
luccioman 2a73b63d9e Use a constant default target file name for seed SCP upload method
6 years ago
luccioman b5eabb626f Removed some dead code
6 years ago
luccioman db7ad76366 Improved support for Java logs file pattern options
6 years ago
luccioman 7adbd1f87d Fixed raw IPV6 addresses snapshots read/write on FAT32 and NTFS fs
6 years ago
luccioman 9b1c87033b Fixed logs folder checking and creation
6 years ago
luccioman c29588dd6a Made possible to provide an absolute data root path for start script
6 years ago
luccioman d03c098b54 Removed deprecated warning comments about imports and Debian installer
6 years ago
luccioman 5b60b4225f Fixed encoding of '+' character on search pages links
6 years ago
luccioman 54fbe166ba Updated pdf cache clear steps consistently with current pdfbox version
6 years ago
luccioman 685122363d Added a parser for XZ compressed archives.
6 years ago
luccioman 4ee14ff3c5 Fixed NullPointerException case on malformed crawl queue folder name
6 years ago
luccioman 21ad9435ec Fixed crawl queue folder naming for IPv6 hosts on MS Windows filesystems
6 years ago
luccioman 8a29551c54 Upgraded the OpenGeoDB dump URL
6 years ago
luccioman 373edf9eac Adjusted yjson Solr writer to support responses from an external Solr
6 years ago
luccioman 87bd17b1cf Simplified a little bit the RSS OpenSearch Solr writer
6 years ago
luccioman dc49ca9c27 Fixed a NPE case on the Solr OpenSearch response writer
6 years ago
luccioman f4267ed247 Made Solr OpenSearch RSS writer compatible with external Solr index
6 years ago
luccioman b1410f593a Fixed stylesheet relative URLs rendering in Solr html writer
6 years ago
luccioman 89c59814da Improved rendering of the Solr api relative url in the html writer
6 years ago
luccioman bf4f320b16 Optionally render the response header when using the Solr html writer
6 years ago
luccioman 313204ae2c Override qf and df Solr params with defaults only when they are not set
6 years ago
luccioman bdafb14336 Removed redundant synchronization lock on network switch function
6 years ago
luccioman d5f44ea216 Removed unnecessary synchronization lock from serverSwitch constructor
6 years ago
luccioman dcad393fe5 Fixed exceeding max size of failreason_s Solr field on large link list
6 years ago
luccioman f467601561 Properly lock solrInstances for reboot and restoration of embedded Solr
6 years ago
luccioman 9630f81306 Fixed small unnecessary lines of code
6 years ago
luccioman 876bcd2f54 Fixed useless comparison between int parameter and Long.MAX_VALUE
6 years ago
luccioman c726154a59 Fixed removal of URLs from the delegatedURL remote crawl stack
6 years ago
luccioman 2bdd71de60 Added server side columns sorting on the Process Scheduler table
6 years ago
luccioman bb51555830 Removed remaining unsafe accesses to SimpleDateFormat instances.
6 years ago
luccioman f895745e1c Removed more unsafe concurrent accesses to SimpleDateFormat instances.
6 years ago
luccioman e97580dfc7 Fixed unsafe conccurent access to generic SimpleDateFormat instances
6 years ago
luccioman 8811700e2e Upgraded Jetty dependency from 9.4.9 to 9.4.11
6 years ago
luccioman d53c33e4ef Fixed potential infinite loop case (does not occur in current code base)
7 years ago
luccioman a15ac8e0ca Made CrawlProfile loading tolerant to malformed json string attribute
7 years ago
luccioman a715bb7876 Fixed rendering of solr mustNoMatch value on CrawlProfileEditor_p.xml
7 years ago
luccioman 0b302c5004 Do not block whole server startup on persisted crawl profile load error
7 years ago
luccioman 4d9aa4ed1e Fixed default crawl profile solr mustnotmatch query from previous commit
7 years ago
luccioman cced94298a Added a new crawler document filter type using Solr syntax
7 years ago
Michael Christen e0dc632020 removed transformer
7 years ago
luccioman 9bc7b6c39d Allow edtion of scheduled next execution dates for finer control
7 years ago
luccioman 40e8c7b89b Use the heavy ConcurrentUpdateSolrClient only when necessary
7 years ago
luccioman bd4cfeda3f Add a max acceptable limit to the size of Solr responses on p2p search
7 years ago
luccioman de4ea95687 Consistently allow gzip compression of remote Solr responses
7 years ago
luccioman cea8187161 Reuse expired connections evictors threads provided by apache and solr
7 years ago
luccioman b5dc1f376f Made outgoing pools max total connections user configurable
7 years ago
luccioman 387d646c0e Added gzip compression of responses returned to user-agents accepting it
7 years ago
luccioman a7a4ba3287 Apply remote solr configured timeout on getting connection from pool
7 years ago
luccioman ee6670fb8f Use a common pooled http connection manager for remote solr instances
7 years ago
luccioman d28f9ba0f6 Removed use of deprecated ConcurrentUpdateSolrClient constructor
7 years ago
luccioman 8a749aa5ad Trace level log message for monitoring remote solr response times
7 years ago
luccioman 35826a3091 Added a search page customization setting to display or not favicons
7 years ago
luccioman 0082b5ab2a Added missing default Solr http client connection timeout initialization
7 years ago
luccioman fa4399d5d2 Small perf improvement : initialize threads names early when possible
7 years ago
luccioman 84d82bfdd7 Adjusted suggestions timeout management
7 years ago
luccioman 65854bcb22 Fixed NullPointerException when omitHeader=true on external Solr server
7 years ago
luccioman c4d984cec8 Fixed Solr response header duplication when requesting external Solr
7 years ago
luccioman 124cc24aa3 Properly handle embedded Solr partial results
7 years ago
luccioman 3ce44cf250 Fixed largest snippet get : don't reject ones starting with a space char
7 years ago
luccioman f511e16d50 Prevent duplication of Solr query highlight fields parameters
7 years ago
luccioman e357ade47d Reduced memory footprint of text snippet extraction
7 years ago
luccioman e115e57cc7 Reduced text snippet extraction processing time.
7 years ago
sgaebel 4b79851e12 corrected icons_sizes_sxt to SolrType.string
7 years ago
luccioman 3b89c232db Easier tracking of longest text snippets initializations
7 years ago
luccioman 3c4344cb12 Fixed text snippet max init time statistic rendering
7 years ago
reger a8234b7ea7 Make sure for image resource url enabled index image pixel size fields are filled
7 years ago
luccioman e67df103b5 Removed more remaining uses of deprecated Seed.getIP() function.
7 years ago
luccioman addd18c993 Removed some remaining uses of deprecated Seed.getIP()
7 years ago
luccioman c35d0568b6 Support for preferred https in peers communication on more operations
7 years ago
luccioman e914d17aca Updated call to function deprecated since commons-codec version 1.11
7 years ago
luccioman a3ec7a7a5f Added analysis optional setting to compute statistics on text snippets
7 years ago
luccioman 1889d484de Added Solr HTML writer support for responses from remote instances
7 years ago
luccioman 2af3bf79c7 Improve rendering of remote Solr admin URLs
7 years ago
luccioman bb74de7d59 Removed unnecessary "/admin" suffix from remote Solr instance admin URL
7 years ago
luccioman 0d34034f17 Ensure an embedded Solr is available for Solr dump/restore operations
7 years ago
luccioman d92b191942 Ensure no remote Solr is attached before "Shut Down and Re-Start Solr"
7 years ago
luccioman 26d8ad591c Adjusted Solr select servlet output when using an external Solr only
7 years ago
luccioman 69690c13a0 Optionally allow external Solr server with self-signed certificate
7 years ago
luccioman b882f85900 Fixed NPE case in Solr select servlet on external Solr only setup
7 years ago
luccioman 2fd4d05e2f Added a shared Java constant for setting key server.servlets.called
7 years ago
luccioman ba9cd14516 Removed hard-coded patch for Solr 5.0 on ranking boost function
7 years ago
luccioman fb3032c530 Added a crawl filtering possibility on documents Media Type (MIME)
7 years ago
luccioman e45afedee4 Added support for enclosures (media links) to the RSS loader
7 years ago
luccioman aaefd5219c Reduce log verbosity of RSS loader on feed items with no link
7 years ago
luccioman cf62b571bd Added RSS reader support for `enclosure` feed item sub element.
7 years ago
luccioman e5f5de0fc7 Added some JavaDoc to the RSSMessage class.
7 years ago
luccioman 0d7625ecfb Handle Solr fields restrict and alias in YaCy html and exml writers
7 years ago
luccioman 3da2739bbd Parse and index more common audio metadata text tag fields.
7 years ago
luccioman 846aba00fa Added parsing of URLs eventually present in audio metadata tags
7 years ago
Michael Peter Christen 187075b878 added nav filter
7 years ago
luccioman bcbd0ae1a4 Enabled partial parsing of audio resources.
7 years ago
luccioman fda0189613 Updated audio file extensions with ones recently added to audioTagParser
7 years ago
luccioman 978e2be95b Let a chance for other parsers on audioTagParser error
7 years ago
luccioman 9e5846a26e Small fix on svg parser error message
7 years ago