Commit Graph

290 Commits (326b5f6e6e8a1e8e526878216a4b148788802dc0)

Author SHA1 Message Date
reger24 91a2ad1457 fix test xlsx file with correct anchor
3 years ago
Michael Peter Christen 0579a9546a changed link to new forum location
3 years ago
Michael Peter Christen 3959d43a5c fixed doku link
3 years ago
sgaebel fc03c4b4fe removes some warning and unused objects
4 years ago
sgaebel 4a495df63a removes some deprecation-warnings
4 years ago
sgaebel dd9d4b1188 replace org.junit.Assert.assertThat by
4 years ago
sgaebel df9ea0a42a removes some warnings: unused imports, params
4 years ago
Michael Peter Christen 64a17faca0 added debug code to parser test to investigate why this fails in travis
4 years ago
Michael Peter Christen f0f12f875b fix for failing parser test: new forum link
5 years ago
Michael Christen 3a46b07603 fixed many links to old forum, now https://searchlab.eu
5 years ago
luccioman e90405b6f0 Support parsing audio URLs without file extension
6 years ago
luccioman 3fb449b3b6 Properly resolve relative URLs against document URL in html base tags
6 years ago
luccioman 9daeea823b Fixed concurrency issue on cache used for circles rendering
6 years ago
luccioman 61c337f29a Decode blacklist entries for easier edition of non ascii chars
6 years ago
luccioman ed93221fa1 Improved normalization of blacklist path patterns having non ascii chars
6 years ago
luccioman 685122363d Added a parser for XZ compressed archives.
6 years ago
luccioman 26aa5f7a0f Suppress compilation warning on unit testing intentional failure
6 years ago
luccioman f895745e1c Removed more unsafe concurrent accesses to SimpleDateFormat instances.
6 years ago
luccioman e97580dfc7 Fixed unsafe conccurent access to generic SimpleDateFormat instances
6 years ago
luccioman cced94298a Added a new crawler document filter type using Solr syntax
7 years ago
luccioman 2c155ece77 Fixed JUnit test after removal of unused Transformer
7 years ago
luccioman e357ade47d Reduced memory footprint of text snippet extraction
7 years ago
luccioman e115e57cc7 Reduced text snippet extraction processing time.
7 years ago
luccioman 3b89c232db Easier tracking of longest text snippets initializations
7 years ago
luccioman 8d7099a081 Handle escaped line breaks and separators in vocabulary import from CSV
7 years ago
luccioman eb20589e29 Fixed issue #158 : completed div CSS class ignore in crawl
7 years ago
luccioman 33593c22e9 Fixed loss of other modifiers on keywords/tags search navigation links
7 years ago
luccioman 9412881230 Added basic support for autotagging microdata annotated item types.
7 years ago
luccioman 5a14d34a7d Refactoring : documented and extracted autotagging processing functions.
7 years ago
luccioman 58b9834729 Added HTML microdata typed items parsing capability.
7 years ago
luccioman fa6d030b0b Moved dbtest to the test source folder.
7 years ago
luccioman 098ee63911 Added a manual performance test for the HostBalancer.
7 years ago
luccioman 46b5249c20 Removed time condition on HostBalancer initialization in JUnit test.
7 years ago
luccioman 36e9b1c5b3 Fixed SegmentTest test case time dependant occasional failures
7 years ago
Michael Peter Christen b907819cb4 Merge branch 'master' of https://github.com/yacy/yacy_search_server.git
7 years ago
Michael Peter Christen 25573bd5ab added a crawl filter based on <div> tag class names
7 years ago
luccioman d95b288f19 Removed use of deprecated Jetty IPAccessHandler for client filtering.
7 years ago
luccioman 0a120787e3 Improved accuracy of URLs search filters : protocol, tld, host, file ext
7 years ago
luccioman d1c7dfd852 Fixed URL parsing with fragment and empty path
7 years ago
luccioman e2f6427a63 Added a basic JUnit test for the Visio parser (vsdParser)
7 years ago
luccioman d41ad7af6f Restore initial locale at the end of a JUnit test case which modify it.
7 years ago
luccioman 7206f1ed71 Do locale neutral case conversions on domain names.
7 years ago
luccioman 398c66f06c Do locale neutral case conversions in MultiProtocolURL
7 years ago
luccioman 9531b83598 Do locale neutral case conversions in Classification
7 years ago
luccioman ac209cac2e Updated the generic top-level known domains list.
7 years ago
luccioman fcd57e2d0f Improved some JUnit tests isolation and resources release
7 years ago
luccioman e0eda84c24 Remove old hard-coded holiday dates from DateDection class.
7 years ago
luccioman 73977ec0fe Added a html parser charset detection unit test
7 years ago
luccioman 285f0d6a39 Consistently encode snapshot image with format requested on the API.
7 years ago
luccioman 7c319c841e Fixed pdf2image conversion with imagemagick on PDFs having transparency
7 years ago