Commit Graph

18 Commits (366ceae35abb3727580faeba9f3a8bdfaa301fd6)

Author SHA1 Message Date
luccioman bf55f1d6e5 Started support of partial parsing on large streamed resources.
8 years ago
luccioman 2a87b08cea Removed temporary html parser test code
8 years ago
luccioman 90a7c1affa HTML parser : removed unnecessary remaining recursive processing
8 years ago
luccioman 9b1bb2545e Refactored plain-text URLs detection implementation.
8 years ago
luccioman 8da3174867 Ensure lower case conversion consistency with any default locale.
8 years ago
luccioman 319231a458 Added a generic XML parser, able to parse elements text and URLs.
8 years ago
luccioman 1acb7005d0 Added a basic JUnit test with test gz files for the gzip parser
8 years ago
luccioman 1e2fb76720 Properly close test files in htmlParser unit test
8 years ago
luccioman a04feac064 Ensure file input streams proper closing in both success and failures
8 years ago
luccioman d98c04853d Ensure proper closing of file input streams.
8 years ago
reger 41e2ee0eca Fix call parameter for ConnectionInfo in MonitorHandler
8 years ago
reger f254fcfc67 fix htmlParser <script> text extraction on code containing expression
8 years ago
reger cb95b7339a include html5 <time> tag in content scraper,
8 years ago
luccioman 7717a3d43d Fixed license headers on files created to improve favicon management.
8 years ago
luccioman 6e1959f469 Merge branch 'master' of https://github.com/yacy/yacy_search_server.git
8 years ago
reger ebde21079a refactor xlsParser to include Excel file attribute (like author) in parser result doc.
9 years ago
luc 3cc5619d93 Improved HTML icons indexing and rendering in search results.
9 years ago
reger 84c970eaec move test classes to test/java (subdirectory as in Maven standard subdir layout)
9 years ago