Commit Graph

11 Commits (ecb55c958d811cdfb1f518a6b30986653e59ea67)

Author SHA1 Message Date
Michael Peter Christen 0579a9546a changed link to new forum location
3 years ago
Michael Christen 3a46b07603 fixed many links to old forum, now https://searchlab.eu
6 years ago
luccioman 3fb449b3b6 Properly resolve relative URLs against document URL in html base tags
6 years ago
luccioman 2c155ece77 Fixed JUnit test after removal of unused Transformer
7 years ago
luccioman 58b9834729 Added HTML microdata typed items parsing capability.
7 years ago
Michael Peter Christen 25573bd5ab added a crawl filter based on <div> tag class names
7 years ago
luccioman bf55f1d6e5 Started support of partial parsing on large streamed resources.
8 years ago
luccioman 9b1bb2545e Refactored plain-text URLs detection implementation.
8 years ago
reger cb95b7339a include html5 <time> tag in content scraper,
8 years ago
luccioman 7717a3d43d Fixed license headers on files created to improve favicon management.
9 years ago
luc 3cc5619d93 Improved HTML icons indexing and rendering in search results.
9 years ago