Commit Graph

7 Commits (117a85987989210f3b3295778e12bbaf2f5cd733)

Author SHA1 Message Date
luccioman 58b9834729 Added HTML microdata typed items parsing capability.
7 years ago
Michael Peter Christen 25573bd5ab added a crawl filter based on <div> tag class names
7 years ago
luccioman bf55f1d6e5 Started support of partial parsing on large streamed resources.
8 years ago
luccioman 9b1bb2545e Refactored plain-text URLs detection implementation.
8 years ago
reger cb95b7339a include html5 <time> tag in content scraper,
8 years ago
luccioman 7717a3d43d Fixed license headers on files created to improve favicon management.
8 years ago
luc 3cc5619d93 Improved HTML icons indexing and rendering in search results.
9 years ago