Commit Graph

11 Commits (4f6658b1153e51c80fe3449a7656e52ecafbcb33)

Author SHA1 Message Date
orbiter b57c9da1f8 - fixes to doc, ppt, xls parser: better title
16 years ago
orbiter bfcf9b7aa3 - added language detection using metadata from documents: html and odt documents provide this information
16 years ago
danielr 3bb870bfcd added final where possible
17 years ago
orbiter c3d461d191 - removed superfluous copyright statement
17 years ago
orbiter efd0b8371a - added parsing of Dublin Core - compliant metadata (see RFC 5013 and ISO 15836) to html parser
17 years ago
low012 b08f877e97 *) tried to get rid of warnings when compiling parsers (http://forum.yacy-websuche.de/viewtopic.php?t=660)
17 years ago
orbiter daf0f74361 joined anomic.net.URL, plasmaURL and url hash computation:
17 years ago
theli 75d90834a2 *) adding additional file extension for powerpoint
18 years ago
orbiter 6b9eea3932 - removed differentiation between longTitle and shortTitle; this cannot be used for search results,
18 years ago
orbiter a738b57b31 added author tag to indexing content
18 years ago
octoate 1c4076da8a First version of the MS Powerpoint parser based on Apache POI
18 years ago