Commit Graph

11 Commits (0139988c044f72c24e797cbc7cae419013b02b39)

Author SHA1 Message Date
orbiter b57c9da1f8 - fixes to doc, ppt, xls parser: better title 16 years ago
orbiter bfcf9b7aa3 - added language detection using metadata from documents: html and odt documents provide this information 17 years ago
danielr 3bb870bfcd added final where possible 17 years ago
orbiter c3d461d191 - removed superfluous copyright statement 17 years ago
orbiter efd0b8371a - added parsing of Dublin Core - compliant metadata (see RFC 5013 and ISO 15836) to html parser 17 years ago
low012 b08f877e97 *) tried to get rid of warnings when compiling parsers (http://forum.yacy-websuche.de/viewtopic.php?t=660) 17 years ago
orbiter daf0f74361 joined anomic.net.URL, plasmaURL and url hash computation: 18 years ago
theli 75d90834a2 *) adding additional file extension for powerpoint 18 years ago
orbiter 6b9eea3932 - removed differentiation between longTitle and shortTitle; this cannot be used for search results, 18 years ago
orbiter a738b57b31 added author tag to indexing content 18 years ago
octoate 1c4076da8a First version of the MS Powerpoint parser based on Apache POI 19 years ago