Commit Graph

28 Commits (4df63626f54720c56b3b5a5584ba3040e313f86c)

Author SHA1 Message Date
orbiter bfcf9b7aa3 - added language detection using metadata from documents: html and odt documents provide this information
16 years ago
danielr 3bb870bfcd added final where possible
17 years ago
orbiter c3d461d191 - removed superfluous copyright statement
17 years ago
danielr 7feae906aa - organize imports
17 years ago
orbiter 87a8747ce3 - enhanced recognition, parsing, management and double-occurrence-handling of image tags
17 years ago
orbiter efd0b8371a - added parsing of Dublin Core - compliant metadata (see RFC 5013 and ISO 15836) to html parser
17 years ago
low012 b08f877e97 *) tried to get rid of warnings when compiling parsers (http://forum.yacy-websuche.de/viewtopic.php?t=660)
17 years ago
orbiter e22014dc83 some memory enhancements when generating and displaying ymage objects
17 years ago
orbiter daf0f74361 joined anomic.net.URL, plasmaURL and url hash computation:
17 years ago
orbiter 40b0547611 - documentaton changes (removed old forum links)
18 years ago
orbiter 6b9eea3932 - removed differentiation between longTitle and shortTitle; this cannot be used for search results,
18 years ago
orbiter a738b57b31 added author tag to indexing content
18 years ago
theli f17ce28b6d *) plasmaHTCache:
18 years ago
theli cd5f349666 *) Better handling of large files during parsing
18 years ago
theli 813a8a8179 *) migration of mimeTypeParser to jmimemagic 0.1
18 years ago
theli b6c7b91582 *) Parser now throws an ParserException instead of returning null on parsing errors (e.g. needed by snippet fetcher)
18 years ago
orbiter 3aac5b26da - added automatic tag generation when a web page from the search results is added
18 years ago
theli 74c3e7cf29 *) storing document charset into plasmaParserDocument object (is needed later by the condenser)
18 years ago
theli d0a5a53789 *) changes needed for multi-language support
18 years ago
theli f3ac4dbbb9 *) better handling of server shutdown
18 years ago
orbiter 3879a0ecd0 replaced java.net.URL usage by use of new class de.anomic.net.URL
19 years ago
orbiter 83e0e765ec redesigned some parts of the html scanner & parser
19 years ago
orbiter 3d8a5ae652 code cleanup
19 years ago
theli bdf30117c1 *) Redesign of parser configuration
19 years ago
theli c2fe3a1670 *) Updating jMimeMagic Ruleset
19 years ago
theli ca26aab9b1 *) More debugging output for migrateWords
19 years ago
hydrox cb69047b91 *)cleanup access static methods and fields
19 years ago
theli 361f05978d Multiple updates regarding the yacy seedUpload facility,
20 years ago