Commit Graph

66 Commits (6412c926bce37dca1e605d4f197685957fe89b31)

Author SHA1 Message Date
orbiter 8fdefd5c68 generalization of payload definition of index storage
18 years ago
low012 4feaa91890 *) Added additional MIME-Type.
18 years ago
low012 89af433879 *) Deleted parts of WebCat that were not needed for parsing SWFs.
18 years ago
low012 8c9bc7e341 *) extracting urls works now
18 years ago
low012 493391e42d *) new flash parser, still experimental
18 years ago
octoate e4a3574b77 StringBuffer now resets every time the parser is called
18 years ago
octoate cc24dde5e0 First version of a MS Excel parser based on Apache POI
18 years ago
octoate 1c4076da8a First version of the MS Powerpoint parser based on Apache POI
18 years ago
theli 5b75d64d7d *) bugfix for last commit
18 years ago
theli 71ed104bc7 *) adding additional rpm mimetype (used by packman)
18 years ago
theli 1586d57187 *) odtParser: better handling of large files
18 years ago
theli f17ce28b6d *) plasmaHTCache:
18 years ago
theli cd5f349666 *) Better handling of large files during parsing
18 years ago
orbiter df1629b05a - code cleanup
18 years ago
theli b73efd5565 *) missing changes needed because of last commit
18 years ago
theli 813a8a8179 *) migration of mimeTypeParser to jmimemagic 0.1
18 years ago
theli b6c7b91582 *) Parser now throws an ParserException instead of returning null on parsing errors (e.g. needed by snippet fetcher)
18 years ago
theli 97d2a08ef1 *) restructuring needed to support parsing of documents using various charsets
18 years ago
orbiter 3aac5b26da - added automatic tag generation when a web page from the search results is added
18 years ago
theli 74c3e7cf29 *) storing document charset into plasmaParserDocument object (is needed later by the condenser)
18 years ago
theli d0a5a53789 *) changes needed for multi-language support
18 years ago
theli b0e8ff6eda *) some TODO makers for UTF-8 problem
18 years ago
theli f3ac4dbbb9 *) better handling of server shutdown
18 years ago
theli 9d13aeca13 *) removing class. does not work so far
19 years ago
theli 95a84ae469 *) adding missing classes
19 years ago
orbiter 3879a0ecd0 replaced java.net.URL usage by use of new class de.anomic.net.URL
19 years ago
theli 45b39ee1be *) solving unpacking problems with to long filename by
19 years ago
orbiter 015d044c25 tried to fix some problems with latest changes to httpc
19 years ago
orbiter 83e0e765ec redesigned some parts of the html scanner & parser
19 years ago
orbiter b21b9df2d0 added section headlines generation to html parser
19 years ago
orbiter 9544c47684 added some UTF-8 handling.
19 years ago
orbiter 9086261476 refactoring of base64 encoding:
19 years ago
theli 44fa94ac52 *) Modifications for dbImport functionality
19 years ago
orbiter 3d8a5ae652 code cleanup
19 years ago
orbiter a04930f025 code cleanup
19 years ago
theli 8ed0aaae8d *) Adding content Parser for RPM Files
19 years ago
theli 818d37ce44 *) Removing getSimpleName
19 years ago
theli bdf30117c1 *) Redesign of parser configuration
19 years ago
theli 90d6c6223b *) Adding color codes to network graphic legend
19 years ago
theli c2fe3a1670 *) Updating jMimeMagic Ruleset
19 years ago
theli ca26aab9b1 *) More debugging output for migrateWords
19 years ago
hydrox cb69047b91 *)cleanup access static methods and fields
19 years ago
hydrox 56b9f34411 *)removed unused imports
19 years ago
theli b990dc1ad1 *) Replacing jsch 0.1.19 lib with newer version 0.1.21
19 years ago
orbiter 858cd94299 replaced indexing ram-queue by file-based stack-queue
20 years ago
theli db3ed75728 *) closing stream correctly
20 years ago
theli 9e47ba5ad6 *) adding missing calls for function close() to avoid "too many open file" bug
20 years ago
theli 9a98988c3c *) Bugfix for SSL/NIO Bug
20 years ago
theli 890e3f4d4a *) adding missing calls for function close() to avoid "too many open file" bug*) adding
20 years ago
theli 6dd3ec0dc4 *) Adding debug="true" debuglevel="lines,vars,source" to ant build files
20 years ago