Commit Graph

20 Commits (0797b0de99cf6a32958405e79009b8939ea30c18)

Author SHA1 Message Date
orbiter b77b8cac0c - enhanced html parser: recognized much more details in the content
14 years ago
orbiter 9b25d07295 - added geo information parsing to html parser
14 years ago
orbiter 694fa3a2a5 - replaced more direct string-based UTF-8 conversions by predefined UTF-8 conversion
14 years ago
orbiter e1b6916423 always try to guess the size of a StringBuilder to prevent too many memory re-allocations
14 years ago
orbiter cb1f49d0f2 replaced all 'new String' with default encoding (missing) or UTF-8 encoding with a String generation method that uses a pre-defined Charset constant for UTF-8. This avoids a cache-lookup for the Charset object using String hashing of the String 'UTF-8'.
14 years ago
orbiter 4588b5a291 - fixed document number limitation for crawls that restrict the number of documents per domain
14 years ago
low012 2a6499364d *) minor changes
14 years ago
low012 c0274bd123 *) minor changes
14 years ago
orbiter 7bdb13bf7f more fixes to smb crawling: better file names
14 years ago
orbiter b769cce433 - added a catch-all parser for all documents that cannot be parsed: they will contributed with their document url for the search index only
14 years ago
orbiter 58e74282af added a word counter statistic in condenser which is used by the did-you-mean to calculate best matches for given search words.
14 years ago
orbiter 0010cd9db1 Support for indexing of RSS feeds!
14 years ago
orbiter b6fb239e74 redesign of parser interface:
15 years ago
orbiter 11639aef35 - added new protocol loader for 'file'-type URLs
15 years ago
orbiter cf43bdc87e This is a large bugfix and enhancement commit to support a better location detection for data
15 years ago
orbiter 54af9e6b49 - added parsing of robots meta-tag in html headers to detect a noindexing request
15 years ago
orbiter 5df628a2a4 - added BEncoder class
15 years ago
orbiter dff4f95c78 some patches to get the torrent parser working
15 years ago
orbiter fbd24c2d84 integrated the torrent parser
15 years ago
orbiter bd32f8b8cb added a torrent metadata file parser
15 years ago