Commit Graph

8 Commits (d0182e47975a17855659609a82e8fd5679aacc6b)

Author SHA1 Message Date
reger b017e97421 optimize condenser language detection a little.
9 years ago
reger ae3717d087 adjust Tokenizer sentence count to ignore repeated punktuation (like !!!! )
9 years ago
reger 474f0476c6 adjust Tokenizer sentence count on trailing text after last recognized sentence
9 years ago
reger 96467c5467 remove not needed counter in Tokeninzer (completing last changes)
9 years ago
reger 272cdd496a reactivate sentence counter in WordTokenizer for phrasepos ranking,
9 years ago
reger e310ec5f70 fix posInText ranking calculation to score 0 on no position info
9 years ago
Michael Peter Christen 90f75c8c3d added enrichment of synonyms and vocabularies for imported documents
10 years ago
Michael Peter Christen 7829480b82 refactoring: separated condenser and tokenizer
10 years ago