Commit Graph

8 Commits (87f6631a2ab768d793c78bd3cd115e3d0f4c424a)

Author SHA1 Message Date
reger b017e97421 optimize condenser language detection a little.
8 years ago
reger ae3717d087 adjust Tokenizer sentence count to ignore repeated punktuation (like !!!! )
8 years ago
reger 474f0476c6 adjust Tokenizer sentence count on trailing text after last recognized sentence
8 years ago
reger 96467c5467 remove not needed counter in Tokeninzer (completing last changes)
8 years ago
reger 272cdd496a reactivate sentence counter in WordTokenizer for phrasepos ranking,
8 years ago
reger e310ec5f70 fix posInText ranking calculation to score 0 on no position info
8 years ago
Michael Peter Christen 90f75c8c3d added enrichment of synonyms and vocabularies for imported documents
10 years ago
Michael Peter Christen 7829480b82 refactoring: separated condenser and tokenizer
10 years ago