Commit Graph

10 Commits (6d5e9ff53f4090e24a0cbe601df5665fb10b6ddf)

Author SHA1 Message Date
Michael Peter Christen 8285fe715a tab to spaces for classes supporting the condenser.
1 year ago
luccioman 5a14d34a7d Refactoring : documented and extracted autotagging processing functions.
7 years ago
reger b017e97421 optimize condenser language detection a little.
8 years ago
reger ae3717d087 adjust Tokenizer sentence count to ignore repeated punktuation (like !!!! )
8 years ago
reger 474f0476c6 adjust Tokenizer sentence count on trailing text after last recognized sentence
8 years ago
reger 96467c5467 remove not needed counter in Tokeninzer (completing last changes)
8 years ago
reger 272cdd496a reactivate sentence counter in WordTokenizer for phrasepos ranking,
8 years ago
reger e310ec5f70 fix posInText ranking calculation to score 0 on no position info
8 years ago
Michael Peter Christen 90f75c8c3d added enrichment of synonyms and vocabularies for imported documents
10 years ago
Michael Peter Christen 7829480b82 refactoring: separated condenser and tokenizer
10 years ago