#%env/templates/metas.template%# #%env/templates/header.template%# #%env/templates/submenuIndexControl.template%#

Content Analysis

These are document analysis attributes.

Double Content Detection

Double-Content detection is done using a ranking on a 'unique'-Field, named 'fuzzy_signature_unique_b'. This field is set during parsing and is influenced by two attributes for the TextProfileSignature class.


This is the minimum length of a word which shall be considered as element of the signature. Should be either 2 or 3.

The quantRate is a measurement for the number of words that take part in a signature computation. The higher the number, the less words are used for the signature. For minTokenLen = 2 the quantRate value should not be below 0.24; for minTokenLen = 3 the quantRate value must be not below 0.5.
#%env/templates/footer.template%#