Marc Perkel wrote:
The words file needs a little documentation. Is it limited to single
words or phrases too? What's with the colon and the numbers after the
word?
Phrases are possible too, spaces and numbers are stripped out in both
the wordlist and the OCR output before matching :)
The colon + the number after it indicates a custom matching threshold
for this word. The default threshold is defined in the FuzzyOcr.cf, but
it makes sense to override this setting for some specific words which
often trigger FPs with the default threshold.
Best regards,
Chris