-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

decoder wrote:
> Hello there,
>
> I have improved the original OcrPlugin (found at
> http://wiki.apache.org/spamassassin/OcrPlugin), so it contains
> fuzzy matching. Like that, mistakes made by the OCR recognition or
> intentional obfuscations in the text don't make the recognition
> impossible. This is being done with a relative distance calculation
>  between the pattern (word from a given word list) and a line in
> the recognized input. Also, the plugin uses dynamic scoring (more
> matched words means more score, this can be adjusted in the
> source).
>
> You can find a full description and an example in the wiki under:
>
> http://wiki.apache.org/spamassassin/FuzzyOcrPlugin
>
>
> Ideas for improvements or critics are always welcome :)
>
>
> Best regards,
>
>
> Chris

A new beta is available (2.2-beta1).

It includes a bugfix for a bug with jpeg content-types reported by
Matthias Keller. Other changes:

- - Debug file stuff removed, instead of that, the tempfiles don't get
deleted when in debug mode (verbose > 1).
- - Logfile support, all debug messages go there
- - Much more debug messages
- - Error handling/logging (Thanks to Ron Bender for pointing that out)
- - Added the necessary priority line to the cf file. (Thanks to Mark
Martinec and others for reminding me about that)

Please note that this is a beta... so you should probably try it out
in non-production environments first before blaming me ;D

Chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFE5HWiJQIKXnJyDxURAvBCAJ9rsVctqQcMC76duSL8YP23L4mPjQCggwv+
gYGWlMO1FSkJ9jud+7tatZc=
=gcsV
-----END PGP SIGNATURE-----

Reply via email to