-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 decoder wrote: > Hello there, > > I have improved the original OcrPlugin (found at > http://wiki.apache.org/spamassassin/OcrPlugin), so it contains > fuzzy matching. Like that, mistakes made by the OCR recognition or > intentional obfuscations in the text don't make the recognition > impossible. This is being done with a relative distance calculation > between the pattern (word from a given word list) and a line in > the recognized input. Also, the plugin uses dynamic scoring (more > matched words means more score, this can be adjusted in the > source). > > You can find a full description and an example in the wiki under: > > http://wiki.apache.org/spamassassin/FuzzyOcrPlugin > > > Ideas for improvements or critics are always welcome :) > > > Best regards, > > > Chris
A new beta is available (2.2-beta1). It includes a bugfix for a bug with jpeg content-types reported by Matthias Keller. Other changes: - - Debug file stuff removed, instead of that, the tempfiles don't get deleted when in debug mode (verbose > 1). - - Logfile support, all debug messages go there - - Much more debug messages - - Error handling/logging (Thanks to Ron Bender for pointing that out) - - Added the necessary priority line to the cf file. (Thanks to Mark Martinec and others for reminding me about that) Please note that this is a beta... so you should probably try it out in non-production environments first before blaming me ;D Chris -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFE5HWiJQIKXnJyDxURAvBCAJ9rsVctqQcMC76duSL8YP23L4mPjQCggwv+ gYGWlMO1FSkJ9jud+7tatZc= =gcsV -----END PGP SIGNATURE-----