-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 decoder wrote: > Hello there, > > I have improved the original OcrPlugin (found at > http://wiki.apache.org/spamassassin/OcrPlugin), so it contains > fuzzy matching. Like that, mistakes made by the OCR recognition or > intentional obfuscations in the text don't make the recognition > impossible. This is being done with a relative distance calculation > between the pattern (word from a given word list) and a line in > the recognized input. Also, the plugin uses dynamic scoring (more > matched words means more score, this can be adjusted in the > source). > > You can find a full description and an example in the wiki under: > > http://wiki.apache.org/spamassassin/FuzzyOcrPlugin > > > Ideas for improvements or critics are always welcome :) > > > Best regards, > > > Chris
See http://wiki.apache.org/spamassassin/FuzzyOcrPlugin Major changes: Replaced imagemagick with netpbm, support png, invoked giffix for broken gifs, detect image format with magic bytes and not by content-type, added various configuration options. Feedback is welcome :) Chris -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFE2PqdJQIKXnJyDxURAnFuAJ4vfLmW4UZUO0YH0EGcJlyNwJMUsACdGmAJ 1ZfXWyUvpaJ8ZNC1HeRMbLA= =/Cyu -----END PGP SIGNATURE-----