Perhaps corrupted gifs should be treated as spam? decoder wrote: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hello again,I only wanted to add a small note: I recently saw gifs that cannot be converted using imagemagick because they are either sloppy generated or with intention partly corrupted. Please think about using giftopnm and jpegtopnm instead. If you have a better idea, tell me. To use giftopnm and jpegtopnm, change the code from: if (($ctype eq "image/gif") || ($ctype eq "image/jpeg")) { open OCR, "|/usr/bin/convert - pnm:-|/usr/bin/gocr -i - > /tmp/spamassassin.focr.$$"; to: if (($ctype eq "image/gif") || ($ctype eq "image/jpeg")) { if ($ctype eq "image/gif") { open OCR, "|/usr/bin/giftopnm - |/usr/bin/gocr -i - > /tmp/spamassassin.focr.$$"; } else { open OCR, "|/usr/bin/jpegtopnm - |/usr/bin/gocr -i -/tmp/spamassassin.focr.$$";} Note that with imagemagick, things can get really bad. I experienced a highly increased time to convert (about 30 seconds and then an error message from imagemagick for a 7kb gif file). So I really advise you to change the code to use different tools. These will also complain, for example: giftopnm: Extraneous data at end of image. Skipped to end of image giftopnm: bogus character 0x4f, ignoring giftopnm: bogus character 0xa7, ignoring giftopnm: bogus character 0xc0, ignoring giftopnm: bogus character 0x8a, ignoring giftopnm: Unable to read Color 33 from colormap But it still continues and the text gets recognized correctly. Best regards, Chris -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFE2F0IJQIKXnJyDxURAtTdAJ4nx25dKbocHd7DW+ff1biW3GFmMACeO7t0 ZjYofyRHdknL5L3GcyMdgLo= =e1ze -----END PGP SIGNATURE----- |
- Re: Improved OCR Plugin with approximate matching Matthias Keller
- Re: Improved OCR Plugin with approximate matching decoder
- Re: Improved OCR Plugin with approximate matching Marc Perkel
- Re: Improved OCR Plugin with approximate matching John D. Hardin
- Re: Improved OCR Plugin with approximate matching decoder
- subject was meant to be "new version, plea... decoder
- Re: new version, please test Matthias Keller
- Re: new version, please test Mathias Tauber
- Re: Improved OCR Plugin with approximate matchi... Expertsites, Inc.
- Re: Improved OCR Plugin with approximate ma... decoder
- Re: Improved OCR Plugin with approximate ma... Spamassassin List
- Re: Improved OCR Plugin with approximat... decoder
- Re: Improved OCR Plugin with appro... Spamassassin List