On Thu, Jul 7, 2011 at 5:18 PM, John Hardin <jhar...@impsec.org> wrote: > On Thu, 7 Jul 2011, polloxx wrote: > >> On Wed, Jul 6, 2011 at 6:33 PM, John Hardin <jhar...@impsec.org> wrote: >>> >>> OK. Just to be clear, you took a jpeg-format image file and used >>> jpegtopnm >>> to convert it to a pnm file, and got a correct .pnm image file out? Did >>> you >>> do this to verify the exit code from jpegtopnm: >>> >>> echo $? >> >> $ /usr/bin/jpegtopnm ./spam1.jpg > spam1.pnm >> jpegtopnm: WRITING PPM FILE >> >> spam1.pnm is created. > > Please run this: > > /usr/bin/jpegtopnm ./spam1.jpg > spam1.pnm ; echo $? > > The return code is likely zero, but let's be _sure_. >
Yes, zero. >>> It would be useful to see the debugging output of spamassassin where it's >>> talking about fuzzyocr. Do you know how to run spamassassin in debug mode >>> against a test message? >> >> # spamassassin --debug FuzzyOCR < ./spam1.jpg > /dev/null > > Your input there needs to be a complete email message with the image as an > attachment, not the image itself: > The example eml from Spamassassin works fine: # spamassassin --debug FuzzyOCR > output # cat output Spam detection software, running on the system "xxx.xxx.xxx", has identified this incoming email as possible spam. The original message has been attached to this so you can view it (if it isn't spam) or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: Langdon looked again at the fax an ancient myth confirmed in black and white. The implications were frightening. He gazed absently through the bay window. The first hint of dawn was sifting through the birch trees in his backyard, but the view looked somehow different this morning. As an odd combination of fear and exhilaration settled over him, Langdon knew he had no choice The man led Langdon the length of the hangar. They rounded the corner onto the runway. [...] Content analysis details: (24.6 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 3.6 RCVD_IN_PBL RBL: Received via a relay in Spamhaus PBL [58.186.156.15 listed in zen.spamhaus.org] 1.6 RCVD_IN_BRBL_LASTEXT RBL: RCVD_IN_BRBL_LASTEXT [58.186.156.15 listed in bb.barracudacentral.org] 0.0 FSL_HELO_NON_FQDN_1 FSL_HELO_NON_FQDN_1 3.6 HELO_LOCALHOST HELO_LOCALHOST 4.4 KB_RATWARE_OUTLOOK_MID KB_RATWARE_OUTLOOK_MID 0.8 DKIM_ADSP_NXDOMAIN No valid author signature and domain not in DNS 2.5 DATE_IN_FUTURE_12_24 Date: is 12 to 24 hours after Received: date 0.0 HTML_MESSAGE BODY: HTML included in message 0.0 MIME_QP_LONG_LINE RAW: Quoted-printable line longer than 76 chars 0.2 SHORT_HELO_AND_INLINE_IMAGE Short HELO string, with inline image 1.3 RDNS_NONE Delivered to internal network by a host with no rDNS 0.0 T_DOS_OUTLOOK_TO_MX_IMAGE Direct to MX with Outlook headers and an image 9.0 FUZZY_OCR BODY: Mail contains an image with common spam text inside [Words found:] ["levitra" in 1 lines] ["cialis" in 1 lines] ["viagra" in 2 lines] [(6 word occurrences found)] -2.3 AWL AWL: From: address is in the auto white-list The original message was not completely plain text, and may be unsafe to open with some email clients; in particular, it may contain a virus, or confirm that your address can receive spam. If you wish to view it, it may be safer to save it to a file and open it with an editor.