Stefan and guys!!! You are awesome!!! All I did was aptitude install fuzzyocr. Nothing else. I re-ran the test again, and this particular spam scored for fuzzyOCR and got a score of 16!!!
Here's the new score: ############# pts rule name description ---- ---------------------- -------------------------------------------------- 0.0 HTML_MESSAGE BODY: HTML included in message 0.0 BAYES_50 BODY: Bayesian spam probability is 40 to 60% [score: 0.5085] 3.0 RCVD_IN_XBL RBL: Received via a relay in Spamhaus XBL [88.236.102.45 listed in zen.spamhaus.org] 0.9 RCVD_IN_PBL RBL: Received via a relay in Spamhaus PBL 0.8 SHORT_HELO_AND_INLINE_IMAGE Short HELO string, with inline image 0.1 RDNS_NONE Delivered to trusted network by a host with no rDNS 12 FUZZY_OCR BODY: Mail contains an image with common spam text inside [Words found:] ["cia***" in 3 lines] ["via***" in 3 lines] [(9 word occurrences found)] On Fri, Apr 24, 2009 at 10:52:30PM +0200, Stefan Luetje wrote: > Am 24. Apr 2009 um 22:12 CEST schrieb Igor Chudov: > > I get plenty of these also, and cannot get them to score well. > > > > These advertise knockoffs of bestselling Pfizer products. The text is > > meaningless garbage text. The sales message is contained in a PNG > > image, but it could be other image types like jpeg. > > > > http://igor.chudov.com/tmp/spam008.txt > > > > Any ides what I can do? > > You can install FuzzyOcr > <http://wiki.apache.org/spamassassin/FuzzyOcrPlugin> > > ,---- > | X-Spam-Status: Yes, score=19.8 required=5.0 > tests=BADRELAY,BAYES_99,FUZZY_OCR, > | HK_IMGSPAM,HTML_MESSAGE,SAGREY autolearn=no version=3.2.5 > | X-Spam-Relay-Country: US TR > | X-Spam-Report: =?ISO-8859-1?Q? > | * 3.5 BAYES_99 BODY: Spamwahrscheinlichkeit nach Bayes-Test: 99-100% > | * [score: 1.0000] > | * 0.3 HTML_MESSAGE BODY: Nachricht enth=e4lt HTML > | * 2.5 BADRELAY bad Relay > | * 2.0 HK_IMGSPAM Inline image in message, Bayes think it's spam > | * 10 FUZZY_OCR BODY: > | * 1.0 SAGREY Adds 1.0 to spam from first-time senders > `---- > > ,----[ fuzzyocr.log ] > | 2009-04-24 22:30:08 [9756] Scanset "ocrad" found word "cialis" with fuzz of > 0.0000 > | line: "ur prce viagra cialis special offer" > | 2009-04-24 22:30:08 [9756] Scanset "ocrad" found word "cialis" with fuzz of > 0.0000 > | line: "lgg cialis special offer" > | 2009-04-24 22:30:08 [9756] Scanset "ocrad" found word "viagra" with fuzz of > 0.0000 > | line: "ur prce viagra cialis special offer" > | 2009-04-24 22:30:08 [9756] Scanset "ocrad" found word "viagra" with fuzz of > 0.1667 > | line: "l ls lo x vagra loo mg lo x cals omg" > | 2009-04-24 22:30:08 [9756] Scanset "ocrad" found word "viagra" with fuzz of > 0.0000 > | line: " viagra hot offer" > | 2009-04-24 22:30:08 [9756] Scanset "ocrad" generates enough hits (5), > skipping further scansets... > | 2009-04-24 22:30:08 [9756] Message is spam, score = 10.500 > | 2009-04-24 22:30:08 [9756] Adding Hash to > "/home/stefan/.fuzzyocr/FuzzyOcr.hashdb" > | 2009-04-24 22:30:08 [9756] Words found: > | "cialis" in 2 lines > | "viagra" in 3 lines > | (7.5 word occurrences found) > `---- > > > Greets > Stefan >