Re: Image MD5sums available, was Re: All image spam

Dirk Bonengel Tue, 07 Mar 2006 23:36:31 -0800

Hi, all,

I wonder if the iXhash Plugin I did last summer would catch these.

FYI, the plugin uses some form(s) of fuzzy MD5 checksums of the completemail body (not seperate mime parts) and does compare the results withthose I provide via DNS.

It's available at http://wiki.apache.org/spamassassin/iXhash.

If not, enhancing it to also compute checksums of attachments would benice to have. If only I had the time...


Dirk


William Stearns schrieb:

Good evening, Jack, all,

On Tue, 7 Mar 2006, Jack Gostl wrote:
I've seen some references to this in threads, but I didn't see ananswer.
Starting in late November, we started getting hit with spam that wasalmost entirely a jpeg. They seem to be mostly "stockrecommendations". There is minimal message, usually HTML, and thereal spam content is in the image. Despite al the trainging that Ido, this seems to slip through the Bayes algorithms with no more thana 50%, and the rest of the tests don't drive the score up high enoughto help.
I am currently running SpamAssassin 3.0.3. I tried running thesemessages through SpamAssassin 3.1 and it doesn't seem to help.
Any suggestions?
We talked about identifying images last summer. There are a fewanswers, some of which have been discussed in this thread already.Razor, pyzor, and DCC are designed to score up messages withalready-seen mime parts (read: if 3 other people think that image isspam, your spam filter can score it up). As with identifying textparts where the spammer inserts random words to throw those servicesoff, images can be subtly modified so the visible area is essentiallyidentical but the actual image file is different with every spam run.I offered to put together a catalog of checksums of images used inspam, and have done so. The md5 and sha1 sums of 44,522 spam imagescan be found at http://www.stearns.org/spamattach/ , broken out bycategory and in combined files. If anyone wants to take on aninteresting project of computing the md5 checksums of attachments, I'dbe willing to set those lists up as a dns-queriable rbl (along thelines of01f5ff6ab05499c94a967409204e6a29.md5.some_rbl.net which would return127.0.0.2 if known, nothing if not).I already understand the downsides to this approach (duplicateswork of razor, pyzor, and dcc, images can be altered), but figure thechecksum work has already been done and will continue to be done anyways.
    Anyone up for it?
    Cheers,
    - Bill
---------------------------------------------------------------------------
        "That man is a success who lived well, laughed often and loved
much: who has gained the respect of intelligent men and the love of
children: who has filled his niche and accomplished his task: who leaves
the world a better place than he found it, whether by an improved poppy,
a perfect poem or a rescued soul; who never lacked appreciation of
earth's beauty or failed to express it; who looked for the best in
others and gave the best he had."
-- Robert Louis Stevenson.--------------------------------------------------------------------------
William Stearns ([EMAIL PROTECTED]).  Mason, Buildkernel, freedups, p0f,
rsync-backup, ssh-keyinstall, dns-check, more at:http://www.stearns.org--------------------------------------------------------------------------

Re: Image MD5sums available, was Re: All image spam

Reply via email to