>-----Original Message----- >From: David B Funk [mailto:[EMAIL PROTECTED] >Sent: Thursday, June 09, 2005 2:16 PM >To: Chris Santerre >Cc: users@spamassassin.apache.org >Subject: RE: Gif-Only spams > > >On Thu, 9 Jun 2005, Chris Santerre wrote: > >> >My only comment on a system like this is that it could be >> >easily subverted. >> >A spammer could use automated image editting tools to randomly >> >change some >> >aspect of the file that would give it a totally different >MD5 sum. Like >> >changing the lower right pixel to a different color would >> >throw the md5 sum >> >way off. >> >> I completely agree. But I'd like to see it tried. Then maybe >combine it with >> distancing techniques to see how distant one MD5 is to another. > >Nice try, but the crypto characteristic of MD5 makes this totally >impractical. One of the attributes of MD5 (by design) is that >even small >changes in the input cause signficant changes in the output. >This is intended to deter attackers from breaking crypto systems >with incremental guessing methods. > >Try this; take a 100Kbyte text file, get a MD5 sum, change one letter >(say a 'b' to 'c') and re-calculate the MD5 sum. >Note that almost every digit of that 32 digit hex value has changed, >even tho you've changed only 1 bit out of 800,000 bits of data in >that file. > >There are image processing algorithms that are much better at 'looking' >at two images and giving a 'distance' value. (Only problem is >that they're >compute intensive).
Well then don't use MD5 :) Hell then just pull a sample from the image. Not that this will stop spammers from reverse eng the code and changing the default sample bits. Change the sample bits every SA release. DOn't know, I'm just spouting off ideas :) He asked! And for Evan's comment: "You'd end up scoring all legit e-mails that image hash shows up in." One single rule should NEVER trigger an email to be labeled spam ;) --Chris