Hi, On Thu, 15 Jan 2004, Rose, Bobby wrote: > From: Alexander Litvinov > > > Hint: I think we should store these things in a SQL database instead > > of in the file system, shouldn't we?
> Can't you hide messages in jpeg? If they created an engine that > embedded a hidden random word in the image wouldn't that change it's > hash and make this database useless. You could just normalize the images to a 64x64 bitmap with a reduced color depth (12bpp; ~48kb uncompressed) and store that; there are probably some optical cross-correlation techniques you can use to measure how similar the images are. Dropping the resolution makes it more difficult to tweak the image to generate different hashes and keeps the database size manageable. Though I'd really like to see optical Bayes poisoning attempts: "Check it out - a blender, a map of Zaire, Winston Churchill, and two purple upside-down naked chicks - it's Warhol Spam!" I still don't think this will work for a couple reasons. First, most spam images are referenced by URL rather than attached to the message - why expend any of your resources retrieving, storing, and analyzing images, especially when the retrieval gives information to the spammer? Image analysis is CPU-intensive and I can't imagine a perl-based solution would scale under any kind of substantial load. And even if you did have the spare bandwidth to retrieve the images (which you don't want to do), you'll completely throw off the accounting of 'legitimate' bulk emailers' delivery tracking web bugs which may attract more mail than less. Most importantly, I think the Razor folks tried this and found that spammers were using a lot of default MS & other images, leading to an unacceptable number of FPs. Spammers and anti-spammers are creative; ask yourself who else has implemented an image hash database or if nobody has, why not? -- Bob ------------------------------------------------------- The SF.Net email is sponsored by EclipseCon 2004 Premiere Conference on Open Tools Development and Integration See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. http://www.eclipsecon.org/osdn _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk