Hi,

On Thu, 15 Jan 2004, Rose, Bobby wrote:
> From: Alexander Litvinov
>
> > Hint: I think we should store these things in a SQL database instead
> > of in the file system, shouldn't we?

>  Can't you hide messages in jpeg?  If they created an engine that
> embedded a hidden random word in the image wouldn't that change it's
> hash and make this database useless.

You could just normalize the images to a 64x64 bitmap with a reduced color
depth (12bpp; ~48kb uncompressed) and store that; there are probably some
optical cross-correlation techniques you can use to measure how similar
the images are. Dropping the resolution makes it more difficult to tweak
the image to generate different hashes and keeps the database size
manageable.

Though I'd really like to see optical Bayes poisoning attempts: "Check it
out - a blender, a map of Zaire, Winston Churchill, and two purple
upside-down naked chicks - it's Warhol Spam!"

I still don't think this will work for a couple reasons. First, most spam
images are referenced by URL rather than attached to the message - why
expend any of your resources retrieving, storing, and analyzing images,
especially when the retrieval gives information to the spammer? Image
analysis is CPU-intensive and I can't imagine a perl-based solution would
scale under any kind of substantial load. And even if you did have the
spare bandwidth to retrieve the images (which you don't want to do),
you'll completely throw off the accounting of 'legitimate' bulk
emailers' delivery tracking web bugs which may attract more mail than
less.

Most importantly, I think the Razor folks tried this and found that
spammers were using a lot of default MS & other images, leading to an
unacceptable number of FPs. Spammers and anti-spammers are creative; ask
yourself who else has implemented an image hash database or if nobody has,
why not?

-- Bob


-------------------------------------------------------
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to