On Mon, 21 Aug 2006, John Rudd wrote:
On Aug 21, 2006, at 10:13 PM, Chip M. wrote:

While skimming thru my daily rejected spam pile, did a double take when a
GIF spam seemed to "blink" at me.  Thought it was a sw glitch at first...
then realized the sneaky Borg had adapted again.

Took a look at the frames in PaintShopPro's AnimationShop, and the first
three are all but blank (wee bit of noise), followed by the payload.

Given the way the GIF format works, that is actually a
reasonable way to inject "salt" into a given image to throw
off checksumming.  (If only the programmer who is doing the
technical end of this would get a real job instead of working
for a spammer...)

For animated, is there a clean break between "frames" of animation, something that netpbm or whatever can easily identify and break out into individual images?

Yes, briefly, the GIF format is a sequence of chunks.  Before
any image data comes along, a chunk defines the overall size of
the GIF (sort of the size of the canvas), and then you can have
a series of other chunks.  One type of chunk says "draw this
image on the virtual canvas at these coordinates using this
palette" and another says "delay this long".  Putting these
two types of chunks together in the right sequence gives the
ability to do animations.  (It also, incidentally, gives you
the ability to do full 24-bit color.  Few people know GIF
is actually capable of this.  But even though it is capable,
it is a hack, and very wasteful of space, so maybe that's for
the better.)

It would be CPU intensive, but the right way to fight it might be to run the FuzzyOCR on each frame. And/or have a setting for "maximum frames to process", and if the GIF goes over that number of frames, give it a huge spam score.

Yeah, that is a bit tricky.  I can think of a way to do a
denial-of-service attack against the "run it on each frame"
approach, but I won't share what that is.  In theory, if that
happens, one could write a plugin to examine the internal
structure of the GIF and detect that.

The one thing that would be important to guard against is
suddenly flagging all animated GIFs as spam.  Although I think
they're really tacky and annoying, that doesn't mean that they
are actually spam.

For interlaced ... I have no idea. Depends a lot on how the interlaced images are stored, I guess. And whether or not netpbm can generate the final image for processing, instead of having to work on the interlaced data.

I'm pretty sure it should be able to.  If I recall correctly,
interlaced GIFs just have the rows in a different order.
It should be no problem to get the full image.

  - Logan

Reply via email to