On Tue, 2009-02-24 at 16:14 +0100, Per Jessen wrote: > Karsten Bräckelmann wrote: > > > Yes, I too get spam with google groups URIs. Typically scores quite > > high, though. So I guess the answer would be "I deal with it by > > running SA..." ;) > > > > More seriously, unless you provide raw samples [1], including the > > rules hit on your system, there's probably not much else to say. ^^^^^^^^^ Per, you missed that. :)
> Hi Guenther > > here's a couple of examples that made it through my filter: > > http://jessen.ch/files/googlegroup-spam-example1.eml Hrm, crap. Indeed, doesn't look good here either. First of all, my (few) samples seem to be rather similar concerning the body -- the URIs are different, though. Unlike your samples, mine tripped over (my) Bayes. Due to different in-stream and training, this may be plausible. Other than that, there are only rare hits on blacklists in these samples. I did see IXHASH, PYZOR, BRBL and probably some other stuff I forgot, though. Didn't check if my trusted networks should be modified for RCVD rules to trigger better. FWIW, my samples do trigger more IP blacklists, and generally score above 10. Given *these* samples (still don't know about Johann's though), I'd go the same way as Jason and Ned. Not scoring a whopping 5, but creating a bunch of specialized, moderately scoring rules. I've done that in the past often enough. Whether you, Johann or anyone else can score google.com or google groups URIs at sight at least with 1 or 2 points depends. YM(M)V. However, there are some highly abusive patterns sticking out. A google URI with a ../ in the path? Sure! Score 2. :) Alternating alpha and numbers might be worth another point. A question mark in a google groups URI? Punish that. I could go on like that. There are a lot of patterns in the URIs which usually do not appear in legitimate mail. Each could be a special uri rule, scored moderately. Plus meta rules combining these patterns. And probably meta rules adding in other bits, like yahoo freemail sender for example. Well, HTH guenther -- char *t="\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4"; main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1: (c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}