One thing to try, for your particular situation. This rule could match in some strange base-64 encoded files, but it's extremely unlikely -- I ran it through my spam corpus, and it hit 7 lines out of 260 megabytes, so you should be OK:
body GENETICS_DATA /([ACGT]{3,}[CGT][ACGT]?\s*){3,}/ describe GENETICS_DATA A, C, T, G, who do we appreciate? score GENETICS_DATA -5 The rule, unfortunately, will match a long line of C,G, or T -- but will not match all As. It should be possible to craft it a bit better, but to do so, I believe, would make the regexp really slow. I wouldn't recommend this rule for general consumption, obviously, but if you're in the habit of getting genetics data... -Dave Geoff Gibbs just mooed: > David G. Andersen wrote: > > > > > anyone else seeing false-positives more often with 2.11? > > > > > > Yes, I have had to roll back to 2.01. > > > > A bit of a suggestion, since you're seeing false positives in a highly > > specific domain. I've been creating word-frequency-based whitelists > > from various mailing lists I'm on (alas, little genetics talk). > > But I've found great success on matching networking-geek specific > > terms, and would think the same approach would prove quite fruitful > > for genetics specific terms. Spammers, happily, don't often say > > adenosine. :-) > > That is an interesting suggestion, although most of the false positives > were not related to genetic specific terms. Solid blocks of ACGT do > trigger the whole line of shouting, but an empty Subject should > not trigger Subject is all in capitals. An e-mail with a base-64 > attachment should not count as spam with no other trigger. > I also had one e-mail that triggered the ascii form and whole line > of shouting, where I cannot see a whole line of shouting and I have > not yet had time to work out what triggered the form, but it is > not obvious to the beginner (me). -- work: [EMAIL PROTECTED] me: [EMAIL PROTECTED] MIT Laboratory for Computer Science http://www.angio.net/ _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk