John Hardin wrote:
> rawbody __TWO_WORD_LINES /^\S\+\s\+\S\+$/
> tflags  __TWO_WORD_LINES multiple
> meta    STACKED_TEXT (__TWO_WORD_LINES > 10)
> 
> Likely somewhat FP-prone...

I think quite FP-prone; think about emailed system logs, lists,
invoices, etc.  Your example used lots of real words, so I'd trust
Bayes to find it.  It also had a URL, so the URIBLs can detect it too.

Finally, IIRC, some of the fuzzy checksum mechanisms go by patterns
that take a keen interest in paragraph structure like that (or at
least one was mentioned as well-loved at the last MIT Spam
Conference), so make sure you're using Razor2, Pyzor, iXhash, and if
permissible, DCC (though I'm not sure which of those use this method
... iXhash certainly does not).

Reply via email to