Hello,
I've implemented a Bayesian filtering scheme on my system that runs
concurrent with SpamAssassin. It works really well, but I am starting to
think there is an easy attack that would render the filtering useless.
What if, at the end of every message, spammers appended a list of a
thousand or more randomly selected common dictionary words. Wouldn't these
words overwhelm a Bayesian filtering scheme? Sure, the spam phrases would
still be present in the top part of the message, but the common, non-spam
words at the bottom would make the message appear, statistically, less
spam-like, perhaps enough to get it by the filter. Further, as these
messages were included in a user's spam corpus, would not legitimate
messages start to appear, statistically speaking, like spam, thus
increasing false positives?
Perhaps this notion is based on a misunderstanding of how Bayesian
filtering works, or perhaps there are ways of working around it, but has
anyone given this idea any thought?
Thanks,
Chris Eykamp
-------------------------------------------------------
This sf.net email is sponsored by: To learn the basics of securing
your web site with SSL, click here to get a FREE TRIAL of a Thawte
Server Certificate: http://www.gothawte.com/rd524.html
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk
- Re: [SAtalk] Bayesian attack Christopher Eykamp
- Re: [SAtalk] Bayesian attack Justin Mason
- Re: [SAtalk] Bayesian attack Matt Sergeant
- Re: [SAtalk] Bayesian attack Ross Vandegrift
- Re: [SAtalk] Bayesian attack Matt Sergeant
- Re: [SAtalk] Bayesian attack Bob Apthorpe
- Re: [SAtalk] Bayesian attack Sean Redmond
- Re: [SAtalk] Bayesian attack Matt Sergeant
- Re: [SAtalk] Bayesian attack Justin Mason
- Re: [SAtalk] Bayesian att... Ross Vandegrift
- Re: [SAtalk] Bayesian att... Vivek Khera