Ross Vandegrift said the following on 19/11/02 14:17:
On Tue, Nov 19, 2002 at 09:39:13AM +0000, Matt Sergeant wrote:

The spammers have. An even better way they've found is to include a snippet from a legit mailing list, but put it in a white text on white background box. This was discussed on the spambayes mailing list.

Now, I am not a statistician but I am a mathematician.  If my
understanding of Bayesian statistics (and if people actually are being
accurate when they call this method Bayesian), this shouldn't matter at
all - that's the beauty of the process.
They are not being accurate when they call it bayesian. It is, at best, naive bayesian.

If the Bayseian analysis actaully takes into account the joint and
conditional densities of word frequency, and it has a reasonable way to
assign an expectation to them (ie, if the corpus is seeded with real-non
spam and real spam), the fact that a spam has been seeded with real
words should show up in the joint and conditional frequency analysis.
This would allow the filter to assign a spam score, though perhaps with
a smaller confidence interval.
See now I did two years of a Maths degree, and you've already gone way over my head :-)

What does "joint and conditional frequency analysis" mean?



-------------------------------------------------------
This sf.net email is sponsored by: To learn the basics of securing your web site with SSL, click here to get a FREE TRIAL of a Thawte Server Certificate: http://www.gothawte.com/rd524.html
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to