-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Arpi writes:
>Hi,
>
>> On Mon, Jan 19, 2004 at 03:21:06PM -0500, Larry Gilson wrote:
>> > http://useast.spamassassin.org/doc/Mail_SpamAssassin_Conf.html#learning%20op
>> > tions
>> > 
>> > bayes_ignore_header header_name
>> 
>> ::bangs head on wall::   How did I miss *that*?  Thanks for correcting
>> my careless reading.
>> 
>> In a broader sense though, shouldn't fields like To: be excluded by
>> default?  It seems like if I receive more than 50% spam, this is a
>> receipe for disaster.  Of course, some spam won't have a valid To:
>> field, but it seems like constant things like this will be very bad
>> arbitors.
>
>Although I agree that this Bayes behaviour on To: is good, this thread
>brought up an interesting problem in me:
>Does the bayes calculation takes spam:ham ration into account?
>
>So, if I have a constant header line (word), present in every spam and every
>ham message, but i get 10 times more spam than ham (so the counters on this
>word are 10 times bigger in spam column than in ham column), then bayes
>will think this word means 10:1 spam probability? Which is bad, of course!!
>As it does mean nothing, it's equally means it's spam as it's ham.
>
>And we all have some constant headers, just think of the Received:
>line including your mail server name/ip...
>
>I wonder if bayes DB normalizes the spam/ham counts, by the number
>of total spam/ham counters? Then it would find that my word is
>present in 100% of all spam messages, and 100% of all ham messages,
>so it means 50% spam probability (instead of 10:1 which means 90%)

Hi Arpi --

yep, it does.  That's why there's a total count of messages in
either category in the nspam and nham counters in the db.

- --j.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFADbQkQTcbUG5Y7woRAla9AKDwS2rqoRk0q8/6jJYeC9ejA608AgCcDyOt
nNGtn2IUVmnDT+iKKV2wl00=
=aE6E
-----END PGP SIGNATURE-----



-------------------------------------------------------
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to