On 3/15/2016 6:26 PM, John Hardin wrote:
On Tue, 15 Mar 2016, Ted Mittelstaedt wrote:
we have scripts checking any samples against current bayes
classification and ignore them if they already have BAYES_99,
Is this even necessary? I thought the learner automatically
rejected everything already tagged.
Already *learned*. There's nothing preventing you from learning messages
that scored BAYES_999 (or BAYES_00).
How exactly would it be a bad thing to learn as spam, spam that had this
score? (spam that had been verified, by hand, to be spam - or spam that
arrived at a honeypot address where it would be impossible for it to be
legit)
How would it be a bad thing to learn a piece of spam that had already
been caught by another rule and tagged as spam?
I guess my question is - if I have a piece of spam - a piece of mail
that I am absolutely positive is real, honest to God spam - not a
possible false positive - but real spam - how would it be bad in any
way to feed that into the learner as spam?
Ted