On 3/15/2016 2:48 PM, Reindl Harald wrote:


Am 15.03.2016 um 22:24 schrieb Ted Mittelstaedt:
Baloney - spamoney!!!

I do not use autolearning, and ALL my spam is either hand-selected or it
comes from honeypot addresses that have NEVER been on my domains - I get
these honeypot addresses by scanning the mail log and looking for
guesses by spammers - when I see a popular address in the "guess bin"
I set it up as a honeypot - and within 6 months it's getting thousands
of spams a week. And the ham comes from me and from a select group of
users who have large amounts of mail stored on the system that is all
clean.

Bayes is NOT the answer to everything!!!!

no, but to most things if your corpora is well maintained and don't
forget already learned samples - otherwise it's easy to trick out over
the long and won't catch seasonal junk or end in miss-classified
seasonal ham

we have scripts checking any samples against current bayes
classification and ignore them if they already have BAYES_99,

Is this even necessary?  I thought the learner automatically
rejected everything already tagged.

there is
not much left to train and with the data of a whole year have fun to
bypass it, especially when it's scored proper


All my spam from 2015 fed into the Bayes learner is backed up there's
probably about 3GB of it.  I show 369686 spams and 15128675 tokens in
the database.  I don't think I'm forgetting already
learned samples since the spam and token count increases every time I
feed it.

if you think there is a different way to learn it that is better,
I can create a test db and feed last year into that and see if it works any better.

Ted

Reply via email to