On Thu, 5 Jan 2017, Nicola Piazzi wrote:
Each minute it learn messages of the last minute so it read and learn one time
only for each message
There is a certain amount of overhead involved in reading the mailbox and
processing messages even if they have already been learned...
Messages are that it sends from internal, so il learn that words are not spam
Internal messages are not spam
...until you get infected by a spambot.
Bayes training should be manually reviewed. Blind training is fragile and
invites the system to go badly off the rails when for some reason it makes
a poor decision that is self-reinforcing.
Nicola Piazzi
CED - Sistemi
COMET s.p.a.
Via Michelino, 105 - 40127 Bologna - Italia
Tel. +39 051.6079.293
Cell. +39 328.21.73.470
Web: www.gruppocomet.it
-----Messaggio originale-----
Da: John Hardin [mailto:jhar...@impsec.org]
Inviato: giovedì 5 gennaio 2017 17:35
A: users@spamassassin.apache.org
Oggetto: Re: learn ham
On Thu, 5 Jan 2017, Marc Stürmer wrote:
Am 2017-01-04 10:58, schrieb Nicola Piazzi:
I found useful to put in cron a little script like this
Each minute cron launch this script that takes messages of last
minute reading from maillog database
What's the purpose of this script, what's the reasoning behind running
this thingie every minute?
What you do is training the Bayes filter every minute. Training a
filter is something which should never be done unattended, but always
supervised, because if not you will get bad results over time.
The execution of the training program can safely be automated, though I'd agree
once per minute is a bit excessive. The classification of messages into the
folders that are trained from is what needs manual supervision.
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhar...@impsec.org FALaholic #11174 pgpk -a jhar...@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
Individual liberties are always "loopholes" to absolute authority.
-----------------------------------------------------------------------
381 days since the first successful real return to launch site (SpaceX)