On Thu, 5 Jan 2017, Nicola Piazzi wrote:

Each minute it learn messages of the last minute so it read and learn one time 
only for each message

There is a certain amount of overhead involved in reading the mailbox and processing messages even if they have already been learned...

Messages are that it sends from internal, so il learn that words are not spam

Internal messages are not spam

...until you get infected by a spambot.

Bayes training should be manually reviewed. Blind training is fragile and invites the system to go badly off the rails when for some reason it makes a poor decision that is self-reinforcing.


Nicola Piazzi
CED - Sistemi
COMET s.p.a.
Via Michelino, 105 - 40127 Bologna - Italia
Tel.  +39 051.6079.293
Cell. +39 328.21.73.470
Web: www.gruppocomet.it



-----Messaggio originale-----
Da: John Hardin [mailto:jhar...@impsec.org]
Inviato: giovedì 5 gennaio 2017 17:35
A: users@spamassassin.apache.org
Oggetto: Re: learn ham

On Thu, 5 Jan 2017, Marc Stürmer wrote:

Am 2017-01-04 10:58, schrieb Nicola Piazzi:

 I found useful to put in cron a little script like this

 Each minute cron launch this script that takes messages of last
minute  reading from maillog database

What's the purpose of this script, what's the reasoning behind running
this thingie every minute?

What you do is training the Bayes filter every minute. Training a
filter is something which should never be done unattended, but always
supervised, because if not you will get bad results over time.

The execution of the training program can safely be automated, though I'd agree 
once per minute is a bit excessive. The classification of messages into the 
folders that are trained from is what needs manual supervision.

--
 John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
 jhar...@impsec.org    FALaholic #11174     pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
  Individual liberties are always "loopholes" to absolute authority.
-----------------------------------------------------------------------
 381 days since the first successful real return to launch site (SpaceX)

Reply via email to