--As of August 23, 2014 3:22:13 AM +0200, Karsten Bräckelmann is alleged
to have said:
On Fri, 2014-08-22 at 17:32 -0700, Ian Zimmerman wrote:
Isn't inotify a bit of overkill for this? If you have a dedicated
maildir for training, you know that anything in maildir/new is, uh,
new. So you process it and move it to maildir/cur. What am I missing?
The new/ directory is for delivery, messages moved will end up in cur/.
Training on messages in new/ means training solely on classification.
These messages have not been seen by a human, and he's most likely not
even aware there's new mail at all.
Messages moved (copied) into dedicated (ham|spam) learning folders will
be placed in cur/.
Thus, training on content in dedicated learning folders' new/ dirs won't
work, because human reviewed mail does not go there. And training on
new/ dirs in general is like overriding all of the precaution measures
of SA auto-learning, and blindly train anything and everything above or
below the required_score threshold.
Besides, moving messages from new/ to cur/ is the IMAP server's duty. No
third-party script should ever mess with that.
--As for the rest, it is mine.
Good points, but inotify might still be overkill. `ls maildir/cur/ | grep
',.*S` will give you all messages that have been seen in the mailbox, so
you can run on a periodic schedule fairly easily. I'm not sure whether you
need the immediate notification inotify gives.
That said: It's still an interesting and possibly useful approach. My
current system is that I have a 'misfiled spam' folder, and I train on
everything in it every night. (And auto clean it out every night as well.)
I let autolearn take care of normal ham. (The occasional misfiled ham I've
always handled manually, as they are so few it's never been worth
automating.)
inotify won't work for me - I'm on a BSD where inotify doesn't exist - but
it's an interesting approach.
Daniel T. Staal
---------------------------------------------------------------
This email copyright the author. Unless otherwise noted, you
are expressly allowed to retransmit, quote, or otherwise use
the contents for non-commercial purposes. This copyright will
expire 5 years after the author's death, or in 30 years,
whichever is longer, unless such a period is in excess of
local copyright law.
---------------------------------------------------------------