On Mon, 23 Jul 2018, Nick Bright wrote:
On 7/23/2018 7:55 PM, Reindl Harald wrote:
and even if - whats the point to store the surrounding messages in the
corpus which you should keep forever if you need rebuild from scratch
later?
what is the problem you try to solveand why can't you just store the
attachment instead the whole mail containg it?
The problem I'm trying to solve is "how to implement a training system on my
server".
I suppose i could de-encapsulate an attachment with a script, before feeding
it to sa-learn?
If your mail-box server is imap, has public folders capability and you have
access to the back-end storage (EG Dovecot) then you could implement a
report-spam folder submission system.
EG your users drop spam messages into the report-spam folder and your script
runs on the back-side, extracting the messages, feeding them to "spamc -l" and
then moving them into a "report-done" folder for archival purposes.
That or you have to glue together some kind of de-mimifying scripts inside
procmail to feed 'spamc -l' and hope that your users use some predictable kind
of mime labeling so you can automate the unwrapping process. (good luck).
Either way you are at the mercy of your users to make valid judgments about
whether a particular message is actual spam (and not just some
marketing/newsletter thing they signed up for and then forgot).
--
Dave Funk University of Iowa
<dbfunk (at) engineering.uiowa.edu> College of Engineering
319/335-5751 FAX: 319/384-0549 1256 Seamans Center
Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527
#include <std_disclaimer.h>
Better is not better, 'standard' is better. B{