On Mon, 23 Jul 2018, Nick Bright wrote:

On 7/23/2018 7:55 PM, Reindl Harald wrote:
and even if - whats the point to store the surrounding messages in the corpus which you should keep forever if you need rebuild from scratch later? what is the problem you try to solveand why can't you just store the attachment instead the whole mail containg it?
The problem I'm trying to solve is "how to implement a training system on my server".

I suppose i could de-encapsulate an attachment with a script, before feeding it to sa-learn?

If your mail-box server is imap, has public folders capability and you have access to the back-end storage (EG Dovecot) then you could implement a report-spam folder submission system.

EG your users drop spam messages into the report-spam folder and your script runs on the back-side, extracting the messages, feeding them to "spamc -l" and then moving them into a "report-done" folder for archival purposes.

That or you have to glue together some kind of de-mimifying scripts inside procmail to feed 'spamc -l' and hope that your users use some predictable kind of mime labeling so you can automate the unwrapping process. (good luck).

Either way you are at the mercy of your users to make valid judgments about whether a particular message is actual spam (and not just some marketing/newsletter thing they signed up for and then forgot).



--
Dave Funk                                  University of Iowa
<dbfunk (at) engineering.uiowa.edu>        College of Engineering
319/335-5751   FAX: 319/384-0549           1256 Seamans Center
Sys_admin/Postmaster/cell_admin            Iowa City, IA 52242-1527
#include <std_disclaimer.h>
Better is not better, 'standard' is better. B{

Reply via email to