On Sun, 20 Jul 2003, Daniel Carrera muttered drunkenly: > Hello, > > I'm trying to figure out how to use 'sa-learn' to train SA's spam filter. > I have already looked at the man page. > > I am not entirely sure what kind of input is acceptable to sa-learn. > > On teaching ham: > ================ > I gather that I can give sa-learn an entire mailbox and it'll know how to > interpret it. Am I right? I am using mutt. Are mutt's mail-boxes similar > to what sa-learn expects.
sa-learn understands Unix mbox format (messages preceded with a line matching the regex "^From "), and maildir format (grabbing all files in a directory save for those starting with a dot, so it can handle mh and Gnus's nnml formats, too). Anything else, you'll have to teach it. > On teaching spam: > ================= > This is the tricky one. Most of the spam I get is successfully filtered > by the current SA rules. These are stored in a sepparate file. I would > like to use this file to training SA. The problem is that SA alters the > header siginficantly. SA's output includes an analysis explaining why it > thinks that the give email is spam. This will do bad things to the > statistical approach. Is there a way to use this output to train SA? That's fine; SA understands what it (and all previous versions of it) did to the headers and body, and automatically reverses it (the equivalent of a `spamassassin -d') before learning. If I were you I'd take care to sa-learn spam that SA misses (and ham that SA thinks is spam) appropriately; it's that that will have most impact upon the effectiveness of SA. -- `We cannot get a new line down the pipe due to a blockage and we cannot dig up the road to clear the blockage because it is covered with the wrong type of tarmac.' --- British Telecom, via Mark Lowes ------------------------------------------------------- This SF.net email is sponsored by: VM Ware With VMware you can run multiple operating systems on a single machine. WITHOUT REBOOTING! Mix Linux / Windows / Novell virtual machines at the same time. Free trial click here: http://www.vmware.com/wl/offer/345/0 _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk