On 8/11/2014 11:42 AM, Jeff Rice wrote: > Hello, > I'm trying to work out a way to have my Sieve filter save a "pristine" > version of email messages as a backup, primarily to use for training the > spam filter. I would like is to have every message saved into a single, > site-wide directory (in the global sieve) before being processed > additionally and delivered. The messages in that directory will be used > to train the spam filter without having to worry about removing > Spamassassin headers and so forth.
Provided I understand you correctly, my first thought is that saving a duplicate copy of every single message that arrives on this system seems wasteful. Why not save only the messages that would actually be useful for spam training purposes? > > I thought fileinto :copy might do what I wanted, but this creates a > backup directory individually for each user. That's unmanageable for > the spam training process I use. redirect *could* work, but that adds a > header during the process so the email saved would not be "pristine". > > I'm thinking of using the extprograms plugin to pipe to a program that > will do a simple copy. That feels very hackish, however, and I'm hoping > there is a more elegant solution. > There is; the Dovecot Antispam plug-in. It does exactly what you describe, and it addresses the problem of storing a duplicate copy of all messages. In short, when a user drags a message from any folder to "Junk", you'll receive a "pristine" copy of the message at any local address you specify, delivered to any folder you specify (e.g., "Train as SPAM") within that "training user's" mailbox. Conversely, when a user drags a message from "Junk" to any other folder, you'll receive a copy of the message in your "Train as HAM" folder. Then, you can point your anti-spam solution's training executable to these two "pristine master corpus" folders. If you ever need to reclassify messages, or expunge them, doing so is trivial with this master corpus approach. > Am I missing something obvious here? > > Thanks! > Jeff Happy to provide a sample script for the antispam plugin's mailtrain back-end, as that's the one I use. Cheers, -Ben