On 03Nov2012 11:03, Russell L. Harris <rlhar...@broadcaster.org> wrote: | I thank you for taking the trouble to give a detailed reply, Cameron. | I have printed it out, and I plan to study it carefully in the | morning.
On 03Nov2012 09:53, Jamie Paul Griffin <ja...@kode5.net> wrote: | I just have my mail delivered by smtp so use OpenBSD spamd and | spamassassin as well as clamav and unofficial sigs, with procmail sorting | as i mentioned. So my set up is different of course. I should have made it clear that my setup is a bit roundabout. The natural macro for this would simply pipe the messages to email-add-spam-subject and then delete it or save it to the "known spam" bucket. I save it to a special spool folder for two reasons: - it is snappier to just save a message to a folder than to pipe the message to a program which does some work, making for a snappier user experience; I don't care that the subject line isn't in my rules until a few seconds later (mailfiler will pick it up as part of its regular scan) - I've already got my system monitoring maildirs as spools with simple rules, so folding this in was very easy The important thing is the script to add a new rule and telling your filtering software about the rule update. With procmail it rereads (and therefore recompiles, alas) the rules file every time you fire it up; my mailfiler notices rule files changes and reloads if they get updated. I outlined my setup to give background and to show that a small leading blacklist and an "UNKNOWN" folder for messages matching no filing rule diverts most stuff away from your inbox fairly effectively without spamassassin et al. Regarding filing tools: I used to use procmail. At some point I decided its rule syntax was too painful, especially if you want to do a few things with _every_ filing, like X-Labels, log lines and so forth, so some years ago I wrote cats2procmailrc to take a simple rule syntax and transcribe a procmailrc. And I finally decided to write something that directly understood my rule syntax, which has several advantages: reads the rules once (more performant!), doesn't need a wrapper script to watch maildirs, leaves me free to make the rules say what I want instead of what can be said to procmail. My core gripe with procmail, aside from the from-scratch startup per message thing, is that it works entirely off regexps. This is not a good way to parse email addresses. These are all equivalent: c...@zip.com.au Cameron Simpson <c...@zip.com.au> (Cameron Simpson) c...@zip.com.au Matching that while not matching: c...@zip.com.au foo.cs.zip.com.au and so forth just does not work reliably. A mailfiler rule like this: me to-me c...@zip.com.au files to the folder "me" with the tag/x-label "to-me" if the to/cc/bcc contains "c...@zip.com.au" in the address component as extracted by a proper RFC2822 parser. No regexps, just string equality tests. It also parses each message header just one on demand, so to test hundreds of rules the parsing happens only once. And of course the rules are parsed when I start mailfiler, not for each message. The other upside of extracting the core address part is that you can do this: friends Friends from:(FRIENDS) which means match is the address in the From: header is in my "friends" group, a set of addresses pulled in from a text db. Again parsed, just at load time. So very fast. When I was using procmail I actually had code that generated an enormous regexp with tens of addresses in it. Ghastly! :0 * ^(to|cc):.*\<(cameron\.simpson@gmail\.com|cameron\.simpson@me\.com|cs@zip\.com\.au|... * ^from:.*(huge regexp for "family" etc kilobytes long... My now obsolete .procmailrc for the spool-in folder is 1036401 bytes long. Nasty! Cheers, -- Cameron Simpson <c...@zip.com.au> Very few things happen at the right time, and the rest do not happen at all. The conscientious historian will correct these defects. - Mark Twain, _A Horse's Tale_