Right now if mail gets tagged as X-Spam-Status in procmcail we pipe it to sa-learn --single --spam otherwise we pipe it through sa-learn --single --ham. So every message goes thru sa-learn before delivery.
Assuming you do this on all mail, this is a Bad Idea(tm). This means you are polluting your Bayes database with false positives and false negatives. You're much better off using the built-in auto-learn functionality, which sets wider thresholds (-2 for ham and 15 for spam, IIRC, but they're configurable).
Consider: A spam gets through at 4.5 points. The way you described your setup, you will learn it as ham - which means next time, it will probably get an even lower score. Or a valid message gets a 6.5 for some reason (someone reports it to Razor/Pyzor by mistake, the sender has gotten on a DNSBL, whatever) - and you learn it as spam, which means the next similar valid message is going to get a higher score than it would otherwise.
Since auto-learn recognizes the margin of error and stays outside it, you have a very low risk of learning false positives or negatives incorrectly, and you can still train on them manually when you get the chance.
Either way - auto-learn or pipe through on arrival - you still have to manually learn false positives/negatives... but with the built-in auto-learn you aren't *reducing* the accuracy until you catch it.
Do you think it is better to batch sa-learn on a whole mailbox, or is there no difference?
Batch-running sa-learn gives you the chance to verify things, but takes more effort, and there's a delay before Bayes gets the new data. Piping to sa-learn based on "X-Spam-Status: Yes" is less administration, but reduces the usefulness of Bayes by reinforcing errors. Using auto-learn is a good compromise: you don't learn *everything* automatically, but most of it is automatic and you run much less risk of polluting the data.
Kelson Vibber
SpeedGate Communications <www.speed.net>
------------------------------------------------------- This SF.NET email is sponsored by: eBay Great deals on office technology -- on eBay now! Click here: http://adfarm.mediaplex.com/ad/ck/711-11697-6916-5 _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk