On Fri, 28 Jan 2011, David F. Skoll wrote: > On Fri, 28 Jan 2011 18:10:08 +0000 > Dominic Benson <domi...@lenny.cus.org> wrote: > > > Recently, in order to balance the ham/spam ratio given to sa-learn, I > > have started to pass mail submitted by authenticated users to > > sa-learn --ham. > > > I haven't seen any mention of this strategy on-list or on the web, so > > I'm interested in whether (a) anyone else does this, and (b) is there > > a good reason not to do it that I haven't thought of? > > It's possibly a good idea, but you want to be really careful of one > thing: Make sure your users are savvy enough not to have their > accounts phished. It'll take just one compromised account that blasts > out a spam run to destroy the usefulness of your Bayes data.
Amen to that. Sad how many supposedly educated people (say engineering professors ;) fall for phishes and get their accounts powned. 419 spammers love to target university systems, semi-clueless users and fat pipes. One other semi-issue with that strategy, half of Bayes is based upon header contents. Your outgoing messages are not going to have headers that are representative of incoming messages. -- Dave Funk University of Iowa <dbfunk (at) engineering.uiowa.edu> College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527 #include <std_disclaimer.h> Better is not better, 'standard' is better. B{