Re: Training Bayes on outbound mail

David B Funk Fri, 28 Jan 2011 11:25:20 -0800

On Fri, 28 Jan 2011, David F. Skoll wrote:

> On Fri, 28 Jan 2011 18:10:08 +0000
> Dominic Benson <domi...@lenny.cus.org> wrote:
>
> > Recently, in order to balance the ham/spam ratio given to sa-learn, I
> > have started to pass mail submitted by authenticated users to
> > sa-learn --ham.
>
> > I haven't seen any mention of this strategy on-list or on the web, so
> > I'm interested in whether (a) anyone else does this, and (b) is there
> > a good reason not to do it that I haven't thought of?
>
> It's possibly a good idea, but you want to be really careful of one
> thing: Make sure your users are savvy enough not to have their
> accounts phished.  It'll take just one compromised account that blasts
> out a spam run to destroy the usefulness of your Bayes data.


Amen to that. Sad how many supposedly educated people (say engineering
professors ;) fall for phishes and get their accounts powned. 419 spammers
love to target university systems, semi-clueless users and fat pipes.

One other semi-issue with that strategy, half of Bayes is based upon
header contents. Your outgoing messages are not going to have headers that
are representative of incoming messages.

-- 
Dave Funk                                  University of Iowa
<dbfunk (at) engineering.uiowa.edu>        College of Engineering
319/335-5751   FAX: 319/384-0549           1256 Seamans Center
Sys_admin/Postmaster/cell_admin            Iowa City, IA 52242-1527
#include <std_disclaimer.h>
Better is not better, 'standard' is better. B{

Re: Training Bayes on outbound mail

Reply via email to