On Sat, 2013-11-09 at 01:34 -0200, Sergio Durigan Junior wrote: > On Friday, November 08 2013, Karsten Bräckelmann wrote:
> > You mentioned that's a fresh install, actually not even in production > > yet. The Bayes sub-system requires some training (minimum of 200 ham and > > spam each) by default, before Bayes rules kick in for scanning. > > > > Instead of -c check only, use the -R option to print the report. You'll > > notice there is no BAYES_xx rule (yet). > > Thanks. I had used -R before, without much success. But yeah, I found > some discussions on this list about Bayes databases, and people saying > that at least 200 messages are needed before Bayes can start doing its > job. > > BTW, one spam has just sneaked in right now. On the one hand I'm sad > because of those false-negatives, but OTOH I'm happy because I'll be > able to train the database faster :-). You don't have any kind of archive of spam? If so, train on recent ones, feel free to exceed the minimum limit, but don't bother too much with old spam. It changes much faster over time than ham does. Also, at least until you reached the minimum required training, do train with identified spam, too. Same with ham. For now, keep training in a ratio somewhere between 1:1 or spam to ham ratio. > > > service, etc. I was expecting that I'd get a high rate after feeding > > > the spam to SpamAssassin, but that's not happening. Any suggestions? > > > > In addition to required initial training: > > > > The Bayesian classifier works on a per-token (think: word) basis. Thus, > > depending on the tokens in the message and existing ones in the db, the > > impact of learning can vary quite a lot -- from hardly noticeable to > > clear detection. > > All right. Since I don't have a good database yet (only 4 or 5 spams > learned), I won't worry about it for now. Let's see when I have a > bigger DB... Do train. Spam, as well as ham. If you got some recent-ish archives. > Thanks a lot, You're welcome. :) -- char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4"; main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1: (c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}