BAYES_00 hits 15.27 of spam on yours, the %ofspam on top ham rules and
%ofham on top spam rules must be buggy.

i'm not running that version with the 5th column.   It must be buggy.
i play with it after bit. 
 
Dallas
 
 


________________________________

        From: Andy Jezierski [mailto:[EMAIL PROTECTED] 
        Sent: Wednesday, July 27, 2005 10:44 AM
        To: users@spamassassin.apache.org
        Subject: RE: generating rule stats from spamd logs
        
        

        > > > > > Another Dallas miracle!
        > > > > 
        > > > > Oh? Er, how does it determine if a message was ham or
spam? 
        > > > It "looks like"
        > > > > it is rather random based on the reports. BAYES_99 may
well
        > > > hit on 84.33%
        > > > > of spam. But I doubt, given it's score, it hits on
44.53% of ham.
        > > > 
        > 
        > The code should be right... It uses spamassassin's judgement,
ie 
        > 
        > "info: spamd: result: Y 20 - BAYES_99,..."
        > "info: spamd: result: . -2 - AWL,...."
        > 
        > 44.53% of your ham hit BAYES_99... That gotta tell you
something is
        > wrong!  My bayes hits break down like
        > 
        > # ./sa-stats.pl -f spamdlog -n 500 | grep BAYES
        > For spam...
        >   10    BAYES_99                        15351     4.46%
45.42%  60.57%
        >   19    BAYES_50                         6443     1.87%
19.06%  25.42%
        >   31    BAYES_80                         1154     0.34%
3.41%   4.55%
        >   32    BAYES_60                         1147     0.33%
3.39%   4.53%
        >   38    BAYES_95                          864     0.25%
2.56%   3.41%
        >  102    BAYES_00                          187     0.05%
0.55%   0.74%
        >  152    BAYES_40                           92     0.03%
0.27%   0.36%
        >  209    BAYES_20                           53     0.02%
0.16%   0.21%
        >  228    BAYES_05                           44     0.01%
0.13%   0.17%
        > 
        > For ham...
        >    2    BAYES_00                         6959    15.73%
20.59%  82.32%
        >    9    BAYES_50                          623     1.41%
1.84%   7.37%
        >   20    BAYES_40                          296     0.67%
0.88%   3.50%
        >   24    BAYES_20                          267     0.60%
0.79%   3.16%
        >   29    BAYES_05                          217     0.49%
0.64%   2.57%
        >   73    BAYES_60                           51     0.12%
0.15%   0.60%
        >  113    BAYES_99                           24     0.05%
0.07%   0.28%
        >  142    BAYES_80                           14     0.03%
0.04%   0.17%
        >  280    BAYES_95                            2     0.00%
0.01%   0.02%
        > 
        > So, BAYES_99 hits 0.28% of my ham and 60.57% of my spam.  
        > 
        > 
        
        So from your explanation I should be ignoring the %ofham column
in the spam stats and the %ofspam column in ham?  Otherwise the stats
don't seem to make much sense: 
        
        python# ./sa-stats -f maillog.0 -n 500 | grep BAYES 
        
        spam rules... 
           3    BAYES_99                          305     3.49    4.99
46.56    5.59 
          10    BAYES_50                          172     1.97    2.81
26.26    3.15 
          23    BAYES_00                          100     1.14    1.64
15.27    1.83 
          77    BAYES_80                           21     0.24    0.34
3.21    0.38 
          85    BAYES_95                           19     0.22    0.31
2.90    0.35 
         111    BAYES_60                           14     0.16    0.23
2.14    0.26 
         131    BAYES_05                           12     0.14    0.20
1.83    0.22 
         186    BAYES_20                            7     0.08    0.11
1.07    0.13 
         224    BAYES_40                            5     0.06    0.08
0.76    0.09 
         373    SARE_BAYES_5x8                      2     0.02    0.03
0.31    0.04 
         387    SARE_BAYES_6x8                      2     0.02    0.03
0.31    0.04 
         412    SARE_BAYES_7x8                      2     0.02    0.03
0.31    0.04 
        
        ham rules... 
           1    BAYES_00                         4079    14.05   66.75
622.75   74.76 
        
        BAYES_00 hitting 622% of spam??? 
        
           6    BAYES_50                          771     2.65   12.62
117.71   14.13 
          25    BAYES_40                          238     0.82    3.89
36.34    4.36 
          35    BAYES_20                          190     0.65    3.11
29.01    3.48 
          40    BAYES_05                          148     0.51    2.42
22.60    2.71 
         173    BAYES_60                           15     0.05    0.25
2.29    0.27 
         232    BAYES_80                            9     0.03    0.15
1.37    0.16 
         310    BAYES_95                            5     0.02    0.08
0.76    0.09 
         349    SARE_BAYES_6x6                      4     0.01    0.07
0.61    0.07 
         416    SARE_BAYES_5x8                      2     0.01    0.03
0.31    0.04 
         496    SARE_BAYES_5x7                      1     0.00    0.02
0.15    0.02 
        
        
        
        Andy

Reply via email to