I would like to know how to get these stats too.

From: Robert Chalmers [mailto:rob...@chalmers.com.au]
Sent: Tuesday, March 08, 2016 5:25 AM
To: users@spamassassin.apache.org
Subject: Re: Missed spam, suggestions?

Can I ask, how are you getting these stats please?

Thanks
On 8 Mar 2016, at 05:11, David B Funk 
<dbf...@engineering.uiowa.edu<mailto:dbf...@engineering.uiowa.edu>> wrote:

On Mon, 7 Mar 2016, Charles Sprickman wrote:


I’ve been running with some daily training for a little over a week and I’m 
seeing less spam in my inbox.  I’ve seen a few things slip through because 
bayes tipped them below the default score, these were two phishing emails.

Here’s some rule stats for anyone interested:

TOP SPAM RULES FIRED

RANK RULE NAME                        COUNT %OFRULES %OFMAIL %OFSPAM  %OFHAM

 1         TXREP                       13171   8.47   40.38  91.00  72.91
 2         HTML_MESSAGE                12714   8.18   38.98  87.85  90.80
 3         DCC_CHECK                        10593   6.81   32.48  73.19  33.78
 4         RDNS_NONE                        10269   6.60   31.48  70.95   5.63
 5         SPF_HELO_PASS                 10070   6.48   30.87  69.58  23.41
 6         URIBL_BLACK                    9711    6.25   29.77  67.10   1.58
 7         BODY_NEWDOMAIN_FMBLA                9550    6.14   29.28   65.98   
1.64
 8         FROM_NEWDOMAIN_FMBLA                9483    6.10   29.07   65.52   
1.36
 9         BAYES_99                             8486    5.46   26.02  58.63   
1.18
10        BAYES_999                           8141    5.24   24.96  56.25   1.06

TOP HAM RULES FIRED

RANK RULE NAME                        COUNT %OFRULES %OFMAIL %OFSPAM  %OFHAM

 1         HTML_MESSAGE                16473   9.13   50.51  87.85  90.80
 2         DKIM_SIGNED                    13776   7.64   42.24  13.81  75.93
 3         TXREP                       13228   7.33   40.56  91.00  72.91
 4         DKIM_VALID                      12962   7.19   39.74  11.93  71.44
 5         RCVD_IN_DNSWL_NONE            9941    5.51   30.48   8.08            
54.79
 6         DKIM_VALID_AU              8711    4.83   26.71   7.99   48.01
 7         BAYES_00                             8390    4.65   25.72   1.84   
46.24
 8         RCVD_IN_JMF_W               7369    4.09   22.59   2.54   40.62
 9         RCVD_IN_MSPIKE_WL                 6713    3.72   20.58   4.39        
    37.00
10        BAYES_50                             6201    3.44   19.01  25.56  
34.18

Based upon your stats it looks like you need more Bayes training. Your Bayes 
00/99 hits should rank higher in the rules-fired stats and BAYES_50 shouldn't 
be in the top-10 at all.
(of course if you've only been training for a week that would explain it).

For example, here's my top-10 hits (for a one month interval).

TOP SPAM RULES FIRED
----------------------------------------------------------------------
RANK    RULE NAME                       COUNT  %OFMAIL %OFSPAM  %OFHAM  S/O
----------------------------------------------------------------------
  1    T__BOTNET_NOTRUST               114907   60.32   86.81   42.66  0.5755
  2    BAYES_99                        109138   32.98   82.45    0.01  0.9998
  3    BAYES_999                       104903   31.70   79.25    0.01  0.9999
  4    HTML_MESSAGE                    90850    79.41   68.63   86.59  0.3456
  5    URIBL_BLACK                     90845    27.61   68.63    0.27  0.9942
  6    T_QUARANTINE_1                  90640    27.40   68.47    0.02  0.9996
  7    URIBL_DBL_SPAM                  79152    24.02   59.79    0.17  0.9956
  8    KAM_VERY_BLACK_DBL              74301    22.45   56.13    0.00  1.0000
  9    L_FROM_SPAMMER1k                73667    22.26   55.65    0.00  1.0000
 10    T__RECEIVED_1                   72413    42.60   54.70   34.54  0.5135

OP HAM RULES FIRED
----------------------------------------------------------------------
RANK    RULE NAME                       COUNT  %OFMAIL %OFSPAM  %OFHAM  S/O
----------------------------------------------------------------------
  1    BAYES_00                        182674   56.03    2.11   91.97  0.0150
  2    HTML_MESSAGE                    171992   79.41   68.63   86.59  0.3456
  3    SPF_PASS                        136623   63.08   54.52   68.78  0.3457
  4    T_RP_MATCHES_RCVD               130879   53.75   35.54   65.89  0.2644
  5    T__RECEIVED_2                   125492   53.76   39.62   63.18  0.2947
  6    DKIM_SIGNED                     114808   38.57    9.72   57.80  0.1008
  7    DKIM_VALID                      105385   34.70    7.16   53.06  0.0825
  8    RCVD_IN_DNSWL_NONE              92951    29.90    4.56   46.80  0.0609
  9    T__BOTNET_NOTRUST               84741    60.32   86.81   42.66  0.5755
 10    KHOP_RCVD_TRUST                 84623    26.44    2.19   42.60  0.0331

Note how highly BAYES 00/99 ranked. What you don't see is that BAYES_50 is way 
down in the mud (below 50 rank).

BTW, this is with a Bayes that is mostly fed via auto-learning. I occasionally
hand feed corner cases that get mis-classified (usually things like phishes, or 
conference announcments that can look shakey).


--
Dave Funk                                  University of Iowa
<dbfunk (at) engineering.uiowa.edu<http://engineering.uiowa.edu>>        
College of Engineering
319/335-5751   FAX: 319/384-0549           1256 Seamans Center
Sys_admin/Postmaster/cell_admin            Iowa City, IA 52242-1527
#include <std_disclaimer.h>
Better is not better, 'standard' is better. B{

Robert Chalmers
rob...@chalmers.com<mailto:rob...@chalmers.com>.au  Quantum Radio: 
http://tinyurl.com/lwwddov
Mac mini 6.2 - 2012, Intel Core i7,2.3 GHz, Memory:16 GB. El-Capitan 10.11.  
XCode 7.2.1
2TB: Drive 0:HGST HTS721010A9E630. Upper bay. Drive 1:ST1000LM024 HN-M101MBB. 
Lower Bay



Reply via email to