TL;DR
You want Dallas Engelken's "sa-stats.pl" NOT the one from SA.
This is confusing because there are two different programs named
"sa-stats.pl".
The one that comes with SpamAssassin (what you're referring to) is an
engine stats reporting tool; does not do rule hits analysis.
The tool that Charles Sprickman and I used is the one from Dallas
Engelken.
See: http://wiki.apache.org/spamassassin/StatsAndAnalyzers
be sure to search that page for reference to Dallas Engelken.
On Fri, 11 Mar 2016, Robert Chalmers wrote:
The sa-stats.pl I refer to is here.
https://spamassassin.apache.org/full/3.0.x/dist/tools/sa-stats.pl. It’s not the
same as the ones shown in other posts. I don’t know what
that is.
and has an output like this.
zeus:~ robert$ perl sa-stats.pl
Report Title : SpamAssassin - Spam Statistics
Report Date : 2016-03-11
Period Beginning : Fri 11 Mar 00:00:00 2016
Period Ending : Sat 12 Mar 00:00:00 2016
Reporting Period : 24.00 hrs
--------------------------------------------------
Note: 'ham' = 'nonspam'
Total spam detected : 22 ( 51.16%)
Total ham accepted : 21 ( 48.84%)
-------------------
Total emails processed : 43 ( 2/hr)
Average spam threshold : 3.00
Average spam score : 4.46
Average ham score : -2.10
Spam kbytes processed : 397 ( 17 kb/hr)
Ham kbytes processed : 147 ( 6 kb/hr)
Total kbytes processed : 545 ( 23 kb/hr)
Spam analysis time : 339 s ( 14 s/hr)
Ham analysis time : 366 s ( 15 s/hr)
Total analysis time : 706 s ( 29 s/hr)
Statistics by Hour
----------------------------------------------------
Hour Spam Ham
------------- ----------------- --------------
2016-03-11 00 0 ( 0%) 13 (100%)
2016-03-11 01 0 ( 0%) 0 ( 0%)
2016-03-11 02 2 (100%) 0 ( 0%)
2016-03-11 03 4 (100%) 0 ( 0%)
2016-03-11 04 4 ( 57%) 3 ( 42%)
2016-03-11 05 6 ( 75%) 2 ( 25%)
2016-03-11 06 6 (100%) 0 ( 0%)
2016-03-11 07 0 ( 0%) 3 (100%)
2016-03-11 08 0 ( 0%) 0 ( 0%)
2016-03-11 09 0 ( 0%) 0 ( 0%)
2016-03-11 10 0 ( 0%) 0 ( 0%)
2016-03-11 11 0 ( 0%) 0 ( 0%)
2016-03-11 12 0 ( 0%) 0 ( 0%)
2016-03-11 13 0 ( 0%) 0 ( 0%)
2016-03-11 14 0 ( 0%) 0 ( 0%)
2016-03-11 15 0 ( 0%) 0 ( 0%)
2016-03-11 16 0 ( 0%) 0 ( 0%)
2016-03-11 17 0 ( 0%) 0 ( 0%)
2016-03-11 18 0 ( 0%) 0 ( 0%)
2016-03-11 19 0 ( 0%) 0 ( 0%)
2016-03-11 20 0 ( 0%) 0 ( 0%)
2016-03-11 21 0 ( 0%) 0 ( 0%)
2016-03-11 22 0 ( 0%) 0 ( 0%)
2016-03-11 23 0 ( 0%) 0 ( 0%)
Done. Report generated in 1 sec by sa-stats.pl, version 6256.
On 10 Mar 2016, at 21:38, Erickarlo Porro <epo...@earthcam.com> wrote:
I would like to know how to get these stats too.
From: Robert Chalmers [mailto:rob...@chalmers.com.au]
Sent: Tuesday, March 08, 2016 5:25 AM
To: users@spamassassin.apache.org
Subject: Re: Missed spam, suggestions?
Can I ask, how are you getting these stats please?
Thanks
On 8 Mar 2016, at 05:11, David B Funk <dbf...@engineering.uiowa.edu>
wrote:
On Mon, 7 Mar 2016, Charles Sprickman wrote:
I’ve been running with some daily training for a little over a week and
I’m seeing less spam in my inbox. I’ve
seen a few things slip through because bayes tipped them below the
default score, these were two phishing emails.
Here’s some rule stats for anyone interested:
TOP SPAM RULES FIRED
RANK RULE NAME COUNT %OFRULES %OFMAIL %OFSPAM
%OFHAM
1 TXREP 13171 8.47 40.38 91.00 72.91
2 HTML_MESSAGE 12714 8.18 38.98 87.85 90.80
3 DCC_CHECK 10593 6.81 32.48 73.19
33.78
4 RDNS_NONE 10269 6.60 31.48 70.95
5.63
5 SPF_HELO_PASS 10070 6.48 30.87 69.58
23.41
6 URIBL_BLACK 9711 6.25 29.77 67.10
1.58
7 BODY_NEWDOMAIN_FMBLA 9550 6.14 29.28
65.98 1.64
8 FROM_NEWDOMAIN_FMBLA 9483 6.10 29.07
65.52 1.36
9 BAYES_99 8486 5.46 26.02
58.63 1.18
10 BAYES_999 8141 5.24 24.96 56.25
1.06
TOP HAM RULES FIRED
RANK RULE NAME COUNT %OFRULES %OFMAIL %OFSPAM
%OFHAM
1 HTML_MESSAGE 16473 9.13 50.51 87.85 90.80
2 DKIM_SIGNED 13776 7.64 42.24 13.81
75.93
3 TXREP 13228 7.33 40.56 91.00 72.91
4 DKIM_VALID 12962 7.19 39.74 11.93
71.44
5 RCVD_IN_DNSWL_NONE 9941 5.51 30.48 8.08
54.79
6 DKIM_VALID_AU 8711 4.83 26.71 7.99 48.01
7 BAYES_00 8390 4.65 25.72
1.84 46.24
8 RCVD_IN_JMF_W 7369 4.09 22.59 2.54 40.62
9 RCVD_IN_MSPIKE_WL 6713 3.72 20.58 4.39
37.00
10 BAYES_50 6201 3.44 19.01
25.56 34.18
Based upon your stats it looks like you need more Bayes training. Your Bayes
00/99 hits should rank higher in the rules-fired
stats and BAYES_50 shouldn't be in the top-10 at all.
(of course if you've only been training for a week that would explain it).
For example, here's my top-10 hits (for a one month interval).
TOP SPAM RULES FIRED
----------------------------------------------------------------------
RANK RULE NAME COUNT %OFMAIL %OFSPAM %OFHAM S/O
----------------------------------------------------------------------
1 T__BOTNET_NOTRUST 114907 60.32 86.81 42.66 0.5755
2 BAYES_99 109138 32.98 82.45 0.01 0.9998
3 BAYES_999 104903 31.70 79.25 0.01 0.9999
4 HTML_MESSAGE 90850 79.41 68.63 86.59 0.3456
5 URIBL_BLACK 90845 27.61 68.63 0.27 0.9942
6 T_QUARANTINE_1 90640 27.40 68.47 0.02 0.9996
7 URIBL_DBL_SPAM 79152 24.02 59.79 0.17 0.9956
8 KAM_VERY_BLACK_DBL 74301 22.45 56.13 0.00 1.0000
9 L_FROM_SPAMMER1k 73667 22.26 55.65 0.00 1.0000
10 T__RECEIVED_1 72413 42.60 54.70 34.54 0.5135
OP HAM RULES FIRED
----------------------------------------------------------------------
RANK RULE NAME COUNT %OFMAIL %OFSPAM %OFHAM S/O
----------------------------------------------------------------------
1 BAYES_00 182674 56.03 2.11 91.97 0.0150
2 HTML_MESSAGE 171992 79.41 68.63 86.59 0.3456
3 SPF_PASS 136623 63.08 54.52 68.78 0.3457
4 T_RP_MATCHES_RCVD 130879 53.75 35.54 65.89 0.2644
5 T__RECEIVED_2 125492 53.76 39.62 63.18 0.2947
6 DKIM_SIGNED 114808 38.57 9.72 57.80 0.1008
7 DKIM_VALID 105385 34.70 7.16 53.06 0.0825
8 RCVD_IN_DNSWL_NONE 92951 29.90 4.56 46.80 0.0609
9 T__BOTNET_NOTRUST 84741 60.32 86.81 42.66 0.5755
10 KHOP_RCVD_TRUST 84623 26.44 2.19 42.60 0.0331
Note how highly BAYES 00/99 ranked. What you don't see is that BAYES_50 is way
down in the mud (below 50 rank).
BTW, this is with a Bayes that is mostly fed via auto-learning. I occasionally
hand feed corner cases that get mis-classified (usually things like phishes, or
conference announcments that can look shakey).
--
Dave Funk University of Iowa
<dbfunk (at) engineering.uiowa.edu> College of Engineering
319/335-5751 FAX: 319/384-0549 1256 Seamans Center
Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527
#include <std_disclaimer.h>
Better is not better, 'standard' is better. B{
Robert Chalmers
rob...@chalmers.com.au Quantum Radio: http://tinyurl.com/lwwddov
Mac mini 6.2 - 2012, Intel Core i7,2.3 GHz, Memory:16 GB. El-Capitan 10.11.
XCode 7.2.1
2TB: Drive 0:HGST HTS721010A9E630. Upper bay. Drive 1:ST1000LM024 HN-M101MBB.
Lower Bay
Robert Chalmers
rob...@chalmers.com.au Quantum Radio: http://tinyurl.com/lwwddov
Mac mini 6.2 - 2012, Intel Core i7,2.3 GHz, Memory:16 GB. El-Capitan 10.11.
XCode 7.2.1
2TB: Drive 0:HGST HTS721010A9E630. Upper bay. Drive 1:ST1000LM024 HN-M101MBB.
Lower Bay
--
Dave Funk University of Iowa
<dbfunk (at) engineering.uiowa.edu> College of Engineering
319/335-5751 FAX: 319/384-0549 1256 Seamans Center
Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527
#include <std_disclaimer.h>
Better is not better, 'standard' is better. B{