I'd be able to code it in myself but I'm not fluent in perl (PHP guy) and of course, the string parsing functions confuse the hell out of me. LOL. Thought that there might be a lot of perl coders here who can make this a snap. [Recipient-domain-based filtering & date range also]
Thanks so much! -- Matthew Yette Senior Engineer - NOC/Operations MA Polce Consulting, Inc. [EMAIL PROTECTED] 315-838-1644 (w) 315-356-0597 (f) AIM/Yahoo: MAPolceNOC MSN: [EMAIL PROTECTED] -----Original Message----- From: Matthew Yette Sent: Thursday, July 28, 2005 12:07 PM To: users@spamassassin.apache.org Subject: RE: generating rule stats from spamd logs Is there any way to modify this code to accept another command-line argument for domain-specific? Meaning, I want to look for all rule hits for mail destined for domain.com? -- Matthew Yette Senior Engineer - NOC/Operations MA Polce Consulting, Inc. [EMAIL PROTECTED] 315-838-1644 (w) 315-356-0597 (f) AIM/Yahoo: MAPolceNOC MSN: [EMAIL PROTECTED] -----Original Message----- From: Dallas L. Engelken [mailto:[EMAIL PROTECTED] Sent: Wednesday, July 27, 2005 1:02 PM To: users@spamassassin.apache.org Subject: RE: generating rule stats from spamd logs My mistake.. It is fixed, hopefully for good. v0.9 - http://www.rulesemporium.com/programs/sa-stats.txt TOP SPAM RULES FIRED ------------------------------------------------------------ RANK RULE NAME COUNT %OFRULES %OFMAIL %OFSPAM %OFHAM ------------------------------------------------------------ 1 UNPARSEABLE_RELAY 25322 7.35 74.72 99.76 99.13 2 URIBL_SBL 22241 6.46 65.63 87.63 0.38 3 URIBL_JP_SURBL 21419 6.22 63.20 84.39 0.28 4 URIBL_BLACK 19436 5.64 57.35 76.57 0.93 5 RAZOR2_CF_RANGE_51_100 17562 5.10 51.82 69.19 1.34 6 RAZOR2_CHECK 17475 5.07 51.57 68.85 1.15 7 SARE_SPEC_ROLEX_REP 16553 4.81 48.84 65.22 0.29 8 SPOOF_COM2OTH 16537 4.80 48.80 65.15 0.05 9 RAZOR2_CF_RANGE_E8_51_100 16329 4.74 48.18 64.33 0.16 10 BAYES_99 15380 4.47 45.38 60.59 0.28 ------------------------------------------------------------ TOP HAM RULES FIRED ------------------------------------------------------------ RANK RULE NAME COUNT %OFRULES %OFMAIL %OFSPAM %OFHAM ------------------------------------------------------------ 1 UNPARSEABLE_RELAY 8433 18.93 24.88 99.76 99.13 2 BAYES_00 7005 15.72 20.67 0.74 82.34 3 AWL 4904 11.01 14.47 26.64 57.65 4 HTML_MESSAGE 3813 8.56 11.25 22.92 44.82 5 NO_REAL_NAME 1453 3.26 4.29 37.79 17.08 6 HTML_80_90 1279 2.87 3.77 10.98 15.03 7 MIME_HTML_ONLY 972 2.18 2.87 6.88 11.43 8 HTML_FONT_BIG 794 1.78 2.34 9.28 9.33 9 BAYES_50 625 1.40 1.84 25.40 7.35 10 HTML_FONT_FACE_BAD 545 1.22 1.61 0.76 6.41 ------------------------------------------------------------ ________________________________ From: Steve Martin [mailto:[EMAIL PROTECTED] Sent: Wednesday, July 27, 2005 11:44 AM To: Andy Jezierski Cc: Dallas L. Engelken; users@spamassassin.apache.org Subject: Re: generating rule stats from spamd logs He only fixed the spam rules section. The TOP HAM RULES sections still has these two incorrect computations... my $perc2=sprintf("%.2f",($HAM_RULES{$key}/$NUM_SPAM)*100); my $perc3=sprintf("%.2f",($SPAM_RULES{$key}/$NUM_HAM)*100); Number of times a rule fired on ham / total number of spam messages. Number of times a rule fired on spam / total number of ham messages. my $perc2=sprintf("%.2f",($SPAM_RULES{$key}/$NUM_SPAM)*100); my $perc3=sprintf("%.2f",($HAM_RULES{$key}/$NUM_HAM)*100); On Jul 27, 2005, at 11:32 AM, Andy Jezierski wrote: "Dallas L. Engelken" <[EMAIL PROTECTED]> wrote on 07/27/2005 11:26:54 AM: > > -----Original Message----- > > From: Chris Thielen [mailto:[EMAIL PROTECTED] > > Sent: Wednesday, July 27, 2005 11:02 AM > > To: Dallas L. Engelken > > Cc: users@spamassassin.apache.org > > Subject: Re: generating rule stats from spamd logs > > > > Dallas L. Engelken wrote: > > > > >BAYES_00 hits 15.27 of spam on yours, the %ofspam on top ham > > rules and > > >%ofham on top spam rules must be buggy. > > > > > >i'm not running that version with the 5th column. It must be buggy. > > >i play with it after bit. > > > > > >Dallas > > > > > > > > > > Dallas, > > > > Did you see the patch I sent to the SARE list? Just need to > > swap two hash lookups. > > > > > > Yup yup. http://www.rulesemporium.com/programs/sa-stats.txt updated. > > D Something's still a little fishy. SA 3.1 latest SVN, if it makes any difference. python# ./sa-stats -f maillog.0 -n 5 Email: 6111 Autolearn: 226 AvgScore: 2.15 AvgScanTime: 3.91 sec Spam: 655 Autolearn: 133 AvgScore: 14.81 AvgScanTime: 3.76 sec Ham: 5456 Autolearn: 93 AvgScore: 0.63 AvgScanTime: 3.93 sec Time Spent Running SA: 6.64 hours Time Spent Processing Spam: 0.68 hours Time Spent Processing Ham: 5.96 hours TOP SPAM RULES FIRED ------------------------------------------------------------ RANK RULE NAME COUNT %OFRULES %OFMAIL %OFSPAM %OFHAM ------------------------------------------------------------ 1 HTML_MESSAGE 496 5.67 8.12 75.73 62.19 2 DCC_CHECK 310 3.55 5.07 47.33 7.02 3 BAYES_99 305 3.49 4.99 46.56 0.02 4 RAZOR2_CHECK 277 3.17 4.53 42.29 4.23 5 DIGEST_MULTIPLE 251 2.87 4.11 38.32 2.42 ------------------------------------------------------------ TOP HAM RULES FIRED ------------------------------------------------------------ RANK RULE NAME COUNT %OFRULES %OFMAIL %OFSPAM %OFHAM ------------------------------------------------------------ 1 BAYES_00 4079 14.05 66.75 622.75 1.83 2 HTML_MESSAGE 3393 11.68 55.52 518.02 9.09 3 NO_REAL_NAME 1053 3.63 17.23 160.76 1.06 4 HTML_80_90 931 3.21 15.23 142.14 2.35 5 LG_4C_2V_3C 798 2.75 13.06 121.83 2.20 ------------------------------------------------------------ -- Steve Martin http://www.cheezmo.com/ Smart Calibration, LLC http://www.smartcalibration.com/ The Widescreen Movie Center http://www.widemovies.com/ Letterboxed Movie TV Schedule http://www.widemovies.com/lbx.html