Hi,

On Fri, 23 Jan 2004 17:15:19 -0600 "Smart,Dan" <[EMAIL PROTECTED]>
wrote:

> I would suggest you use SpamStats from http://www.gryzor.com/tools/
> I ran both, and SpamStats reported more of both Spam and Ham.  I
> suspect sa-stats is missing some records.

Actually, I'm starting to believe the opposite:

# rm /tmp/spamcompare.txt; (/usr/local/bin/raa_spamstats.pl -file \
/var/log/mail > /tmp/spamcompare.txt & \
~apthorpe/pjx/sa-stats/sa-stats.pl -l /var/log/mail -s "Jan 27 23:55" \
-e now >> /tmp/spamcompare.txt)

##### /tmp/spamcompare.txt #####
File /var/log/mail : from Jan 27 23:55:08 to Jan 28 04:11:39
Total number of emails processed by the spam filter : 76
Number of spams                         :         3 (  3.95%)
Number of clean messages                :        73 ( 96.05%)
Average message analysis time           :     10.15 seconds
Average spam analysis time              :      4.90 seconds
Average clean message analysis time     :     10.36 seconds
Average message score                   :    -31.18
Average spam score                      :     28.63
Average clean message score             :    -33.57
Total spam volume                       :        38 kbytes
Total clean volume                      :       551 kbytes
Report Title     : SpamAssassin - Spam Statistics
Report Date      : 2004-01-28
Period Beginning : Tue Jan 27 23:55:00 2004
Period Ending    : Wed Jan 28 04:13:52 2004

Reporting Period : 4.31 hrs
--------------------------------------------------

Note: 'ham' = 'nonspam'

Total spam detected    :        3 (   3.85%)
Total ham accepted     :       75 (  96.15%)
                        -------------------
Total emails processed :       78 (   18/hr)

Average spam threshold :        6.30
Average spam score     :       28.63
Average ham score      :      -33.57

Spam kbytes processed  :       38   (    9 kb/hr)
Ham kbytes processed   :      551   (  128 kb/hr)
Total kbytes processed :      590   (  137 kb/hr)

Spam analysis time     :       14 s (    3 s/hr)
Ham analysis time      :      776 s (  180 s/hr)
Total analysis time    :      791 s (  183 s/hr)


Statistics by Hour
-------------------------------------
Hour                 Spam         Ham
--------------   --------    --------
2004-01-27, 23          0           7
2004-01-28, 00          1          26
2004-01-28, 01          2          12
2004-01-28, 02          0          18
2004-01-28, 03          0          10
2004-01-28, 04          0           2


Done. Report generated in 0 sec by /home/apthorpe/pjx/sa-stats/sa-stats.pl, version 
1.13.

Running
`egrep 'spamd\[[0-9]*\]' /var/log/mail | egrep -c 'clean message|identified spam'`
turns up 78 lines; I haven't had the time to walk through spamstats.pl
(I've hacked on this code as well, so it's not like I'm playing
favorites...)

The results are inconsistent, it's 4am local time, I've had three
glasses of wine (hooray for Australian reds!), and I'm burned out from
studying for Oracle certification - something isn't right, I don't know
what, but given that sa-stats.pl and egrep agree on totals, I suspect
spamstats.pl (probably a regex failing on an edge case...) I need to run
a known test case for both scripts and then anaylze it by hand to figure
out what's broken. Any other JAPHs out there that want to take a crack
at analyzing both scripts for consistency?

-- Bob


-------------------------------------------------------
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to