> -----Original Message----- > From: Brian Ipsen [mailto:[EMAIL PROTECTED] > Sent: Sunday, August 24, 2003 10:29 AM > To: [EMAIL PROTECTED] > Subject: [Qmail-scanner-general]Suggestion: Option to archive > all messages tagged by SpamAssassin > > > Hi! > > I miss an option, where it is possible to specify that > qmail-scanner should archive all mails that SpamAssassin > identifies as spam. The reason for this is that I'd like to > be able to gather statistics on what rules are triggered on > each message - and I can only do this either by storing a > copy of each message - or enabling debug-log in SpamAssassin, > which unfortunately reguires some disk-space. The other way > around I'm able to process each message and store the needed > data in an SQL database - and afterwards delete the message. > > Regards, > > /Brian >
i would start my stripping the test=(.*) line from X-Spam-Status and splitting the matching tests @tests=split(/\,/,$1); with 2.60-x, you will run into a small problem with TERSE report which is no longer an on|off option. The X-Spam-Status _REPORT_ is automatically TERSE, and will fold at 78 chars, so your tests= will look like X-Spam-Status: Yes, hits=14.6 required=5.0 tests=BAYES_99,CLICK_BELOW_CAPS, DATE_MISSING,SUBJ_HAS_SPACES,SUBJ_HAS_UNIQ_ID autolearn=no version=2.60-rc2 on emails that have large amounts of matching rules, so $1 will hold "BAYES_99=5.4,CLICK_BELOW_CAPS=0.5,DATE_MISSING=1.917," and not grab the fold. you would need to set a $next_header=0 and watch for \t's for header continuation. it'll take a little work, but it will be much easier that anything else you are thinking about doing (IMHO). then, once you have all the rules in @tests, you can foreach my $test (@tests) { $sql="INSERT INTO test_hits (msgid,rule,score) VALUES (?,?,?)"; .. .. $sth->execute($msgid,$test,NULL); i use the score field and run _TESTSSCORES(,)_ as above, except with scores appended (eg. AWL=-3.0,...) instead of _TESTS(,)_ tests hit separated by , (or other separator) in the X-Spam-Status: header, so then in my foreach loop, i split again on the = sign, foreach my $test (@tests) { $sql="INSERT INTO test_hits (msgid,rule,score) VALUES (?,?,?)"; .. .. my ($rule,$score) = split(/=/,$test); $sth->execute($msgid,$rule,$score); my test_hits db contains 5 columns CREATE TABLE test_hits ( id int(10) unsigned NOT NULL auto_increment, msgid varchar(254) NOT NULL default '', rule varchar(64) NOT NULL default '', score float(5,2) NOT NULL default '0.00', t timestamp(14) NOT NULL, PRIMARY KEY (id), KEY msgid (msgid), KEY rule (rule) ) TYPE=MyISAM COMMENT='SpamAssasin Rule Matches'; and indicies on msgid and rule, so i can easily show all rules that match for a specific msgid, or show how many messages match a certain rule.... you could extend as needed to include env_sender, recips, spam score, etc.... enjoy, and good luck! dallas ------------------------------------------------------- This SF.net email is sponsored by: VM Ware With VMware you can run multiple operating systems on a single machine. WITHOUT REBOOTING! Mix Linux / Windows / Novell virtual machines at the same time. Free trial click here:http://www.vmware.com/wl/offer/358/0 _______________________________________________ Qmail-scanner-general mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/qmail-scanner-general