Hi John
I quite sure that the script is running and the variable in $DOMAIN and
$SPAM are correct ( I defined it early in the script, which are not shown
here) because the I got a copy for each them in $DIRCOLLECTSPAM and nothing
in the learning folder, /home/vpopmail/domains/$DOMAIN/$SPAM/Maildir/cur/*

I did the The dump from your command and which had given me this 

0.000          0          3          0  non-token data: bayes db version
0.000          0       1337          0  non-token data: nspam
0.000          0          6          0  non-token data: nham
0.000          0      41188          0  non-token data: ntokens
0.000          0  920269009          0  non-token data: oldest atime
0.000          0 1213715208          0  non-token data: newest atime
0.000          0          0          0  non-token data: last journal sync
atime
0.000          0          0          0  non-token data: last expiry atime
0.000          0          0          0  non-token data: last expire atime
delta
0.000          0          0          0  non-token data: last expire
reduction count



-----Original Message-----
From: John Hardin [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, June 17, 2008 3:50 AM
To: NGSS
Cc: users@spamassassin.apache.org; [EMAIL PROTECTED]
Subject: RE: SA experts needed here - SPAM examples

On Tue, 17 Jun 2008, NGSS wrote:

> HI,
> Thanks for the response.
>
> May I know how I can capture the output of the sa trainer ?

Well, if you're running the script from cron, stdout and stderr should 
automatically be emailed to the owner of the cron job - unless you are 
explicitly redirecting that output.

> I using the follow script to do training,
>
> cd /home/vpopmail/domains/$DOMAIN/$SPAM/Maildir/cur
> /usr/bin/sa-learn --spam ./*
> cp -a /home/vpopmail/domains/$DOMAIN/$SPAM/Maildir/cur/* $DIRCOLLECTSPAM
> rm -rf /home/vpopmail/domains/$DOMAIN/$SPAM/Maildir/cur/*
> cd /home/vpopmail/domains/$DOMAIN/$SPAM/Maildir/new
> /usr/bin/sa-learn --spam ./*
> cp -a /home/vpopmail/domains/$DOMAIN/$SPAM/Maildir/new/* $DIRCOLLECTSPAM
> rm -rf /home/vpopmail/domains/$DOMAIN/$SPAM/Maildir/new/*

Do you see a report of how many messages were seen and how many were 
learned from when you run that interactively? You should see the same 
output in email from the cron job.

Question: what sets $DOMAIN and $SPAM for the cron job? Remember, cron 
scripts start out with an empty environment. The cron job may not be 
learning anything because the directory paths are screwed up due to 
$DOMAIN and/or $SPAM not being set.

It's a good idea to do something like this for dynamic paths:

   if ! cd "/home/vpopmail/domains/$DOMAIN/$SPAM/Maildir/cur"
   then
     echo "Could not cd to /home/vpopmail/domains/$DOMAIN/$SPAM/Maildir/cur"
     exit 1
   fi

> I also do the same for the HAM using the same script which section is 
> not shown here .

Good.

You might want to add this to the end of your script to get bayes database 
stats afterward:

   /usr/bin/sa-learn --dump magic

-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  [EMAIL PROTECTED]    FALaholic #11174     pgpk -a [EMAIL PROTECTED]
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   Where We Want You To Go Today 07/05/07: Microsoft patents in-OS
   adware architecture incorporating spyware, profiling, competitor
   suppression and delivery confirmation (U.S. Patent #20070157227)
-----------------------------------------------------------------------
  2 days until SWMBO's Birthday

Reply via email to