Hello M.Lewis,

Saturday, October 15, 2005, 10:40:29 PM, you wrote:


ML> Is there a best practices recommendation for how often to run sa-learn ?

IMO, best (least system impact, least problematic) is to run sa-learn
for each individual message as it's identified.   

However, that's probably not feasible/practical for most systems. When
emails are batched (such as moved/copied to "spam" and "not-spam"
folders, or uploaded via ftp, or whatever), then obviously you need to
also batch the sa-learn runs.

My experience has had provided very reasonable experience with
sa-learn cycles every 10 minutes.

It costs almost nothing when ham/spam files are empty to run
>  while : ; do                             # run without end
>  while [[ ! -s pause-learn ]]; do         # allow for pause
>  for file in /...mail-spool.../*.*am ; do # loop thru learn spools
>    if [[ spam file ]]
>    then sa-learn --spam $file
>    else sa-learn --ham  $file
>    fi
>  done
>  sleep 600    # sleep 10 min between each cycle
>  done
>  if [[ -s expire-learn ]]                 # daily expire?
>  then sa-learn --expire ; rm expire-learn
>  fi
>  sleep 600    # sleep 10 min when paused
>  done

The problems I've run into are
1) when the files being fed into sa-learn are excessively large, so
frequent runs minimize that exposure,
2) when sa-learn tries to do its expire while other things are
running, so add a once-a-day expire, and pause learning runs during
that.

Bob Menschel



Reply via email to