Hello all,

I'm sorry I don't have more time to put into this, because it would be nice
to put together a proper patch set for this. However, I hacked together an
interface between qmail-scanner and Gary Arnold's Bayespam package at
http://www.garyarnold.com/projects.php and thought I'd share it with the
group. Hopefully somebody out there cares enough to make this a regular
feature. =)

First, I made a new module sub-bayesian.pl:
-------------------------------------------
sub bayesian {
  # Only run bayes_spam_check if mail is from a "remote" SMTP client
  return if (defined($ENV{'RELAYCLIENT'}));

  my ($bayesian_found,$bayesian_status);
  my ($start_bayesian_time)=[gettimeofday];
  my ($DD,$bayesian_status,$stop_bayesian_time,$bayesian_time);
  my ($bayes_status)=0;
  
  # These should really be configurable options
  my ($bayes_bin)="/usr/local/bin/bayes_spam_check.pl";
  my ($bayes_db)="/tmp/rating.db";
  my ($output);
  
  &debug("SA: run $bayes_bin -r $bayes_db <
$scandir/$wmaildir/new/$file_id");
  
  $output=`$bayes_bin -r $bayes_db < $scandir/$wmaildir/new/$file_id 2>&1`;
  $bayes_status=($?);
  
  if ($bayes_status > 0) {
      $quarantine_description="Bayesian spam filter: spam detected";
      &debug($quarantine_description);
          $quarantine_event=$quarantine_description . "\n" . $output;
  }
  
  $stop_bayesian_time=[gettimeofday];
  $bayesian_time = tv_interval ($start_bayesian_time, $stop_bayesian_time);
  &debug("bayesian: finished scan of filer
\"$scandir/$wmaildir/new/$file_id\" in $bayesian_time 
secs");
}
-------------------------------------------

The filter location and database are hard-coded - to any enterprising
individual who feels like making this a configure option, I salute you!
Bayespam doesn't output anything useful, so I can't trap interesting tidbits
like the score it applied to each message, but maybe I'll hack that in
later.

I also hacked the configure script and hard-coded an entry to include the
filter at line 1014:

SCANNER_ARRAY="$SCANNER_ARRAY,\"bayesian\""

Once this is done, I did this:
1. Ran
     ./configure --log-details yes --admin myadmin --domain mydomain.com
--scanners auto
2. Copied the script to /var/qmail/bin
3. chmod 4755 /var/qmail/bin/qmail-scanner-queue.pl

4. Unpack the Bayespam source
5. Installed the binaries to /usr/local/bin
6. Set up nospam@ and spam@ users on the server that will do the processing.
7. Set up maildir for those two users.
8. Forwarded a bunch of saved spam to spam@myfilterhost.
9. Forwarded, well, most of my inbox to nospam@myfilterhost.
10. Made a script to run via cron every hour:
      bayes_process_email.pl -g /home/nospam/.maildir/new -s
/home/spam/.maildir/new -o /tmp/rating.db -r 1

11. In my case, I had the suidperl issue, so I installed the
qmail-scanner-queue.c application.

12. Installed the filter by setting tcp.smtp up so:
  192.168.:allow,RELAYCLIENT="",QMAILQUEUE="/var/qmail/bin/qmail-queue"
  127.0.0.1:allow,RELAYCLIENT=""QMAILQUEUE="/var/qmail/bin/qmail-queue"
  :allow,QMAILQUEUE="/var/qmail/bin/qmail-scanner-queue"

This is pretty neat. I've processed dozens of messages so far, with only 1
false positive, and only 2 false negatives. Users can improve it on their
own by forwarding incoming spam to spam@myfilterhost, and this is VERY
effective:
a. Administrators don't have to try to look for keywords or things to filter
on, always a pain in the rear.
b. Forwarding mail to the nospam@ address to add it to the spam corpus is
VERY effective at reducing future spam. It finds even unrelated items that
are statistically similar.

The only angst right now is that quarantined mail never makes it on to the
real mail server. It stays in the quarantine, and the notification message
doesn't contain enough information to be worth sending the user. Thus, we've
set up Pine on the box so administrators can come along and clean out the
maildir for the quarantine location Bayespam creates. So far this hasn't
been a problem because there haven't been false positives, but some day I
will probably hack the script so the quarantine message contains the body of
the e-mail as well. That way administrators can forward it right to the user
and CC: it to nospam@myfilterhost.

Good luck. I'm happy to entertain emails about what I did if more details
are required, but please don't ask me for tech support because I just don't
have the time to serve in that capacity.

Regards,
Chad


-------------------------------------------------------
This sf.net email is sponsored by: To learn the basics of securing 
your web site with SSL, click here to get a FREE TRIAL of a Thawte 
Server Certificate: http://www.gothawte.com/rd524.html
_______________________________________________
Qmail-scanner-general mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/qmail-scanner-general

Reply via email to