Re: autolearn_discriminator callback not getting called.

psychobyte Fri, 10 May 2013 08:58:16 -0700

So i was able to write a plugin that overrides the AWLcheck_from_in_auto_whitelist() eval rule. Thanks for the help Karsten.


===== /etc/spamassassin/25_AwlIgnore.cf


##AWL ignore address types (periods in names are not supported)
## @see  AwlIgnore.pm
awl_ignore_from         postmaster mailer-daemon

## Overridden AWL params
#
header AWL eval:awl_ignore_check_from_in_auto_whitelist()
describe AWL                    From: address is in the auto white-list
tflags AWL                      userconf noautolearn
priority AWL                    1000


===== /etc/spamassassin/init_AwlIgnore.pre

## Enable the AWL Ignore plugin
loadplugin Mail::SpamAssassin::Plugin::AwlIgnore

===== AwlIgnore.pm

package Mail::SpamAssassin::Plugin::AwlIgnore;
#

# This plugin overrides the AWL check_from_in_auto_whitelist() method inorder# to ignore specific types of addresses from getting into thewhitelist database.# For Example, don't add addresses like postmas...@example.com to theAWL database.

#
# To activate this plugin,
# 1) Enable the Mail::SpamAssassin::Plugin::AWL plugin

# 2) Enable this plugin e.g. loadpluginMail::SpamAssassin::Plugin::AwlIgnore

#      check /etc/spamassassin/init_AwlIngore.pre

# 3) Update your config. Check 25_AwlIgnore.cf. Should look somethinglike this:

#
#
### AWL ignore address types (periods in names are not supported)
## @see  AwlIgnore.pm
# awl_ignore_from         postmaster mailer-daemon
#
## Overridden AWL params
#
# header AWL eval:awl_ignore_check_from_in_auto_whitelist()
# describe AWL                    From: address is in the auto white-list
# tflags AWL                      userconf noautolearn
# priority AWL                    1000
#
#

# @todo - support ignoring local addresses w/ "." in them i.e.user.name...@example.com




use Mail::SpamAssassin::Plugin;
use strict;

use vars qw(@ISA);
@ISA = qw(Mail::SpamAssassin::Plugin);

# constructor: register the eval rule
sub new {
  my $class = shift;
  my $mailsaobject = shift;

  # some boilerplate...
  $class = ref($class) || $class;
  my $self = $class->SUPER::new($mailsaobject);
  bless ($self, $class);
  $self->set_config($mailsaobject->{conf});

  $self->register_eval_rule ('awl_ignore_check_from_in_auto_whitelist');
  return $self;
}

#
# Load params from config
#
sub set_config {
  my($self, $conf) = @_;
  my @cmds;

=item awl_ignore_from

Ignore address types from going into the AWL database.

=cut

  push (@cmds, {
        setting => 'awl_ignore_from',
        type => $Mail::SpamAssassin::Conf::CONF_TYPE_ADDRLIST
  });
    $conf->{parser}->register_commands(\@cmds);
}

#
# Replace check_from_in_auto_whitelist()
#
sub awl_ignore_check_from_in_auto_whitelist {
    my ($self, $pms) = @_;

    return 0 unless ($pms->{conf}->{use_auto_whitelist});

    my $timer = $self->{main}->time_method("total_awl");

    my $from = lc $pms->get('From:addr');
    return 0 unless $from =~ /\S/;

   ## ignore addresses in awl_ignore_from
   foreach (keys %{$pms->{conf}->{awl_ignore_from}}) {
     if ($from =~ /$_\@/) {
       dbg("auto-whitelist: AWL ignoring ". $from);
       return 0;
     }
   }

   # find the earliest usable "originating IP".  ignore private nets
   my $origip;

foreach my $rly (reverse (@{$pms->{relays_trusted}},@{$pms->{relays_untrusted}}))

   {
     next if ($rly->{ip_private});
     if ($rly->{ip}) {
       $origip = $rly->{ip}; last;
     }
   }

   my $scores = $pms->{conf}->{scores};
   my $tflags = $pms->{conf}->{tflags};
   my $points = 0;
   my $signedby = $pms->get_tag('DKIMDOMAIN');
   undef $signedby  if defined $signedby && $signedby eq '';

   foreach my $test (@{$pms->{test_names_hit}}) {
     # ignore tests with 0 score in this scoreset,
     # or if the test is marked as "noautolearn"
     next if !$scores->{$test};

next if exists $tflags->{$test} && $tflags->{$test} =~/\bnoautolearn\b/;

     $points += $scores->{$test};
   }

   my $awlpoints = (sprintf "%0.3f", $points) + 0;

   # Create the AWL object
   my $whitelist;
   eval {
     $whitelist = Mail::SpamAssassin::AutoWhitelist->new($pms->{main});

     my $meanscore;
     { # check
       my $timer = $self->{main}->time_method("check_awl");
       $meanscore = $whitelist->check_address($from, $origip, $signedby);
     }
     my $delta = 0;

dbg("auto-whitelist: AWL active, pre-score: %s, autolearn score:%s, ".

        "mean: %s, IP: %s, address: %s %s",
         $pms->{score}, $awlpoints,
         !defined $meanscore ? 'undef' : sprintf("%.3f",$meanscore),
         $origip || 'undef',
         $from,  $signedby ? "signed by $signedby" : '(not signed)');

     if (defined $meanscore) {
         $delta = $meanscore - $awlpoints;
         $delta *= $pms->{main}->{conf}->{auto_whitelist_factor};

         $pms->set_tag('AWL', sprintf("%2.1f",$delta));
       if (defined $meanscore) {
           $pms->set_tag('AWLMEAN', sprintf("%2.1f", $meanscore));
         }
         $pms->set_tag('AWLCOUNT', sprintf("%2.1f", $whitelist->count()));
         $pms->set_tag('AWLPRESCORE', sprintf("%2.1f", $pms->{score}));
     }

     # Update the AWL *before* adding the new score, otherwise
     # early high-scoring messages are reinforced compared to
     # later ones. http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=159704
     if (!$pms->{disable_auto_learning}) {
       my $timer = $self->{main}->time_method("update_awl");
       $whitelist->add_score($awlpoints);
     }

     # now redundant, got_hit() takes care of it
     # for my $set (0..3) {  # current AWL score changes with each hit

# $pms->{conf}->{scoreset}->[$set]->{"AWL"} = sprintf("%0.3f",$delta);

     # }

     if ($delta != 0) {
       $pms->got_hit("AWL", "AWL: ", ruletype => 'eval',
                     score => sprintf("%0.3f", $delta));
     }

     $whitelist->finish();
     1;
    } or do {
      my $eval_stat = $@ ne '' ? $@ : "errno=$!";  chomp $eval_stat;

warn("auto-whitelist-awl: open of auto-whitelist file failed:$eval_stat\n");

      # try an unlock, in case we got that far
      eval { $whitelist->finish(); } if $whitelist;
      return 0;
    };

    dbg("auto-whitelist: post auto-whitelist score: %.3f", $pms->{score});

    # test hit is above
    return 0;
}


sub dbg { Mail::SpamAssassin::dbg (@_); }

1;
=====








On 05/08/2013 04:53 PM, Karsten Bräckelmann wrote:

On Wed, 2013-05-08 at 09:18 -0700, psychobyte wrote:

My goal is to ignore several types of address from getting into the AWL
database.  So I'm creating a plugin that will set a lower priority for
autolearn_discriminator(), run the autolearn_discriminator callback,
check a list of ignored address types, then call
inhibit_further_callbacks() stop the address from getting into the AWL
database.

Does that process sound right? Below is a test implementation, however
it doesn't seem to work. The autolearn_discriminator callback doesn't
get called. What am I doing wrong?

First and foremost, the AWL plugin has been disabled by default in the
3.3 branch. Since that required commenting out the loadplugin line in
v310.pre, it is possible that change was not performed on an upgrade
from 3.2. Or you might have decided to re-enable it site-wide.

Anyway, are you sure you mean the AWL database? The Auto Whitelist (AWL)
plugin, better described as a general, fading history score averager,
was designed to protect from FPs, by identifying known senders base on
the address and netblock. Your goal to prevent certain address types
from being added to the database matches with the AWL approach.

The autolearn_discriminator callback is used exclusively by Bayes, which
is something completely different.


Your approach of inhibit_further_callbacks() would work with e.g. Bayes
and callback registered code. It would not work with AWL, though, which
uses an eval() rule.

Moreover, regardless the approach of callback or not -- unless you
either require a lot of custom configuration options, changing patterns
during runtime, or a ton of complicated code -- you are likely making
this harder than it needs to be.

What about a simple patch, testing the address against a couple REs
specifying the "several types of addresses" you want to exclude from
AWL.

See M::SA::Plugin::AWL.pm check_from_in_auto_whitelist(). Conveniently,
the code features an example right where you should inject your code.

   return 0 unless $from =~ /\S/;

That is right after the local $from variable gets initialized with the
address to use for AWL. And it does, what you want: Skip AWL (by ending
the eval() rule early, returning 0) if (not) the address matches a
pattern.

If your list of address types (patterns) is going to be mostly static, a
very few commands there would get you what you want.

Re: autolearn_discriminator callback not getting called.

Reply via email to