Martin Gregorie wrote:

> OTOH I have a similar plot. The idea is that mail from an exact address
> that I've previously sent mail to will not be spam. My system consists
> of two parts:
>
> - the first automatically records every address I've sent mail to.
>   This uses a table in a PostgreSQL database which contains the
>   address and a manually set flag to to show whether the address is
>   within one of my domains.
>
> - the second part is an SA plugin and a rule. The plugin checks
>   whether an email's sender address matches the database and is not
>   one of my marked addresses. If this condition is met the email is
>   whitelisted.
>
>   The 'my domain' flag is needed to weed out cases where the sender
>   address is forged using a valid address in my domain. I only filter
>   incoming mail: this would not work if I filtered internal mail.
>
> I built the plugin by modifying the SentOutDB.pm plugin from
> http://whatever.frukt.org/mdf-sentoutdb.text.shtml

Yes, this is a useful techniques to cut down false positives.

It was introduced with amavisd-new-2.4.2 (June 2006), the feature
is known as 'pen pals'. Initially it only dealt with envelope sender
and recipient address pairs, but was later extended to take into
account a Message-ID, References and In-Reply-To header fields,
which extended its usefulness to mailing list threads.

From release notes:

- new feature: "pen pals soft-whitelisting" lowers spam score of received
  replies (or followup correspondence) to a message previously sent by a
  local user to this address;

  How it works:
  * SQL logging stores records about all mail messages processed by amavisd,
    their sender, recipients, delivery status, mail contents type (no changes
    there, this feature was introduced with amavisd-new-2.3.0); for the
    purpose of pen pals scheme only records with local-domain senders matter;
  * when a message is received, a SQL lookup against a SQL logging database
    is performed, looking for previous messages sent in reverse direction,
    i.e. from a local user (which is now a recipient of the current mail)
    to the address that is now the sender of the message being processed;
    A SELECT clause in $sql_clause{'sel_penpals'} is used, which by default
    only considers records of previous messages that were actually
    delivered (not rejected, discarded or bounced), and were not infected.
    SQL lookup returns a timestamp of the most recent such message (if any),
    the difference (in seconds) between the current time and the timestamp
    is an 'age' as used in the following formula;
  * an exponential decay formula calculates score points to be deducted
    from the SA score:
      weight = 1 / 2^(age/penpals_halflife)
      score_boost = -penpals_bonus_score * weight
    i.e. penpals_bonus_score is multiplied by 1, 1/2, 1/4, 1/8, 1/16, ...
    at age 0, 1*halflife, 2*halflife, 3*halflife, 4*halflife ...
    weight is a continuous function of age (actually, in steps of one second);


Mark

Reply via email to