I've been using Bayes and auto whitelisting for several weeks
only, but on a site wide basis (i.e. spamc/spamd triggered from
procmail from sendmail).

I've been very happy with the bayes learner, but today got a
pretty bad spam into one of the mailing lists I manage, here are
some of the headers and the first paragraph of the spam
(indented by 4):

----------------------------

    Received: from stat.math.ethz.ch (hypatia [129.132.58.23])
            by stat.math.ethz.ch (8.12.9/8.12.6) with ESMTP id h5KFENeT023548;
            Fri, 20 Jun 2003 17:14:23 +0200 (MEST)
    Received: from franz.stat.wisc.edu (www.omegahat.org [128.105.174.32])
            by stat.math.ethz.ch (8.12.9/8.12.6) with ESMTP id h5KFEDeT023527
            for <[EMAIL PROTECTED]>; Fri, 20 Jun 2003 17:14:14 +0200 (MEST)
    Received: from smtp1.nix.paypal.com ([65.206.228.74])
            by franz.stat.wisc.edu with esmtp (Exim 3.35 #1 (Debian))
            id 19TNau-0000iR-00
            for <[EMAIL PROTECTED]>; Fri, 20 Jun 2003 10:14:29 -0500
    Received: from oma-krapp02.omaha.local (oma-krapp02.omaha.local [10.10.4.142])
            by smtp1.nix.paypal.com (Postfix) with SMTP id F32E73F616
            for <[EMAIL PROTECTED]>; Fri, 20 Jun 2003 08:13:44 -0700 (PDT)
    Precedence: bulk
    Auto-Submitted: auto-replied
    MIME-Version: 1.0
    Content-Type: text/plain; charset = "us-ascii"
    X-Mailer: KANA Response 7.01.102
    Message-Id: <[EMAIL PROTECTED]>
    X-Virus-Scanned: by amavisd-milter (http://amavis.org/)
    X-Virus-Scanned: by amavisd-milter (http://amavis.org/)
    Content-Transfer-Encoding: 8bit
    X-MIME-Autoconverted: from quoted-printable to 8bit by stat.math.ethz.ch id
            h5KFEDeT023527
    X-BeenThere: [EMAIL PROTECTED]
    X-Mailman-Version: 2.1.2
    Reply-To: PayPal Customer Service 2 <[EMAIL PROTECTED]>
  <....>
    List-Help: <mailto:[EMAIL PROTECTED]>
    Errors-To: [EMAIL PROTECTED]
    X-Spam-Status: No, hits=-103.4 required=5.0 
tests=BAYES_60,CLICK_BELOW,KNOWN_MAILING_LIST,RCVD_IN_BONDEDSENDER,USER_IN_WHITELIST 
autolearn=ham version=2.54
    X-Spam-Level: 
    X-Spam-Checker-Version: SpamAssassin 2.54 (1.174.2.17-2003-05-11-exp)
    From: PayPal Customer Service 2 <[EMAIL PROTECTED]>
    Sender: [EMAIL PROTECTED]
    To: <[EMAIL PROTECTED]>
    Cc: 
    Subject: AutoResponse - Email Returned SAXK  (KMM31468613V21988L0KM)
    Date: Fri, 20 Jun 2003 08:13:46 -0700



    Thank you for contacting PayPal Customer Service.  

    In an effort to assist you as quickly and efficiently as possible, please 
    direct all customer service inquires through our website. Click on the 
    hyperlink below to go to the PayPal website. After entering your email 
    address and password into the Member Log In box, you can submit your 
    inquiry via our Customer Service Contact form. If you indicate the type of 
    question you have with as much detail as you can, we will be able to 
    provide you with the best customer service possible.

----------------------------
and it ends with

--------------------------------
    Please do not reply to this e-mail.  Mail sent to this address will not be 
    answered.

    ********************************************
    Original Email:
    See the attached file for details.
--------------------------------
i.e., there may have been an attachment (a virus?) that the
mailing list software (mailman) has already dropped before I get
to see things.

You see the score of -103 they managed to get via "USER_IN_WHITELIST"
and it also has a RCVD_IN_BONDEDSENDER.  Could the latter be
because the legitimate [EMAIL PROTECTED] goes to franz.stat.wisc.edu
and is alias-forwarded from there to stat.math.ethz.ch ?

What can I do to avoid this?

I've already learned the message as "spam" -- but 
"of course," I've learned with the mailing list headers and
footer and no attachments, instead of how it was when spamd looked at it.

One thing I can see is basically disabling whitelisting by
changing the score of USER_IN_WHITELIST from -100 to -1 (or so).
But that would still give a negative score to the above;
so I also need to do something about the RCVD_IN_BONDEDSENDER
problem.

When I try the message with "spamassassin -t  < ...",
I still get a -101 score, even though I tried to stop the
autolearning.

  Content analysis details:   (-101.30 points, 5 required)
  BAYES_80           (2.9 points)  BODY: Bayesian classifier says spam probability is 
80 to 90%
                     [score: 0.8791]
  USER_IN_WHITELIST  (-100.0 points)From: address is in the user's white-list
  RCVD_IN_BONDEDSENDER (-4.2 points) RBL: Bonded sender, see 
http://www.bondedsender.org/referred.html
                     [RBL check: found 74.228.206.65.query.bondedsender.org., type: 
127.0.0.10]
  CLICK_BELOW        (0.0 points)  Asks you to click below

==> how can I look at my whitelist database and modify it?
    I have a *.dir and *.pag file

Thanks a lot for advice "ASAP" because I'd like to finish work for
the weekend and be able to stay calm...
Thank you in advance!

Martin Maechler <[EMAIL PROTECTED]>     http://stat.ethz.ch/~maechler/
Seminar fuer Statistik, ETH-Zentrum  LEO C16    Leonhardstr. 27
ETH (Federal Inst. Technology)  8092 Zurich     SWITZERLAND
phone: x-41-1-632-3408          fax: ...-1228                   <><


-------------------------------------------------------
This SF.Net email is sponsored by: INetU
Attention Web Developers & Consultants: Become An INetU Hosting Partner.
Refer Dedicated Servers. We Manage Them. You Get 10% Monthly Commission!
INetU Dedicated Managed Hosting http://www.inetu.net/partner/index.php
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to