I've been using Bayes and auto whitelisting for several weeks only, but on a site wide basis (i.e. spamc/spamd triggered from procmail from sendmail).
I've been very happy with the bayes learner, but today got a pretty bad spam into one of the mailing lists I manage, here are some of the headers and the first paragraph of the spam (indented by 4): ---------------------------- Received: from stat.math.ethz.ch (hypatia [129.132.58.23]) by stat.math.ethz.ch (8.12.9/8.12.6) with ESMTP id h5KFENeT023548; Fri, 20 Jun 2003 17:14:23 +0200 (MEST) Received: from franz.stat.wisc.edu (www.omegahat.org [128.105.174.32]) by stat.math.ethz.ch (8.12.9/8.12.6) with ESMTP id h5KFEDeT023527 for <[EMAIL PROTECTED]>; Fri, 20 Jun 2003 17:14:14 +0200 (MEST) Received: from smtp1.nix.paypal.com ([65.206.228.74]) by franz.stat.wisc.edu with esmtp (Exim 3.35 #1 (Debian)) id 19TNau-0000iR-00 for <[EMAIL PROTECTED]>; Fri, 20 Jun 2003 10:14:29 -0500 Received: from oma-krapp02.omaha.local (oma-krapp02.omaha.local [10.10.4.142]) by smtp1.nix.paypal.com (Postfix) with SMTP id F32E73F616 for <[EMAIL PROTECTED]>; Fri, 20 Jun 2003 08:13:44 -0700 (PDT) Precedence: bulk Auto-Submitted: auto-replied MIME-Version: 1.0 Content-Type: text/plain; charset = "us-ascii" X-Mailer: KANA Response 7.01.102 Message-Id: <[EMAIL PROTECTED]> X-Virus-Scanned: by amavisd-milter (http://amavis.org/) X-Virus-Scanned: by amavisd-milter (http://amavis.org/) Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by stat.math.ethz.ch id h5KFEDeT023527 X-BeenThere: [EMAIL PROTECTED] X-Mailman-Version: 2.1.2 Reply-To: PayPal Customer Service 2 <[EMAIL PROTECTED]> <....> List-Help: <mailto:[EMAIL PROTECTED]> Errors-To: [EMAIL PROTECTED] X-Spam-Status: No, hits=-103.4 required=5.0 tests=BAYES_60,CLICK_BELOW,KNOWN_MAILING_LIST,RCVD_IN_BONDEDSENDER,USER_IN_WHITELIST autolearn=ham version=2.54 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.54 (1.174.2.17-2003-05-11-exp) From: PayPal Customer Service 2 <[EMAIL PROTECTED]> Sender: [EMAIL PROTECTED] To: <[EMAIL PROTECTED]> Cc: Subject: AutoResponse - Email Returned SAXK (KMM31468613V21988L0KM) Date: Fri, 20 Jun 2003 08:13:46 -0700 Thank you for contacting PayPal Customer Service. In an effort to assist you as quickly and efficiently as possible, please direct all customer service inquires through our website. Click on the hyperlink below to go to the PayPal website. After entering your email address and password into the Member Log In box, you can submit your inquiry via our Customer Service Contact form. If you indicate the type of question you have with as much detail as you can, we will be able to provide you with the best customer service possible. ---------------------------- and it ends with -------------------------------- Please do not reply to this e-mail. Mail sent to this address will not be answered. ******************************************** Original Email: See the attached file for details. -------------------------------- i.e., there may have been an attachment (a virus?) that the mailing list software (mailman) has already dropped before I get to see things. You see the score of -103 they managed to get via "USER_IN_WHITELIST" and it also has a RCVD_IN_BONDEDSENDER. Could the latter be because the legitimate [EMAIL PROTECTED] goes to franz.stat.wisc.edu and is alias-forwarded from there to stat.math.ethz.ch ? What can I do to avoid this? I've already learned the message as "spam" -- but "of course," I've learned with the mailing list headers and footer and no attachments, instead of how it was when spamd looked at it. One thing I can see is basically disabling whitelisting by changing the score of USER_IN_WHITELIST from -100 to -1 (or so). But that would still give a negative score to the above; so I also need to do something about the RCVD_IN_BONDEDSENDER problem. When I try the message with "spamassassin -t < ...", I still get a -101 score, even though I tried to stop the autolearning. Content analysis details: (-101.30 points, 5 required) BAYES_80 (2.9 points) BODY: Bayesian classifier says spam probability is 80 to 90% [score: 0.8791] USER_IN_WHITELIST (-100.0 points)From: address is in the user's white-list RCVD_IN_BONDEDSENDER (-4.2 points) RBL: Bonded sender, see http://www.bondedsender.org/referred.html [RBL check: found 74.228.206.65.query.bondedsender.org., type: 127.0.0.10] CLICK_BELOW (0.0 points) Asks you to click below ==> how can I look at my whitelist database and modify it? I have a *.dir and *.pag file Thanks a lot for advice "ASAP" because I'd like to finish work for the weekend and be able to stay calm... Thank you in advance! Martin Maechler <[EMAIL PROTECTED]> http://stat.ethz.ch/~maechler/ Seminar fuer Statistik, ETH-Zentrum LEO C16 Leonhardstr. 27 ETH (Federal Inst. Technology) 8092 Zurich SWITZERLAND phone: x-41-1-632-3408 fax: ...-1228 <>< ------------------------------------------------------- This SF.Net email is sponsored by: INetU Attention Web Developers & Consultants: Become An INetU Hosting Partner. Refer Dedicated Servers. We Manage Them. You Get 10% Monthly Commission! INetU Dedicated Managed Hosting http://www.inetu.net/partner/index.php _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk