-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
"Ben Wing" writes: >well, i get false positives with an empty body ... Yep, that's a pretty serious sign -- the header data in that message (sent from yourself, to yourself, via your own relays, right?) is being recognised as spam. Try using "spamassassin -D -Lt < msg > out" and watch the bayes tokens and their values on stderr. e.g. here's an example from sample-nonspam.txt for me: debug: bayes token 'N:NNNN-NN-NN' => 1.60066644848413e-05 debug: bayes token 'organizations' => 0.000215113954418233 debug: bayes token 'rarely' => 0.00032196289646918 debug: bayes token 'ICANN' => 0.000451721242653233 debug: bayes token 'deeper' => 0.000471516213847502 debug: bayes token 'commentary' => 0.000647412755716005 debug: bayes token 'depth' => 0.000680151706700379 debug: bayes token '1994' => 0.000726045883940621 debug: bayes token 'voices' => 0.000756680731364276 debug: bayes token 'Dawson' => 0.000880523731587561 debug: bayes token 'Host' => 0.000880523731587561 debug: bayes token 'roots' => 0.000942206654991244 debug: bayes token 'deceptive' => 0.00114225053078556 debug: bayes token 'Topic' => 0.00124825986078886 debug: bayes token 'columnists' => 0.00124825986078886 debug: bayes token 'Sitescooper' => 0.00127790973871734 debug: bayes token 'ash' => 0.00162537764350453 debug: bayes token 'PDA' => 0.00167601246105919 debug: bayes token 'UD:slashdot.org' => 0.00167601246105919 debug: bayes token 'obsession' => 0.00198523985239852 debug: bayes token 'intersection' => 0.00206130268199234 debug: bayes token 'Layer' => 0.00232900432900433 debug: bayes token 'distinctive' => 0.00267661691542289 debug: bayes token 'separates' => 0.00281675392670157 debug: bayes token 'UD:quicktopic.com' => 0.00281675392670157 debug: bayes token 'U*dawson' => 0.0033416149068323 debug: bayes token 'H*F:D*world.std.com' => 0.00664197530864198 debug: bayes token 'H*F:D*std.com' => 0.00664197530864198 debug: bayes token 'www.pgp.com' => 0.00881967213114754 debug: bayes token 'UD:pgp.com' => 0.00881967213114754 debug: bayes token 'H*m:192' => 0.0104581626770632 debug: bayes token 'examples' => 0.0123431642679307 debug: bayes token 'Log' => 0.0130133209114604 debug: bayes token 'behaviors' => 0.0131219512195122 debug: bayes token '2,000' => 0.0131219512195122 debug: bayes token 'Hail' => 0.0131219512195122 debug: bayes token 'SIGNED' => 0.0134839529349941 debug: bayes token 'immoral' => 0.0173548387096774 debug: bayes token 'aggregator' => 0.0173548387096774 debug: bayes token 'subscribe' => 0.0214775262438607 debug: bayes token 'UD:shtml' => 0.02559193319822 debug: bayes token 'HTo:D*std.com' => 0.0256190476190476 debug: bayes token 'HTo:D*world.std.com' => 0.0256190476190476 debug: bayes token 'H*F:U*dawson' => 0.0256190476190476 debug: bayes token 'UnBlinking' => 0.0256190476190476 debug: bayes token 'unmatched' => 0.0256190476190476 debug: bayes token 'H*m:193' => 0.0256190476190476 debug: bayes token 'sk:www.sit' => 0.0256190476190476 debug: bayes token 'Scout' => 0.0256190476190476 debug: bayes token 'SIGNATURE' => 0.0257894126485889 debug: bayes token 'culture' => 0.0272021597517014 debug: bayes token 'N:N.N.N' => 0.0272793722027467 debug: bayes token 'Gary' => 0.0320014392974647 debug: bayes token 'PGP' => 0.0358753189283018 debug: bayes token 'HPrecedence:list' => 0.037141126102354 debug: bayes token 'separate' => 0.039440168771582 debug: bayes token 'topical' => 0.958 debug: bayes token 'ping' => 0.0451277464637061 debug: bayes token 'ISSN' => 0.0489090909090909 debug: bayes token 'UD:rdf' => 0.0489090909090909 debug: bayes token 'pursues' => 0.0489090909090909 debug: bayes token 'stock's' => 0.0489090909090909 debug: bayes token 'resuming' => 0.0489090909090909 debug: bayes token 'excise' => 0.0489090909090909 debug: bayes token 'D*tbtf.com' => 0.0489090909090909 debug: bayes token 'H*r:world.std.com' => 0.0489090909090909 debug: bayes token 'comment' => 0.0539053222173553 debug: bayes token 'BEGIN' => 0.0556296837236107 debug: bayes token 'runs' => 0.0561664508720611 debug: bayes token 'morning' => 0.0640287802717383 debug: bayes token 'forum' => 0.0645257315925537 debug: bayes token 'blog' => 0.0670958180925054 debug: bayes token 'sk:_______' => 0.0675631545686896 debug: bayes token 'prohibited' => 0.0712432072884898 debug: bayes token 'Copy' => 0.925232790783064 debug: bayes token 'Sun' => 0.0760314122684923 debug: bayes token 'affect' => 0.0785519839660173 debug: bayes token 'archive' => 0.0795060650813824 debug: bayes token 'compelling' => 0.0863543258179187 debug: bayes token 'subscription' => 0.0998350082839608 debug: bayes token 'H*m:102' => 0.105326764576386 debug: bayes token 'dead' => 0.10852287665769 debug: bayes token 'H*c:plain' => 0.109216762405638 debug: bayes token 'issue' => 0.118565944969698 debug: bayes token 'utterly' => 0.121118249899843 debug: bayes token 'H*c:us-ascii' => 0.124295594576641 debug: bayes token 'END' => 0.125704773987869 debug: bayes token 'file' => 0.131541664289693 debug: bayes token 'writing' => 0.133370455069644 debug: bayes token 'sources' => 0.141737437136924 debug: bayes token 'Version' => 0.142590596342089 debug: bayes token 'promises' => 0.144716859485728 debug: bayes token 'UD:org' => 0.146916282340951 debug: bayes token 'consider' => 0.150153742517993 All of those are quite low, so combined they result in a score of: debug: bayes: score = 0 - --j. >Return-Path: <[EMAIL PROTECTED]> >Delivered-To: [EMAIL PROTECTED] >Received: (qmail 38912 invoked by uid 19047); 17 Oct 2003 07:23:43 -0000 >Received: from unknown (HELO mpls-qmqp-02.inet.qwest.net) ([63.231.195.113]) >(envelope-sender <[EMAIL PROTECTED]>) > by 192.220.74.103 (qmail-ldap-1.03) with SMTP > for <[EMAIL PROTECTED]>; 17 Oct 2003 07:23:43 -0000 >Received: (qmail 73098 invoked by uid 0); 17 Oct 2003 06:40:41 -0000 >Received: from mpls-pop-02.inet.qwest.net (63.231.195.2) > by mpls-qmqp-02.inet.qwest.net with QMQP; 17 Oct 2003 06:40:41 -0000 >Received: from ddslppp71.tcsn.uswest.net (HELO neeeeeee) (216.161.150.71) > by mpls-pop-02.inet.qwest.net with SMTP; 17 Oct 2003 07:23:42 -0000 >Date: Fri, 17 Oct 2003 00:28:01 -0700 >Message-ID: <[EMAIL PROTECTED]> >From: "Ben Wing" <[EMAIL PROTECTED]> >To: "Ben Wing" <[EMAIL PROTECTED]> >Subject: test test >MIME-Version: 1.0 >Content-Type: text/plain; > charset="iso-8859-1" >Content-Transfer-Encoding: 7bit >X-Priority: 3 >X-MSMail-Priority: Normal >X-Mailer: Microsoft Outlook Express 6.00.2800.1158 >X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1165 >X-Spam-Checker-Version: SpamAssassin 2.60 (1.212-2003-09-23-exp) on 666.com >X-Spam-Report: > * 2.1 BAYES_90 BODY: Bayesian spam probability is 90 to 99% > * [score: 0.9573] >X-Spam-Status: No, hits=2.1 required=5.0 tests=BAYES_90 autolearn=ham > version=2.60 >X-Spam-Level: ** >Status: > > >----- Original Message ----- >From: "Justin Mason" <[EMAIL PROTECTED]> >To: "Martin Radford" <[EMAIL PROTECTED]> >Cc: "Ben Wing" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> >Sent: Sunday, October 19, 2003 3:41 PM >Subject: Re: [SAtalk] strange behavior of Bayesian analyzer in SA 2.6 > > >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA1 >> >> >> Martin Radford writes: >> >At Fri Oct 17 21:17:54 2003, Ben Wing wrote: >> >> >> >> hi. i just upgraded from 2.53 to 2.6 and i'm seeing something >> >> rather odd about the Bayesian results: nearly every one is almost >> >> exactly 0%, 50%, or 100%! it's almost as if it's applying an >> >> extreme rounding function to the actual result. now, these are >> >> turning out so far to be accurate, but i'm still highly distrustful >> >> of such "perfect" results. this clustering happened the instant i >> >> upgraded spam assassin -- in fact, one of the first messages i sent >> >> after this >> > >> >I found this when I first upgraded to one of the pre-releases of 2.60. >> >The developers said that this was due to changing the method of >> >calculating the Bayes score. The newer code is much more likely to >> >cluster around 0, 0.5, and 1. I have seen a few messages outside >> >those cluster areas, but not too many. I've not seen any FPs, though. >> >> If you're seeing FPs, it's strongly indicative of mistakes in the >> training data -- spam trained as ham or vice-versa, I'm afraid ;) >> >> - --j. >> -----BEGIN PGP SIGNATURE----- >> Version: GnuPG v1.2.2 (GNU/Linux) >> Comment: Exmh CVS >> >> iD8DBQE/kxMjQTcbUG5Y7woRAgnyAJ9GaPCdey9oNgAT/y2ZiJkahjPuIgCgoxAC >> vPt8S4fWAKrhfkvq++O4BmI= >> =JWtb >> -----END PGP SIGNATURE----- >> >> >> >> ------------------------------------------------------- >> This SF.net email sponsored by: Enterprise Linux Forum Conference & Expo >> The Event For Linux Datacenter Solutions & Strategies in The Enterprise >> Linux in the Boardroom; in the Front Office; & in the Server Room >> http://www.enterpriselinuxforum.com >> _______________________________________________ >> Spamassassin-talk mailing list >> [EMAIL PROTECTED] >> https://lists.sourceforge.net/lists/listinfo/spamassassin-talk > > -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.2 (GNU/Linux) Comment: Exmh CVS iD8DBQE/k12fQTcbUG5Y7woRAiTIAJ4kUN/aAIP81n1NvVqmVmURTdwVkgCfTaq+ ibaeU0UkxgYBEgokyZlvU1Y= =dHCc -----END PGP SIGNATURE----- ------------------------------------------------------- This SF.net email sponsored by: Enterprise Linux Forum Conference & Expo The Event For Linux Datacenter Solutions & Strategies in The Enterprise Linux in the Boardroom; in the Front Office; & in the Server Room http://www.enterpriselinuxforum.com _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk