Also this: Rule Description Score Total Ham Col6 Spam Col8 BAYES_40 Bayes spam probability is 20 to 40% 0.00 2,784 2,721 97.7 63 2.3 BAYES_50 Bayes spam probability is 40 to 60% 0.80 126 93 73.8 33 26.2 BAYES_60 Bayes spam probability is 60 to 80% 1.50 437 127 29.1 310 70.9 BAYES_80 Bayes spam probability is 80 to 95% 7.00 266 1 0.4 265 99.6
I only have BAYES_40 to BAYES_80 after clearing bayes DB and manually RE-learning on 2500 HAM and 2500 SPAM messages. So NO BAYES lower than 40 or higher than 80... There is 100% something wrong here, bayes in not decision maker at all, for me it is useless. This indecisiveness along with fact that some mails arent even BAYES scored makes me think there is a bug or I implemented it wrong? ________________________________ From: Grega via users <users@spamassassin.apache.org> Sent: Monday, 23 September 2024 15:14 To: users@spamassassin.apache.org Subject: Re: Bayes in V4 compared to V3 Hi again. In V4 there is something wrong with bayes... I received 3 identical mails (1 external sender, 3 internal recipients) and scores are like this: 2 X like: 0.00 ARC_SIGNED Message has a ARC signature -0.10 ARC_VALID Message has a valid ARC signature -0.40 DCC_REPUT_00_12 DCC reputation between 0 and 12 % (mostly ham) 0.10 DKIM_INVALID DKIM or DK signature exists, but is not valid 0.10 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid -0.00 DMARC_PASS DMARC pass policy 0.25 GMD_PDF_HORIZ Contains pdf 100-240 (high) x 450-800 (wide) 0.50 GMD_PDF_SQUARE Contains pdf 180-360 (high) x 180-360 (wide) 0.00 HTML_MESSAGE HTML included in message 1.02 MISSING_HEADERS Missing To: header 1.50 PHISH_LNK_URI Typical phishing tactic - pre filled mail in link -0.00 RCVD_IN_DNSWL_NONE Sender listed at https://www.dnswl.org/, no trust 0.00 RCVD_IN_VALIDITY_CERTIFIED_BLOCKED ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. 0.00 RCVD_IN_VALIDITY_RPBL_BLOCKED ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. 0.00 RCVD_IN_VALIDITY_SAFE_BLOCKED ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. -0.00 SPF_HELO_PASS SPF: HELO matches SPF record AND 1X like: 0.00 ARC_SIGNED Message has a ARC signature -0.10 ARC_VALID Message has a valid ARC signature 1.50 BAYES_60 Bayes spam probability is 60 to 80% -0.40 DCC_REPUT_00_12 DCC reputation between 0 and 12 % (mostly ham) 0.10 DKIM_INVALID DKIM or DK signature exists, but is not valid 0.10 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid -0.00 DMARC_PASS DMARC pass policy 0.25 GMD_PDF_HORIZ Contains pdf 100-240 (high) x 450-800 (wide) 0.50 GMD_PDF_SQUARE Contains pdf 180-360 (high) x 180-360 (wide) 0.00 HTML_MESSAGE HTML included in message 1.02 MISSING_HEADERS Missing To: header 1.50 PHISH_LNK_URI Typical phishing tactic - pre filled mail in link -0.00 RCVD_IN_DNSWL_NONE Sender listed at https://www.dnswl.org/, no trust 0.00 RCVD_IN_VALIDITY_CERTIFIED_BLOCKED ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. 0.00 RCVD_IN_VALIDITY_RPBL_BLOCKED ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. 0.00 RCVD_IN_VALIDITY_SAFE_BLOCKED ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. -0.00 SPF_HELO_PASS SPF: HELO matches SPF record Why one has "BAYES_60" and other 2 not? My thoughts so far: 1. This is not shortcircuit as only bayes is different. 2. Mails are identical and mailserver load is... well non-existant (1 minute load 0.08) 3. Maybe some new logic in bayes to skip some? 4. Race condition (IDK I`m not coder) 5. Bayes behaves non consistent on BOTH installs I have it on ________________________________ From: John Hardin <jhar...@impsec.org> Sent: Friday, 13 September 2024 20:38 To: SpamAssassin-Users Subject: Re: Bayes in V4 compared to V3 On Fri, 13 Sep 2024, Bill Cole wrote: > Please send any replies to the list only. ...or to Harald only. -- John Hardin KA7OHZ http://www.impsec.org/~jhardin/ jhar...@impsec.org pgpk -a jhar...@impsec.org key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 ----------------------------------------------------------------------- USMC Rules of Gunfighting #20: The faster you finish the fight, the less shot you will get. ----------------------------------------------------------------------- Today: the 459th anniversary of the muslim Ottoman defeat at Malta