Also this:

Rule    Description    Score    Total    Ham    Col6    Spam    Col8
BAYES_40    Bayes spam probability is 20 to 40%    0.00    2,784    
2,721    97.7    63    2.3
BAYES_50    Bayes spam probability is 40 to 60%    0.80    126    93    
73.8    33    26.2
BAYES_60    Bayes spam probability is 60 to 80%    1.50    437    127    
29.1    310    70.9
BAYES_80    Bayes spam probability is 80 to 95%    7.00    266    1    
0.4    265    99.6

I only have BAYES_40 to BAYES_80 after clearing bayes DB and manually 
RE-learning on 2500 HAM and 2500 SPAM messages.
So NO BAYES lower than 40 or higher than 80...

There is 100% something wrong here, bayes in not decision maker at all, for me 
it is useless. This indecisiveness along with fact that some mails arent even 
BAYES scored makes me think there is a bug or I implemented it wrong?



________________________________
From: Grega via users <users@spamassassin.apache.org>
Sent: Monday, 23 September 2024 15:14
To: users@spamassassin.apache.org
Subject: Re: Bayes in V4 compared to V3


Hi again.


In V4 there is something wrong with bayes...


I received 3 identical mails (1 external sender, 3 internal recipients) and 
scores are like this:


2 X like:

0.00    ARC_SIGNED      Message has a ARC signature
-0.10   ARC_VALID       Message has a valid ARC signature
-0.40   DCC_REPUT_00_12 DCC reputation between 0 and 12 % (mostly ham)
0.10    DKIM_INVALID    DKIM or DK signature exists, but is not valid
0.10    DKIM_SIGNED     Message has a DKIM or DK signature, not necessarily 
valid
-0.00   DMARC_PASS      DMARC pass policy
0.25    GMD_PDF_HORIZ   Contains pdf 100-240 (high) x 450-800 (wide)
0.50    GMD_PDF_SQUARE  Contains pdf 180-360 (high) x 180-360 (wide)
0.00    HTML_MESSAGE    HTML included in message
1.02    MISSING_HEADERS Missing To: header
1.50    PHISH_LNK_URI   Typical phishing tactic - pre filled mail in link
-0.00   RCVD_IN_DNSWL_NONE      Sender listed at https://www.dnswl.org/, no 
trust
0.00    RCVD_IN_VALIDITY_CERTIFIED_BLOCKED      ADMINISTRATOR NOTICE: The query 
to Validity was blocked. See 
https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more 
information.
0.00    RCVD_IN_VALIDITY_RPBL_BLOCKED   ADMINISTRATOR NOTICE: The query to 
Validity was blocked. See 
https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more 
information.
0.00    RCVD_IN_VALIDITY_SAFE_BLOCKED   ADMINISTRATOR NOTICE: The query to 
Validity was blocked. See 
https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more 
information.
-0.00   SPF_HELO_PASS   SPF: HELO matches SPF record



AND 1X like:

0.00    ARC_SIGNED      Message has a ARC signature
-0.10   ARC_VALID       Message has a valid ARC signature
1.50    BAYES_60        Bayes spam probability is 60 to 80%
-0.40   DCC_REPUT_00_12 DCC reputation between 0 and 12 % (mostly ham)
0.10    DKIM_INVALID    DKIM or DK signature exists, but is not valid
0.10    DKIM_SIGNED     Message has a DKIM or DK signature, not necessarily 
valid
-0.00   DMARC_PASS      DMARC pass policy
0.25    GMD_PDF_HORIZ   Contains pdf 100-240 (high) x 450-800 (wide)
0.50    GMD_PDF_SQUARE  Contains pdf 180-360 (high) x 180-360 (wide)
0.00    HTML_MESSAGE    HTML included in message
1.02    MISSING_HEADERS Missing To: header
1.50    PHISH_LNK_URI   Typical phishing tactic - pre filled mail in link
-0.00   RCVD_IN_DNSWL_NONE      Sender listed at https://www.dnswl.org/, no 
trust
0.00    RCVD_IN_VALIDITY_CERTIFIED_BLOCKED      ADMINISTRATOR NOTICE: The query 
to Validity was blocked. See 
https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more 
information.
0.00    RCVD_IN_VALIDITY_RPBL_BLOCKED   ADMINISTRATOR NOTICE: The query to 
Validity was blocked. See 
https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more 
information.
0.00    RCVD_IN_VALIDITY_SAFE_BLOCKED   ADMINISTRATOR NOTICE: The query to 
Validity was blocked. See 
https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more 
information.
-0.00   SPF_HELO_PASS   SPF: HELO matches SPF record



Why one has "BAYES_60" and other 2 not?


My thoughts so far:

  1.  This is not shortcircuit as only bayes is different.
  2.  Mails are identical and mailserver load is... well non-existant (1 minute 
load 0.08)
  3.  Maybe some new logic in bayes to skip some?
  4.  Race condition (IDK I`m not coder)
  5.  Bayes behaves non consistent on BOTH installs I have it on



________________________________
From: John Hardin <jhar...@impsec.org>
Sent: Friday, 13 September 2024 20:38
To: SpamAssassin-Users
Subject: Re: Bayes in V4 compared to V3

On Fri, 13 Sep 2024, Bill Cole wrote:

> Please send any replies to the list only.

...or to Harald only.


--
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhar...@impsec.org                         pgpk -a jhar...@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   USMC Rules of Gunfighting #20: The faster you finish the fight,
   the less shot you will get.
-----------------------------------------------------------------------
  Today: the 459th anniversary of the muslim Ottoman defeat at Malta

Reply via email to