On 6/22/2023 6:29 AM, Simon Wilson via users wrote:
How do people work around this? I've trained Bayes, and that is
applying a -ve offset as expected, but they still end up at over 7.
X-Spam-Status: Yes, score=7.215 tagged_above=-999 required=6.2
tests=[BASE64_LENGTH_78_79=0.1, BASE64_LENGTH_79_INF=1.502,
BAYES_00=-1.9, DCC_CHECK=1.1, DIGEST_MULTIPLE=0.293,
DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1,
ENA_SUBJ_LONG_WORD=2.2, HTML_MESSAGE=0.001, LR_DMARC_PASS=-0.1,
MIME_BASE64_TEXT=1.741, MIME_HTML_MOSTLY=0.1, MPART_ALT_DIFF=0.79,
PYZOR_CHECK=1.392, RCVD_IN_DNSWL_NONE=-0.0001,
RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001,
T_SCC_BODY_TEXT_LINE=-0.01, T_TVD_MIME_NO_HEADERS=0.01]
My Spam threshold is higher, so not a real problem for me. But...
1) You might plead your case to KAM off-list and see if he can bump up
his regex length for ENA_SUBJ_LONG_WORD to something longer than 30,
like 33.
2) Lower the score for ENA_SUBJ_LONG_WORD
3) I don't run Pyzor; maybe lower the score the a little bit also?
4) Create an off-setting rule; like:
meta DMARC_OFFSET (ENA_SUBJ_LONG_WORD && DKIM_VALID)
score DMARC_OFFSET -2.2
Yes, for sure, ALL Microsoft DMARC messages hit ENA_SUBJ_LONG_WORD.
dokomo.ne.jp also hits (32 chars). In the near-miss category, mail.ru
comes in OK at 29 characters.
-- Jared Hall