On 11/07/2022 15:44, Matus UHLAR - fantomas wrote:
On 11.07.22 12:57, Bert Van de Poel wrote:
A few times a month we have spam messages getting through, often in
German, that have some spam score but not enough to be
marked/discarded. Always these messages are marked by DCC, since
they're of course bulk spam, but it's also not uncommon to see Pyzor
as well. I've been wondering if there are realistic cases where both
DCC and Pyzor would mark as spam while the message was ham.
this is likely to happen if the message is empty or learly empty.
some people are stupid, send one-two words or a short link in message
without Subject: ...
Oh yeah, that's a case I hadn't thought of, good point!
I feel like when both co-occur it's a pretty solid sign it's spam.
Therefore, I'm wondering if an upstream amplification (or a local
one) would make sense.
Some examples (I can also supply full emails, but fear this might
prevent my message from arriving):
X-Spam-Status: No, score=4.082 tagged_above=-9999 required=5
tests=[DCC_CHECK=1.1, DIGEST_MULTIPLE=0.001, FSL_BULK_SIG=0.001,
HEADER_FROM_DIFFERENT_DOMAINS=0.25, HTML_IMAGE_RATIO_08=0.001,
HTML_MESSAGE=0.001, MIME_HTML_ONLY=0.1, PYZOR_CHECK=1.985,
SPF_HELO_NONE=0.001, SPF_NEUTRAL=0.652, T_SCC_BODY_TEXT_LINE=-0.01]
X-Spam-Status: No, score=4.816 tagged_above=-9999 required=5
tests=[DCC_CHECK=1.1, DIGEST_MULTIPLE=0.001, FSL_BULK_SIG=0.001,
HEADER_FROM_DIFFERENT_DOMAINS=0.248, HTML_IMAGE_ONLY_28=0.726,
HTML_IMAGE_RATIO_02=0.001, HTML_MESSAGE=0.001, MIME_HTML_ONLY=0.1,
PYZOR_CHECK=1.985, SPF_HELO_NONE=0.001, SPF_NEUTRAL=0.652,
T_REMOTE_IMAGE=0.01, T_SCC_BODY_TEXT_LINE=-0.01]
X-Spam-Status: No, score=4.109 tagged_above=-9999 required=5
tests=[DCC_CHECK=1.1, DIGEST_MULTIPLE=0.001, FSL_BULK_SIG=0.029,
HEADER_FROM_DIFFERENT_DOMAINS=0.249, HTML_IMAGE_RATIO_04=0.001,
HTML_MESSAGE=0.001, MIME_HTML_ONLY=0.1, PYZOR_CHECK=1.985,
SPF_HELO_NONE=0.001, SPF_NEUTRAL=0.652, T_SCC_BODY_TEXT_LINE=-0.01]
looks like you should implement bayes.
since these are generated by amavis, you could train amavis database.
We have Bayes running on the main server, but my own local server
doesn't have it so hence why it's missing. I did however take all spam I
received myself in 2022 that wasn't caught and fed it to sa-learn (for
the amavis user), thx for that suggestion. Let's hope it works to remove
this minor inconvenience :)