On 2022-03-04 at 09:18:08 UTC-0500 (Fri, 04 Mar 2022 09:18:08 -0500)
Greg Troxel <g...@lexort.com>
is rumored to have said:

> Greg Troxel <g...@lexort.com> writes:
>
>> With stock scores, sendgrid gets
>>
>>  2.1 URIBL_GREY             Contains an URL listed in the URIBL greylist
>>                             [URIs: sendgrid.net]
>>  1.5 KAM_SENDGRID           Sendgrid being exploited by scammers
>>
>> and I find 3.6 a bit much.


Note that those are quasi-independent rules. URIBL looks at all of the URIs in 
a message. KAM_SENDGRID only hits mail transferred through Sendgrid where the 
From header and envelope sender addresses are in unrelated domains.

I may be wrong, but I do not believe that all Sendgrid ham will hit either of 
those rules, although much surely will hit both. The KAM rules don't go through 
QA that would reveal their overlap/independence as the stock rules do, so 
there's not a good way that I can check.

>> But maybe 72% of what sendgrid sends is
>> spam?  (Knowing the spam % is actually a serious question.)
>
> sorry, didn't quite get back to stock for that  test, so I think it's
> only 1.1+1.5=2.6, so tuned for 52% spam...

FWIW, that is NOT how the math works for score determination. Even for the 
stock rules which get programmatically adjusted as a set, that's not a "tuning" 
target that would be useful or even have a calculable solution.

The rule score tuning doesn't really pay any attention to aggregate score 
values except in >/< relation to the threshold. If 100% of a sender's mail is 
ham that just happens to score 4.2, that's great. If it is 100% spam, all 
scoring 5.2, that's also great. If it is a 50/50 mix that SA scores perfectly 
at either 4.2 or 5.2, that would be astoundingly good. Message scores do NOT 
have a score distribution that can be approximated by any combination of 
statistically useful distributions which could support the sort of score 
arithmetic you are positing.

I wish Justin had originally made the base score -5 and the threshold 0. It's 
20 years too late to fix that, but it would have made it easier for people to 
avoid wrong mathematical assumptions about the value of the aggregate score of 
a message.


-- 
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to