On 4/13/22 9:32 PM, Rob McEwen via mailop wrote:
fwiw, I've confirmed at some point within the past couple of years - directly with Brandon Long of Google - that, yes, Google does have this extra after-connection filtering, where a message can potentially be spam filtered even though the sender's mail server received a "250 OK" response.
Pre-coffee Devil's advocate here.Where do the RFCs say that a 250 (2xy) response message after the end of the DATA means that the message MUST be delivered to the mailbox?
My recollection is that once a receiving server accepts responsibility for a message, it is charged with not loosing the message without a good reason to do so. What constitutes a "good reason"? Is actively choosing not to deliver the message sufficient? -- I have always taken this verbiage to be related to trivial things like committing to persistent storage so that things like a power failure don't cause the loss of a message.
But I don't remember anything implying that accepted messages MUST be delivered to the end user's mailbox. -- It's been a while since I've read the RFCs. -- If I'm mis-remembering, please point me to the relevant sections of the relevant RFCs that indicate that accepted messages (250 / 2xy) MUST be delivered to the recipients mailbox.
If accepted messages MUST be delivered to mailboxes, how does quarantining messages jive?
How does conditionally re-routing messages based on $CRITERIA jive with MUST be delivered?
Does $ESP re-routing and delivering the message to a /different/ mailbox, other than the original intended recipient's email address, count as successfully delivering message between sending and receiving server?
I ask in the spirit of a 5 year old asking why fire is hot.
That the message may have been in the recipients' google spam folder - is something I already acknowledged, but that's "besides the point".
Is it /really/ besides the point?The message was successfully delivered from a sending server to a receiving server and it was not lost for trivial reasons (like power failure).
Also, while many mail hosters ALSO do this filtering technique, I consider this to /typically/ be an inferior spam filtering practice, although I'm open to the idea that doing it on a very limited basis might be OK (such as when an attachment has a /strong/ potential be a zero-day virus that anti-virus systems are not yet detecting... stuff like that).
So -- if I may twist your words a little bit -- you admit that you are okay with filtering /after/ the message is accepted. And that it's a matter of numbers / ratio / scale. -- Is that an acceptable permutation of your words?
Let's put some hypothetical numbers to this. Let's say that this behavior is acceptable in 0.1% of messages, or 1 in 1000 messages. -- If Google processes 1,000,000,000 messages a day, 0.1% would be 1,000,000 messages a day filtered /after/ the message was accepted.
Even five nines (99.999%) of messages being handled properly would allow for 10,000 messages a day to be filtered /after/ the message was accepted.
ALSO - this reminds me - another inferior practice of some of these largest email providers - including Google - is the lack of support and willingness/ability to make changes in response to egregious filtering mistakes. IT staff of their customers are OFTEN told by these large providers - "it is what it is" - with no willingness to look into SMTP logs and figure out and fix exactly what went wrong - but level of service doesn't scale, right? (But yet they STILL charge premium prices per mailbox - so in spite of this - their REVENUE "scales"!)
On the flip side of the coin, when was the last time you had /any/ company of /any/ size make /any/ change based on a complaint? Of those complaints, how long did the change persist? -- My limited experience, only the smallest of companies will be willing to make a change. There might even be an inverse relationship between the size and the likelihood that a change will be accepted and persist.
That being said, I have seen evidence that Google has made some changes in response to complaints. Admittedly they take much longer to happen than I think they should.
Slightly changing the subject and getting back to Paul Vixie's original post (the post that started this thread) about Google's lack of spam filtering transparency - unless he's referring to something bad of which I'm not aware - otherwise, I think he's being a little too picky - spam filtering is 1/2 sausage factory (if you saw it up close, you'd be shocked about how crazy it is) - and part front-lines war zone. So, in summary, spam filtering is "a sausage factory on the front lines of a war zone". Then Paul Vixie comes along and asks, "where's your TPS report?" (a reference from the movie "office space" - remember that?) - and those of us running spam filters are like "dude, we're just barely surviving, and everything is so fast paced and changing so often - we can't even think about that TPS report!"
I believe -- what I've heard loosely described as -- the law of large numbers applies equally as well on the sending outbound side of the house as it does to the receiving inbound side of the house. In short, yes, Google sends LOTS of spam. But based on the percentage of numbers, it's probably much smaller percentage spam to ham leaving their network than it is leaving most of our networks. So....
There is also the fact that what differentiates ESPs (et al.) is /how/ they do things. So, asking ~ demanding a TPS report -- good choice of words -- from them is in some ways like demanding for proprietary information on the ESP's intellectual property.
The computer industry's demand for error information from hard drive manufactures and the resultant S.M.A.R.T. comes to mind. -- Even if we did get some numbers from the ESPs, could we trust them? Could we even determine what they mean?
Or, in Paul Vixie's defense, maybe Paul is thinking about the fact that gmail's outbound spam has been absolutely INSANE the past several months, with no end of slowdown in sight. It's insane that this has gotten so little attention in recent months, and that Google keeps seemingly getting a free pass over that. So there's that, too.
I think that Google, et al., only get a free pass because some of us give it to them. -- In a word, "don't" (give large ESPs a free pass).
I'm not aware of anything, save for reputation systems, that come close to justifying giving large ESPs a free pass. -- I have defended my spam filtering techniques to paying customers many times over the years.
Admittedly, I do carefully choose my spam techniques to be ones that apply equally to all message and agnostic of who they come from. -- I don't give Google, or any ESP, a free pass. -- The only free passes I give are my own systems not filtering each other.
And it's bizarre that so many are so "OK" with that! (or pretend that this isn't happening?) Or - again - maybe Paul Vixie knows something that many of us don't know about - regarding Google's mail system! I wouldn't rule that out.
The only initial reaction I had to Dr Vixie's comment was that we would be remiss to /not/ mention the biggest players in the email industry as a possible option. Not mentioning them would be akin to not talking about the elephant in the room. -- There's nothing that says that our mention of the elephant must be positive. We could very easily say something like "there's always $BIG_ESP, but they are a black box and you get what you get, use them at your own risk".
-- Grant. . . . unix || die
smime.p7s
Description: S/MIME Cryptographic Signature
_______________________________________________ mailop mailing list mailop@mailop.org https://list.mailop.org/listinfo/mailop