Viktor Dukhovni wrote:
> Bob Proulx wrote:
> > > > ... http://postmaster.comcast.net/smtp-error-codes.php#RL000001 (in 
> > > > reply to MAIL FROM command))
> > > 
> > > Look carefully at the log entry.  The "421" is send in response to "MAIL
> > > FROM", not "RCPT TO".  So the recipient limit does not look entirely
> > > plausible.
> > 
> > If this was other than Comcast I would accept that immediately at face
> > value but a long history says that Comcast is an unreliable narrator
> > in this story.  :-(
> 
> The (in reply to MAIL FROM command) is added by Postfix, not Comcast, so
> that is not subject to doubt.  Because the recipients were not yet
> processed when "MAIL FROM" is received, it seems unlikely that the
> recipient count is the issue, rather simply trying later solved the
> problem.

But without the results from trying it without pipelining then don't
we not yet know exactly when the rejection occurred?  Because of
pipelining the entire set will be sent and then the results from the
set returned?

I haven't heard back the results from the experiment with turning off
pipelining.  So I can't make a report either way on the result.

> The Comcast message is pretty clear, they're rate limiting on total
> recipients per unit time, based on IP reputation.  I see no evidence
> of hiding anything, the link in question publishes the numbers, ...

I don't see how three recipients in an hour can in any way exceed
their published rate limits.  That's the evidence that they are doing
something that they have not published.

> Some legitimate senders may find it difficult to deliver the email they
> expect to send this policy, but that does not mean that the policy is
> any way something other than what it appears to be.

They publish that with an "N/A" SenderScore that they will accept 120
recipients per hour.

    https://postmaster.comcast.net/smtp-error-codes.php#RL000001

Sending 3 recipients per hour, or day, or week in this case, is well
below that published threshold.  In the category of my friend's IP
reputation likely having a "N/A" score it should be allowed to send up
to 120 recipients per hour.  There is no way that three would exceed
this limit.

> > Also the empirical testing showed that forcing one recipient per
> > message succeeds while three per message was rejected.  Interlaced.
> 
> Which only proves that sending at some later time did not run into the
> rate limit.  Nothing else.

Sorry if I failed to describe this adequately.  Let me describe this
again.

The three-recipient messages were always given a 421 even after the
single-recipient messages were accepted.  We could send test messages
with a single recipient and they were immediately accepted without
delay.  Repeatedly.  But multiple attempts to send three-recipient
messages resulted in a 421 response, and those messages staying in the
queue and retrying until message expiration.  After the
single-recipient messages were accepted.

I understand a 421 is not a permanent failure.  But retrying until the
message expires and generates a bounce notification is still not a
successful delivery.  Over the course of a couple of days this was the
timeline.

    1. Three recipients: 421 repeatedly
    2. Single recipients: status=sent immediately
    3. Three recipients: 421 repeatedly
    4. Single recipients: status=sent immediately

Seems like at some level they are bucketing senders allowing the
sending of single-recipient messages but blocking multiple-recipient
messages.

And 3 is less than their published 120 / hour lowest rate limit.  Even
with all of the testing we are in the range of a dozen or twenty
messages in the logs over the last week or so now so definitely this
would be a very low volume server by any measurement.

Any proposed failure model must include machinery such as to account
for this type of behavior or it does not match the behavior that has
been experimentally seen.

> > So while Comcast may be putting sites into buckets and some buckets
> > are allowed to send to three at the same time and others are not
> > knowing what kind of rule is being applied does not help work with it.
> 
> There is zero evidence that per message recipient counts are pertinent,
> unless the rate limit quantum goes all the way down to 1 recipient
> per unit time, in which case you have a bit of a problem.

I'll emphasize that the evidence did show this as I repeated above.

And yes I think it does show that we have a bit of a problem. :-(

> > > A good test would be to disable "pipelining" in a custom
> > > smtp(8) transport, and use that for Comcast.  That would definitely
> > > rule out recipient count limits if the reject is still at "MAIL FROM".
> > 
> > Sounds like a good thing to test.  Will give that a try.
> 
> Do the pipelining test.  The first recipient should be accepted, but
> then if the subsequent recipients "421" (instead of 450 or similar)
> that'd be unfortunate, I might reach out to comcast email engineers for
> a chat about that...

I do not have any results from the pipeline test but will push my
friend to do the test.

However as a workaround we have arranged to relay the very few
messages out through my server instead.  I have enough traffic to
Comcast servers that by deduction the SenderScore IP reputation on my
outbound must be non-zero in their table somewhere by the traffic
counts I can see in my logs.  And is not currently being rate limited.
Therefore it's possible that we will stop poking at things.  Since we
have a workaround.

But it would still be nice to have a good Postfix documented Best
Practice for dealing with situations such as this one.

> > Comcast is an unreliable narrator.
> 
> That's hyperbole.  There's no evidence for that.

I agree that was my opinion based upon long experience.  But ask
anyone who has had to deal with Comcast over years and I believe this
would be the majority opinion.  Search the web for ratings of their
customer service and see what you find.  It's not good!

And actually I received direct mail responses from a few others who
read this list that commiserated with me and related similar problems.
And suggested their own workarounds they have implemented to deal with
the issue.  It's comforting to know we are not the only ones with this
problem.

Having things mostly resolved on our end now though I am going to go
back to letting sleeping dogs lie on this again.

Bob

Reply via email to