Mark wrote:

Do yahoo and python.org enforce a shorter time-out?

Highly doubtful. RFC 2821, Section 4.5.3.2 ("Timeouts") gives you a 2
minutes window while awaiting the "354 Start Input" reply to a DATA
command.

Wich is of course irrelevant since SpamAssassin must be called after the client sends the data, not while the client if waiting for permission to send it.

The relevant timeout is the after the client sends <CRLF>.<CRLF> to end the data and then waits for 250 or an error. RFC2821 4.5.3.2 specifies that this timeout SHOULD be 10 minutes.

> But please note that timeouts in this Section are a SHOULD, not
a MUST.

Exactly. But one should remember that RFC2821 has this to say about SHOULD:

2.2.2:
---8<---
there may exist valid reasons in particular circumstances to ignore a particular item, but the full implications must be understood and carefully weighed before choosing a different course.
---8<---

I'm not convinced all mail list operators satisfies that (see below).

While I have no idea what Yahoo or python.org has set those timeouts to, I have seen mail from listservers with as low as a 1 (one) minute timeout.

(The rationale for one such system was that they needed to empty their mail queue as fast as possible. When told that the low timeout in reality might make them empty their queue slower rather than faster (because their system had to retry mail that would otherwise get out on the first try), they simply didnät listen.)

Or does some characteristic of list mail make it take longer to
process in sa?

Headers are part of the DATA stream. Hence, at the time a connecting
server is awaiting your "354 Start Input" reply to their DATA command, the
whole notion of a (possible) mailing-list is simply not within scope yet
(as the headers have not been received yet).

But... SpamAssassin should not be called before the DATA is sent. SpamAssassin is a content filter that scans the data (including the headers).

It's quite possible for a connecting mail server to retry after it
received, say, one of the many 4.x.x codes. However, what's definitely
broken in your setup is that, in such a scenario, apparently your LDA
already delivers (part of an) aborted mail.

The LDA delivers what it gets, and really can't know ether the mail is complete or not. It's the responibility of the server wether a non-complete mail should be discarded or send to the LDA.

Once the client has sent data end (<CRLF>.<CRLF>) and the server has recieved it, it's normal (and correct) for the server to deliver the mail and then send a relevant response to the client.

If the client disconnects (due to a timeout or other problem) after sending <CRLF>.<CRLF> but before getting a response from the server, this tend to result in the server delivering the message and the client retrying it.

RFS2821 spoecifically mentions this.

4.5.3.2:
---8<---
When the receiver gets the final period terminating the message data, it typically performs processing to deliver the message to a user mailbox. A spurious timeout at this point would be very wasteful and would typically result in delivery of multiple copies of the message, since it has been successfully sent and the server has accepted responsibility for delivery.
---8<---

6.1:
---8<---
To avoid receiving duplicate messages as the result of timeouts, a receiver-SMTP MUST seek to minimize the time required to respond to the final <CRLF>.<CRLF> end of data indicator.
---8<---

The resoning behind this behaviour is that the server doesn't know wether the client received the 250 response or not. If the client did receive it, the server MUST take responisibilty for delivering the message (or notifying the sender of a failure if it can't deliver it). So the server takes the safe way of assuming the client did receive the positive response.

Similarly, since the client doesn't know wether the server managed to deliver the message or not, the client MUST take responsibility for retrying the message (or notifying the sender of a failure).

The basic idea is that it's more important to not loose mail than to allways avoid duplicates.

> That seems too strange for
words; so, likely, you're not sending out proper reply codes to connecting
mail servers.

Quite the opposite. It seems perfectly normal. His servers are most likely sending the correct responses to the client, but the client disconnects due to a timeout before getting the response and therefore doesn't know that the mail was eventually accepted for delivery.

So, the solution is simple: fix your mail server; or have someone do it
for you.

There's two parts to the solution.

Part One: Fix his mail server so that it does take overly long to respond to data end (<CRLF>.<CRLF>). This is a MUST.

It is, of course, quiote possible that a normally perfectly good mail server is temporarily overloaded, has disk problems, network problems, or for some other temporary reason takes a long time to respond. This is not a failure to follow RFC2821. This seems to be what happened in this case. It is the reason part two is needed.

Part Two: Fix the sending systems so that they do not use an inappropriately low timeout after data end (<CRLF>.<CRLF>). There's a reason why it SHOULD be 10 minutes.

Regards
/Jonas
--
Jonas Eckerman
Fruktträdet & Förbundet Sveriges Dövblinda
http://www.fsdb.org/
http://www.frukt.org/
http://whatever.frukt.org/

Reply via email to