Re: Odd behaviour under load.

Jonas Eckerman Fri, 08 May 2009 07:14:13 -0700

Mark wrote:

Do yahoo and python.org enforce a shorter time-out?

Highly doubtful. RFC 2821, Section 4.5.3.2 ("Timeouts") gives you a 2
minutes window while awaiting the "354 Start Input" reply to a DATA
command.

Wich is of course irrelevant since SpamAssassin must be called after theclient sends the data, not while the client if waiting for permission tosend it.

The relevant timeout is the after the client sends <CRLF>.<CRLF> to endthe data and then waits for 250 or an error. RFC2821 4.5.3.2 specifiesthat this timeout SHOULD be 10 minutes.


> But please note that timeouts in this Section are a SHOULD, not

a MUST.


Exactly. But one should remember that RFC2821 has this to say about SHOULD:

2.2.2:
---8<---

there may exist valid reasons in particular circumstances to ignore aparticular item, but the full implications must be understood andcarefully weighed before choosing a different course.

---8<---

I'm not convinced all mail list operators satisfies that (see below).

While I have no idea what Yahoo or python.org has set those timeouts to,I have seen mail from listservers with as low as a 1 (one) minute timeout.

(The rationale for one such system was that they needed to empty theirmail queue as fast as possible. When told that the low timeout inreality might make them empty their queue slower rather than faster(because their system had to retry mail that would otherwise get out onthe first try), they simply didnät listen.)

Or does some characteristic of list mail make it take longer to
process in sa?

Headers are part of the DATA stream. Hence, at the time a connecting
server is awaiting your "354 Start Input" reply to their DATA command, the
whole notion of a (possible) mailing-list is simply not within scope yet
(as the headers have not been received yet).

But... SpamAssassin should not be called before the DATA is sent.SpamAssassin is a content filter that scans the data (including theheaders).

It's quite possible for a connecting mail server to retry after it
received, say, one of the many 4.x.x codes. However, what's definitely
broken in your setup is that, in such a scenario, apparently your LDA
already delivers (part of an) aborted mail.

The LDA delivers what it gets, and really can't know ether the mail iscomplete or not. It's the responibility of the server wether anon-complete mail should be discarded or send to the LDA.

Once the client has sent data end (<CRLF>.<CRLF>) and the server hasrecieved it, it's normal (and correct) for the server to deliver themail and then send a relevant response to the client.

If the client disconnects (due to a timeout or other problem) aftersending <CRLF>.<CRLF> but before getting a response from the server,this tend to result in the server delivering the message and the clientretrying it.


RFS2821 spoecifically mentions this.

4.5.3.2:
---8<---

When the receiver gets the final period terminating the message data, ittypically performs processing to deliver the message to a user mailbox.A spurious timeout at this point would be very wasteful and wouldtypically result in delivery of multiple copies of the message, since ithas been successfully sent and the server has accepted responsibilityfor delivery.

---8<---

6.1:
---8<---

To avoid receiving duplicate messages as the result of timeouts, areceiver-SMTP MUST seek to minimize the time required to respond to thefinal <CRLF>.<CRLF> end of data indicator.

---8<---

The resoning behind this behaviour is that the server doesn't knowwether the client received the 250 response or not. If the client didreceive it, the server MUST take responisibilty for delivering themessage (or notifying the sender of a failure if it can't deliver it).So the server takes the safe way of assuming the client did receive thepositive response.

Similarly, since the client doesn't know wether the server managed todeliver the message or not, the client MUST take responsibility forretrying the message (or notifying the sender of a failure).

The basic idea is that it's more important to not loose mail than toallways avoid duplicates.


> That seems too strange for

words; so, likely, you're not sending out proper reply codes to connecting
mail servers.

Quite the opposite. It seems perfectly normal. His servers are mostlikely sending the correct responses to the client, but the clientdisconnects due to a timeout before getting the response and thereforedoesn't know that the mail was eventually accepted for delivery.

So, the solution is simple: fix your mail server; or have someone do it
for you.


There's two parts to the solution.

Part One: Fix his mail server so that it does take overly long torespond to data end (<CRLF>.<CRLF>). This is a MUST.

It is, of course, quiote possible that a normally perfectly good mailserver is temporarily overloaded, has disk problems, network problems,or for some other temporary reason takes a long time to respond. This isnot a failure to follow RFC2821. This seems to be what happened in thiscase. It is the reason part two is needed.

Part Two: Fix the sending systems so that they do not use aninappropriately low timeout after data end (<CRLF>.<CRLF>). There's areason why it SHOULD be 10 minutes.


Regards
/Jonas
--
Jonas Eckerman
Fruktträdet & Förbundet Sveriges Dövblinda
http://www.fsdb.org/
http://www.frukt.org/
http://whatever.frukt.org/

Re: Odd behaviour under load.

Reply via email to