Seems I stumbled across another manifestation of a delayed-ACK holdups
of 100 ms in the communication over a loopback interface between postfix
and a content filter in a proxy (pre-queue) setup ... similar to what
was already resolved for a milter setup:
  http://archives.neohapsis.com/archives/postfix/2007-07/1101.html
  http://archives.neohapsis.com/archives/postfix/2007-07/1129.html

Environment:
  FreeBSD 7.2 or 8.1, postfix-current-2.8.20100728,
  the MTU of a loopback interface is 16384 bytes (a default)

The problem became apparent when plotting a chart of time that
amavisd spends in a DATA state of a SMTP session while receiving
a message from a smtpd service in a proxy setup. For message sizes
between approx 4000 and 34000 bytes the elapsed time is almost
exactly 100 ms (with one small gap, see attached chart), while
for shorter and for longer messages the elapsed time is smaller
and is proportional to a message size.

The problem is reproducible in a simplified setup, where
a proxy content filter is substituted by smtp-sink:

$ smtp-sink -v inet:127.0.0.1:10222 10

and the following is placed in master.cf:

10111    inet n    -       n       -      -       smtpd
    -o smtpd_proxy_filter=[127.0.0.1]:10222
    -o mynetworks=127.0.0.0/8,[::1]
    -o smtpd_client_restrictions=permit_mynetworks,reject

then a sample message is submitted into port 10111:

$ mini_sendmail -f...@example.com -s127.0.0.1 -p10111 x...@example.com \
  < sample.msg

The experiment is repeated with three sample messages of different
sizes: 7.0 KiB, 3.6 KiB and 26 KiB. As expected, the 100 ms delay
only occurs for the middle-size message of 7 KiB.

The three tcpdumps are attached. Each holds two interleaved SMTP
sessions, one between mini_sendmail and smtpd (port 19111),
the other one between smtpd and the smtp-sink (port 10222) acting
as a proxy content filter.

The interesting one is the tcpdump-7kB-delay.log, and its noteworthy
section is:

$ tcpdump -s 0 -n -ttt -r tcpdump-7kB-delay.log
00:00:00.000053 IP 127.0.0.1.22538 > 127.0.0.1.10111: Flags [P.], ack 114,
  win 8960, options [nop,nop,TS val 2346345874 ecr 3462397426], length 7505
00:00:00.000077 IP 127.0.0.1.35048 > 127.0.0.1.10222: Flags [P.], ack 257,
  win 8960, options [nop,nop,TS val 2346345874 ecr 299545718], length 4096
00:00:00.102767 IP 127.0.0.1.10111 > 127.0.0.1.22538: Flags [.], ack 7596,
  win 8960, options [nop,nop,TS val 3462397526 ecr 2346345874], length 0

Actually it seems that the holdup occurs on the primary session
(mini_sendmail -> postfix), and the proxy session just sees the
side effect of a delay on the primary session.

Observations:
- the problem is avoided for short and for long messages;
- setting MTU on a loopback interface to 1500 avoids the problem
  (which is probably just another way of saying that all messages are long);
- even though the received packet was of 7596 bytes and the MTU is 16k,
  the proxy session chopped it at 4096 bytes, leaving the rest for the
  next packet;
- TCP window was never depleted;
- feeding a message across a true ethernet interface (MTU 1500) does
  not eliminate the problem, as long as the proxy session stays on
  a loopback interface (using lo0 for both just makes it easier to
  capture a tcpdump)

I wonder if there is anything that should be done about it.

  Mark

Attachment: tcpdumps.tar.gz
Description: application/compressed-tar

<<attachment: timing-chart.png>>

Reply via email to