This part of the thread sounds really familiar.  I recall someone coming
up with a patch for this a few weeks ago, possibly committing it to
-current.  I'm too tired and it's too late, though; I'll look for it
tomorrow if Matt doesn't find the thread in the archives first.

Mike "Silby" Silbersack


On Sun, 2 Dec 2001, Matthew Dillon wrote:

>
> :curious, as the loopback's MTU is normally 16384.
> :Also, any idea on where does the 4096 limit (1460*2+1176) come from ?
> :
> :     cheers
> :     luigi
>
>     It comes from the size of an mbuf, which is 2K.  If you are trying to
>     send 4100 bytes of data what winds up happening is this:
>
>       * construct 2048 byte mbuf and queue    (TF_MORETOCOME set)
>               1460 byte packet gets pushed out
>       * construct 2048 byte mbuf and queue    (TF_MORETOCOME set)
>               1460 byte packet gets pushed out
>               (1172 bytes left over in mbuf)
>           <<--- ack is received (semi synchronous)
>               1172 bytes in transmit buffer are pushed out due to the ack
>       * construct 4 byte mbuf and queue       (TF_MORETOCOME clear)
>               4 bytes is pushed out due to TCP_NOWAIT being set.
>
>     There are two localhost MTUs.  If you use 'localhost' the MTU is 16384.
>     If you use the IP address of an ethernet interface on the machine the
>     MTU winds up being 1500 even though it is effectively a localhost
>     connection.  An MTU of 1500 generates the 1460 byte push-outs.
>
>     However, even with an MTU of 16384 you still have the same problem when
>     sending, say, 16384+2052 bytes of data.  After it pushed out a 16384 byte
>     segment it winds up with 2048 bytes queued in the mbuf and a
>     received ack (again, semi synchronous because this is localhost) will
>     cause it to push out the 2048 bytes prematurely, before the last 4 bytes
>     can get queued.
>
>     What we need is a mechanism in the tcp_input() code to NOT call
>     tcp_output() when an ACK is received, under certain circumstances.
>     I was thinking of taking the TF_MORETOCOME flag and causing it to be
>     left set for the duration of the write (except for the last sub-write).
>     At the moment it is set and cleared for each sub-write and the ack wiggles
>     its way in while it happens to be clear.  In anycase, this would all
>     tcp_input() to skip calling tcp_output() prematurely.  But it isn't so
>     easy to implement since the TF_ flags are in the 'tp' structure, not
>     the 'so' socket structure, and higher levels do not have direct access
>     to the tcp-specific 'tp' structure.
>
>                                       -Matt
>                                       Matthew Dillon
>                                       <[EMAIL PROTECTED]>
>
>
> To Unsubscribe: send mail to [EMAIL PROTECTED]
> with "unsubscribe freebsd-hackers" in the body of the message
>


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Reply via email to