Rob wrote: > > > You figured it out correctly. However at that moment TCP flow control > > would kick in and save you from local packet loss so to say. > > Hi, > > Thanks for the response, but you have actually confused me more. It is > my understanding that TCP doesn't have flow control (i.e., local to the > node), it has congestion control, which is end-to-end across the network.
tcp_output() knows about queue overflows when it tries to send packets and instantly gives up until ACK's get in. This avoids real packet drops locally and allows to retry the same packet a microseconds later again without having to wait for the other side to signal a real packet loss. > So it is entirely possible to drop packets locally in this method > with a highband width, high latency (so called "long-fat") connection. Packets are not exactly dropped. tcp_output() generates and adds one packet after the other to the interface queue. The moment this doesn't work anymore it breaks out of that loop even it would be allowed by the window to send more. The failing packet is not dropped but retried later and further packets are not produced for the time being. It just defers further delivery. A local loss would be different. There it would fail with one packet, drop it, and try again with the next packet which may go through if the network card was done with an earlier packet that moment. However this is not what happens. Because things happens inside the kernel TCP knows how to deal with this in the right way and to avoid any costly recoveries from these cases. > For example, if there were a giga-bit/second link, with a latency of > 100 miliseconds rtt, and window scaling set to 14 (the max), tcp could > in theory open it's congestion window up to 2^16*2^14 or 2^30 bytes, > which could be ACK'd more quickly than the net.inet.ip.intr_queue_max > queue would allow for, causing packets to be dropped locally. Basically, > the bandwidth-delay product dictates the size the buffer/queue should be, > and in the above (extreme) example, it should be 0.1s*1Gb/s=12.5MB which > is larger than the 50 packets at 1500 bytes each that you get > with net.inet.ip.intr_queue_max=50. You are mixing up the socket buffer size with the interface queue size. Those are not the same. TCP will converges to maximal link speed and then is bound by the output rate of the interface even if the window gets larger. > In otherwords, this is the reason for the net.inet.ip.intr_queue_drops > counter, right? I'm surprised that more of the tuning guides don't No, not directly. This counter gets to work when the box is doing routing. Drop happen when you go from a high bandwidth link to a low(er) bandwidth link overloading the slow link. On a box not doing routing you should'nt see any drops. > suggest increasing net.inet.ip.intr_queue_max to a higher value - am I > missing something? The equivalent setting in Linux is 1000, and Windows > 2k appears to be 1500 (not that this should be necessarily taken as any > sort of endorsement). This doesn't really help. What happens with larger queues is just that you fill the larger queues and it takes longer until TCP's limiting kicks in. The queues have to have some size to provide some smoothing of burstiness but should not be too long to keep some direct feedback in place. > If my understanding is incorrect, please let me know. In any case, > thanks for the help (and thanks to those that have replied off list). -- Andre _______________________________________________ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "[EMAIL PROTECTED]"