On Fri, Jun 23, 2006 at 09:16:44PM +0200, Benoit Branciard wrote: > Package: linux-source-2.6.16 > Version: 2.6.16-2 > > When a great number of IPv6 TCP connections are initiated from the Linux > machine at high rate, some of them get stalled in SYN_SENT state and > eventually time out after tcp_syn_retries (about 3 minutes). > > > The remote server does NOT seem to see the connection at all (no > SYN_RECV report with netstat). > > This behaviour was noticed initially using LDAP queries. Further > investigations reported the same problem with SMTP requests, but NOT > with HTTP (maybe related to the short-living TIME_WAIT state of HTTP > connections ?). > The failure rate is about 1-2 to 5000 on a busy machine (for example one > hosting a web server), and harder to obtain on a quiet one. > > How to reproduce : > > - have a dual-stack LDAP or SMTP server ready, on a IPv6-enabled network > (let's call it myserver) > > - on the Linux client to be tested, launch a loop of quick TCP > connections to myserver : > > --> example 1 : loop of 5000 anonymous LDAP searches from a bash shell : > > $ i=0; while [ $i -lt 5000 ] ; do ldapsearch -H ldap://myserver -x -b > dc=mydomain,dc=myroot '(uid=someuid)' > /dev/null ; i=$((i+1)) ; [ > $((i%100)) -eq 0 ] && echo $i ; done > > --> example 2 : loop of 5000 SMTP connexions from a bash shell (uses the > echoping package) : > > $ i=0; while [ $i -lt 5000 ] ; do echoping -6 -S myserver >/dev/null ; > i=$((i+1)) ; [ $((i%100)) -eq 0 ] && echo $i ; done > > Both examples should print the query number every hundred connections. > If a connection gets stalled, the query count hangs, and a netstat > command (in another shell) should display the SYN_SENT stalled connection : > > tcp6 0 0 myclient.mydomain:51930 myserver.mydomain:ldap TIME_WAIT > (.. a bunch of other TIME_WAIT closing connexions ..) > tcp6 0 1 myclient.mydomain:51940 myserver.mydomain:ldap SYN_SENT > > The number of TIME_WAIT connections in our case is about a few hundreds, > so the tcp_max_tw_buckets value should not be an issue. > > The same experiments have NOT shown any stalling connections when using > IPv4 in the same conditions (either by explicitly specifying the IPv4 > address of myserver, or by means of the "-4" option of echoping). > > > We are using Debian GNU/Linux 3.1, libc6 2.3.2.ds1-22sarge3, and a > compiled linux-source-2.6.16 (2.6.16-2) kernel with the stock > 2.6.16-1-686-smp (or amd64-k8) unmodified config file. > > Same results have been achieved using several physical Debian client > machines with similar config and different ethernet adapters (e1000 and > tg3), against several LDAP or SMTP servers, and with various ethernet > switches. > > Also noted on a Mandriva Linux 2006.0 client with 2.6.12-18mdk kernel > and glibc-2.3.5-5mdk. > > So this sounds like a general bug in the Linux 2.6 IPv6 TCP stack.
Does this error still occur with more recent kernel versions? Cheers, Moritz -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]