On Sun, 3 Aug 2008, Robert Watson wrote:
This is an advance warning that, late next week, I will be merging a fairly
large set of changes to the IPv4 and IPv6 protocols layered over the
inpcb/inpcbinfo kernel infrastructure. To be specific, this affects TCP,
UDP, and raw sockets on both IPv4 and IPv6. I will post a further e-mail
announcement along with patch set and schedule in a day or two once it's
prepared.
Patches, which require the MFC of rwlock try-locking, which I did earlier
today:
http://www.watson.org/~robert/freebsd/netperf/20080808-7stable-rwlock-inpcb.diff
These incude the inpcb/inpcbinfo read/write locking changes (although not yet
for raw/divert sockets). Any testing, especially with heavy UDP loads, would
be much appreciated -- this are fairly complex changes, and also quite a
complex MFC.
Robert N M Watson
Computer Laboratory
University of Cambridge
The thrust of this change is to replace the mutexes protecting the inpcb and
inpcbinfo data structures with read-write locks (rwlocks). These structures
represent, respectively, particular sockets and the global socket lists for
all socket types in IPv4 and IPv6 except for SCTP. When you run netstat,
inpcbinfo is the data structure referencing all connections, and each line in
the nestat output reflects the contents of a specific inpcb.
In the current stage of this work, the intent is to improve performance for
datagram-related protocols on SMP systems by allowing concurrent acquisition
of both global and connection locks during receive and transmit. This is
possible because, in the common case, no connection or global state is
modified during UDP/raw receive and transmit at the IP layer, so a read lock
is sufficient to prevent data in those structures from unexpectedly changing.
For receive, socket layer state is modified, but this is separately protected
by socket layer locks. On transmit, no state is modified at any layer, so in
principle we will allow fully parallel transmit from multiple threads down to
about the routing and network interface layers, whereas previously they would
bottleneck in UDP.
The applications targeted by this change are threaded UDP server
applications, such as BIND9, nsd, and UDP-based memcached. Kris Kennaway and
Paul Saab have done fairly extensive testing with the changes and
demonstrated significant performance improvements due to reduced contention
and overhead. Perhaps they can mention some of those numbers in a follow-up
to this post.
The reason for the heads up is that, while carefully-tested, changes of this
sort do come with risks. We've carefully structured them so as to avoid
breaking the ABIs for netstat, etc, but it's not impossible that some
problems will arise as the changes settle. The goal, however, is to see
these performance improvements in 7.1, and since they've had a bit to shake
out in 8.x and seen some heavy use, I think now is the right time to merge
them.
In any case, I will send out e-mail in a couple of days with a proposed merge
patch and schedule for merging, and perhaps if you are in a positition where
you might benefit from these improvements, or have interesting UDP or
raw-socket based applications running on 7.x, you could test the candidate
patch before it's merged, reporting any problems. Unless I receive negative
feedback, I will plan on merging the changes late in the week, and keep a
close eye on stable@ for any reports of problems.
Thanks,
Robert N M Watson
Computer Laboratory
University of Cambridge
_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"
_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"