On 25 Mar 2011, at 21:01, John Baldwin wrote:

> On Tuesday, February 01, 2011 12:54:33 am Jim wrote:
>> I am not sure if anybody has asked it before. I could not find answer by
>> doing rough search on Internet, if it is duplicate question, sorry in
>> advance.
>> 
>> My question is that, for getting socket options in tcp_ctloutput() in
>> tcp_usrreq.c, why do we need to do lock with INP_WLOCK(inp) as setting
>> socket options does. Why do we just use INP_RLOCK(inp), as it looks not
>> changing anything in tcp control block?
> 
> I think mostly it is just because no one has bothered to change it.  
> Realistically it probably won't make any noticable difference unless your 
> workload consists of doing lots of calls to getsockopt() but not sending any 
> actual traffic on the associated sockets. :)  (Almost all of the other 
> operations on a TCP connection require a write lock on the pcb.)

Just to reiterate John's point here: the critical performance paths for TCP 
both require the inpcb lock to be held exclusively (input and output), and 
socket options are typically called from the same user thread doing I/O, 
meaning that acquiring read locks instead of write locks is unlikely to make 
any measurable difference. However, in principle I believe most if not all 
getsockopt()'s in TCP should be fine with just a read lock, and for socket 
options used with UDP, there might well be some benefit to using a read lock, 
since most UDP operations use read locks and note write locks on the inpcb.

I should further note that Jeff Roberson has some exciting in-progress work to 
reduce transmit-input contention on the inpcb that appears to make quite a 
noticeable difference in improving TCP performance. We don't have much global 
lock contention currently when in the steady state, but the per-connection 
locks do get heavily contended. His work is similar to some work done in the 
Linux stack a year or two ago to defer input processing to the user thread 
rather than contending on the inpcb lock, if it's already held. Hopefully we'll 
see the results of that work in 9.0, and possibly backported to 8.x.

I also have a large pending patchset adding connection group support, and 
aligning software lookup tables with hardware work distribution via RSS, which 
is due to go in before 9.0.

Robert_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Reply via email to