Re: Use of V_tcbinfo lock in tcp_do_segment()

Robert Watson Sun, 22 Jul 2012 10:56:26 -0700


On Thu, 24 May 2012, Vijay Singh wrote:

Folks, I am trying to understand the need to hold the V_tcbinfo lock in partof this function [code included below for reference].
At present this causes the socket upcall to be called with the V_tcbinfolock held, which I'd like to avoid. We release this lock right after thissection.
Looking at the code, it seems the lock is needed if we were in FIN_WAIT_2and tcp_close() the connection. The lock also seems to be protectingV_twq_2msl.
Would it be an acceptable solution if we deferred calling socantrecvmore()till after the lock can be dropped (after the swtich statement).tcp_twstart() can be changed to return tp if the connections survives, orNULL if it doesn't, much like what tcp_close() does. Also a new lock couldbe added to protect the V_rwq_2msl queue.
If this sounds acceptable, I can generate a patch against -CURRENT. I wouldappreciate feedback.


Hi Vijay:

Sorry for responding to your query with such a delay.

A quick glance at tcp_twstart() appears to confirm your description -- I readit as requiring the incpbinfo lock for several reasons:


1) Potential call to tcp_close().

2) Potential call to tcp_tw_2msl_scan() which may in turn call tcp_twclose(),
   which requires the pcbinfo lock (see 5 below).

3) Direct use of V_twq_2msl.

4) Potential call to tcp_tw_2msl_scan(), which uses V_twq_2msl, and may also
   call tcp_tw_close (see 5).

5) In the tcp_twclose() callgraph subtree, the pcbinfo lock is used for
   several things, including tcp_tw_2msl_stop() and in_pcbfree().  The call to
   sofree() may recurse, under some circumstances, back into pru_detach(), and
   then into tcp_detach() which requires the lock.

Especially with the pcbgroup reworking so that the pcbhash rather than pcbinfolock is used in a number of places, there are increasing numbers of placeswhere this sort of optimisation could be made.

One caution to ponder: socantrecvmore() can trigger an upcall (per yourcomment) in the receive path. It used to be that upcalls did worrying things,such as call back down the stack -- I think we've largely eliminated those(NFS UDP was one such example, now fixed?) and therefore they are probably OK,but worth checking the so_upcall consumers to see what the implications mightbe, if any.


Robert


-vijay


relevant code from tcp_do_segment():

       if (thflags & TH_FIN) {
               if (TCPS_HAVERCVDFIN(tp->t_state) == 0) {
                       socantrcvmore(so);
                       /*
                        * If connection is half-synchronized
                        * (ie NEEDSYN flag on) then delay ACK,
                        * so it may be piggybacked when SYN is sent.
                        * Otherwise, since we received a FIN then no
                        * more input can be expected, send ACK now.
                        */
                       if (tp->t_flags & TF_NEEDSYN)
                               tp->t_flags |= TF_DELACK;
                       else
                               tp->t_flags |= TF_ACKNOW;
                       tp->rcv_nxt++;
               }
               switch (tp->t_state) {

               /*
                * In SYN_RECEIVED and ESTABLISHED STATES
                * enter the CLOSE_WAIT state.
                */
               case TCPS_SYN_RECEIVED:
                       tp->t_starttime = ticks;
                       /* FALLTHROUGH */
               case TCPS_ESTABLISHED:
                       tp->t_state = TCPS_CLOSE_WAIT;
                       break;

               /*
                * If still in FIN_WAIT_1 STATE FIN has not been acked so
                * enter the CLOSING state.
                */
               case TCPS_FIN_WAIT_1:
                       tp->t_state = TCPS_CLOSING;
                       break;

               /*
                * In FIN_WAIT_2 state enter the TIME_WAIT state,
                * starting the time-wait timer, turning off the other
                * standard timers.
                */
               case TCPS_FIN_WAIT_2:
                       INP_INFO_WLOCK_ASSERT(&V_tcbinfo);
                       KASSERT(ti_locked == TI_WLOCKED, ("%s: dodata "
                           "TCP_FIN_WAIT_2 ti_locked: %d", __func__,
                           ti_locked));

                       tcp_twstart(tp);
                       INP_INFO_WUNLOCK(&V_tcbinfo);
                       return;
               }
_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Use of V_tcbinfo lock in tcp_do_segment()

Reply via email to