On Thu, Apr 05, 2012 at 05:00:12PM -0700, Jesse Gross wrote: > On Wed, Apr 4, 2012 at 6:08 PM, Simon Horman <ho...@verge.net.au> wrote: > > diff --git a/include/linux/tcp.h b/include/linux/tcp.h > > index 3c7ffdb..36e794b 100644 > > --- a/include/linux/tcp.h > > +++ b/include/linux/tcp.h > > @@ -472,6 +472,11 @@ struct tcp_sock { > > * contains related tcp_cookie_transactions fields. > > */ > > struct tcp_cookie_values *cookie_values; > > + > > + /* For encapsulation sockets. */ > > + __u16 encap_type; > > Do we actually get any value out of encap_type? It seems equivalent > to encap_rcv != NULL.
I agree, but this is in keeping with the UDP implementation. > > diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c > > index fd54c5f..ce56c6c 100644 > > --- a/net/ipv4/tcp_ipv4.c > > +++ b/net/ipv4/tcp_ipv4.c > > @@ -1714,6 +1715,32 @@ process: > > > > if (!xfrm4_policy_check(sk, XFRM_POLICY_IN, skb)) > > goto discard_and_relse; > > + > > + tp = tcp_sk(sk); > > I wonder if this is the right place for this hook. It seems odd to be > after the TIME_WAIT check, for example. True. I pushed it down here because I felt that the xfrm4_policy_check and ttl checks made sense and the TIME_WAIT would always fail. If not, perhaps immediately after the socket lookup would be a better location. > > > + if (tp->encap_type) { > > + int (*encap_rcv)(struct sock *sk, struct sk_buff *skb); > > + > > + /* > > + * This is an encapsulation socket so pass the skb to > > + * the socket's udp_encap_rcv() hook. Otherwise, just > > + * fall through and pass this up the TCP socket. > > + * up->encap_rcv() returns the following value: > > + * =0 if skb was successfully passed to the encap > > + * handler or was discarded by it. > > + * >0 if skb should be passed on to TCP. > > + * <0 if skb should be resubmitted as proto -N > > + */ > > + > > + /* if we're overly short, let UDP handle it */ > > There are still a number of references to UDP here. Oops. > > + encap_rcv = ACCESS_ONCE(tp->encap_rcv); > > + if (encap_rcv) { > > + int ret = encap_rcv(sk, skb); > > + if (ret <= 0) > > + return -ret; > > It seems odd to resubmit as a different protocol here. With UDP it's > common to have something that runs over either bare IP or encapsulated > in UDP but I'm not sure that the same really applies here. > > > + } Ok, so perhaps it would be best to just have return encap_rcv(sk, skb); This was the behaviour of my original version of this patch but I thought it may make sense to have behaviour closer to UDP's encap_rcv. > > Doesn't this leak a reference to the socket if the encap_rcv handler > takes the packet? I will look into that. _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev