On Thu, Apr 5, 2012 at 5:35 PM, Simon Horman <ho...@verge.net.au> wrote:
> On Thu, Apr 05, 2012 at 05:00:12PM -0700, Jesse Gross wrote:
>> On Wed, Apr 4, 2012 at 6:08 PM, Simon Horman <ho...@verge.net.au> wrote:
>> > diff --git a/include/linux/tcp.h b/include/linux/tcp.h
>> > index 3c7ffdb..36e794b 100644
>> > --- a/include/linux/tcp.h
>> > +++ b/include/linux/tcp.h
>> > @@ -472,6 +472,11 @@ struct tcp_sock {
>> >         * contains related tcp_cookie_transactions fields.
>> >         */
>> >        struct tcp_cookie_values  *cookie_values;
>> > +
>> > +       /* For encapsulation sockets. */
>> > +       __u16            encap_type;
>>
>> Do we actually get any value out of encap_type?  It seems equivalent
>> to encap_rcv != NULL.
>
> I agree, but this is in keeping with the UDP implementation.

I guess that I'm not convinced that exactly matching the UDP
implementation is the right thing to do here.  I think realistically
there aren't going to be a lot of new TCP encapsulations and currently
nothing sets this up from userspace, which is why UDP needs it.  Given
that, it seems best to simplify things.

>> > diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
>> > index fd54c5f..ce56c6c 100644
>> > --- a/net/ipv4/tcp_ipv4.c
>> > +++ b/net/ipv4/tcp_ipv4.c
>> > @@ -1714,6 +1715,32 @@ process:
>> >
>> >        if (!xfrm4_policy_check(sk, XFRM_POLICY_IN, skb))
>> >                goto discard_and_relse;
>> > +
>> > +       tp = tcp_sk(sk);
>>
>> I wonder if this is the right place for this hook.  It seems odd to be
>> after the TIME_WAIT check, for example.
>
> True. I pushed it down here because I felt that the xfrm4_policy_check and
> ttl checks made sense and the TIME_WAIT would always fail.  If not, perhaps
> immediately after the socket lookup would be a better location.

I think what we would ideally like to do is register an IP protocol
handler.  We can't do that because the normal TCP stack has already
taken it but we'd like to get as close to that as possible, so I would
tend to push it as close to the socket lookup as possible.

>> > +               encap_rcv = ACCESS_ONCE(tp->encap_rcv);
>> > +               if (encap_rcv) {
>> > +                       int ret = encap_rcv(sk, skb);
>> > +                       if (ret <= 0)
>> > +                               return -ret;
>>
>> It seems odd to resubmit as a different protocol here.  With UDP it's
>> common to have something that runs over either bare IP or encapsulated
>> in UDP but I'm not sure that the same really applies here.
>>
>> > +               }
>
> Ok, so perhaps it would be best to just have
>
>        return encap_rcv(sk, skb);
>
> This was the behaviour of my original version of this patch but
> I thought it may make sense to have behaviour closer to UDP's encap_rcv.

I think there's probably still value in allowing the encap handler to
reject the packet back to the TCP stack.
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to