>> This series adds driver support for the processing of tunnel
>> [specifically vxlan/geneve/gre tunnels] packets which are
>> aggregated [GROed] by the hardware before driver passes
>> such packets upto the stack.

> First off I am pretty sure this isn't GRO.  This is LRO. 
Nopes. This is GRO - each MSS-sized frame will arrive on its
own frag, whereas the headers [both  internal & external]
would arrive on the linear part of the SKB.

> Also I don't know if you have been paying attention to recent
> discussions on the mailing list but the fact is GRO over UDP tunnels
> is still a subject for debate.  This patch takes things in the
> opposite direction of where we are currently talking about going with
> GRO.  I've added Hannes and Tom to this discussion just to make sure I
> have the proper understanding of all this as my protocol knowledge is
> somewhat lacking.
I guess we're on the exact opposite side of the discussion - I.e., we're
the vendor that tries pushing offload capabilities to the device.
Do notice however that we're not planning on pushing anything new
feature-wise, just like we haven't done anything for regular TCP GRO -
All we do is allow our HW/FW to aggregate packets instead of stack.

> Ideally we need to be able to identify that a given packet terminates
> on a local socket in our namespace before we could begin to perform
> any sort of mangling on the local packets.  It is always possible that
> we could be looking at a frame that uses the same UDP port but is not
> the tunnel protocol if we are performing bridging or routing.  The
> current GRO implementation fails in that regard and there are
> discussions between several of us on how to deal with that.  It is
> likely that we would be forcing GRO for tunnels to move a bit further
> up the stack if bridging or routing so that we could verify that the
> frame is not being routed back out before we could aggregate it.
I'm aware of the on-going discussion, but I'm not sure this should
bother us greatly - the aggregation is done based on the
inner TCP connection; I.e., it's guaranteed all the frames belong to
the same TCP connection. While HW requires the UDP port for the
initial classification effort, it doesn't take decision solely on that.

> Also I would be interested in knowing how your hardware handles
> tunnels with outer checksums.  Is it just ignoring the frames in such
> a case, ignoring the checksum, or is it actually validating the frames
> and then merging the resulting checksum?
HW is validating both inner and outer csum; if it wouldn't be able to
due to some limitation, it would not try to pass the packet as a GRO
aggregation but rather as regular seperated packets.
But I don't believe it merges the checksum, only validate each
independently [CSUM_UNNECESSARY].

Reply via email to