On Wed, Aug 17, 2016 at 4:23 PM, David Ahern <d...@cumulusnetworks.com> wrote: > On 8/17/16 5:16 PM, Alexander Duyck wrote: >>> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c >>> index 1ecbd7715f6d..6d78f162a88b 100644 >>> --- a/net/openvswitch/actions.c >>> +++ b/net/openvswitch/actions.c >>> @@ -167,6 +167,12 @@ static int push_mpls(struct sk_buff *skb, struct >>> sw_flow_key *key, >>> skb->mac_len); >>> skb_reset_mac_header(skb); >>> >>> + /* for GSO: set MPLS as network header and encapsulated protocol >>> + * header as inner network header >>> + */ >>> + skb_set_network_header(skb, skb->mac_len); >>> + skb_set_inner_network_header(skb, skb->mac_len + MPLS_HLEN); >>> + >>> new_mpls_lse = (__be32 *)skb_mpls_header(skb); >>> *new_mpls_lse = mpls->mpls_lse; >>> >> >> So the one question I would have about this is how attached are you to >> using the network_header to record the offset for the MPLS header? I >> ask because I think from a hardware offloading perspective it would >> make it much easier if instead you used the inner_mac_header to >> represent the offset for the MPLS header. This way device drivers >> could just skip over it like a VLAN and just use network and transport >> header values like they would otherwise. >> > > Where does the network_header relate to if I change the marker to > inner_mac_header? Would it be skipped?
No, the network header would still be the network header. > skb->protocol is set to MPLS. > mac_header points to ethernet address > network_header points to ??? The network_header would point to the IP header like it would be for a non-MPLS frame. > inner protocol is set to what is encapsulated (e.g., ipv4 or ipv6) I am okay with this, but wonder if we actually need it. Do you know of any protocols other than IPv4 or IPv6 that can be carried over MPLS and would expect to be offloaded? If not we may be able to just get away with recording the network header offset and then using the first nibble of the network header to determine the IP version since the value should be 4 or 6 for the two types we are offloading. > inner_mac_header points to start of mpls label. So this is what I would expect. > inner_network points to start of network header. The problem is that using inner_network_header to point to the network header will require me to fork the path pretty significantly for most of the Intel devices that would want to do MPLS GSO. The assumption most drivers make is that if we are offloading things then network_header and inner_network_header will point to either IPv4 or IPv6 headers. Introducing MPLS as the network_header with IPv4 or IPv6 as the inner_network_header throws a kink in the works because we currently ignore inner_network_header for the devices that are doing UDP or GRE tunnel GSO via GSO_PARTIAL with TSO_MANGLEID. > Is that sufficient for h/w drivers? I think of this as working like how we handle it for IP over IP tunnels. In that case we are at L3 so the inner_network_header field is populated, but the transport header stays the same. In the case of MPLS it isn't really L3 it is more of an L2.5 so my preference would be to treat it like it is an L2 tunnel or VLAN and just overwrite the inner_mac_header with the MPLS header offset, and leave the network and transport headers untouched. One other bonus that also occurred to me is that you might be able to get away with doing MPLS offloads for MPLS over IP or GRE tunnels. I hadn't realized that MPLS inside of these tunnels was a thing, I had just noticed it while looking over how the IP-in-IP tunnels are all being handled. However if you move the header tracking to inner_mac_header, and can avoid using skb->inner_protocol by instead using the first nibble of the network_header value then you could probably support segmenting those types of tunnels in hardware. - Alex