On Fri, Apr 25, 2014 at 1:52 PM, <thomas.mo...@orange.com> wrote: > Hi Jesse, > > 2014-04-25, Jesse Gross: >>> Practically speaking, making your layer 3 port code work for >>> MPLS-over-GRE is not entirely trivial: >>> - on emission, compose_output_action__ pops the Ethernet header before >>> push_mpls is called (which is done during commit_odp_actions) ; this >>> does not work since push_mpls actually relies on the presence of the >>> Ethernet header to set the Ethernet ethertype to an MPLS ethertype. One >>> way to fix it would be to adapt MPLS code to support the case where the >>> frame is layer3. Another is to have compose_output_action__ call >>> odp_put_pop_eth_action after commit_odp_actions. I implemented the >>> latter and it works. This latter option has the benefit preserving >>> compatibility with other actions that can be done before outputting and >>> which rely on the Ethernet header being there (such as changing a >>> Ethernet dst) -- such operations do not make sense if the header is >>> popped afterward, but I think we should try to not break if the user is >>> to define such a flow. >> >> I think it's actually OK to break (or really ignore) actions that >> don't make sense. There are some similar examples already - if you try >> to set the TCP port on an L2 packet then the action will just get >> ignored. > > The alternative is making all MPLS code work for both Ethernet and non > Ethernet cases. More on this below. > >>> - on reception, although I haven't fully tried to reach a resolution, >>> there is at least an issue in flow_extract: with the current patch flow >>> extract does not parse MPLS for a layer3 frame. I guess that could be >>> solved although I haven't clarified exactly yet how flow_extract would >>> guess what payload is in the ofppuf->packet (as I understand, this is >>> initially determined in the kernel datapath and stored in skb->protocol, >>> but it has to be propagated to userspace). >> >> I think this would probably have to come as metadata from the >> originator in some form, similar to the input port which also can't be >> extracted from the packet. > > Ok. > > > In theory, this could either be from the >> kernel or OpenFlow directly. > > (I'm not sure how this can come from Openflow, but possibly that's a > side question)
An OpenFlow controller can send a packet-out message to inject a packet. This should behave similar to a received packet but will skip the kernel entirely. >>> Based on the above, I wonder if the code couldn't be made overall >>> simpler by *always* pushing an Ethernet header on reception of a >>> non-Ethernet tunnelled packet (not only when, after flow identification, >>> we conclude that the in_port is layer 3 while the out_port is not). >>> Similarly, on emission, the popping of the Ethernet header would not >>> have to be conditioned to the in/out ports being layer 3 or not. The >>> coexistence of the layer 3 port code with operations relying on the fact >>> that the packet being manipulated is Ethernet, would become a non-issue >>> ; the code related to these operations could remain completely untouched. >>> >>> In practice, that could be done by doing pop_eth based on flow actions >>> (as currently done by your patch), but by doing push_eth unconditionally >>> on reception of a packet from a layer3 tunnel (*not* as a flow action as >>> currently done in your patch). >> >> This is pretty much what is already done in the checked-in LISP code. >> I think at some point, we need to bite the bullet and figure out how >> to make OVS independent of Ethernet so this seems to be the time to do >> it. > > I can understand your point of view, although it seems to me that > normalizing all packets to Ethernet in input would be beneficial in that > it would avoid having to make a bunch of codepaths conditional to the > type of payload (MPLS comes to mind, there might be others). It would be > similarly beneficial in terms of testsuite coverage, and in terms of > avoiding the introduction of regressions in the kernel datapath or in > kernel/user space exchanges. In the past, support for Infiniband has come up a few times. I think if we started adding fake Ethernet headers that would drive us even further away from the possibility of being truly multi-protocol. (I'm still thinking about the earlier messages because I think there are some significant differences between the kernel netlink protocol and OpenFlow). _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev