Re: [ovs-dev] VXLAN Patches

Jesse Gross Mon, 06 Feb 2012 13:19:54 -0800

On Sat, Feb 4, 2012 at 6:52 PM, Kyle Mestery <kmest...@cisco.com> wrote:
> On Feb 4, 2012, at 1:14 PM, Joseph Glanville wrote:
>> Hi
>>
>> I ported the aforementioned patches to 1.4 but haven't gotten much
>> further than that yet.
>> Would be interested in lending a hand soonish. (Bit snowed under atm)
>>
>> Joseph.
>>
> Yes, I also did the same (see the github reference below). At this point, I 
> was waiting for Jesse to send out his initial thoughts on the initial design 
> before getting started. Will be happy to work with you on this Joseph, lets 
> wait and see Jesse's thoughts before diving in deeper.


Great, thanks guys.  Obviously this is something that we've wanted to
do for a while now but have gotten to it yet (and as an aside this is
the major reason why we haven't pushed for tunneling support in
upstream Linux yet as we wanted to get the model right first).

I spent some time thinking about how to do this and while the plan
isn't fully fleshed out yet here's the rough idea:

When we first implemented tunneling support, all configuration went on
the tunnel port itself including IP addresses and the key (or VNI or
whatever you want to call it).  For a given packet, the tunnel
properties were chosen by outputting to a particular port and on
receive the port was done by a hash lookup before the flow table.  We
later added an extension so that the key could be set via an action
and included in the main flow table.  The goal here would be to do the
same thing for the remaining tunnel fields (src/dst IP, ToS, TTL).
The immediate impact is that it would allow us to drop the separate
tunnel hash table and ToS/TTL inheritance mechanisms from the kernel
and implement them in userspace.  It would also enable learning from
multicast like vxlan does since the source information is not lost in
the tunnel lookup like it currently is.

I think when you go down this path, you essentially end up with a
tunnel port (or perhaps some other new construct) that indicates only
the encapsulation format.  It registers with the IP stack to be the
packet handler and implements protocol-specific
encapsulation/fragmentation/etc.  On receive it supplies the outer
header values along with the packet you get something like this as the
flow to lookup:
in_port(tunnel port), ip(struct ovs_key_ipv4), vxlan(vni),
encap(struct ovs_key_ethernet),...)

In the kernel we can arrange the new fields at the end of the struct
so there is no performance cost for non-tunneled packets and when
talking to userspace the new fields won't be included at all if not
used.

On transmit, we would have an action for setting those fields,
possibly plus a few of the current configuration options.  i.e.:
set_tunnel(struct ovs_key_ipv4, vni, csum...)
These get used when the packet is encapsulated after being output to a
tunnel port.

Once we do this, there is a less direct mapping between kernel vports
and those in userspace.  I'd like to maintain the current port-based
mechanism in addition vxlan learning so that basically means we need
to look at received packets and assign them to an input port in
userspace (and handle stats, etc.).  On output we would need to add
the appropriate set_tunnel action instead of a direct output.  The
final component, which I don't have a good plan for at the moment, is
how to deal with ports on different datapaths since the tunnel packets
can only arrive on a single datapath but they might want to be bridged
to a physical port on another datapath.

Obviously, this is all very rough and I'll keep working on it but
hopefully it's useful to start thinking about.

Thoughts?
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Re: [ovs-dev] VXLAN Patches

Reply via email to