On Feb 6, 2012, at 5:55 PM, Jesse Gross wrote:
> On Mon, Feb 6, 2012 at 2:55 PM, Kyle Mestery <kmest...@cisco.com> wrote:
>> On Feb 6, 2012, at 3:19 PM, Jesse Gross wrote:
>>> When we first implemented tunneling support, all configuration went on
>>> the tunnel port itself including IP addresses and the key (or VNI or
>>> whatever you want to call it).  For a given packet, the tunnel
>>> properties were chosen by outputting to a particular port and on
>>> receive the port was done by a hash lookup before the flow table.  We
>>> later added an extension so that the key could be set via an action
>>> and included in the main flow table.  The goal here would be to do the
>>> same thing for the remaining tunnel fields (src/dst IP, ToS, TTL).
>>> The immediate impact is that it would allow us to drop the separate
>>> tunnel hash table and ToS/TTL inheritance mechanisms from the kernel
>>> and implement them in userspace.  It would also enable learning from
>>> multicast like vxlan does since the source information is not lost in
>>> the tunnel lookup like it currently is.
>>> 
>> Being able to set all this information from userspace via extensions is 
>> good. It moves back to handling configuration of tunnel ports into the 
>> protocol (via extensions) instead of punting the problem outside the 
>> protocol. Also, does this mean we would still end up with a single tunnel 
>> per remote host in the case of VXLAN?
> 
> When you say "the protocol" do you mean OpenFlow?  What I was talking
> about was just the communication channel between userspace and kernel
> which is private to OVS but is ultimately used to implement both
> OpenFlow flows and direct logic in ovs-vswitchd.  For the time being
> at least, I wasn't planning on exposing this level of flexibility
> through OpenFlow itself since it has several issues when setting up
> tunnels that require state like IPsec or connectivity monitoring.
> 
Yes, I was referring to OpenFlow, but thanks for the clarification.

> In the kernel you would just end up with a single port (or port-like
> object) per protocol with the flows determining which remote host to
> send packets to.  I'm assuming that the abstraction for vxlan that
> would be exposed (from userspace to the outside world) would be more
> akin to that of a network with a particular multicast group/VNI than a
> set of ports with remote hosts.
> 
OK, that makes sense.

>>> In the kernel we can arrange the new fields at the end of the struct
>>> so there is no performance cost for non-tunneled packets and when
>>> talking to userspace the new fields won't be included at all if not
>>> used.
>>> 
>>> On transmit, we would have an action for setting those fields,
>>> possibly plus a few of the current configuration options.  i.e.:
>>> set_tunnel(struct ovs_key_ipv4, vni, csum...)
>>> These get used when the packet is encapsulated after being output to a
>>> tunnel port.
>>> 
>>> Once we do this, there is a less direct mapping between kernel vports
>>> and those in userspace.  I'd like to maintain the current port-based
>>> mechanism in addition vxlan learning so that basically means we need
>>> to look at received packets and assign them to an input port in
>>> userspace (and handle stats, etc.).  On output we would need to add
>>> the appropriate set_tunnel action instead of a direct output.  The
>>> final component, which I don't have a good plan for at the moment, is
>>> how to deal with ports on different datapaths since the tunnel packets
>>> can only arrive on a single datapath but they might want to be bridged
>>> to a physical port on another datapath.
>>> 
>> So, this somewhat ties into my earlier question about the number of tunnel 
>> ports. This design assumes a single tunnel port per host, but to fit in with 
>> the existing design, we'd need a single tunnel port per datapath? 
>> Essentially it would be good to have a single tunnel port and be able to 
>> have it service multiple datapaths on the host, right? I need to think about 
>> that a bit.
> 
> With this design, you're actually forced to have only a single port
> for all datapaths.  Currently when a packet comes into the IP stack
> for a given tunnel protocol, we do a lookup on the source IP
> (primarily) to determine the port, which is attached to a particular
> datapath.  However, since we won't be doing that pre-lookup before the
> packet hits the flow table anymore, the only choice is to send all
> packets from a particular protocol to a single datapath.
> 
> In theory it's possible to have a flow output to a port on a different
> datapath, it just starts to get messy as it breaks down the
> abstractions.

Yes, understood.
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to