On Sat, Feb 4, 2012 at 6:52 PM, Kyle Mestery <kmest...@cisco.com> wrote: > On Feb 4, 2012, at 1:14 PM, Joseph Glanville wrote: >> Hi >> >> I ported the aforementioned patches to 1.4 but haven't gotten much >> further than that yet. >> Would be interested in lending a hand soonish. (Bit snowed under atm) >> >> Joseph. >> > Yes, I also did the same (see the github reference below). At this point, I > was waiting for Jesse to send out his initial thoughts on the initial design > before getting started. Will be happy to work with you on this Joseph, lets > wait and see Jesse's thoughts before diving in deeper.
Great, thanks guys. Obviously this is something that we've wanted to do for a while now but have gotten to it yet (and as an aside this is the major reason why we haven't pushed for tunneling support in upstream Linux yet as we wanted to get the model right first). I spent some time thinking about how to do this and while the plan isn't fully fleshed out yet here's the rough idea: When we first implemented tunneling support, all configuration went on the tunnel port itself including IP addresses and the key (or VNI or whatever you want to call it). For a given packet, the tunnel properties were chosen by outputting to a particular port and on receive the port was done by a hash lookup before the flow table. We later added an extension so that the key could be set via an action and included in the main flow table. The goal here would be to do the same thing for the remaining tunnel fields (src/dst IP, ToS, TTL). The immediate impact is that it would allow us to drop the separate tunnel hash table and ToS/TTL inheritance mechanisms from the kernel and implement them in userspace. It would also enable learning from multicast like vxlan does since the source information is not lost in the tunnel lookup like it currently is. I think when you go down this path, you essentially end up with a tunnel port (or perhaps some other new construct) that indicates only the encapsulation format. It registers with the IP stack to be the packet handler and implements protocol-specific encapsulation/fragmentation/etc. On receive it supplies the outer header values along with the packet you get something like this as the flow to lookup: in_port(tunnel port), ip(struct ovs_key_ipv4), vxlan(vni), encap(struct ovs_key_ethernet),...) In the kernel we can arrange the new fields at the end of the struct so there is no performance cost for non-tunneled packets and when talking to userspace the new fields won't be included at all if not used. On transmit, we would have an action for setting those fields, possibly plus a few of the current configuration options. i.e.: set_tunnel(struct ovs_key_ipv4, vni, csum...) These get used when the packet is encapsulated after being output to a tunnel port. Once we do this, there is a less direct mapping between kernel vports and those in userspace. I'd like to maintain the current port-based mechanism in addition vxlan learning so that basically means we need to look at received packets and assign them to an input port in userspace (and handle stats, etc.). On output we would need to add the appropriate set_tunnel action instead of a direct output. The final component, which I don't have a good plan for at the moment, is how to deal with ports on different datapaths since the tunnel packets can only arrive on a single datapath but they might want to be bridged to a physical port on another datapath. Obviously, this is all very rough and I'll keep working on it but hopefully it's useful to start thinking about. Thoughts? _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev