On Fri, Jun 17, 2016 at 6:46 AM, Thadeu Lima de Souza Cascardo <casca...@redhat.com> wrote: > On Thu, Jun 16, 2016 at 03:53:12PM -0700, Jesse Gross wrote: >> On Thu, Jun 16, 2016 at 12:40 PM, Thadeu Lima de Souza Cascardo >> <casca...@redhat.com> wrote: >> > The reason we have the first patch in the series is because we can't >> > identify >> > the port as vxlan otherwise. It is identified as system. In that case, the >> > port >> > would not be removed, and that would cause the problems you see during >> > restarts >> > and when removing all ports for that vport. >> > >> > But it happens that if a netdev has been opened for that name and no type >> > has >> > been used, things will break, as its type will be detected as system. >> > >> > So it happens that the route table opens it and keeps a reference when >> > inserting >> > it into the tnl port map. Not sure why it didn't happen to me in the past. >> > It >> > could be explained by a change after a rebase, but I couldn't find what >> > commit >> > would have broken it. >> > >> > Well, there are plenty of ways to fix it. One line of possible changes >> > involves >> > the routing table: prevent vports from being considered in tnl ports or the >> > route table, or preventing routes with only LLAs routes to go to the tnl >> > ports. >> > >> > The other line is making sure we will detect the correct type no matter >> > what: >> > we can either always open such devices with their appropriate type, doing >> > the >> > vport name to type mapping before open, or in netdev_get_type_from_name >> > always >> > prefer the vport type before trying to find an existing netdev. >> > >> > I think the last solution would be best. First, any later changes on >> > routing >> > table or in case anyone calls netdev_open for these names for any other >> > reason >> > won't break it. Second, doing it on netdev_open would cause other changes, >> > like >> > the netdev_class used would be different, and even the route table code >> > would >> > have problems, since get_addr is not supported for vport types. >> >> I definitely agree that it is better to fix the netdev code rather >> than the routing table. Just trying to avoid this situation would >> almost certainly mean that it would break at some point in the future. >> >> However, I'm not sure that it's really a good idea to have the same >> device open as netdevs with different types (if I'm understanding your >> proposal correctly). I think we should try to ensure that we always >> have the right netdev class when we are dealing with devices, >> otherwise I suspect that we'll have weird corner cases in the future. >> If the routing table code isn't aware of some types of devices and >> doesn't handle them properly then that seems like an appropriate thing >> to fix within the routing code. > > The routing code just receives notifications for all system interfaces and > opens > them up with the NULL type, which is the system type. > > In the specific case of vport types, they are not directly opened or shouldn't > be by other code. For the purpose of the routing code, using a different type > could even prevent it from using it as intended, that is, to get the address > list.
When the route table code opens them currently as system, it's not going to get any useful information out of them, right? We don't expect these devices to have IP addresses on them or directly have packets passed. So if it's not going to be useful, then it seems like it could just check the type before it opens it and skip it if it isn't system. (Note that this is still a generic fix because if it did open the device it would get the correct type.) The other possibility to is to make the netdev operations that the route table code uses no-ops for non-system devices. > I can see some potential ways to cause lots of confusion, like having a vxlan0 > interface on the system (be it a vxlan interface or not) and create a vxlan0 > port/interface with type=vxlan. Not sure how to fix that problem. I agree this is a problem (though it's one that I think can already happen today). It seems like the best thing to do is just to check for and disallow it - having two separate devices with the same name and somewhat in the same namespace is only going to lead to problems at some point. _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev