A performance issue that has always bothered me: OVSDB has a set data type that matches up with Python's set data type (an unordered collection of unique items). The in-tree Python library represents this set type as a list. Not only does it do that, but every time you call Row.__getattr__() through accessing a Row with a set-type column, it will loop through those values, add them to a new Python set (presumably to remove duplicates)...and then return them as a sorted list. Every single time the attribute is accessed [1].
Some of these sets can be quite huge. In OpenStack Neutron, for example, we have a default Port Group that all ports are added to by default. This is many thousands of ports. Now, it would be very simple to just return a set here and users would get the benefits of both less overhead on attribute access AND the ability to do O(1) lookups on these sets. Things like "find port groups that have this port" etc. would be *much* cheaper. The problem is that this breaks the API. You can no longer do things like Port_Group.ports[0] as set objects are unordered and do not have __getitem__(), operations like append() don't exist, etc. This will also break tons of tests because they tend to rely on order of objects since they do simple string matching. The latter issue is probably pretty easy to fix in the tests themselves by just sorting the results in the tests themselves. It's probably possible to create a wrapper type object that makes a set that kinda looks like a list enough to not break things, but that's also pretty ugly. So I guess my question is, "what do we think about breaking the API at some point to fix this?" It's pretty terrible behavior, but it's also annoying when APIs change. Terry [1] https://github.com/openvswitch/ovs/blob/d70688a7291edb432fd66b9230a92842fcfd3607/python/ovs/db/data.py#L498-L504
_______________________________________________ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss