Thanks for the revision. I have some further comments. On Mon, Jan 07, 2013 at 05:32:29PM -0800, Justin Pettit wrote: > + - The Linux datapath implementation creates a different kernel- > + userspace channel for each port instead of sharing a static 16 > + channels to provide better performance isolation.
Previously I commented on this but I didn't see a response in your follow-up. The first time, I said that I think that's going to prompt more questions than it's going to answer. How about: - With the Linux datapath, packets for new flows are now queued separately on a per-port basis, so it should no longer be possible for a large number of new flows arriving on one port to prevent new flows from being processed on other ports. It's still very technical, but I think it is likely to help more readers. I think that part of the problem I'm having with 'channels' and 'n_channels' is that I tend to read the 'n_channels' name as meaning the number of channels actually present in the (possibly sparse) arrays 'channels' and 'epoll_events', instead of the allocated size of the array itself. That could be cured at least partially with a comment on n_channels (currently it has none), but I think that another name would make it more obvious. I like the 'array_size' name that you used in the add_channel() function. So the result would be something like this: /* Upcall messages. */ int array_size; /* Size of 'channels' and 'epoll_events'. */ struct dpif_channel *channels; struct epoll_event *epoll_events; int epoll_fd; /* epoll fd that includes channel socks. */ int n_events; /* Num events returned by epoll_wait(). */ int event_offset; /* Offset into 'epoll_events'. */ In add_channel(), then, a comparison like if (port_no >= dpif->array_size) { would then make perfect sense to me, without the need for a local variable. I'd ordinarily write "sizeof *dpif->channels" and "sizeof *dpif->epoll_events" in the allocations in add_channel(), since that's the typical OVS style: dpif->channels = xrealloc(dpif->channels, array_size * sizeof(struct dpif_channel)); ... dpif->epoll_events = xrealloc(dpif->epoll_events, array_size * sizeof(struct epoll_event)); It might be worth growing these two dynamic arrays by doubling, rather than one at a time, since we could add a large number of elements and it's nice to avoid N**2 behavior in the reallocation. In add_channel(), I think that the assignment dpif->n_channels = array_size; is wrong in the case where the new port_no is less than some existing port number (which can happen if a port is deleted and then a new one created). I think this assignment should go inside the "if" block. del_channel() attempts to use EPOLL_CTL_ADD to remove a channel, but I do not think that this can work. Also, it's not necessary to specify an event to remove a fd from an epoll set, unless we want to support Linux earlier than 2.6.9, which we don't. Should the "if" test at the top of del_channel() check for port_no >= dpif->n_channels (rather than ">")? add_channel() returns success without storing or destroying the nl_sock if epoll_fd < 0. I think this leaks memory and a fd if recv isn't turned on. (It might make more sense not to create a socket at all if recv isn't turned on.) Also, if recv isn't turned on, then dpif_linux_port_add() still creates a Netlink socket and tells the kernel to use that socket. I'd suggest that it should use 0, indicating that the kernel should not send upcalls at all, since no one is going to want to receive them. In dpif_linux_port_get_pid(), I think that "port_no == UINT32_MAX || port_no >= dpif->n_channels" can be simplified to just the second clause. Arguably, destroy_channels() should turn off upcalls for all the vports where we've enabled them, to save kernel time and buffer space when ovs-vswitchd exits gracefully. _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev