On Sat, Sep 29, 2012 at 7:05 AM, Luigi Rizzo <ri...@iet.unipi.it> wrote: > jumping in the conversation: last year when Gaetano and I worked on > this code, the main loop had multiple performance issues, including > a) rebuilding from scratch the list of file descriptors on each iteration, > b) ignoring the response from select() and firing all handlers (and perhaps > looking at timeouts ?) on return, and > c) processing only one packet per iteration. > > Our feeling at the time was that: > - a patch that fixed #a and #b in the main loop would have a much lower > chance of being accepted; > - user-space forwarding was not a priority fo the project given the existence > of > the in-kernel module for linux (though we may disagree on this, given that > together with netmap this makes it possible to get much better throughput); > - we were unsure how much #a could be simplified in the main loop, whereas > splitting the event loop in two makes the handling of the forwarding path a > lot more efficient; > - also having a separate forwarding thread probably would save some latency > in the forwarding when the main thread is busy with the control path. > > and these are the reasons why we went for a separate thread. > Also, we were not completely sure on how to address > #a in the main loop, whereas this is somewhat simpler in the secondary > thread which only has to deal with the dp_netdevs list. > > An option to move forward could be to make the entire patch conditional > (it almost is this way now, but we can certainly work on making the patch > less intrusive and clearly mark the conditional block) and possibly default > to off, so that people not comfortable with the threaded extension > will not have to deal with it. > > In the meantime you can perhaps try to > import some of the performance enhancements in the main code > (though again i am not sure how efficiently high-speed packet-IO events > can coexist with the control-plane related one.
I can definitely see an argument for keeping control traffic and packet processing separate, both from a strict efficiency perspetive and also just in terms of separation of concerns. If there are benefits to be gained by making changes to the main loop then it would be nice to get those everywhere but that obviously shouldn't block other work. I would actually prefer to see less rather than more conditional code, especially with synchronization. If it's conditional, in the best case you always have to worry about the most complicated case and in the worst case you either have to worry about multiple case or accidentally break things. These types of situations make me particularly nervous when they start leaking out into other parts of the code, as we see with the vlog library. Given that the source of the gains are structural, rather than due to fine grained threading, the most natural approach to me would to be a multiprocess model. This would seem to have all of the benefits with none of the worries about synchronization. It also has the benefits of very closely modeling the kernel datapath. _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev