> > On Feb 8, 2006, at 7:06 PM, Jean-Christophe Hugly wrote: > >> But should I understand from all this that the "direct" mode will >> never >> actually work ? It seems that if you need at least two transports, >> then >> none of them can be the hardwired unique one, right ? Unless there's a >> built-in switch between a built-in self and the built-in other >> transport. > > Some of the transport layers are able to handle the messages to > "self". However, as we decide to let "self" do this type of work no > effort was spending on making sure they do it. Our first concern was/ > is/will be about performance, and "self" really do a great job. So > the quick answer to your question is no, there is no way to limit the > number of transports to one. > > Long ago, before the latest version of the BTL (byte transport > layer), we had something called the PTL. They were used with another > set of PML (protocol management layer). I wrote a specific PML > (called uniq) that was able to handle only one device (plus "self"). > The latency went down by a little bit (around 0.3 micro-seconds). > Anyway, the old openib PTL never reached a stable state so this will > not help you :(. As we plan to drop all support for the old > generation of PML/PTL, I don't think is a wise idea to spend time on > the openib PTL to make it working with uniq ... > > Thanks, > george. >
With the change to ob1/BTLs, there was also a refactoring of data structures that reduced the overall latency through the stack. As Galen indicated, if you do a direct comparison w/ send/recv semantics, I think you will find the overall latency through the stack is lower than other implementations (on the order of 0.5 us).