> > > So far, the best latency I got from ompi is 5.24 us, and the best I > > got from mvapich is 3.15. > > I am perfectly ready to accept that ompi scales better and that may be > > more important (except to the marketing dept :-) ), but I do not > > understand your explanation based on small-message RDMA. Either I > > missunderstood something badly (my best guess), or the 2 us are lost to > > something else than an RDMA-size tradeoff. > > > Again this is small message RDMA with polling versus send/receive > semantics, we will be adding small message RDMA and should have > performance equal to that of mvapich for small messages, but it is only > relevant for a small working set of peers / micro benchmarks.
Thanks a lot. I was being fooled by various levels of size thresholds in the mvapich code. It was indeed doing rdma for small messages. After turning that off, I get numbers comparable to yours. Well, mvapich still beats ompi by a hair on my configuration. 5.11 vs. 5.25 but that's in the near-irrelevant range compared to other benefits. >From an adoption perspective, though, the ability to shine in micro-benchmarks is important, even if it means using an ad-hoc tuning. There is some justification for it after all. There are small clusters out there (many more than big ones, in fact) so taking maximum advantage of a small scale is relevant. When do you plan on having the small-msg rdma option available ? J-C -- Jean-Christophe Hugly <j...@pantasys.com> PANTA