Re: [O-MPI users] direct openib btl and latency

Galen Shipman Thu, 9 Feb 2006 17:15:37 -0500


On Feb 9, 2006, at 3:03 PM, Jean-Christophe Hugly wrote:

On Thu, 2006-02-09 at 14:05 -0700, Ron Brightwell wrote:
[...]
From an adoption perspective, though, the ability to shine in
micro-benchmarks is important, even if it means using an ad-hoctuning.There is some justification for it after all. There are smallclustersout there (many more than big ones, in fact) so taking maximumadvantage
of a small scale is relevant.
I'm obliged to point out that you jumped to a conclusion -- possiblytrue
in some cases, but not always.

You assumed that a performance increase for a two-node micro-benchmark
would result in an application performance increase for a smallcluster.Using RDMA for short messages is the default on small clusters*because*
of the two-node micro-benchmark, not because the cluster is small.
No, I assumed it based on comparisions between doing and not doingsmall
msg rdma at various scales, from a paper Galen pointed out to me.
http://www.cs.unm.edu/~treport/tr/05-10/Infiniband.pdf

Hmm, this is not what I would conclude from my results, in fact if youlook at the NPB results in my paper you will see that Open MPIoutperforms in the CG and FT benchmarks at both 32 and 64 nodes withoutSRQ. The crossover point you are referring to must be the pairwiseping-pong benchmark. So I would have to conclude that it is totallyapplication dependent.


- Galen

Benchmarks are what they are. In the above paper, the tests place the
cross-over at around 64 nodes and that confirms a number of anecdotal
reports I got. It may well be that in some situations, small-msg rdmaisbetter only for 2 nodes, but that's note such a likely scenario;reality
is sometimes linear (at least at our scale :-) ) after all.

The scale threshold could be tunable, couldnt it ?

--
Jean-Christophe Hugly <j...@pantasys.com>
PANTA

Re: [O-MPI users] direct openib btl and latency

Reply via email to