Le 31/10/2014 00:24, Gus Correa a écrit :
> 2) Any recommendation for the values of the
> various vader btl parameters?
> [There are 12 of them in OMPI 1.8.3!
> That is real challenge to get right.]
>
> Which values did you use in your benchmarks?
> Defaults?
> Other?
>
> In particular, is there an optimal value for the eager/rendevous
> threshold value? (btl_vader_eager_limit, default=4kB)
> [The INRIA web site suggests 32kB for the sm+knem counterpart
> (btl_sm_eager_limit, default=4kB).]

There's no perfect value, and no easy way to tune all this.

The impact of direct copy mechanisms such as XPMEM/KNEM/CMA depends on
the contention in your memory bus and caches. If you're doing a
Alltoall, the optimal threshold for enabling them will be much lower
than if you're doing a pingpong because doing a single copy instead of
two usually helps more when the memory subsystem is overloaded. And it
also depends on your process placement and what cache (and cache size)
is shared between them.

Unfortunately, microbenchmarks will hardly help you decide of a better
threshold because performance also depend on the state of buffers in
caches (did the application writes the send buffer recently? will the
application read the buffer soon? microbenchmark ignore these), and each
copy strategy may have different impact of caches (which process is
reading and writing from which processors and from/to which buffer?).

So I'd say don't bother tuning things for too long...

Brice


Reply via email to