Le 31/10/2014 00:24, Gus Correa a écrit : > 2) Any recommendation for the values of the > various vader btl parameters? > [There are 12 of them in OMPI 1.8.3! > That is real challenge to get right.] > > Which values did you use in your benchmarks? > Defaults? > Other? > > In particular, is there an optimal value for the eager/rendevous > threshold value? (btl_vader_eager_limit, default=4kB) > [The INRIA web site suggests 32kB for the sm+knem counterpart > (btl_sm_eager_limit, default=4kB).]
There's no perfect value, and no easy way to tune all this. The impact of direct copy mechanisms such as XPMEM/KNEM/CMA depends on the contention in your memory bus and caches. If you're doing a Alltoall, the optimal threshold for enabling them will be much lower than if you're doing a pingpong because doing a single copy instead of two usually helps more when the memory subsystem is overloaded. And it also depends on your process placement and what cache (and cache size) is shared between them. Unfortunately, microbenchmarks will hardly help you decide of a better threshold because performance also depend on the state of buffers in caches (did the application writes the send buffer recently? will the application read the buffer soon? microbenchmark ignore these), and each copy strategy may have different impact of caches (which process is reading and writing from which processors and from/to which buffer?). So I'd say don't bother tuning things for too long... Brice