Re: [OMPI users] mkl threaded works in serail but not in parallel

2016-06-22 Thread Ralph Castain
Sure - for example, if you intend to run 4 threads, then —map-by core:pe=4 (assuming you are running OMPI 1.10 or higher) will bind each process to 4 cores in a disjoint pattern (i.e., no sharing). > On Jun 22, 2016, at 3:37 AM, Gilles Gouaillardet > wrote: > > my point is the way I (almost)

Re: [OMPI users] mkl threaded works in serail but not in parallel

2016-06-22 Thread Gilles Gouaillardet
my point is the way I (almost) always use it is export KMP_AFFINITY=compact,granularity=fine the trick is I rely on OpenMPI and/or the batch manager to pin MPI tasks on disjoint core sets. that is obviously not the case with mpirun --bind-to none ... but that can be achieved with the appropriate

Re: [OMPI users] mkl threaded works in serail but not in parallel

2016-06-22 Thread Jeff Hammond
KMP_AFFINITY is essential for performance. One just needs to set it to something that distributes the threads properly. Not setting KMP_AFFINITY means no affinity and thus inheriting from process affinity mask. Jeff On Wednesday, June 22, 2016, Gilles Gouaillardet wrote: > my bad, I was assum

Re: [OMPI users] mkl threaded works in serail but not in parallel

2016-06-22 Thread Gilles Gouaillardet
my bad, I was assuming KMP_AFFINITY was used so let me put it this way : do *not* use KMP_AFFINITY with mpirun -bind-to none, otherwise, you will very likely end up doing time sharing ... Cheers, Gilles On 6/22/2016 5:07 PM, Jeff Hammond wrote: Linux should not put more than one thread

Re: [OMPI users] mkl threaded works in serail but not in parallel

2016-06-22 Thread Jeff Hammond
Linux should not put more than one thread on a core if there are free cores. Depending on cache/bandwidth needs, it may or may not be better to colocate on the same socket. KMP_AFFINITY will pin the OpenMP threads. This is often important for MKL performance. See https://software.intel.com/en-u

Re: [OMPI users] mkl threaded works in serail but not in parallel

2016-06-22 Thread Gilles Gouaillardet
Remi, Keep in mind this is still suboptimal. if you run 2 tasks per node, there is a risks threads from different ranks end up bound to the same core, which means time sharing and a drop in performance. Cheers, Gilles On 6/22/2016 4:45 PM, remi marchal wrote: Dear Gilles, Thanks a lo

Re: [OMPI users] mkl threaded works in serail but not in parallel

2016-06-22 Thread remi marchal
Dear Gilles, Thanks a lot. The mpirun --bind-to-none solve the problem. Thanks a lot, Regards, Rémi > Le 22 juin 2016 à 09:34, Gilles Gouaillardet a écrit : > > Remi, > > > in the same environment, can you > > mpirun -np 1 grep Cpus_allowed_list /proc/self/status > > > it is likely

Re: [OMPI users] mkl threaded works in serail but not in parallel

2016-06-22 Thread Gilles Gouaillardet
Remi, in the same environment, can you mpirun -np 1 grep Cpus_allowed_list /proc/self/status it is likely Open MPI allows only one core, and in this case, i suspect MKL refuses to do some time sharing and hence transparently reduce the number of threads to 1. /* unless it *does* time sharin

Re: [OMPI users] mkl threaded works in serail but not in parallel

2016-06-22 Thread Jeff Hammond
Do you know for sure that MKL is only using one thread or do you merely see that the performance is consistent with it using one thread? If MPI does process pinning, it is possible for all OpenMP threads to run on one core, which means one will observe no speedup from threads (and potentially a sl

[OMPI users] mkl threaded works in serail but not in parallel

2016-06-22 Thread remi marchal
Dear openmpi users, Today, I faced a strange problem. I am compiling a quantum chemistry software (CASTEP-16) using intel16, mkl threaded libraries and openmpi-18.1. The compilation works fine. When I ask for MKL_NUM_THREAD=4 and call the program in serial mode (without mpirun), it works perf