my bad, I was assuming KMP_AFFINITY was used

so let me put it this way :

do *not* use KMP_AFFINITY with mpirun -bind-to none, otherwise, you will very likely end up doing time sharing ...


Cheers,


Gilles


On 6/22/2016 5:07 PM, Jeff Hammond wrote:
Linux should not put more than one thread on a core if there are free cores. Depending on cache/bandwidth needs, it may or may not be better to colocate on the same socket.

KMP_AFFINITY will pin the OpenMP threads. This is often important for MKL performance. See https://software.intel.com/en-us/node/522691 for details.

Jeff

On Wed, Jun 22, 2016 at 9:47 AM, Gilles Gouaillardet <gil...@rist.or.jp <mailto:gil...@rist.or.jp>> wrote:

    Remi,


    Keep in mind this is still suboptimal.

    if you run 2 tasks per node, there is a risks threads from
    different ranks end up bound to the same core, which means time
    sharing and a drop in performance.


    Cheers,


    Gilles


    On 6/22/2016 4:45 PM, remi marchal wrote:
    Dear Gilles,

    Thanks a lot.

    The mpirun --bind-to-none solve the problem.

    Thanks a lot,

    Regards,

    Rémi





    Le 22 juin 2016 à 09:34, Gilles Gouaillardet <gil...@rist.or.jp
    <mailto:gil...@rist.or.jp>> a écrit :

    Remi,


    in the same environment, can you

    mpirun -np 1 grep Cpus_allowed_list /proc/self/status


    it is likely Open MPI allows only one core, and in this case, i
    suspect MKL refuses to do some time sharing and hence
    transparently reduce the number of threads to 1.
    /* unless it *does* time sharing, and you observed 4 threads
    with the performance of one */


    mpirun --bind-to none ...

    will tell Open MPI *not* to bind on one core, and that should
    help a bit.

    note this is suboptimal, you should really ask mpirun to
    allocate 4 cores per task, but i cannot remember the correct
    command line for that

    Cheers,

    Gilles




    On 6/22/2016 4:17 PM, remi marchal wrote:
    Dear openmpi users,

    Today, I faced a strange problem.

    I am compiling a quantum chemistry software (CASTEP-16) using
    intel16, mkl threaded libraries and openmpi-18.1.

    The compilation works fine.

    When I ask for MKL_NUM_THREAD=4 and call the program in serial
    mode (without mpirun), it works perfectly and use 4 threads.

    However, when I start the program with mpirun, even with 1 mpi
    process, the program ran but only with 1 thread.

    I never add such kind of trouble.

    Does anyone have an explanation.

    Regards,

    Rémi






    _______________________________________________
    users mailing list
    us...@open-mpi.org <mailto:us...@open-mpi.org>
    Subscription:https://www.open-mpi.org/mailman/listinfo.cgi/users
    Link to this 
post:http://www.open-mpi.org/community/lists/users/2016/06/29495.php

    _______________________________________________
    users mailing list
    us...@open-mpi.org <mailto:us...@open-mpi.org>
    Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
    Link to this post:
    http://www.open-mpi.org/community/lists/users/2016/06/29497.php



    _______________________________________________ users mailing
    list us...@open-mpi.org <mailto:us...@open-mpi.org> Subscription:
    https://www.open-mpi.org/mailman/listinfo.cgi/users

    Link to this 
post:http://www.open-mpi.org/community/lists/users/2016/06/29498.php


    _______________________________________________
    users mailing list
    us...@open-mpi.org <mailto:us...@open-mpi.org>
    Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
    Link to this post:
    http://www.open-mpi.org/community/lists/users/2016/06/29499.php




--
Jeff Hammond
jeff.scie...@gmail.com <mailto:jeff.scie...@gmail.com>
http://jeffhammond.github.io/


_______________________________________________
users mailing list
us...@open-mpi.org
Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2016/06/29500.php

Reply via email to