Re: [OMPI users] loading processes per node

Ronald Cohen Fri, 25 Mar 2016 14:10:42 -0400 (EDT)

Thank you--I looked on the man page and it is not clear to me what
pe=2 does. Is that the number of threads? So if I want 16 mpi procs
with 2 threads is it for 32 cores (two nodes)


mpirun -map-by core:pe=2 -n 16

?

Sorry if I mangled this.


Ron

---
Ron Cohen
recoh...@gmail.com
skypename: ronaldcohen
twitter: @recohen3


On Fri, Mar 25, 2016 at 2:03 PM, Ralph Castain <r...@open-mpi.org> wrote:
> Okay, what I would suggest is that you use the following cmd line:
>
> mpirun -map-by core:pe=2 (or 8 or whatever number you want)
>
> This should give you the best performance as it will tight-pack the procs and 
> assign them to the correct number of cores. See if that helps
>
>> On Mar 25, 2016, at 10:38 AM, Ronald Cohen <recoh...@gmail.com> wrote:
>>
>> 1.10.2
>>
>> Ron
>>
>> ---
>> Ron Cohen
>> recoh...@gmail.com
>> skypename: ronaldcohen
>> twitter: @recohen3
>>
>>
>> On Fri, Mar 25, 2016 at 1:30 PM, Ralph Castain <r...@open-mpi.org> wrote:
>>> Hmmm…what version of OMPI are you using?
>>>
>>>
>>> On Mar 25, 2016, at 10:27 AM, Ronald Cohen <recoh...@gmail.com> wrote:
>>>
>>> --report-bindings didn't report anything
>>> ---
>>> Ron Cohen
>>> recoh...@gmail.com
>>> skypename: ronaldcohen
>>> twitter: @recohen3
>>>
>>>
>>> On Fri, Mar 25, 2016 at 1:24 PM, Ronald Cohen <recoh...@gmail.com> wrote:
>>>
>>> —display-allocation an
>>> didn't seem to give useful information:
>>>
>>> ======================   ALLOCATED NODES   ======================
>>>       n005: slots=16 max_slots=0 slots_inuse=0 state=UP
>>>       n008.cluster.com: slots=16 max_slots=0 slots_inuse=0 state=UP
>>>       n007.cluster.com: slots=16 max_slots=0 slots_inuse=0 state=UP
>>>       n006.cluster.com: slots=16 max_slots=0 slots_inuse=0 state=UP
>>> =================================================================
>>>
>>> for
>>> mpirun -display-allocation  --map-by ppr:8:node -n 32
>>>
>>> Ron
>>>
>>> ---
>>> Ron Cohen
>>> recoh...@gmail.com
>>> skypename: ronaldcohen
>>> twitter: @recohen3
>>>
>>>
>>> On Fri, Mar 25, 2016 at 1:17 PM, Ronald Cohen <recoh...@gmail.com> wrote:
>>>
>>> Actually there was the same number of procs per node in each case. I
>>> verified this by logging into the nodes while they were running--in
>>> both cases 4 per node .
>>>
>>> Ron
>>>
>>> ---
>>> Ron Cohen
>>> recoh...@gmail.com
>>> skypename: ronaldcohen
>>> twitter: @recohen3
>>>
>>>
>>> On Fri, Mar 25, 2016 at 1:14 PM, Ralph Castain <r...@open-mpi.org> wrote:
>>>
>>>
>>> On Mar 25, 2016, at 9:59 AM, Ronald Cohen <recoh...@gmail.com> wrote:
>>>
>>> It is very strange but my program runs slower with any of these
>>> choices than if IO simply use:
>>>
>>> mpirun  -n 16
>>> with
>>> #PBS -l
>>> nodes=n013.cluster.com:ppn=4+n014.cluster.com:ppn=4+n015.cluster.com:ppn=4+n016.cluster.com:ppn=4
>>> for example.
>>>
>>>
>>> This command will tightly pack as many procs as possible on a node - note
>>> that we may well not see the PBS directives regarding number of ppn. Add
>>> —display-allocation and let’s see how many slots we think were assigned on
>>> each node
>>>
>>>
>>> The timing for the latter is 165 seconds, and for
>>> #PBS -l nodes=4:ppn=16,pmem=1gb
>>> mpirun  --map-by ppr:4:node -n 16
>>> it is 368 seconds.
>>>
>>>
>>> It will typically be faster if you pack more procs/node as they can use
>>> shared memory for communication.
>>>
>>>
>>> Ron
>>>
>>> ---
>>> Ron Cohen
>>> recoh...@gmail.com
>>> skypename: ronaldcohen
>>> twitter: @recohen3
>>>
>>>
>>> On Fri, Mar 25, 2016 at 12:43 PM, Ralph Castain <r...@open-mpi.org> wrote:
>>>
>>>
>>> On Mar 25, 2016, at 9:40 AM, Ronald Cohen <recoh...@gmail.com> wrote:
>>>
>>> Thank you! I will try it!
>>>
>>>
>>> What would
>>> -cpus-per-proc  4 -n 16
>>> do?
>>>
>>>
>>> This would bind each process to 4 cores, filling each node with procs until
>>> the cores on that node were exhausted, to a total of 16 processes within the
>>> allocation.
>>>
>>>
>>> Ron
>>> ---
>>> Ron Cohen
>>> recoh...@gmail.com
>>> skypename: ronaldcohen
>>> twitter: @recohen3
>>>
>>>
>>> On Fri, Mar 25, 2016 at 12:38 PM, Ralph Castain <r...@open-mpi.org> wrote:
>>>
>>> Add -rank-by node to your cmd line. You’ll still get 4 procs/node, but they
>>> will be ranked by node instead of consecutively within a node.
>>>
>>>
>>>
>>> On Mar 25, 2016, at 9:30 AM, Ronald Cohen <recoh...@gmail.com> wrote:
>>>
>>> I am using
>>>
>>> mpirun  --map-by ppr:4:node -n 16
>>>
>>> and this loads the processes in round robin fashion. This seems to be
>>> twice as slow for my code as loading them node by node, 4 processes
>>> per node.
>>>
>>> How can I not load them round robin, but node by node?
>>>
>>> Thanks!
>>>
>>> Ron
>>>
>>>
>>> ---
>>> Ron Cohen
>>> recoh...@gmail.com
>>> skypename: ronaldcohen
>>> twitter: @recohen3
>>>
>>> ---
>>> Ronald Cohen
>>> Geophysical Laboratory
>>> Carnegie Institution
>>> 5251 Broad Branch Rd., N.W.
>>> Washington, D.C. 20015
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> Link to this post:
>>> http://www.open-mpi.org/community/lists/users/2016/03/28828.php
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> Link to this post:
>>> http://www.open-mpi.org/community/lists/users/2016/03/28829.php
>>>
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> Link to this post:
>>> http://www.open-mpi.org/community/lists/users/2016/03/28830.php
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> Link to this post:
>>> http://www.open-mpi.org/community/lists/users/2016/03/28831.php
>>>
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> Link to this post:
>>> http://www.open-mpi.org/community/lists/users/2016/03/28832.php
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> Link to this post:
>>> http://www.open-mpi.org/community/lists/users/2016/03/28833.php
>>>
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> Link to this post:
>>> http://www.open-mpi.org/community/lists/users/2016/03/28837.php
>>>
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> Link to this post:
>>> http://www.open-mpi.org/community/lists/users/2016/03/28840.php
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/users/2016/03/28843.php
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2016/03/28844.php

Re: [OMPI users] loading processes per node

Reply via email to