Re: [OMPI users] loading processes per node

Ronald Cohen Fri, 25 Mar 2016 14:11:55 -0400 (EDT)

or is it mpirun -map-by core:pe=8 -n 16 ?

---
Ron Cohen
recoh...@gmail.com
skypename: ronaldcohen
twitter: @recohen3



On Fri, Mar 25, 2016 at 2:10 PM, Ronald Cohen <recoh...@gmail.com> wrote:
> Thank you--I looked on the man page and it is not clear to me what
> pe=2 does. Is that the number of threads? So if I want 16 mpi procs
> with 2 threads is it for 32 cores (two nodes)
>
> mpirun -map-by core:pe=2 -n 16
>
> ?
>
> Sorry if I mangled this.
>
>
> Ron
>
> ---
> Ron Cohen
> recoh...@gmail.com
> skypename: ronaldcohen
> twitter: @recohen3
>
>
> On Fri, Mar 25, 2016 at 2:03 PM, Ralph Castain <r...@open-mpi.org> wrote:
>> Okay, what I would suggest is that you use the following cmd line:
>>
>> mpirun -map-by core:pe=2 (or 8 or whatever number you want)
>>
>> This should give you the best performance as it will tight-pack the procs 
>> and assign them to the correct number of cores. See if that helps
>>
>>> On Mar 25, 2016, at 10:38 AM, Ronald Cohen <recoh...@gmail.com> wrote:
>>>
>>> 1.10.2
>>>
>>> Ron
>>>
>>> ---
>>> Ron Cohen
>>> recoh...@gmail.com
>>> skypename: ronaldcohen
>>> twitter: @recohen3
>>>
>>>
>>> On Fri, Mar 25, 2016 at 1:30 PM, Ralph Castain <r...@open-mpi.org> wrote:
>>>> Hmmm…what version of OMPI are you using?
>>>>
>>>>
>>>> On Mar 25, 2016, at 10:27 AM, Ronald Cohen <recoh...@gmail.com> wrote:
>>>>
>>>> --report-bindings didn't report anything
>>>> ---
>>>> Ron Cohen
>>>> recoh...@gmail.com
>>>> skypename: ronaldcohen
>>>> twitter: @recohen3
>>>>
>>>>
>>>> On Fri, Mar 25, 2016 at 1:24 PM, Ronald Cohen <recoh...@gmail.com> wrote:
>>>>
>>>> —display-allocation an
>>>> didn't seem to give useful information:
>>>>
>>>> ======================   ALLOCATED NODES   ======================
>>>>       n005: slots=16 max_slots=0 slots_inuse=0 state=UP
>>>>       n008.cluster.com: slots=16 max_slots=0 slots_inuse=0 state=UP
>>>>       n007.cluster.com: slots=16 max_slots=0 slots_inuse=0 state=UP
>>>>       n006.cluster.com: slots=16 max_slots=0 slots_inuse=0 state=UP
>>>> =================================================================
>>>>
>>>> for
>>>> mpirun -display-allocation  --map-by ppr:8:node -n 32
>>>>
>>>> Ron
>>>>
>>>> ---
>>>> Ron Cohen
>>>> recoh...@gmail.com
>>>> skypename: ronaldcohen
>>>> twitter: @recohen3
>>>>
>>>>
>>>> On Fri, Mar 25, 2016 at 1:17 PM, Ronald Cohen <recoh...@gmail.com> wrote:
>>>>
>>>> Actually there was the same number of procs per node in each case. I
>>>> verified this by logging into the nodes while they were running--in
>>>> both cases 4 per node .
>>>>
>>>> Ron
>>>>
>>>> ---
>>>> Ron Cohen
>>>> recoh...@gmail.com
>>>> skypename: ronaldcohen
>>>> twitter: @recohen3
>>>>
>>>>
>>>> On Fri, Mar 25, 2016 at 1:14 PM, Ralph Castain <r...@open-mpi.org> wrote:
>>>>
>>>>
>>>> On Mar 25, 2016, at 9:59 AM, Ronald Cohen <recoh...@gmail.com> wrote:
>>>>
>>>> It is very strange but my program runs slower with any of these
>>>> choices than if IO simply use:
>>>>
>>>> mpirun  -n 16
>>>> with
>>>> #PBS -l
>>>> nodes=n013.cluster.com:ppn=4+n014.cluster.com:ppn=4+n015.cluster.com:ppn=4+n016.cluster.com:ppn=4
>>>> for example.
>>>>
>>>>
>>>> This command will tightly pack as many procs as possible on a node - note
>>>> that we may well not see the PBS directives regarding number of ppn. Add
>>>> —display-allocation and let’s see how many slots we think were assigned on
>>>> each node
>>>>
>>>>
>>>> The timing for the latter is 165 seconds, and for
>>>> #PBS -l nodes=4:ppn=16,pmem=1gb
>>>> mpirun  --map-by ppr:4:node -n 16
>>>> it is 368 seconds.
>>>>
>>>>
>>>> It will typically be faster if you pack more procs/node as they can use
>>>> shared memory for communication.
>>>>
>>>>
>>>> Ron
>>>>
>>>> ---
>>>> Ron Cohen
>>>> recoh...@gmail.com
>>>> skypename: ronaldcohen
>>>> twitter: @recohen3
>>>>
>>>>
>>>> On Fri, Mar 25, 2016 at 12:43 PM, Ralph Castain <r...@open-mpi.org> wrote:
>>>>
>>>>
>>>> On Mar 25, 2016, at 9:40 AM, Ronald Cohen <recoh...@gmail.com> wrote:
>>>>
>>>> Thank you! I will try it!
>>>>
>>>>
>>>> What would
>>>> -cpus-per-proc  4 -n 16
>>>> do?
>>>>
>>>>
>>>> This would bind each process to 4 cores, filling each node with procs until
>>>> the cores on that node were exhausted, to a total of 16 processes within 
>>>> the
>>>> allocation.
>>>>
>>>>
>>>> Ron
>>>> ---
>>>> Ron Cohen
>>>> recoh...@gmail.com
>>>> skypename: ronaldcohen
>>>> twitter: @recohen3
>>>>
>>>>
>>>> On Fri, Mar 25, 2016 at 12:38 PM, Ralph Castain <r...@open-mpi.org> wrote:
>>>>
>>>> Add -rank-by node to your cmd line. You’ll still get 4 procs/node, but they
>>>> will be ranked by node instead of consecutively within a node.
>>>>
>>>>
>>>>
>>>> On Mar 25, 2016, at 9:30 AM, Ronald Cohen <recoh...@gmail.com> wrote:
>>>>
>>>> I am using
>>>>
>>>> mpirun  --map-by ppr:4:node -n 16
>>>>
>>>> and this loads the processes in round robin fashion. This seems to be
>>>> twice as slow for my code as loading them node by node, 4 processes
>>>> per node.
>>>>
>>>> How can I not load them round robin, but node by node?
>>>>
>>>> Thanks!
>>>>
>>>> Ron
>>>>
>>>>
>>>> ---
>>>> Ron Cohen
>>>> recoh...@gmail.com
>>>> skypename: ronaldcohen
>>>> twitter: @recohen3
>>>>
>>>> ---
>>>> Ronald Cohen
>>>> Geophysical Laboratory
>>>> Carnegie Institution
>>>> 5251 Broad Branch Rd., N.W.
>>>> Washington, D.C. 20015
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> Link to this post:
>>>> http://www.open-mpi.org/community/lists/users/2016/03/28828.php
>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> Link to this post:
>>>> http://www.open-mpi.org/community/lists/users/2016/03/28829.php
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> Link to this post:
>>>> http://www.open-mpi.org/community/lists/users/2016/03/28830.php
>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> Link to this post:
>>>> http://www.open-mpi.org/community/lists/users/2016/03/28831.php
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> Link to this post:
>>>> http://www.open-mpi.org/community/lists/users/2016/03/28832.php
>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> Link to this post:
>>>> http://www.open-mpi.org/community/lists/users/2016/03/28833.php
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> Link to this post:
>>>> http://www.open-mpi.org/community/lists/users/2016/03/28837.php
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> Link to this post:
>>>> http://www.open-mpi.org/community/lists/users/2016/03/28840.php
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> Link to this post: 
>>> http://www.open-mpi.org/community/lists/users/2016/03/28843.php
>>
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/users/2016/03/28844.php

Re: [OMPI users] loading processes per node

Reply via email to