Re: [OMPI users] loading processes per node

Ronald Cohen Fri, 25 Mar 2016 13:36:49 -0400 (EDT)

The -n 32 run ion contrast gave:

[n011.cluster.com:05847] MCW rank 0 bound to socket 0[core 0[hwt 0]],
socket 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt
0]], socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]], socket 0[core
6[hwt 0]], socket 0[core 7[hwt 0]]: [B/B/B/B/B/B/B/B][./././././././.]
[n011.cluster.com:05847] MCW rank 1 bound to socket 1[core 8[hwt 0]],
socket 1[core 9[hwt 0]], socket 1[core 10[hwt 0]], socket 1[core
11[hwt 0]], socket 1[core 12[hwt 0]], socket 1[core 13[hwt 0]], socket
1[core 14[hwt 0]], socket 1[core 15[hwt 0]]:
[./././././././.][B/B/B/B/B/B/B/B]
[n011.cluster.com:05847] MCW rank 2 bound to socket 0[core 0[hwt 0]],
socket 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt
0]], socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]], socket 0[core
6[hwt 0]], socket 0[core 7[hwt 0]]: [B/B/B/B/B/B/B/B][./././././././.]
[n011.cluster.com:05847] MCW rank 3 bound to socket 1[core 8[hwt 0]],
socket 1[core 9[hwt 0]], socket 1[core 10[hwt 0]], socket 1[core
11[hwt 0]], socket 1[core 12[hwt 0]], socket 1[core 13[hwt 0]], socket
1[core 14[hwt 0]], socket 1[core 15[hwt 0]]:
[./././././././.][B/B/B/B/B/B/B/B]
[n011.cluster.com:05847] MCW rank 4 bound to socket 0[core 0[hwt 0]],
socket 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt
0]], socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]], socket 0[core
6[hwt 0]], socket 0[core 7[hwt 0]]: [B/B/B/B/B/B/B/B][./././././././.]
[n011.cluster.com:05847] MCW rank 5 bound to socket 1[core 8[hwt 0]],
socket 1[core 9[hwt 0]], socket 1[core 10[hwt 0]], socket 1[core
11[hwt 0]], socket 1[core 12[hwt 0]], socket 1[core 13[hwt 0]], socket
1[core 14[hwt 0]], socket 1[core 15[hwt 0]]:
[./././././././.][B/B/B/B/B/B/B/B]
[n011.cluster.com:05847] MCW rank 6 bound to socket 0[core 0[hwt 0]],
socket 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt
0]], socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]], socket 0[core
6[hwt 0]], socket 0[core 7[hwt 0]]: [B/B/B/B/B/B/B/B][./././././././.]
[n011.cluster.com:05847] MCW rank 7 bound to socket 1[core 8[hwt 0]],
socket 1[core 9[hwt 0]], socket 1[core 10[hwt 0]], socket 1[core
11[hwt 0]], socket 1[core 12[hwt 0]], socket 1[core 13[hwt 0]], socket
1[core 14[hwt 0]], socket 1[core 15[hwt 0]]:
[./././././././.][B/B/B/B/B/B/B/B]
[n019.cluster.com:02562] MCW rank 24 bound to socket 0[core 0[hwt 0]],
socket 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt
0]], socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]], socket 0[core
6[hwt 0]], socket 0[core 7[hwt 0]]: [B/B/B/B/B/B/B/B][./././././././.]
---
Ron Cohen
recoh...@gmail.com
skypename: ronaldcohen
twitter: @recohen3



On Fri, Mar 25, 2016 at 1:29 PM, Ronald Cohen <recoh...@gmail.com> wrote:
> sorry they are in stderr.
>
> Whaty shoudl I learn from:
>
> n001.cluster.com:27958] MCW rank 0 bound to socket 0[core 0[hwt 0]],
> socket 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt
> 0
> ]], socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]], socket 0[core
> 6[hwt 0]], socket 0[core 7[hwt 0]]: [B/B/B/B/B/B/B/B][./././././././.]
> [n001.cluster.com:27958] MCW rank 1 bound to socket 1[core 8[hwt 0]],
> socket 1[core 9[hwt 0]], socket 1[core 10[hwt 0]], socket 1[core
> 11[hwt
>  0]], socket 1[core 12[hwt 0]], socket 1[core 13[hwt 0]], socket
> 1[core 14[hwt 0]], socket 1[core 15[hwt 0]]:
> [./././././././.][B/B/B/B/B/B/B
> /B]
> [n001.cluster.com:27958] MCW rank 2 bound to socket 0[core 0[hwt 0]],
> socket 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt
> 0
> ]], socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]], socket 0[core
> 6[hwt 0]], socket 0[core 7[hwt 0]]: [B/B/B/B/B/B/B/B][./././././././.]
> [n001.cluster.com:27958] MCW rank 3 bound to socket 1[core 8[hwt 0]],
> socket 1[core 9[hwt 0]], socket 1[core 10[hwt 0]], socket 1[core
> 11[hwt
>  0]], socket 1[core 12[hwt 0]], socket 1[core 13[hwt 0]], socket
> 1[core 14[hwt 0]], socket 1[core 15[hwt 0]]:
> [./././././././.][B/B/B/B/B/B/B
> /B]
> [n001.cluster.com:27958] MCW rank 4 bound to socket 0[core 0[hwt 0]],
> socket 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt
> 0]], socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]], socket 0[core
> 6[hwt 0]], socket 0[core 7[hwt 0]]: [B/B/B/B/B/B/B/B][./././././././.]
> [n001.cluster.com:27958] MCW rank 5 bound to socket 1[core 8[hwt 0]],
> socket 1[core 9[hwt 0]], socket 1[core 10[hwt 0]], socket 1[core
> 11[hwt 0]], socket 1[core 12[hwt 0]], socket 1[core 13[hwt 0]], socket
> 1[core 14[hwt 0]], socket 1[core 15[hwt 0]]:
> [./././././././.][B/B/B/B/B/B/B/B]
> [n001.cluster.com:27958] MCW rank 6 bound to socket 0[core 0[hwt 0]],
> socket 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt
> 0]], socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]], socket 0[core
> 6[hwt 0]], socket 0[core 7[hwt 0]]: [B/B/B/B/B/B/B/B][./././././././.]
> [n001.cluster.com:27958] MCW rank 7 bound to socket 1[core 8[hwt 0]],
> socket 1[core 9[hwt 0]], socket 1[core 10[hwt 0]], socket 1[core
> 11[hwt 0]], socket 1[core 12[hwt 0]], socket 1[core 13[hwt 0]], socket
> 1[core 14[hwt 0]], socket 1[core 15[hwt 0]]:
> [./././././././.][B/B/B/B/B/B/B/B]
> [n002
>
> etc?
>
> ---
> Ron Cohen
> recoh...@gmail.com
> skypename: ronaldcohen
> twitter: @recohen3
>
>
> On Fri, Mar 25, 2016 at 1:27 PM, Ronald Cohen <recoh...@gmail.com> wrote:
>>  --report-bindings didn't report anything
>> ---
>> Ron Cohen
>> recoh...@gmail.com
>> skypename: ronaldcohen
>> twitter: @recohen3
>>
>>
>> On Fri, Mar 25, 2016 at 1:24 PM, Ronald Cohen <recoh...@gmail.com> wrote:
>>> —display-allocation an
>>> didn't seem to give useful information:
>>>
>>> ======================   ALLOCATED NODES   ======================
>>>         n005: slots=16 max_slots=0 slots_inuse=0 state=UP
>>>         n008.cluster.com: slots=16 max_slots=0 slots_inuse=0 state=UP
>>>         n007.cluster.com: slots=16 max_slots=0 slots_inuse=0 state=UP
>>>         n006.cluster.com: slots=16 max_slots=0 slots_inuse=0 state=UP
>>> =================================================================
>>>
>>> for
>>> mpirun -display-allocation  --map-by ppr:8:node -n 32
>>>
>>> Ron
>>>
>>> ---
>>> Ron Cohen
>>> recoh...@gmail.com
>>> skypename: ronaldcohen
>>> twitter: @recohen3
>>>
>>>
>>> On Fri, Mar 25, 2016 at 1:17 PM, Ronald Cohen <recoh...@gmail.com> wrote:
>>>> Actually there was the same number of procs per node in each case. I
>>>> verified this by logging into the nodes while they were running--in
>>>> both cases 4 per node .
>>>>
>>>> Ron
>>>>
>>>> ---
>>>> Ron Cohen
>>>> recoh...@gmail.com
>>>> skypename: ronaldcohen
>>>> twitter: @recohen3
>>>>
>>>>
>>>> On Fri, Mar 25, 2016 at 1:14 PM, Ralph Castain <r...@open-mpi.org> wrote:
>>>>>
>>>>>> On Mar 25, 2016, at 9:59 AM, Ronald Cohen <recoh...@gmail.com> wrote:
>>>>>>
>>>>>> It is very strange but my program runs slower with any of these
>>>>>> choices than if IO simply use:
>>>>>>
>>>>>> mpirun  -n 16
>>>>>> with
>>>>>> #PBS -l 
>>>>>> nodes=n013.cluster.com:ppn=4+n014.cluster.com:ppn=4+n015.cluster.com:ppn=4+n016.cluster.com:ppn=4
>>>>>> for example.
>>>>>
>>>>> This command will tightly pack as many procs as possible on a node - note 
>>>>> that we may well not see the PBS directives regarding number of ppn. Add 
>>>>> —display-allocation and let’s see how many slots we think were assigned 
>>>>> on each node
>>>>>
>>>>>>
>>>>>> The timing for the latter is 165 seconds, and for
>>>>>> #PBS -l nodes=4:ppn=16,pmem=1gb
>>>>>> mpirun  --map-by ppr:4:node -n 16
>>>>>> it is 368 seconds.
>>>>>
>>>>> It will typically be faster if you pack more procs/node as they can use 
>>>>> shared memory for communication.
>>>>>
>>>>>>
>>>>>> Ron
>>>>>>
>>>>>> ---
>>>>>> Ron Cohen
>>>>>> recoh...@gmail.com
>>>>>> skypename: ronaldcohen
>>>>>> twitter: @recohen3
>>>>>>
>>>>>>
>>>>>> On Fri, Mar 25, 2016 at 12:43 PM, Ralph Castain <r...@open-mpi.org> 
>>>>>> wrote:
>>>>>>>
>>>>>>>> On Mar 25, 2016, at 9:40 AM, Ronald Cohen <recoh...@gmail.com> wrote:
>>>>>>>>
>>>>>>>> Thank you! I will try it!
>>>>>>>>
>>>>>>>>
>>>>>>>> What would
>>>>>>>> -cpus-per-proc  4 -n 16
>>>>>>>> do?
>>>>>>>
>>>>>>> This would bind each process to 4 cores, filling each node with procs 
>>>>>>> until the cores on that node were exhausted, to a total of 16 processes 
>>>>>>> within the allocation.
>>>>>>>
>>>>>>>>
>>>>>>>> Ron
>>>>>>>> ---
>>>>>>>> Ron Cohen
>>>>>>>> recoh...@gmail.com
>>>>>>>> skypename: ronaldcohen
>>>>>>>> twitter: @recohen3
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Mar 25, 2016 at 12:38 PM, Ralph Castain <r...@open-mpi.org> 
>>>>>>>> wrote:
>>>>>>>>> Add -rank-by node to your cmd line. You’ll still get 4 procs/node, 
>>>>>>>>> but they will be ranked by node instead of consecutively within a 
>>>>>>>>> node.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> On Mar 25, 2016, at 9:30 AM, Ronald Cohen <recoh...@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>> I am using
>>>>>>>>>>
>>>>>>>>>> mpirun  --map-by ppr:4:node -n 16
>>>>>>>>>>
>>>>>>>>>> and this loads the processes in round robin fashion. This seems to be
>>>>>>>>>> twice as slow for my code as loading them node by node, 4 processes
>>>>>>>>>> per node.
>>>>>>>>>>
>>>>>>>>>> How can I not load them round robin, but node by node?
>>>>>>>>>>
>>>>>>>>>> Thanks!
>>>>>>>>>>
>>>>>>>>>> Ron
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> ---
>>>>>>>>>> Ron Cohen
>>>>>>>>>> recoh...@gmail.com
>>>>>>>>>> skypename: ronaldcohen
>>>>>>>>>> twitter: @recohen3
>>>>>>>>>>
>>>>>>>>>> ---
>>>>>>>>>> Ronald Cohen
>>>>>>>>>> Geophysical Laboratory
>>>>>>>>>> Carnegie Institution
>>>>>>>>>> 5251 Broad Branch Rd., N.W.
>>>>>>>>>> Washington, D.C. 20015
>>>>>>>>>> _______________________________________________
>>>>>>>>>> users mailing list
>>>>>>>>>> us...@open-mpi.org
>>>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>>> Link to this post: 
>>>>>>>>>> http://www.open-mpi.org/community/lists/users/2016/03/28828.php
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> users mailing list
>>>>>>>>> us...@open-mpi.org
>>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>> Link to this post: 
>>>>>>>>> http://www.open-mpi.org/community/lists/users/2016/03/28829.php
>>>>>>>> _______________________________________________
>>>>>>>> users mailing list
>>>>>>>> us...@open-mpi.org
>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>> Link to this post: 
>>>>>>>> http://www.open-mpi.org/community/lists/users/2016/03/28830.php
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> users mailing list
>>>>>>> us...@open-mpi.org
>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>> Link to this post: 
>>>>>>> http://www.open-mpi.org/community/lists/users/2016/03/28831.php
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> us...@open-mpi.org
>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>> Link to this post: 
>>>>>> http://www.open-mpi.org/community/lists/users/2016/03/28832.php
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>> Link to this post: 
>>>>> http://www.open-mpi.org/community/lists/users/2016/03/28833.php

Re: [OMPI users] loading processes per node

Reply via email to