sorry they are in stderr.

Whaty shoudl I learn from:

n001.cluster.com:27958] MCW rank 0 bound to socket 0[core 0[hwt 0]],
socket 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt
0
]], socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]], socket 0[core
6[hwt 0]], socket 0[core 7[hwt 0]]: [B/B/B/B/B/B/B/B][./././././././.]
[n001.cluster.com:27958] MCW rank 1 bound to socket 1[core 8[hwt 0]],
socket 1[core 9[hwt 0]], socket 1[core 10[hwt 0]], socket 1[core
11[hwt
 0]], socket 1[core 12[hwt 0]], socket 1[core 13[hwt 0]], socket
1[core 14[hwt 0]], socket 1[core 15[hwt 0]]:
[./././././././.][B/B/B/B/B/B/B
/B]
[n001.cluster.com:27958] MCW rank 2 bound to socket 0[core 0[hwt 0]],
socket 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt
0
]], socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]], socket 0[core
6[hwt 0]], socket 0[core 7[hwt 0]]: [B/B/B/B/B/B/B/B][./././././././.]
[n001.cluster.com:27958] MCW rank 3 bound to socket 1[core 8[hwt 0]],
socket 1[core 9[hwt 0]], socket 1[core 10[hwt 0]], socket 1[core
11[hwt
 0]], socket 1[core 12[hwt 0]], socket 1[core 13[hwt 0]], socket
1[core 14[hwt 0]], socket 1[core 15[hwt 0]]:
[./././././././.][B/B/B/B/B/B/B
/B]
[n001.cluster.com:27958] MCW rank 4 bound to socket 0[core 0[hwt 0]],
socket 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt
0]], socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]], socket 0[core
6[hwt 0]], socket 0[core 7[hwt 0]]: [B/B/B/B/B/B/B/B][./././././././.]
[n001.cluster.com:27958] MCW rank 5 bound to socket 1[core 8[hwt 0]],
socket 1[core 9[hwt 0]], socket 1[core 10[hwt 0]], socket 1[core
11[hwt 0]], socket 1[core 12[hwt 0]], socket 1[core 13[hwt 0]], socket
1[core 14[hwt 0]], socket 1[core 15[hwt 0]]:
[./././././././.][B/B/B/B/B/B/B/B]
[n001.cluster.com:27958] MCW rank 6 bound to socket 0[core 0[hwt 0]],
socket 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt
0]], socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]], socket 0[core
6[hwt 0]], socket 0[core 7[hwt 0]]: [B/B/B/B/B/B/B/B][./././././././.]
[n001.cluster.com:27958] MCW rank 7 bound to socket 1[core 8[hwt 0]],
socket 1[core 9[hwt 0]], socket 1[core 10[hwt 0]], socket 1[core
11[hwt 0]], socket 1[core 12[hwt 0]], socket 1[core 13[hwt 0]], socket
1[core 14[hwt 0]], socket 1[core 15[hwt 0]]:
[./././././././.][B/B/B/B/B/B/B/B]
[n002

etc?

---
Ron Cohen
recoh...@gmail.com
skypename: ronaldcohen
twitter: @recohen3


On Fri, Mar 25, 2016 at 1:27 PM, Ronald Cohen <recoh...@gmail.com> wrote:
>  --report-bindings didn't report anything
> ---
> Ron Cohen
> recoh...@gmail.com
> skypename: ronaldcohen
> twitter: @recohen3
>
>
> On Fri, Mar 25, 2016 at 1:24 PM, Ronald Cohen <recoh...@gmail.com> wrote:
>> —display-allocation an
>> didn't seem to give useful information:
>>
>> ======================   ALLOCATED NODES   ======================
>>         n005: slots=16 max_slots=0 slots_inuse=0 state=UP
>>         n008.cluster.com: slots=16 max_slots=0 slots_inuse=0 state=UP
>>         n007.cluster.com: slots=16 max_slots=0 slots_inuse=0 state=UP
>>         n006.cluster.com: slots=16 max_slots=0 slots_inuse=0 state=UP
>> =================================================================
>>
>> for
>> mpirun -display-allocation  --map-by ppr:8:node -n 32
>>
>> Ron
>>
>> ---
>> Ron Cohen
>> recoh...@gmail.com
>> skypename: ronaldcohen
>> twitter: @recohen3
>>
>>
>> On Fri, Mar 25, 2016 at 1:17 PM, Ronald Cohen <recoh...@gmail.com> wrote:
>>> Actually there was the same number of procs per node in each case. I
>>> verified this by logging into the nodes while they were running--in
>>> both cases 4 per node .
>>>
>>> Ron
>>>
>>> ---
>>> Ron Cohen
>>> recoh...@gmail.com
>>> skypename: ronaldcohen
>>> twitter: @recohen3
>>>
>>>
>>> On Fri, Mar 25, 2016 at 1:14 PM, Ralph Castain <r...@open-mpi.org> wrote:
>>>>
>>>>> On Mar 25, 2016, at 9:59 AM, Ronald Cohen <recoh...@gmail.com> wrote:
>>>>>
>>>>> It is very strange but my program runs slower with any of these
>>>>> choices than if IO simply use:
>>>>>
>>>>> mpirun  -n 16
>>>>> with
>>>>> #PBS -l 
>>>>> nodes=n013.cluster.com:ppn=4+n014.cluster.com:ppn=4+n015.cluster.com:ppn=4+n016.cluster.com:ppn=4
>>>>> for example.
>>>>
>>>> This command will tightly pack as many procs as possible on a node - note 
>>>> that we may well not see the PBS directives regarding number of ppn. Add 
>>>> —display-allocation and let’s see how many slots we think were assigned on 
>>>> each node
>>>>
>>>>>
>>>>> The timing for the latter is 165 seconds, and for
>>>>> #PBS -l nodes=4:ppn=16,pmem=1gb
>>>>> mpirun  --map-by ppr:4:node -n 16
>>>>> it is 368 seconds.
>>>>
>>>> It will typically be faster if you pack more procs/node as they can use 
>>>> shared memory for communication.
>>>>
>>>>>
>>>>> Ron
>>>>>
>>>>> ---
>>>>> Ron Cohen
>>>>> recoh...@gmail.com
>>>>> skypename: ronaldcohen
>>>>> twitter: @recohen3
>>>>>
>>>>>
>>>>> On Fri, Mar 25, 2016 at 12:43 PM, Ralph Castain <r...@open-mpi.org> wrote:
>>>>>>
>>>>>>> On Mar 25, 2016, at 9:40 AM, Ronald Cohen <recoh...@gmail.com> wrote:
>>>>>>>
>>>>>>> Thank you! I will try it!
>>>>>>>
>>>>>>>
>>>>>>> What would
>>>>>>> -cpus-per-proc  4 -n 16
>>>>>>> do?
>>>>>>
>>>>>> This would bind each process to 4 cores, filling each node with procs 
>>>>>> until the cores on that node were exhausted, to a total of 16 processes 
>>>>>> within the allocation.
>>>>>>
>>>>>>>
>>>>>>> Ron
>>>>>>> ---
>>>>>>> Ron Cohen
>>>>>>> recoh...@gmail.com
>>>>>>> skypename: ronaldcohen
>>>>>>> twitter: @recohen3
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Mar 25, 2016 at 12:38 PM, Ralph Castain <r...@open-mpi.org> 
>>>>>>> wrote:
>>>>>>>> Add -rank-by node to your cmd line. You’ll still get 4 procs/node, but 
>>>>>>>> they will be ranked by node instead of consecutively within a node.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> On Mar 25, 2016, at 9:30 AM, Ronald Cohen <recoh...@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>> I am using
>>>>>>>>>
>>>>>>>>> mpirun  --map-by ppr:4:node -n 16
>>>>>>>>>
>>>>>>>>> and this loads the processes in round robin fashion. This seems to be
>>>>>>>>> twice as slow for my code as loading them node by node, 4 processes
>>>>>>>>> per node.
>>>>>>>>>
>>>>>>>>> How can I not load them round robin, but node by node?
>>>>>>>>>
>>>>>>>>> Thanks!
>>>>>>>>>
>>>>>>>>> Ron
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> ---
>>>>>>>>> Ron Cohen
>>>>>>>>> recoh...@gmail.com
>>>>>>>>> skypename: ronaldcohen
>>>>>>>>> twitter: @recohen3
>>>>>>>>>
>>>>>>>>> ---
>>>>>>>>> Ronald Cohen
>>>>>>>>> Geophysical Laboratory
>>>>>>>>> Carnegie Institution
>>>>>>>>> 5251 Broad Branch Rd., N.W.
>>>>>>>>> Washington, D.C. 20015
>>>>>>>>> _______________________________________________
>>>>>>>>> users mailing list
>>>>>>>>> us...@open-mpi.org
>>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>> Link to this post: 
>>>>>>>>> http://www.open-mpi.org/community/lists/users/2016/03/28828.php
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> users mailing list
>>>>>>>> us...@open-mpi.org
>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>> Link to this post: 
>>>>>>>> http://www.open-mpi.org/community/lists/users/2016/03/28829.php
>>>>>>> _______________________________________________
>>>>>>> users mailing list
>>>>>>> us...@open-mpi.org
>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>> Link to this post: 
>>>>>>> http://www.open-mpi.org/community/lists/users/2016/03/28830.php
>>>>>>
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> us...@open-mpi.org
>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>> Link to this post: 
>>>>>> http://www.open-mpi.org/community/lists/users/2016/03/28831.php
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>> Link to this post: 
>>>>> http://www.open-mpi.org/community/lists/users/2016/03/28832.php
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> Link to this post: 
>>>> http://www.open-mpi.org/community/lists/users/2016/03/28833.php

Reply via email to