Hi,

Am 23.08.2014 um 16:46 schrieb Noah Knowles:

> Hi Reuti,
> 
> On 08/23/2014 01:38 AM, Reuti wrote:
>> Am 23.08.2014 um 02:37 schrieb Reuti:
>> 
>>> Hi,
>>> 
>>> Am 23.08.2014 um 00:43 schrieb Noah Knowles:
>>> 
>>>> Hi, I am using OGS/GE 2011.11p1 on ROCKS. We have a small cluster with a 
>>>> combination of 12- and 16-core blades. We are running an application where 
>>>> the specific assignment of ranks to nodes has a big effect on run time. Is 
>>>> it possible, for example, with NP=64 to specify that
>>>> 
>>>> ranks  0-15 go to a 16-core blade,
>>>> ranks 16-27 go to a 12-core blade,
>>>> ranks 28-39 go to a 12-core blade,
>>>> ranks 40-55 go to a 16-core blade, and
>>>> ranks 56-63 go to a 12-core blade?
>>>> 
>>>> I tried, for this example,
>>>> qsub -binding linear:64  -l 
>>>> h="compute-0-4|compute-0-0|compute-0-1|compute-0-5|compute-0-2"
>>> The binding would only be honored (as it's a soft request), if there would 
>>> be a node with 64 cores. And it must also be activated in "execd_params" in 
>>> SGE's configuration.
> OK I see. I misunderstood the way that binding works.
>>> 
>>> 
>>>> (where compute nodes 4-5 are 16 core and the others are 12-core), but that 
>>>> gave me no control over the order in which the nodes were assigned.
>>>> 
>>>> We are experimenting with Intel MPI and OpenMPI-- I couldn't figure out 
>>>> how to do this with the Intel mpirun options, and rankfiles were causing 
>>>> errors, so I was hoping to accomplish it with qsub.
>>> - Do you have a tight integration of Open MPI into SGE (i.e. compiled with 
>>> "--with-sge")?
> yes
>>> - All 64 are MPI processes, no OpenMP threads?
> correct
>>> - What PE did you use?
> orte

Ok, unclear question. The important thing would be the "allocation_rule". But 
both "$fill_up" or "$round_robin" will do as you request all 68 slots of the 
machines you list.

Requesting only 64 slots might lead to the effect that foobar@compute-0-4 gets 
the master slot for sure, but the node won't be filled completely with either 
allocation rule, instead the last node compute-0-2 is completely filled and the 
first node has still 4 slots free (even attaching an exclusive complex which 
you request in `qsub` won't prohibit this).

-- Reuti


>>> - You always want complete machines, i.e. you could also request 68 cores?
> yes that would be smarter!
>>> - The rank0 (i.e. where also the jobscript runs) can be selected with:
>>> 
>>> `qsub -masterq foobar@compute-0-4 ...`
>>> 
>>> - Additional machines with:
>>> 
>>> "... -q 
>>> foobar@compute-0-4,foobar@compute-0-0,foobar@compute-0-1,foobar@compute-0-5,foobar@compute-0-2"
>>> 
>>> (foobar@compute-0-4 needs to be listed in both options, no order of hosts 
>>> guaranteed)
>>> 
>>> Creating a rankfile out of the granted machinefile should work (i.e. 
>>> keeping the allocation). As long as you are alone on these machine, it's 
>>> better when Open MPI would do the binding to cores finally.
>>> 
>>> Jobscript:
>>> 
>>> # Reorder in the way you need them
>>> sort $PE_HOSTFILE > RESORTED_HOSTFILE
>>> export PE_HOSTFILE=RESORTED_HOSTFILE
>>> 
>>> PeHostfile2RankFile()
>>> {
>>>   rank=0
>>>   cat RESORTED_HOSTFILE | while read line; do
>>>      # echo $line
>>>      host=`echo $line|cut -f1 -d" "|cut -f1 -d"."`
>>>      nslots=`echo $line|cut -f2 -d" "`
>>>      i=0
>>>      while [ $i -lt $nslots ]; do
>>>         echo "rank $rank=$host slot=$i"
>>>         rank=`expr $rank + 1`
>>>         i=`expr $i + 1`
>>>         if [ $rank -eq "$1" ]; then
>>>            break
>>>         fi
>>>      done
>>>   done
>>> }
>>> 
>>> PeHostfile2RankFile 64 > RANKFILE
>>> 
>>> mpiexec -np 64 --rankfile RANKFILE ./mpihello
>>> 
>>> (I don't have such machines, so I gave all the same core to get only the 
>>> list of locations [slots=0] which seems working)
>> One additional thought: OpenMPI fills the machines according to the given 
>> machinefile. Maybe you don't need to provide a rankfile at all when the 
>> machinefile has already be rearranged.
> OK thanks, I'll try that Monday or when the kids are sleeping. Even if I 
> don't need it, it's helpful to see the script too.
> Thanks so much for your very helpful (and quick) replies Reuti!
> Noah
>> 
>> -- Reuti
>> 
>> 
>>> -- Reuti
>>> 
>>> 
>>>> I hope I'm asking this in the right place-- sorry if not.
>>>> Thanks for any help!
>>>> Noah
>>>> _______________________________________________
>>>> users mailing list
>>>> users@gridengine.org
>>>> https://gridengine.org/mailman/listinfo/users
>>> 
>>> _______________________________________________
>>> users mailing list
>>> users@gridengine.org
>>> https://gridengine.org/mailman/listinfo/users
>> 
> 


_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to