Thanks again Reuti. A combination of -q to choose the nodes, np such that all cores on those nodes are used, and using a re-ordered machinefile in the job script did what I want.

On 08/25/2014 03:01 AM, Reuti wrote:
Am 23.08.2014 um 16:46 schrieb Noah Knowles:

Hi Reuti,

On 08/23/2014 01:38 AM, Reuti wrote:
Am 23.08.2014 um 02:37 schrieb Reuti:

Hi,

Am 23.08.2014 um 00:43 schrieb Noah Knowles:

Hi, I am using OGS/GE 2011.11p1 on ROCKS. We have a small cluster with a 
combination of 12- and 16-core blades. We are running an application where the 
specific assignment of ranks to nodes has a big effect on run time. Is it 
possible, for example, with NP=64 to specify that

ranks  0-15 go to a 16-core blade,
ranks 16-27 go to a 12-core blade,
ranks 28-39 go to a 12-core blade,
ranks 40-55 go to a 16-core blade, and
ranks 56-63 go to a 12-core blade?

I tried, for this example,
qsub -binding linear:64  -l 
h="compute-0-4|compute-0-0|compute-0-1|compute-0-5|compute-0-2"
The binding would only be honored (as it's a soft request), if there would be a node with 
64 cores. And it must also be activated in "execd_params" in SGE's 
configuration.
OK I see. I misunderstood the way that binding works.

(where compute nodes 4-5 are 16 core and the others are 12-core), but that gave 
me no control over the order in which the nodes were assigned.

We are experimenting with Intel MPI and OpenMPI-- I couldn't figure out how to 
do this with the Intel mpirun options, and rankfiles were causing errors, so I 
was hoping to accomplish it with qsub.
- Do you have a tight integration of Open MPI into SGE (i.e. compiled with 
"--with-sge")?
yes
- All 64 are MPI processes, no OpenMP threads?
correct
- What PE did you use?
orte
- You always want complete machines, i.e. you could also request 68 cores?
yes that would be smarter!
- The rank0 (i.e. where also the jobscript runs) can be selected with:

`qsub -masterq foobar@compute-0-4 ...`

- Additional machines with:

"... -q 
foobar@compute-0-4,foobar@compute-0-0,foobar@compute-0-1,foobar@compute-0-5,foobar@compute-0-2"

(foobar@compute-0-4 needs to be listed in both options, no order of hosts 
guaranteed)

Creating a rankfile out of the granted machinefile should work (i.e. keeping 
the allocation). As long as you are alone on these machine, it's better when 
Open MPI would do the binding to cores finally.

Jobscript:

# Reorder in the way you need them
sort $PE_HOSTFILE > RESORTED_HOSTFILE
export PE_HOSTFILE=RESORTED_HOSTFILE

PeHostfile2RankFile()
{
   rank=0
   cat RESORTED_HOSTFILE | while read line; do
      # echo $line
      host=`echo $line|cut -f1 -d" "|cut -f1 -d"."`
      nslots=`echo $line|cut -f2 -d" "`
      i=0
      while [ $i -lt $nslots ]; do
         echo "rank $rank=$host slot=$i"
         rank=`expr $rank + 1`
         i=`expr $i + 1`
         if [ $rank -eq "$1" ]; then
            break
         fi
      done
   done
}

PeHostfile2RankFile 64 > RANKFILE

mpiexec -np 64 --rankfile RANKFILE ./mpihello

(I don't have such machines, so I gave all the same core to get only the list 
of locations [slots=0] which seems working)
One additional thought: OpenMPI fills the machines according to the given 
machinefile. Maybe you don't need to provide a rankfile at all when the 
machinefile has already be rearranged.
OK thanks, I'll try that Monday or when the kids are sleeping. Even if I don't 
need it, it's helpful to see the script too.
Thanks so much for your very helpful (and quick) replies Reuti!
One additional note I forgot to mention: using hostgroups or a pattern, you 
could also shorten the list of machines:

'... -q foobar@compute-0-[40152]', '... -q "*@*0-[40152]"' or even '... -q 
"*@*[40152]"'

depending on the names of your queues/machines (see man `sge_types` section 
"pattern").

-- Reuti


Noah
-- Reuti


-- Reuti


I hope I'm asking this in the right place-- sorry if not.
Thanks for any help!
Noah
_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users


_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to