Hi, Am 23.08.2014 um 00:43 schrieb Noah Knowles:
> Hi, I am using OGS/GE 2011.11p1 on ROCKS. We have a small cluster with a > combination of 12- and 16-core blades. We are running an application where > the specific assignment of ranks to nodes has a big effect on run time. Is it > possible, for example, with NP=64 to specify that > > ranks 0-15 go to a 16-core blade, > ranks 16-27 go to a 12-core blade, > ranks 28-39 go to a 12-core blade, > ranks 40-55 go to a 16-core blade, and > ranks 56-63 go to a 12-core blade? > > I tried, for this example, > qsub -binding linear:64 -l > h="compute-0-4|compute-0-0|compute-0-1|compute-0-5|compute-0-2" The binding would only be honored (as it's a soft request), if there would be a node with 64 cores. And it must also be activated in "execd_params" in SGE's configuration. > (where compute nodes 4-5 are 16 core and the others are 12-core), but that > gave me no control over the order in which the nodes were assigned. > > We are experimenting with Intel MPI and OpenMPI-- I couldn't figure out how > to do this with the Intel mpirun options, and rankfiles were causing errors, > so I was hoping to accomplish it with qsub. - Do you have a tight integration of Open MPI into SGE (i.e. compiled with "--with-sge")? - All 64 are MPI processes, no OpenMP threads? - What PE did you use? - You always want complete machines, i.e. you could also request 68 cores? - The rank0 (i.e. where also the jobscript runs) can be selected with: `qsub -masterq foobar@compute-0-4 ...` - Additional machines with: "... -q foobar@compute-0-4,foobar@compute-0-0,foobar@compute-0-1,foobar@compute-0-5,foobar@compute-0-2" (foobar@compute-0-4 needs to be listed in both options, no order of hosts guaranteed) Creating a rankfile out of the granted machinefile should work (i.e. keeping the allocation). As long as you are alone on these machine, it's better when Open MPI would do the binding to cores finally. Jobscript: # Reorder in the way you need them sort $PE_HOSTFILE > RESORTED_HOSTFILE export PE_HOSTFILE=RESORTED_HOSTFILE PeHostfile2RankFile() { rank=0 cat RESORTED_HOSTFILE | while read line; do # echo $line host=`echo $line|cut -f1 -d" "|cut -f1 -d"."` nslots=`echo $line|cut -f2 -d" "` i=0 while [ $i -lt $nslots ]; do echo "rank $rank=$host slot=$i" rank=`expr $rank + 1` i=`expr $i + 1` if [ $rank -eq "$1" ]; then break fi done done } PeHostfile2RankFile 64 > RANKFILE mpiexec -np 64 --rankfile RANKFILE ./mpihello (I don't have such machines, so I gave all the same core to get only the list of locations [slots=0] which seems working) -- Reuti > I hope I'm asking this in the right place-- sorry if not. > Thanks for any help! > Noah > _______________________________________________ > users mailing list > users@gridengine.org > https://gridengine.org/mailman/listinfo/users _______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users