On 04/04/2011 12:56 AM, Ralph Castain wrote:
What I still don't understand is why you are trying to do it this way. Why not
just run
time mpirun -v -x LD_LIBRARY_PATH -x PATH -np 2 -machinefile .machineN
/home/lma712/src/Virgin_10.1/lapw1Q_mpi lapw1Q_1.def
where machineN contains the names of the nodes where you want the MPI apps to
execute? mpirun will only execute apps on those nodes, so this accomplishes the
same thing as your script - only with a lot less pain.
Your script would just contain a sequence of these commands, each with its
number of procs and machinefile as required.
Maybe I missed why this suggestion of forgetting about the ssh/pbsdsh altogether
was not feasible? Just use mpirun (with its great tm support!) to distribute
MPI jobs.
A simple example:
vayu1:~/MPI > qsub -lncpus=24,vmem=24gb,walltime=10:00 -wd -I
qsub: waiting for job 574900.vu-pbs to start
qsub: job 574900.vu-pbs ready
[dbs900@v250 ~/MPI]$ wc -l $PBS_NODEFILE
24
[dbs900@v250 ~/MPI]$ head -12 $PBS_NODEFILE > m1
[dbs900@v250 ~/MPI]$ tail -12 $PBS_NODEFILE > m2
[dbs900@v250 ~/MPI]$ mpirun --machinefile m1 ./a2a143 120000 30 & mpirun
--machinefile m2 ./pp143
Check how the processes are distributed ...
vayu1:~ > qps 574900.vu-pbs
Node 0: v250:
PID S RSS VSZ %MEM TIME %CPU COMMAND
11420 S 2104 10396 0.0 00:00:00 0.0 -tcsh
11421 S 620 10552 0.0 00:00:00 0.0 pbs_demux
12471 S 2208 49324 0.0 00:00:00 0.9 /apps/openmpi/1.4.3/bin/mpirun
--machinefile m1 ./a2a143 120000 30
12472 S 2116 49312 0.0 00:00:00 0.0 /apps/openmpi/1.4.3/bin/mpirun
--machinefile m2 ./pp143
12535 R 270160 565668 1.0 00:00:02 82.4 ./a2a143 120000 30
12536 R 270032 565536 1.0 00:00:02 81.4 ./a2a143 120000 30
12537 R 270012 565528 1.0 00:00:02 87.3 ./a2a143 120000 30
12538 R 269992 565532 1.0 00:00:02 93.3 ./a2a143 120000 30
12539 R 269980 565516 1.0 00:00:02 81.4 ./a2a143 120000 30
12540 R 270008 565516 1.0 00:00:02 86.3 ./a2a143 120000 30
12541 R 270008 565516 1.0 00:00:02 96.3 ./a2a143 120000 30
12542 R 272064 567568 1.0 00:00:02 91.3 ./a2a143 120000 30
Node 1: v251:
PID S RSS VSZ %MEM TIME %CPU COMMAND
10367 S 1872 40648 0.0 00:00:00 0.0 orted -mca ess env -mca orte_ess_jobid 1444413440 -mca orte_ess_vpid 1 -mca orte_ess_num_procs 2 --hnp-uri
"1444413440.0;tcp://10.1.3.58:37339"
10368 S 1868 40648 0.0 00:00:00 0.0 orted -mca ess env -mca orte_ess_jobid 1444347904 -mca orte_ess_vpid 1 -mca orte_ess_num_procs 3 --hnp-uri
"1444347904.0;tcp://10.1.3.58:39610"
10372 R 271112 567556 1.0 00:00:04 74.5 ./a2a143 120000 30
10373 R 271036 567564 1.0 00:00:04 71.5 ./a2a143 120000 30
10374 R 271032 567560 1.0 00:00:04 66.5 ./a2a143 120000 30
10375 R 273112 569612 1.1 00:00:04 68.5 ./a2a143 120000 30
10378 R 552280 840712 2.2 00:00:04 100 ./pp143
10379 R 552280 840708 2.2 00:00:04 100 ./pp143
10380 R 552328 841576 2.2 00:00:04 100 ./pp143
10381 R 552788 841216 2.2 00:00:04 99.3 ./pp143
Node 2: v252:
PID S RSS VSZ %MEM TIME %CPU COMMAND
10152 S 1908 40780 0.0 00:00:00 0.0 orted -mca ess env -mca orte_ess_jobid 1444347904 -mca orte_ess_vpid 2 -mca orte_ess_num_procs 3 --hnp-uri
"1444347904.0;tcp://10.1.3.58:39610"
10156 R 552384 840200 2.2 00:00:07 99.3 ./pp143
10157 R 551868 839692 2.2 00:00:06 99.3 ./pp143
10158 R 551400 839184 2.2 00:00:07 100 ./pp143
10159 R 551436 839184 2.2 00:00:06 98.3 ./pp143
10160 R 551760 839692 2.2 00:00:07 100 ./pp143
10161 R 551788 839824 2.2 00:00:07 97.3 ./pp143
10162 R 552256 840332 2.2 00:00:07 100 ./pp143
10163 R 552216 840340 2.2 00:00:07 99.3 ./pp143
You would have to do something smarter to get correct process binding etc.