Am 03.04.2011 um 16:56 schrieb Ralph Castain: > On Apr 3, 2011, at 8:14 AM, Laurence Marks wrote: > >> Let me expand on this slightly (in response to Ralph Castain's posting >> -- I had digest mode set). As currently constructed a shellscript in >> Wien2k (www.wien2k.at) launches a series of tasks using >> >> ($remote $remotemachine "cd $PWD;$t $ttt;rm -f .lock_$lockfile[$p]") >>>> .time1_$loop & >> >> where the standard setting for "remote" is "ssh", remotemachine is the >> appropriate host, "t" is "time" and "ttt" is a concatenation of >> commands, for instance when using 2 cores on one node for Task1, 2 >> cores on 2 nodes for Task2 and 2 cores on 1 node for Task3 >> >> Task1: >> mpirun -v -x LD_LIBRARY_PATH -x PATH -np 2 -machinefile .machine1 >> /home/lma712/src/Virgin_10.1/lapw1Q_mpi lapw1Q_1.def >> Task2: >> mpirun -v -x LD_LIBRARY_PATH -x PATH -np 4 -machinefile .machine2 >> /home/lma712/src/Virgin_10.1/lapw1Q_mpi lapw1Q_2.def >> Task3: >> mpirun -v -x LD_LIBRARY_PATH -x PATH -np 2 -machinefile .machine3 >> /home/lma712/src/Virgin_10.1/lapw1Q_mpi lapw1Q_3.def >> >> This is a stable script, works under SGI, linux, mvapich and many >> others using ssh or rsh (although I've never myself used it with rsh). >> It is general purpose, i.e. will work to run just 1 task on 8x8 >> nodes/cores or 8 parallel tasks on 8 nodes all with 8 cores or any >> scatter of nodes/cores. >> >> According to some, ssh is becoming obsolete within supercomputers and >> the "replacement" is pbsdsh at least under Torque. > > Somebody is playing an April Fools joke on you. The majority of > supercomputers use ssh as their sole launch mechanism, and I have seen no > indication that anyone intends to change that situation. That said, Torque is > certainly popular and a good environment.
I operate my Linux clusters without `ssh` or `rsh`. I use SGE's `qrsh` instead. How will you get a tight integration with correct accounting and job control otherwise? This might be different when you have an AIX or NEC SX machine, as they provide additonal control mechanisms. -- Reuti >> Getting pbsdsh is >> certainly not as simple as the documentation I've seen. To get it to >> even partially work I am using for "remote" a script "pbsh" which >> creates an executable bash file where HOME, PATH, LD_LIBRARY_PATH etc >> as well as the PBS environmental variables listed at the bottom of >> http://www.bear.bham.ac.uk/bluebear/pbsdsh.shtml plus PBS_NODEFILE to >> a file $PBS_O_WORKDIR/.tmp_$1 followed by the relevant command and >> then runs >> >> pbsdsh -h $1 /bin/bash -lc " $PBS_O_WORKDIR/.tmp_$1 " >> >> This works fine so long as Task2 does not have 2 nodes (probably 3 as >> well, I've not tested this). If it does there is a communications >> failure with nothing launched on the 2nd node of Task2. >> >> I'm including the script below, as maybe there are some other >> environmental variables needed or some should not be there in order to >> properly rebuilt the environment so things will work. (And yes, I know >> there should be tests to see if the variables are set first and so >> forth and this is not so clean, this is just an initial version.) > > By providing all those PBS-related envars to OMPI, you are causing OMPI to > think it should use Torque as the launch mechanism. Unfortunately, that won't > work in this scenario. > > When you start a Torque job (get an allocation etc.), Torque puts you on one > of the allocated nodes and creates a "sister mom" on that node. This is your > job's "master node". All Torque-based launches must take place from that > location. > > So when you pbsdsh to another node and attempt to execute mpirun with those > envars set, mpirun attempts to contact the local "sister mom" so it can order > the launch of any daemons on other nodes....only the "sister mom" isn't > there! So the connection fails and mpirun aborts. > > If mpirun is -only- launching procs on the local node, then it doesn't need > to launch another daemon (as mpirun will host the local procs itself), and so > it doesn't attempt to contact the "sister mom" and the comm failure doesn't > occur. > > What I still don't understand is why you are trying to do it this way. Why > not just run > > time mpirun -v -x LD_LIBRARY_PATH -x PATH -np 2 -machinefile .machineN > /home/lma712/src/Virgin_10.1/lapw1Q_mpi lapw1Q_1.def > > where machineN contains the names of the nodes where you want the MPI apps to > execute? mpirun will only execute apps on those nodes, so this accomplishes > the same thing as your script - only with a lot less pain. > > Your script would just contain a sequence of these commands, each with its > number of procs and machinefile as required. > > Actually, it would be pretty much identical to the script I use when doing > scaling tests... > > >> >> ---------- >> # Script to replace ssh by pbsdsh >> # Beta version, April 2011, L. D. Marks >> # >> # Remove old file -- needed ! >> rm -f $PBS_O_WORKDIR/.tmp_$1 >> >> # Create a script that exports the environment we have >> # This may not be enough >> echo #!/bin/bash > $PBS_O_WORKDIR/.tmp_$1 >> echo source $HOME/.bashrc >> $PBS_O_WORKDIR/.tmp_$1 >> echo cd $PBS_O_WORKDIR >> $PBS_O_WORKDIR/.tmp_$1 >> echo export PATH=$PBS_O_PATH >> $PBS_O_WORKDIR/.tmp_$1 >> echo export TMPDIR=$TMPDIR >> $PBS_O_WORKDIR/.tmp_$1 >> echo export SCRATCH=$SCRATCH >> $PBS_O_WORKDIR/.tmp_$1 >> echo export LD_LIBRARY_PATH=$LD_LIBRARY_PATH >> $PBS_O_WORKDIR/.tmp_$1 >> >> # Openmpi needs to have this defined, even if we don't use it >> echo export PBS_NODEFILE=$PBS_NODEFILE >> $PBS_O_WORKDIR/.tmp_$1 >> echo export PBS_ENVIRONMENT=$PBS_ENVIRONMENT >> $PBS_O_WORKDIR/.tmp_$1 >> echo export PBS_JOBCOOKIE=$PBS_JOBCOOKIE >> $PBS_O_WORKDIR/.tmp_$1 >> echo export PBS_JOBID=$PBS_JOBID >> $PBS_O_WORKDIR/.tmp_$1 >> echo export PBS_JOBNAME=$PBS_JOBNAME >> $PBS_O_WORKDIR/.tmp_$1 >> echo export PBS_MOMPORT=$PBS_MOMPORT >> $PBS_O_WORKDIR/.tmp_$1 >> echo export PBS_NODENUM=$PBS_NODENUM >> $PBS_O_WORKDIR/.tmp_$1 >> echo export PBS_O_HOME=$PBS_O_HOME >> $PBS_O_WORKDIR/.tmp_$1 >> echo export PBS_O_HOST=$PBS_O_HOST >> $PBS_O_WORKDIR/.tmp_$1 >> echo export PBS_O_LANG=$PBS_O_LANG >> $PBS_O_WORKDIR/.tmp_$1 >> echo export PBS_O_LOGNAME=$PBS_O_LOGNAME >> $PBS_O_WORKDIR/.tmp_$1 >> echo export PBS_O_MAIL=$PBS_O_MAIL >> $PBS_O_WORKDIR/.tmp_$1 >> echo export PBS_O_PATH=$PBS_O_PATH >> $PBS_O_WORKDIR/.tmp_$1 >> echo export PBS_O_QUEUE=$PBS_O_QUEUE >> $PBS_O_WORKDIR/.tmp_$1 >> echo export PBS_O_SHELL=$PBS_O_SHELL >> $PBS_O_WORKDIR/.tmp_$1 >> echo export PBS_O_WORKDIR=$PBS_O_WORKDIR >> $PBS_O_WORKDIR/.tmp_$1 >> echo export PBS_QUEUE=$PBS_QUEUE >> $PBS_O_WORKDIR/.tmp_$1 >> echo export PBS_TASKNUM=$PBS_TASKNUM >> $PBS_O_WORKDIR/.tmp_$1 >> echo export PBS_VNODENUM=$PBS_VNODENUM >> $PBS_O_WORKDIR/.tmp_$1 >> >> # Now the command we want to run >> echo $2 >> $PBS_O_WORKDIR/.tmp_$1 >> >> # Make it executable >> chmod a+x $PBS_O_WORKDIR/.tmp_$1 >> >> pbsdsh -h $1 /bin/bash -lc " $PBS_O_WORKDIR/.tmp_$1 " >> >> #Cleanup if needed (commented out for debugging) >> #rm $PBS_O_WORKDIR/.tmp_$1 >> >> >> On Sat, Apr 2, 2011 at 9:36 PM, Laurence Marks <l-ma...@northwestern.edu> >> wrote: >>> I have a problem which may or may not be openmpi, but since this list >>> was useful before with a race condition I am posting. >>> >>> I am trying to use pbsdsh as a ssh replacement, pushed by sysadmins as >>> Torque does not know about ssh tasks launched from a task. In a simple >>> case, a script launches three mpi tasks in parallel, >>> >>> Task1: NodeA >>> Task2: NodeB and NodeC >>> Task3: NodeD >>> >>> (some cores on each, all handled correctly). Reproducible (but with >>> different nodes and numbers of cores) Task1 and Task3 work fine, the >>> mpi task starts on NodeB but nothing starts on NodeC, it appears that >>> NodeC does not communicate. It does not have to be this it could be >>> >>> Task1: NodeA NodeB >>> Task2: NodeC NodeD >>> >>> Here NodeC will start and it looks as if NodeD never starts anything. >>> I've also run it with 4 Tasks (1,3,4 work) and if Task2 only uses one >>> Node (number of cores do not matter) it is fine. >>> >>> -- >>> Laurence Marks >>> Department of Materials Science and Engineering >>> MSE Rm 2036 Cook Hall >>> 2220 N Campus Drive >>> Northwestern University >>> Evanston, IL 60208, USA >>> Tel: (847) 491-3996 Fax: (847) 491-7820 >>> email: L-marks at northwestern dot edu >>> Web: www.numis.northwestern.edu >>> Chair, Commission on Electron Crystallography of IUCR >>> www.numis.northwestern.edu/ >>> Research is to see what everybody else has seen, and to think what >>> nobody else has thought >>> Albert Szent-Györgi >>> >> >> >> >> -- >> Laurence Marks >> Department of Materials Science and Engineering >> MSE Rm 2036 Cook Hall >> 2220 N Campus Drive >> Northwestern University >> Evanston, IL 60208, USA >> Tel: (847) 491-3996 Fax: (847) 491-7820 >> email: L-marks at northwestern dot edu >> Web: www.numis.northwestern.edu >> Chair, Commission on Electron Crystallography of IUCR >> www.numis.northwestern.edu/ >> Research is to see what everybody else has seen, and to think what >> nobody else has thought >> Albert Szent-Györgi >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >