Hmmm...yes, I guess we did get off-track then. This soln is exactly what I 
proposed on the first response to your thread, and was repeated by others later 
on. :-/

So long as mpirun is executed on the node where the "sister mom" is located, 
and as long as your script "B" does -not- include an "mpirun" cmd,  this will 
work fine.

On Apr 4, 2011, at 8:38 AM, Laurence Marks wrote:

> Thanks, I think we may have a mistaken communication here; I assume
> that the computer where they have disabled rsh and ssh they have
> "something" to communicate with so we don't need to use pbsdsh. If
> they don't there is not much a lowly user like me can do.
> 
> I think we can close this, since like many things the answer is
> "simple" when you find it and I think I have. Forget pbsdsh which
> seems to be a bit flakey and probably is not being maintained much.
> Instead, use mpirun to replace ssh. In other words replace
> 
> ssh A B
> 
> to execute command B on node A by
> 
> mpirun -np 1 --host A bash -c " B "
> 
> (with variables appropriately substituted, or with csh instead of
> bash). Then -x (in OMPI) can be used to export whatever is needed in
> the environment etc, which pbsdsh lacks, and there should be similarly
> environment exporting with other MPI. With whatever minor changes are
> needed for other flavors of MPI I believe this should be 99% robust
> and portable. This passes the simple test with B of "sleep 600" when
> terminating the process where the mpirun is launched kills the sleep
> on a remote node (unlike ssh on some but not all computers).
> 
> On Mon, Apr 4, 2011 at 6:35 AM, Ralph Castain <r...@open-mpi.org> wrote:
>> I apologize - I realized late last night that I had a typo in my recommended 
>> command. It should read:
>> 
>> mpirun -mca plm rsh -mca plm_rsh_agent pbsdsh -mca ras ^tm --machinefile 
>> m1....
>>                                      ^^^^^^^^^^^^^^^^^^^
>> 
>> Also, if you know that #procs <= #cores on your nodes, you can greatly 
>> improve performance by adding "--bind-to-core".
>> 
>> 
>> 
>> On Apr 3, 2011, at 5:28 PM, Laurence Marks wrote:
>> 
>>> And, before someone wonders, while Wien2k is a commercial code it is
>>> about 500 Eu for a lifetime licence so this is not the same as Vasp or
>>> Gaussian which cost $$$$$. And, I have no financial interest in the
>>> code, but like many others help make it better (semi gnu).
>>> 
>>> On Sun, Apr 3, 2011 at 6:25 PM, Laurence Marks <l-ma...@northwestern.edu> 
>>> wrote:
>>>> Thanks. I will test this tomorrow.
>>>> 
>>>> Many people run Wien2k with openmpi as you say, I only became aware of
>>>> the issue of Wien2k (and perhaps other codes) leaving orphaned
>>>> processes still running a few days ago. I also know someone who wants
>>>> to run Wien2k on a system where both rsh and ssh are banned.
>>>> Personally, as I don't want to be banned from the supercomputers I use
>>>> I want to find a adequate patch for myself --- and then try and
>>>> persuade the developers to adopt it.
>>>> 
>>>> On Sun, Apr 3, 2011 at 6:13 PM, Ralph Castain <r...@open-mpi.org> wrote:
>>>>> 
>>>>> On Apr 3, 2011, at 4:37 PM, Laurence Marks wrote:
>>>>> 
>>>>>> On Sun, Apr 3, 2011 at 5:08 PM, Reuti <re...@staff.uni-marburg.de> wrote:
>>>>>>> Am 03.04.2011 um 23:59 schrieb David Singleton:
>>>>>>> 
>>>>>>>> On 04/04/2011 12:56 AM, Ralph Castain wrote:
>>>>>>>>> 
>>>>>>>>> What I still don't understand is why you are trying to do it this 
>>>>>>>>> way. Why not just run
>>>>>>>>> 
>>>>>>>>> time mpirun -v -x LD_LIBRARY_PATH -x PATH -np 2 -machinefile 
>>>>>>>>> .machineN /home/lma712/src/Virgin_10.1/lapw1Q_mpi lapw1Q_1.def
>>>>>>>>> 
>>>>>>>>> where machineN contains the names of the nodes where you want the MPI 
>>>>>>>>> apps to execute? mpirun will only execute apps on those nodes, so 
>>>>>>>>> this accomplishes the same thing as your script - only with a lot 
>>>>>>>>> less pain.
>>>>>>>>> 
>>>>>>>>> Your script would just contain a sequence of these commands, each 
>>>>>>>>> with its number of procs and machinefile as required.
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> Maybe I missed why this suggestion of forgetting about the ssh/pbsdsh 
>>>>>>>> altogether
>>>>>>>> was not feasible?  Just use mpirun (with its great tm support!) to 
>>>>>>>> distribute
>>>>>>>> MPI jobs.
>>>>>>> 
>>>>>>> Wien2k has a two stage startup, e.g. for 16 slots:
>>>>>>> 
>>>>>>> a) start 4 times `ssh` in the background to go to some of the granted 
>>>>>>> nodes
>>>>>>> b) use there on each node `mpirun` to start processes on the remaining 
>>>>>>> nodes, 3 for each call
>>>>>>> 
>>>>>>> Problems:
>>>>>>> 
>>>>>>> 1) control `ssh` under Torque
>>>>>>> 2) provide a partially hostlist to `mpirun`, maybe by disabling the 
>>>>>>> default tight integration
>>>>>>> 
>>>>>>> -- Reuti
>>>>>>> 
>>>>>> 
>>>>>> 1) The mpi tasks can be started on only one node (Reuti, "setenv
>>>>>> MPI_REMOTE 0" in parallel_options which was introduced for other
>>>>>> reasons in 9.3 and later releases). That seems to be safe and maybe
>>>>>> the only viable method with OMPI as pbsdsh appears to be unable to
>>>>>> launch mpi tasks correctly (or needs some environmental variables that
>>>>>> I don't know about).
>>>>>> 2) This is already done (Reuti, this is .machine0, .machine1 etc. If
>>>>>> you need information about setting up the Wien2k file under qsub in
>>>>>> general, contact me offline or look for Machines2W on the mailing list
>>>>>> which may be part of the next release, I'm not sure and I don't make
>>>>>> those decisions).
>>>>>> 
>>>>>> However, there is another layer that Ruedi did not mention for this
>>>>>> code which is that some processes also need to be remotely launched to
>>>>>> ensure that the correct scratch directories are used (i.e. local
>>>>>> storage which is faster rather than nfs or similar). Maybe pbsdsh can
>>>>>> be used for this, I am still testing and I am not sure. It may be
>>>>>> enough to create a script with all important environmental variables
>>>>>> exported (as they may not all be in .bashrc or .cshrc) although there
>>>>>> might be issues making this fully portable. Since there are > 1000
>>>>>> licenses of Wien2k, it has to be able to cope with different OS's, and
>>>>>> not just OMPI.
>>>>>> 
>>>>> 
>>>>> Here is what I would do, based on my knowledge of OMPI's internals (and I 
>>>>> wrote the launchers :-)):
>>>>> 
>>>>> 1. do not use your script - you don't want all those PBS envars to 
>>>>> confuse OMPI
>>>>> 
>>>>> 2. mpirun -mca plm rsh -launch-agent pbsdsh -mca ras ^tm --machinefile 
>>>>> m1....
>>>>> 
>>>>> This cmd line tells mpirun to use the "rsh/ssh" launcher, but to 
>>>>> substitute "pbsdsh" for "ssh". It also tells it to ignore the 
>>>>> PBS_NODEFILE and just use the machinefile for the nodes to be used for 
>>>>> that job.
>>>>> 
>>>>> I can't swear this will work as I have never verified that pbsdsh and ssh 
>>>>> have the same syntax, but I -think- that was true. If so, then this might 
>>>>> do what you are attempting.
>>>>> 
>>>>> 
>>>>> I know people have run Wien2k with OMPI before - but I have never heard 
>>>>> of the problems you are reporting.
>>>>> 
>>>>> 
>>>>>>> 
>>>>>>>> A simple example:
>>>>>>>> 
>>>>>>>> vayu1:~/MPI > qsub -lncpus=24,vmem=24gb,walltime=10:00 -wd -I
>>>>>>>> qsub: waiting for job 574900.vu-pbs to start
>>>>>>>> qsub: job 574900.vu-pbs ready
>>>>>>>> 
>>>>>>>> [dbs900@v250 ~/MPI]$ wc -l $PBS_NODEFILE
>>>>>>>> 24
>>>>>>>> [dbs900@v250 ~/MPI]$ head -12 $PBS_NODEFILE > m1
>>>>>>>> [dbs900@v250 ~/MPI]$ tail -12 $PBS_NODEFILE > m2
>>>>>>>> [dbs900@v250 ~/MPI]$ mpirun --machinefile m1 ./a2a143 120000 30 & 
>>>>>>>> mpirun --machinefile m2 ./pp143
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Check how the processes are distributed ...
>>>>>>>> 
>>>>>>>> vayu1:~ > qps 574900.vu-pbs
>>>>>>>> Node 0: v250:
>>>>>>>>  PID S   RSS    VSZ %MEM     TIME %CPU COMMAND
>>>>>>>> 11420 S  2104  10396  0.0 00:00:00  0.0 -tcsh
>>>>>>>> 11421 S   620  10552  0.0 00:00:00  0.0 pbs_demux
>>>>>>>> 12471 S  2208  49324  0.0 00:00:00  0.9 /apps/openmpi/1.4.3/bin/mpirun 
>>>>>>>> --machinefile m1 ./a2a143 120000 30
>>>>>>>> 12472 S  2116  49312  0.0 00:00:00  0.0 /apps/openmpi/1.4.3/bin/mpirun 
>>>>>>>> --machinefile m2 ./pp143
>>>>>>>> 12535 R 270160 565668  1.0 00:00:02 82.4 ./a2a143 120000 30
>>>>>>>> 12536 R 270032 565536  1.0 00:00:02 81.4 ./a2a143 120000 30
>>>>>>>> 12537 R 270012 565528  1.0 00:00:02 87.3 ./a2a143 120000 30
>>>>>>>> 12538 R 269992 565532  1.0 00:00:02 93.3 ./a2a143 120000 30
>>>>>>>> 12539 R 269980 565516  1.0 00:00:02 81.4 ./a2a143 120000 30
>>>>>>>> 12540 R 270008 565516  1.0 00:00:02 86.3 ./a2a143 120000 30
>>>>>>>> 12541 R 270008 565516  1.0 00:00:02 96.3 ./a2a143 120000 30
>>>>>>>> 12542 R 272064 567568  1.0 00:00:02 91.3 ./a2a143 120000 30
>>>>>>>> Node 1: v251:
>>>>>>>>  PID S   RSS    VSZ %MEM     TIME %CPU COMMAND
>>>>>>>> 10367 S  1872  40648  0.0 00:00:00  0.0 orted -mca ess env -mca 
>>>>>>>> orte_ess_jobid 1444413440 -mca orte_ess_vpid 1 -mca orte_ess_num_procs 
>>>>>>>> 2 --hnp-uri "1444413440.0;tcp://10.1.3.58:37339"
>>>>>>>> 10368 S  1868  40648  0.0 00:00:00  0.0 orted -mca ess env -mca 
>>>>>>>> orte_ess_jobid 1444347904 -mca orte_ess_vpid 1 -mca orte_ess_num_procs 
>>>>>>>> 3 --hnp-uri "1444347904.0;tcp://10.1.3.58:39610"
>>>>>>>> 10372 R 271112 567556  1.0 00:00:04 74.5 ./a2a143 120000 30
>>>>>>>> 10373 R 271036 567564  1.0 00:00:04 71.5 ./a2a143 120000 30
>>>>>>>> 10374 R 271032 567560  1.0 00:00:04 66.5 ./a2a143 120000 30
>>>>>>>> 10375 R 273112 569612  1.1 00:00:04 68.5 ./a2a143 120000 30
>>>>>>>> 10378 R 552280 840712  2.2 00:00:04 100 ./pp143
>>>>>>>> 10379 R 552280 840708  2.2 00:00:04 100 ./pp143
>>>>>>>> 10380 R 552328 841576  2.2 00:00:04 100 ./pp143
>>>>>>>> 10381 R 552788 841216  2.2 00:00:04 99.3 ./pp143
>>>>>>>> Node 2: v252:
>>>>>>>>  PID S   RSS    VSZ %MEM     TIME %CPU COMMAND
>>>>>>>> 10152 S  1908  40780  0.0 00:00:00  0.0 orted -mca ess env -mca 
>>>>>>>> orte_ess_jobid 1444347904 -mca orte_ess_vpid 2 -mca orte_ess_num_procs 
>>>>>>>> 3 --hnp-uri "1444347904.0;tcp://10.1.3.58:39610"
>>>>>>>> 10156 R 552384 840200  2.2 00:00:07 99.3 ./pp143
>>>>>>>> 10157 R 551868 839692  2.2 00:00:06 99.3 ./pp143
>>>>>>>> 10158 R 551400 839184  2.2 00:00:07 100 ./pp143
>>>>>>>> 10159 R 551436 839184  2.2 00:00:06 98.3 ./pp143
>>>>>>>> 10160 R 551760 839692  2.2 00:00:07 100 ./pp143
>>>>>>>> 10161 R 551788 839824  2.2 00:00:07 97.3 ./pp143
>>>>>>>> 10162 R 552256 840332  2.2 00:00:07 100 ./pp143
>>>>>>>> 10163 R 552216 840340  2.2 00:00:07 99.3 ./pp143
>>>>>>>> 
>>>>>>>> 
>>>>>>>> You would have to do something smarter to get correct process binding 
>>>>>>>> etc.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> _______________________________________________
>>>>>>>> users mailing list
>>>>>>>> us...@open-mpi.org
>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>> 
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> users mailing list
>>>>>>> us...@open-mpi.org
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> Laurence Marks
>>>>>> Department of Materials Science and Engineering
>>>>>> MSE Rm 2036 Cook Hall
>>>>>> 2220 N Campus Drive
>>>>>> Northwestern University
>>>>>> Evanston, IL 60208, USA
>>>>>> Tel: (847) 491-3996 Fax: (847) 491-7820
>>>>>> email: L-marks at northwestern dot edu
>>>>>> Web: www.numis.northwestern.edu
>>>>>> Chair, Commission on Electron Crystallography of IUCR
>>>>>> www.numis.northwestern.edu/
>>>>>> Research is to see what everybody else has seen, and to think what
>>>>>> nobody else has thought
>>>>>> Albert Szent-Gyorgi
>>>>>> 
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> us...@open-mpi.org
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Laurence Marks
>>>> Department of Materials Science and Engineering
>>>> MSE Rm 2036 Cook Hall
>>>> 2220 N Campus Drive
>>>> Northwestern University
>>>> Evanston, IL 60208, USA
>>>> Tel: (847) 491-3996 Fax: (847) 491-7820
>>>> email: L-marks at northwestern dot edu
>>>> Web: www.numis.northwestern.edu
>>>> Chair, Commission on Electron Crystallography of IUCR
>>>> www.numis.northwestern.edu/
>>>> Research is to see what everybody else has seen, and to think what
>>>> nobody else has thought
>>>> Albert Szent-Gyorgi
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> Laurence Marks
>>> Department of Materials Science and Engineering
>>> MSE Rm 2036 Cook Hall
>>> 2220 N Campus Drive
>>> Northwestern University
>>> Evanston, IL 60208, USA
>>> Tel: (847) 491-3996 Fax: (847) 491-7820
>>> email: L-marks at northwestern dot edu
>>> Web: www.numis.northwestern.edu
>>> Chair, Commission on Electron Crystallography of IUCR
>>> www.numis.northwestern.edu/
>>> Research is to see what everybody else has seen, and to think what
>>> nobody else has thought
>>> Albert Szent-Gyorgi
>>> 
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
> 
> 
> 
> -- 
> Laurence Marks
> Department of Materials Science and Engineering
> MSE Rm 2036 Cook Hall
> 2220 N Campus Drive
> Northwestern University
> Evanston, IL 60208, USA
> Tel: (847) 491-3996 Fax: (847) 491-7820
> email: L-marks at northwestern dot edu
> Web: www.numis.northwestern.edu
> Chair, Commission on Electron Crystallography of IUCR
> www.numis.northwestern.edu/
> Research is to see what everybody else has seen, and to think what
> nobody else has thought
> Albert Szent-Gyorgi
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


Reply via email to