It turns out that the "-x" option should be put on each line of the app file if 
app file is used.

OK, now test results on our cluster, in case this may be useful to some Open 
MPI users(Open MPI 1.4.3 used on 
my system):

(1) If I run mpirun command from command line as Jeff's foo test, everything 
works fine, the same as in Jeff's foo 
test.

(2) Now let me start mpirun from shell script:

first, foo script includes:
>>>
#!/bin/sh -f

echo $HOSTNAME: PATH : $PATH
echo $HOSTNAME: LD_LIBRARY_PATH : $LD_LIBRARY_PATH
<<<

testenvars.bash script includes:
>>>
#!/bin/sh -f
#nohup
#
# 
>-------------------------------------------------------------------------------------------<
adinahome=/home/yiguang/testdmp881
mpirunfile=$adinahome/bin/mpirun
#
# Set envars for mpirun and orted
#
export PATH=/this/is/a/fake/path:$adinahome/bin:$adinahome/tools:$PATH
export LD_LIBRARY_PATH=/this/is/a/fake/libdir:$adinahome/lib:$LD_LIBRARY_PATH
#
#
# run DMP problem
#
mcaprefix="--prefix $adinahome"
mcaenvars="-x PATH -x LD_LIBRARY_PATH"
mcabtlconn="--mca btl openib,sm,self"
#mcaplmbase="--mca plm_base_verbose 100"

# mpirun is under $adinahome/bin

$mpirunfile --host gulftown,ibnode001 foo
<<<

Now if I run testenvars.bash from command line:
>>>
[yiguang@gulftown testdmp]$ ./testenvars.bash
gulftown: PATH : 
/home/yiguang/testdmp881/bin:/home/yiguang/testdmp881/bin:/this/is/a/fake/path:/home/yiguang/testdmp881/bin:/ho
me/yiguang/testdmp881/tools:/usr/bin:/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/usr/adina/system8.8/tools:/usr/adi
na/system8.7/tools:/usr/adina/system8.6/tools:/usr/adina/system8.5/tools:/home/yiguang/bin
gulftown: LD_LIBRARY_PATH : 
/home/yiguang/testdmp881/lib:/home/yiguang/testdmp881/lib:/this/is/a/fake/libdir:/home/yiguang/testdmp881/lib:
ibnode001: PATH : 
/home/yiguang/testdmp881/bin:/home/yiguang/testdmp881/bin:/usr/bin:/usr/lib64/qt-
3.3/bin:/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin
ibnode001: LD_LIBRARY_PATH : 
/home/yiguang/testdmp881/lib:/home/yiguang/testdmp881/lib:
<<<

If, in the testenvars.bash script, I change the line 
$mpirunfile --host gulftown,ibnode001 foo
-->
mpirun --prefix $adinahome --host gulftown,ibnode001 foo

then I get the same output as above, and as expected, full path of mpirun and 
--prefix give us the same action. The 
unexpected part is that /home/yiguang/testdmp881/bin and 
/home/yiguang/testdmp881/lib are included twice here, 
why?

Now if I change, in the above testenvars.bash script, the line

$mpirunfile --host gulftown,ibnode001 foo
-->
mpirun --prefix $adinahome $mcaenvars --host gulftown,ibnode001 foo

Then run the script:
>>>
[yiguang@gulftown testdmp]$ ./testenvars.bash
gulftown: PATH : 
/home/yiguang/testdmp881/bin:/this/is/a/fake/path:/home/yiguang/testdmp881/bin:/home/yiguang/testdmp881/tools:/
usr/bin:/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/usr/adina/system8.8/tools:/usr/adina/system8.7/tools:/usr/adina/s
ystem8.6/tools:/usr/adina/system8.5/tools:/home/yiguang/bin
gulftown: LD_LIBRARY_PATH : 
/home/yiguang/testdmp881/lib:/this/is/a/fake/libdir:/home/yiguang/testdmp881/lib:
ibnode001: PATH : 
/home/yiguang/testdmp881/bin:/this/is/a/fake/path:/home/yiguang/testdmp881/bin:/home/yiguang/testdmp881/tools:/
usr/bin:/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/usr/adina/system8.8/tools:/usr/adina/system8.7/tools:/usr/adina/s
ystem8.6/tools:/usr/adina/system8.5/tools:/home/yiguang/bin
ibnode001: LD_LIBRARY_PATH : 
/home/yiguang/testdmp881/lib:/this/is/a/fake/libdir:/home/yiguang/testdmp881/lib:
<<<
This time, the PATH and LD_LIBRARY_PATH are passed to slave node, and 
/home/yiguang/testdmp881/bin and 
/home/yiguang/testdmp881/lib include only once, different from the last test.

So far so good expect the minor things.

(3) Now I changed to use app file

First scripts, foo script is as above, testenvars-app.bash scripts includes:
>>>
[yiguang@gulftown testdmp]$ cat testenvars-app.bash
#!/bin/sh -f
#nohup
#
# 
>-------------------------------------------------------------------------------------------<
adinahome=/home/yiguang/testdmp881
mpirunfile=$adinahome/bin/mpirun
#
# Set envars for mpirun and orted
#
export PATH=/this/is/a/fake/path:$adinahome/bin:$adinahome/tools:$PATH
export LD_LIBRARY_PATH=/this/is/a/fake/libdir:$adinahome/lib:$LD_LIBRARY_PATH
#
#
# run DMP problem
#
#mcaprefix="--prefix $adinahome"
mcaenvars="-x PATH -x LD_LIBRARY_PATH"
mcabtlconn="--mca btl openib,sm,self"
#mcaplmbase="--mca plm_base_verbose 100"

$mpirunfile $mcabltconn --app addmpw-foo-nox
#$mpirunfile $mcaenvars $mcabltconn --app addmpw-foo-nox
#$mpirunfile $mcabltconn --app addmpw-foo
<<<

addmpw-foo-nox app file as:
>>>
[yiguang@gulftown testdmp]$ cat addmpw-foo-nox
--prefix /home/yiguang/testdmp881 -n 1 -host gulftown foo
--prefix /home/yiguang/testdmp881 -n 1 -host ibnode001 foo
<<<
addmpw-foo app file as:
>>>
[yiguang@gulftown testdmp]$ cat addmpw-foo
--prefix /home/yiguang/testdmp881 -x PATH -x LD_LIBRARY_PATH -n 1 -host 
gulftown foo
--prefix /home/yiguang/testdmp881 -x PATH -x LD_LIBRARY_PATH -n 1 -host 
ibnode001 foo
<<<

(a) If I run testenvars-app.bash, choosing this one from the last three lines 
of it:

>>>$mpirunfile $mcabltconn --app addmpw-foo-nox

then output as:
>>>
[yiguang@gulftown testdmp]$ ./testenvars-app.bash
gulftown: PATH : 
/home/yiguang/testdmp881/bin:/home/yiguang/testdmp881/bin:/this/is/a/fake/path:/home/yiguang/testdmp881/bin:/ho
me/yiguang/testdmp881/tools:/usr/bin:/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/usr/adina/system8.8/tools:/usr/adi
na/system8.7/tools:/usr/adina/system8.6/tools:/usr/adina/system8.5/tools:/home/yiguang/bin
gulftown: LD_LIBRARY_PATH : 
/home/yiguang/testdmp881/lib:/home/yiguang/testdmp881/lib:/this/is/a/fake/libdir:/home/yiguang/testdmp881/lib:
ibnode001: PATH : 
/home/yiguang/testdmp881/bin:/home/yiguang/testdmp881/bin:/usr/bin:/usr/lib64/qt-
3.3/bin:/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin
ibnode001: LD_LIBRARY_PATH : 
/home/yiguang/testdmp881/lib:/home/yiguang/testdmp881/lib:
<<<

(b) If I choose the second one from the last three lines of testenvars-app.bash 
script, that is uncomment the line:
$mpirunfile $mcaenvars $mcabltconn --app addmpw-foo-nox
and comment out other two lines, output as:
>>>
[yiguang@gulftown testdmp]$ ./testenvars-app.bash
gulftown: PATH : 
/home/yiguang/testdmp881/bin:/home/yiguang/testdmp881/bin:/this/is/a/fake/path:/home/yiguang/testdmp881/bin:/ho
me/yiguang/testdmp881/tools:/usr/bin:/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/usr/adina/system8.8/tools:/usr/adi
na/system8.7/tools:/usr/adina/system8.6/tools:/usr/adina/system8.5/tools:/home/yiguang/bin
gulftown: LD_LIBRARY_PATH : 
/home/yiguang/testdmp881/lib:/home/yiguang/testdmp881/lib:/this/is/a/fake/libdir:/home/yiguang/testdmp881/lib:
ibnode001: PATH : 
/home/yiguang/testdmp881/bin:/home/yiguang/testdmp881/bin:/usr/bin:/usr/lib64/qt-
3.3/bin:/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin
ibnode001: LD_LIBRARY_PATH : 
/home/yiguang/testdmp881/lib:/home/yiguang/testdmp881/lib:
<<<

(c) now if I uncomment the last line and comment out the other two of the last 
three lines, as run
$mpirunfile $mcabltconn --app addmpw-foo

then output as:
>>>
[yiguang@gulftown testdmp]$ ./testenvars-app.bash
gulftown: PATH : 
/home/yiguang/testdmp881/bin:/this/is/a/fake/path:/home/yiguang/testdmp881/bin:/home/yiguang/testdmp881/tools:/
usr/bin:/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/usr/adina/system8.8/tools:/usr/adina/system8.7/tools:/usr/adina/s
ystem8.6/tools:/usr/adina/system8.5/tools:/home/yiguang/bin
gulftown: LD_LIBRARY_PATH : 
/home/yiguang/testdmp881/lib:/this/is/a/fake/libdir:/home/yiguang/testdmp881/lib:
ibnode001: PATH : 
/home/yiguang/testdmp881/bin:/this/is/a/fake/path:/home/yiguang/testdmp881/bin:/home/yiguang/testdmp881/tools:/
usr/bin:/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/usr/adina/system8.8/tools:/usr/adina/system8.7/tools:/usr/adina/s
ystem8.6/tools:/usr/adina/system8.5/tools:/home/yiguang/bin
ibnode001: LD_LIBRARY_PATH : 
/home/yiguang/testdmp881/lib:/this/is/a/fake/libdir:/home/yiguang/testdmp881/lib:
<<<

So from tests (a),(b),(c), if I am using app file, the PATH and LD_LIBRARY_PATH 
are only passed to slave node 
when the "-x" is set on each line of the app file, similar to the "--prefix" 
option.

Any conclusion? If a bug fix is admitted for the "--prefix" option, I would 
think this is another bug for "-x" option.

Thanks,
Yiguang





Reply via email to