On Mar 2, 2012, at 9:48 AM, Yiguang Yan wrote:

> (All with the same test script test.bash I post in my previous emails, so run 
> with app file fed to mpirun command.)
> 
> (1) If I put the --prefix in the app file, on each line of it, it works fine 
> as Jeff said.
> 
> (2) Since in the manual, it is said that the full path of mpirun is the same 
> as setting "--prefix". However, with app file, 
> this is not the case. Without "--prefix" on each line of the app file, the 
> full path of mpirun does not work.

Ralph and I just had a phone conversation about this.  We consider it a bug -- 
you shouldn't need to put --prefix in the app file.  Meaning: --prefix is 
currently being ignored if you use an app file (and therefore you have to put 
--prefix in the app file).  We're going to fix that.

> (3) With "--prefix $adinahome" set on each line of the app file, it is 
> exclusively put, on each node, the 
> $adinahome/bin into the PATH, and $adinahome/lib into the LD_LIBRARY_PATH(not 
> the $adinahome/lib64 as said 
> in mpirun manual(v1.4.x)).

Correct.

> The envars $PATH and $LD_LIBARARY_PATH set in test.bash script only affect 
> the 
> envars on the submitting node(gulftown in my case). No $PATH or 
> $LD_LIBRARY_PATH is passed to slave nodes 
> even if I use "-x PATH -x LD_LIBRARY_PATH", either fed to mpirun or put on 
> each line of the app file. I am not sure 
> if this is intended, since "--prefix" overwrite the effect of "-x" option, 
> this is different from what I see from the mpirun 
> man page.

Hmm.  Let's do a simple test here...

-----
[9:38] svbu-mpi:~ % cat foo
#!/bin/bash

echo test_env_var: $test_env_var
[9:38] svbu-mpi:~ % ./foo
test_env_var:
[9:38] svbu-mpi:~ % mpirun --host svbu-mpi001,svbu-mpi002 ~/foo
test_env_var:
test_env_var:
[9:38] svbu-mpi:~ % setenv test_env_var THIS-IS-TEST-ENV-VAR
[9:39] svbu-mpi:~ % ./foo
test_env_var: THIS-IS-TEST-ENV-VAR
[9:39] svbu-mpi:~ % mpirun --host svbu-mpi001,svbu-mpi002 ~/foo
test_env_var:
test_env_var:
[9:39] svbu-mpi:~ % mpirun --host svbu-mpi001,svbu-mpi002 -x test_env_var ~/foo
test_env_var: THIS-IS-TEST-ENV-VAR
test_env_var: THIS-IS-TEST-ENV-VAR
[9:39] svbu-mpi:~ % 
-----

So that appears to work.  Let's try with PATH.

-----
[9:41] svbu-mpi:~ % cat foo
#!/bin/bash -f

echo PATH: $PATH
[9:41] svbu-mpi:~ % ./foo
PATH: 
/home/jsquyres/bogus/bin:/users/jsquyres/local/bin:/home/jsquyres/bogus/bin:/users/jsquyres/local/bin:/var/opt/intel/composerxe-2011.1.107/bin:/opt/autotools/ac268-am1113-lt242/bin:/cm/shared/apps/valgrind/3.7.0/bin:/cm/shared/apps/mercurial/2.0.2/bin:/cm/shared/apps/gcc/4.4.6/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/sbin:/usr/sbin:/cm/shared/apps/slurm/2.2.4/bin:/cm/shared/apps/slurm/2.2.4/sbin:/cm/shared/apps/proxy/bin:/cm/shared/apps/subversion/1.7.2/bin:/sbin:/usr/sbin

# That's ok. Now let's try with mpirun.

[9:41] svbu-mpi:~ % mpirun --host svbu-mpi001,svbu-mpi002 ~/foo
PATH: 
/home/jsquyres/bogus/bin:/home/jsquyres/bogus/bin:/home/jsquyres/bogus/bin:/users/jsquyres/local/bin:/var/opt/intel/composerxe-2011.1.107/bin:/opt/autotools/ac268-am1113-lt242/bin:/cm/shared/apps/valgrind/3.7.0/bin:/cm/shared/apps/mercurial/2.0.2/bin:/cm/shared/apps/gcc/4.4.6/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/sbin:/usr/sbin:/cm/shared/apps/slurm/2.2.4/bin:/cm/shared/apps/slurm/2.2.4/sbin:/cm/shared/apps/proxy/bin:/cm/shared/apps/subversion/1.7.2/bin
PATH: 
/home/jsquyres/bogus/bin:/home/jsquyres/bogus/bin:/home/jsquyres/bogus/bin:/users/jsquyres/local/bin:/var/opt/intel/composerxe-2011.1.107/bin:/opt/autotools/ac268-am1113-lt242/bin:/cm/shared/apps/valgrind/3.7.0/bin:/cm/shared/apps/mercurial/2.0.2/bin:/cm/shared/apps/gcc/4.4.6/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/sbin:/usr/sbin:/cm/shared/apps/slurm/2.2.4/bin:/cm/shared/apps/slurm/2.2.4/sbin:/cm/shared/apps/proxy/bin:/cm/shared/apps/subversion/1.7.2/bin

# These look ok (my remote path is a bit longer than my local path)
# Now let's add a bogus entry the local path

[9:41] svbu-mpi:~ % set path = ($path /this/is/a/fake/path)
[9:41] svbu-mpi:~ % ./foo
PATH: 
/home/jsquyres/bogus/bin:/users/jsquyres/local/bin:/home/jsquyres/bogus/bin:/users/jsquyres/local/bin:/var/opt/intel/composerxe-2011.1.107/bin:/opt/autotools/ac268-am1113-lt242/bin:/cm/shared/apps/valgrind/3.7.0/bin:/cm/shared/apps/mercurial/2.0.2/bin:/cm/shared/apps/gcc/4.4.6/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/sbin:/usr/sbin:/cm/shared/apps/slurm/2.2.4/bin:/cm/shared/apps/slurm/2.2.4/sbin:/cm/shared/apps/proxy/bin:/cm/shared/apps/subversion/1.7.2/bin:/sbin:/usr/sbin:/this/is/a/fake/path

# Good; the bogus entry is there.  Now try mpirun

[9:41] svbu-mpi:~ % mpirun --host svbu-mpi001,svbu-mpi002 ~/foo
PATH: 
/home/jsquyres/bogus/bin:/home/jsquyres/bogus/bin:/home/jsquyres/bogus/bin:/users/jsquyres/local/bin:/var/opt/intel/composerxe-2011.1.107/bin:/opt/autotools/ac268-am1113-lt242/bin:/cm/shared/apps/valgrind/3.7.0/bin:/cm/shared/apps/mercurial/2.0.2/bin:/cm/shared/apps/gcc/4.4.6/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/sbin:/usr/sbin:/cm/shared/apps/slurm/2.2.4/bin:/cm/shared/apps/slurm/2.2.4/sbin:/cm/shared/apps/proxy/bin:/cm/shared/apps/subversion/1.7.2/bin
PATH: 
/home/jsquyres/bogus/bin:/home/jsquyres/bogus/bin:/home/jsquyres/bogus/bin:/users/jsquyres/local/bin:/var/opt/intel/composerxe-2011.1.107/bin:/opt/autotools/ac268-am1113-lt242/bin:/cm/shared/apps/valgrind/3.7.0/bin:/cm/shared/apps/mercurial/2.0.2/bin:/cm/shared/apps/gcc/4.4.6/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/sbin:/usr/sbin:/cm/shared/apps/slurm/2.2.4/bin:/cm/shared/apps/slurm/2.2.4/sbin:/cm/shared/apps/proxy/bin:/cm/shared/apps/subversion/1.7.2/bin

# Good -- it's not there.  Now -x PATH

[9:41] svbu-mpi:~ % mpirun --host svbu-mpi001,svbu-mpi002 -x PATH ~/foo
PATH: 
/home/jsquyres/bogus/bin:/home/jsquyres/bogus/bin:/users/jsquyres/local/bin:/home/jsquyres/bogus/bin:/users/jsquyres/local/bin:/var/opt/intel/composerxe-2011.1.107/bin:/opt/autotools/ac268-am1113-lt242/bin:/cm/shared/apps/valgrind/3.7.0/bin:/cm/shared/apps/mercurial/2.0.2/bin:/cm/shared/apps/gcc/4.4.6/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/sbin:/usr/sbin:/cm/shared/apps/slurm/2.2.4/bin:/cm/shared/apps/slurm/2.2.4/sbin:/cm/shared/apps/proxy/bin:/cm/shared/apps/subversion/1.7.2/bin:/sbin:/usr/sbin:/this/is/a/fake/path
PATH: 
/home/jsquyres/bogus/bin:/home/jsquyres/bogus/bin:/users/jsquyres/local/bin:/home/jsquyres/bogus/bin:/users/jsquyres/local/bin:/var/opt/intel/composerxe-2011.1.107/bin:/opt/autotools/ac268-am1113-lt242/bin:/cm/shared/apps/valgrind/3.7.0/bin:/cm/shared/apps/mercurial/2.0.2/bin:/cm/shared/apps/gcc/4.4.6/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/sbin:/usr/sbin:/cm/shared/apps/slurm/2.2.4/bin:/cm/shared/apps/slurm/2.2.4/sbin:/cm/shared/apps/proxy/bin:/cm/shared/apps/subversion/1.7.2/bin:/sbin:/usr/sbin:/this/is/a/fake/path

# Good -- the entry is now there on the remote nodes.
# Now let's try with --prefix and -x PATH

[9:44] svbu-mpi:~ % mpirun --prefix /home/jsquyres/bogus --host 
svbu-mpi001,svbu-mpi002 -x PATH ~/foo
PATH: 
/home/jsquyres/bogus/bin:/home/jsquyres/bogus/bin:/users/jsquyres/local/bin:/home/jsquyres/bogus/bin:/users/jsquyres/local/bin:/var/opt/intel/composerxe-2011.1.107/bin:/opt/autotools/ac268-am1113-lt242/bin:/cm/shared/apps/valgrind/3.7.0/bin:/cm/shared/apps/mercurial/2.0.2/bin:/cm/shared/apps/gcc/4.4.6/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/sbin:/usr/sbin:/cm/shared/apps/slurm/2.2.4/bin:/cm/shared/apps/slurm/2.2.4/sbin:/cm/shared/apps/proxy/bin:/cm/shared/apps/subversion/1.7.2/bin:/sbin:/usr/sbin:/this/is/a/fake/path
PATH: 
/home/jsquyres/bogus/bin:/home/jsquyres/bogus/bin:/users/jsquyres/local/bin:/home/jsquyres/bogus/bin:/users/jsquyres/local/bin:/var/opt/intel/composerxe-2011.1.107/bin:/opt/autotools/ac268-am1113-lt242/bin:/cm/shared/apps/valgrind/3.7.0/bin:/cm/shared/apps/mercurial/2.0.2/bin:/cm/shared/apps/gcc/4.4.6/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/sbin:/usr/sbin:/cm/shared/apps/slurm/2.2.4/bin:/cm/shared/apps/slurm/2.2.4/sbin:/cm/shared/apps/proxy/bin:/cm/shared/apps/subversion/1.7.2/bin:/sbin:/usr/sbin:/this/is/a/fake/path
[9:45] svbu-mpi:~ % 
-----

So it seems to be working for me.  Can you do a few manual tests like this and 
see if there's some combination that's not working properly for you?

> I have another question about the btl used for communication. I noticed that 
> rsh is using the tcp for connection, I 
> understand that tcp may be used for initial connection, but how can I know 
> that openib(infiniband) btl is used for my 
> data communication? Any explicit way?


At the moment, there are implicit ways.

TCP is used for MPI bootstrapping.  But then what transport is used for MPI 
traffic is set by the "btl" MCA parameter (byte transfer layer), as Ralph said. 
 You can *force* the OpenFabrics BTL to be used with something like this:

    mpirun --mca btl openib,sm,self ...

The "openib" is the OpenFabrics BTL (OpenFabric used to be called OpenIB, and 
we're kinda stuck with the plugin name now).  "sm" is shared memory, and "self" 
is process loopback.  So with this command line, you'll *only* use these 3 BTLs 
for MPI communication.

Make sense?

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/


Reply via email to