Hi Jeff,

Thanks.

I tried as what you suggested. Here are the output:

>>>
yiguang@gulftown testdmp]$ ./test.bash
[gulftown:25052] mca: base: components_open: Looking for plm 
components
[gulftown:25052] mca: base: components_open: opening plm 
components
[gulftown:25052] mca: base: components_open: found loaded 
component rsh
[gulftown:25052] mca: base: components_open: component rsh 
has no register function
[gulftown:25052] mca: base: components_open: component rsh 
open function successful
[gulftown:25052] mca: base: components_open: found loaded 
component slurm
[gulftown:25052] mca: base: components_open: component slurm 
has no register function
[gulftown:25052] mca: base: components_open: component slurm 
open function successful
[gulftown:25052] mca: base: components_open: found loaded 
component tm
[gulftown:25052] mca: base: components_open: component tm 
has no register function
[gulftown:25052] mca: base: components_open: component tm 
open function successful
[gulftown:25052] mca:base:select: Auto-selecting plm components
[gulftown:25052] mca:base:select:(  plm) Querying component [rsh]
[gulftown:25052] mca:base:select:(  plm) Query of component [rsh] 
set priority to 10
[gulftown:25052] mca:base:select:(  plm) Querying component 
[slurm]
[gulftown:25052] mca:base:select:(  plm) Skipping component 
[slurm]. Query failed to return a module
[gulftown:25052] mca:base:select:(  plm) Querying component [tm]
[gulftown:25052] mca:base:select:(  plm) Skipping component [tm]. 
Query failed to return a module
[gulftown:25052] mca:base:select:(  plm) Selected component [rsh]
[gulftown:25052] mca: base: close: component slurm closed
[gulftown:25052] mca: base: close: unloading component slurm
[gulftown:25052] mca: base: close: component tm closed
[gulftown:25052] mca: base: close: unloading component tm
bash: orted: command not found
bash: orted: command not found
bash: orted: command not found
<<<


The following is the content of test.bash:
>>>
yiguang@gulftown testdmp]$ ./test.bash
#!/bin/sh -f
#nohup
#
# 
>-----------------------------------------------------------------------------------
--------<
adinahome=/usr/adina/system8.8dmp
mpirunfile=$adinahome/bin/mpirun
#
# Set envars for mpirun and orted
#
export PATH=$adinahome/bin:$adinahome/tools:$PATH
export LD_LIBRARY_PATH=$adinahome/lib:$LD_LIBRARY_PATH
#
#
# run DMP problem
#
mcaprefix="--prefix $adinahome"
mcarshagent="--mca plm_rsh_agent rsh:ssh"
mcatmpdir="--mca orte_tmpdir_base /tmp"
mcaopenibmsg="--mca btl_openib_warn_default_gid_prefix 0"
mcaenvars="-x PATH -x LD_LIBRARY_PATH"
mcabtlconn="--mca btl openib,sm,self"
mcaplmbase="--mca plm_base_verbose 100"

mcaparams="$mcaprefix $mcaenvars $mcarshagent 
$mcaopenibmsg $mcabtlconn $mcatmpdir $mcaplmbase"

$mpirunfile $mcaparams --app addmpw-hostname
<<<

While the content of addmpw-hostname is:
>>>
-n 1 -host gulftown hostname
-n 1 -host ibnode001 hostname
-n 1 -host ibnode002 hostname
-n 1 -host ibnode003 thostname
<<<

After this, I also tried to specify the orted through:

--mca orte_launch_agent $adinahome/bin/orted

then, orted could be found on slave nodes, but now the shared libs 
in $adinahome/lib are not on the LD_LIBRARY_PATH.

Any comments?

Thanks,
Yiguang



Reply via email to