Hi,

> We shouldn't just hang - that isn't right. Can you configure
> OMPI with --enable-debug and then add "-mca plm_base_verbose 5
> -mca state_base_verbose 5" to your cmd line so we can see where
> it is hanging?

The program doesn't hang. It completes without any output and
return status "1".

tyr small_prog 55 mpiexec -np 3 -host rs0,sunpc1,linpc1 \
  -mca plm_base_verbose 5 -mca state_base_verbose 5 rank_size
[tyr.informatik.hs-fulda.de:12297] mca:base:select:(state) Querying component 
[app]
[tyr.informatik.hs-fulda.de:12297] mca:base:select:(state) Skipping component 
[app]. Query failed to return a module
[tyr.informatik.hs-fulda.de:12297] mca:base:select:(state) Querying component 
[hnp]
[tyr.informatik.hs-fulda.de:12297] mca:base:select:(state) Query of component 
[hnp] set priority to 60
[tyr.informatik.hs-fulda.de:12297] mca:base:select:(state) Querying component 
[novm]
[tyr.informatik.hs-fulda.de:12297] mca:base:select:(state) Skipping component 
[novm]. Query failed to return a module
[tyr.informatik.hs-fulda.de:12297] mca:base:select:(state) Querying component 
[orted]
[tyr.informatik.hs-fulda.de:12297] mca:base:select:(state) Skipping component 
[orted]. Query failed to return a module
[tyr.informatik.hs-fulda.de:12297] mca:base:select:(state) Querying component 
[staged_hnp]
[tyr.informatik.hs-fulda.de:12297] mca:base:select:(state) Skipping component 
[staged_hnp]. Query failed to return a module
[tyr.informatik.hs-fulda.de:12297] mca:base:select:(state) Querying component 
[staged_orted]
[tyr.informatik.hs-fulda.de:12297] mca:base:select:(state) Skipping component 
[staged_orted]. Query failed to return a module
[tyr.informatik.hs-fulda.de:12297] mca:base:select:(state) Querying component 
[tool]
[tyr.informatik.hs-fulda.de:12297] mca:base:select:(state) Skipping component 
[tool]. Query failed to return a module
[tyr.informatik.hs-fulda.de:12297] mca:base:select:(state) Selected component 
[hnp]
[tyr.informatik.hs-fulda.de:12297] mca:base:select:(  plm) Querying component 
[rsh]
[tyr.informatik.hs-fulda.de:12297] [[INVALID],INVALID] plm:rsh_lookup on agent 
ssh : rsh path NULL
[tyr.informatik.hs-fulda.de:12297] mca:base:select:(  plm) Query of component 
[rsh] set priority to 10
[tyr.informatik.hs-fulda.de:12297] mca:base:select:(  plm) Selected component 
[rsh]
[tyr.informatik.hs-fulda.de:12297] plm:base:set_hnp_name: initial bias 12297 
nodename hash 339128848
[tyr.informatik.hs-fulda.de:12297] plm:base:set_hnp_name: final jobfam 38447
[tyr.informatik.hs-fulda.de:12297] [[38447,0],0] plm:rsh_setup on agent ssh : 
rsh path NULL
[tyr.informatik.hs-fulda.de:12297] [[38447,0],0] plm:base:receive start comm
[tyr.informatik.hs-fulda.de:12297] [[38447,0],0] ACTIVATE JOB [INVALID] STATE 
PENDING INIT AT 
../../../../../openmpi-1.7.4rc2r30094/orte/mca/plm/rsh/plm_rsh_module.c:900
[tyr.informatik.hs-fulda.de:12297] [[38447,0],0] ACTIVATING JOB [INVALID] STATE 
PENDING INIT PRI 4
[tyr.informatik.hs-fulda.de:12297] [[38447,0],0] plm:base:setup_job
[tyr.informatik.hs-fulda.de:12297] [[38447,0],0] ACTIVATE JOB [38447,1] STATE 
INIT_COMPLETE AT 
../../../../openmpi-1.7.4rc2r30094/orte/mca/plm/base/plm_base_launch_support.c:317
[tyr.informatik.hs-fulda.de:12297] [[38447,0],0] ACTIVATING JOB [38447,1] STATE 
INIT_COMPLETE PRI 4
[tyr.informatik.hs-fulda.de:12297] [[38447,0],0] ACTIVATE JOB [38447,1] STATE 
PENDING ALLOCATION AT 
../../../../openmpi-1.7.4rc2r30094/orte/mca/plm/base/plm_base_launch_support.c:328
[tyr.informatik.hs-fulda.de:12297] [[38447,0],0] ACTIVATING JOB [38447,1] STATE 
PENDING ALLOCATION PRI 4
[tyr.informatik.hs-fulda.de:12297] [[38447,0],0] ACTIVATE JOB [38447,1] STATE 
ALLOCATION COMPLETE AT 
../../../../openmpi-1.7.4rc2r30094/orte/mca/ras/base/ras_base_allocate.c:423
[tyr.informatik.hs-fulda.de:12297] [[38447,0],0] ACTIVATING JOB [38447,1] STATE 
ALLOCATION COMPLETE PRI 4
[tyr.informatik.hs-fulda.de:12297] [[38447,0],0] ACTIVATE JOB [38447,1] STATE 
PENDING DAEMON LAUNCH AT 
../../../../openmpi-1.7.4rc2r30094/orte/mca/plm/base/plm_base_launch_support.c:184
[tyr.informatik.hs-fulda.de:12297] [[38447,0],0] ACTIVATING JOB [38447,1] STATE 
PENDING DAEMON LAUNCH PRI 4
[tyr.informatik.hs-fulda.de:12297] [[38447,0],0] plm:base:setup_vm
[tyr.informatik.hs-fulda.de:12297] [[38447,0],0] plm:base:setup_vm creating map
[tyr.informatik.hs-fulda.de:12297] [[38447,0],0] setup:vm: working unmanaged 
allocation
[tyr.informatik.hs-fulda.de:12297] [[38447,0],0] using dash_host
[tyr.informatik.hs-fulda.de:12297] [[38447,0],0] checking node rs0
[tyr.informatik.hs-fulda.de:12297] [[38447,0],0] checking node sunpc1
[tyr.informatik.hs-fulda.de:12297] [[38447,0],0] checking node linpc1
[tyr.informatik.hs-fulda.de:12297] [[38447,0],0] plm:base:setup_vm add new 
daemon [[38447,0],1]
[tyr.informatik.hs-fulda.de:12297] [[38447,0],0] plm:base:setup_vm assigning 
new daemon [[38447,0],1] to node rs0
[tyr.informatik.hs-fulda.de:12297] [[38447,0],0] plm:base:setup_vm add new 
daemon [[38447,0],2]
[tyr.informatik.hs-fulda.de:12297] [[38447,0],0] plm:base:setup_vm assigning 
new daemon [[38447,0],2] to node sunpc1
[tyr.informatik.hs-fulda.de:12297] [[38447,0],0] plm:base:setup_vm add new 
daemon [[38447,0],3]
[tyr.informatik.hs-fulda.de:12297] [[38447,0],0] plm:base:setup_vm assigning 
new daemon [[38447,0],3] to node linpc1
[tyr.informatik.hs-fulda.de:12297] [[38447,0],0] plm:rsh: launching vm
[tyr.informatik.hs-fulda.de:12297] [[38447,0],0] plm:rsh: local shell: 2 (tcsh)
[tyr.informatik.hs-fulda.de:12297] [[38447,0],0] plm:rsh: assuming same remote 
shell as local shell
[tyr.informatik.hs-fulda.de:12297] [[38447,0],0] plm:rsh: remote shell: 2 (tcsh)
[tyr.informatik.hs-fulda.de:12297] [[38447,0],0] plm:rsh: final template argv:
        /usr/local/bin/ssh <template>  orted -mca ess env -mca orte_ess_jobid 
2519662592 -mca orte_ess_vpid <template> -mca orte_ess_num_procs 4 -mca 
orte_hnp_uri "2519662592.0;tcp://193.174.24.39:59753" 
--tree-spawn -mca plm_base_verbose 5 -mca state_base_verbose 5 -mca plm rsh
[tyr.informatik.hs-fulda.de:12297] [[38447,0],0] plm:rsh:launch daemon 0 not a 
child of mine
[tyr.informatik.hs-fulda.de:12297] [[38447,0],0] plm:rsh: adding node rs0 to 
launch list
[tyr.informatik.hs-fulda.de:12297] [[38447,0],0] plm:rsh: adding node sunpc1 to 
launch list
[tyr.informatik.hs-fulda.de:12297] [[38447,0],0] plm:rsh:launch daemon 3 not a 
child of mine
[tyr.informatik.hs-fulda.de:12297] [[38447,0],0] plm:rsh: activating launch 
event
[tyr.informatik.hs-fulda.de:12297] [[38447,0],0] plm:rsh: recording launch of 
daemon [[38447,0],1]
[tyr.informatik.hs-fulda.de:12297] [[38447,0],0] plm:rsh: executing: 
(/usr/local/bin/ssh) [/usr/local/bin/ssh rs0  orted -mca ess env -mca 
orte_ess_jobid 2519662592 -mca orte_ess_vpid 1 -mca 
orte_ess_num_procs 4 -mca orte_hnp_uri "2519662592.0;tcp://193.174.24.39:59753" 
--tree-spawn -mca plm_base_verbose 5 -mca state_base_verbose 5 -mca plm rsh]
[tyr.informatik.hs-fulda.de:12297] [[38447,0],0] plm:rsh: executing: 
(/usr/local/bin/ssh) [/usr/local/bin/ssh sunpc1  orted -mca ess env -mca 
orte_ess_jobid 2519662592 -mca orte_ess_vpid 2 -mca 
orte_ess_num_procs 4 -mca orte_hnp_uri "2519662592.0;tcp://193.174.24.39:59753" 
--tree-spawn -mca plm_base_verbose 5 -mca state_base_verbose 5 -mca plm rsh]
[tyr.informatik.hs-fulda.de:12297] [[38447,0],0] plm:rsh: recording launch of 
daemon [[38447,0],2]
X11 forwarding request failed on channel 0
[sunpc1:22290] mca:base:select:(state) Querying component [app]
[sunpc1:22290] mca:base:select:(state) Skipping component [app]. Query failed 
to return a module
[sunpc1:22290] mca:base:select:(state) Querying component [hnp]
[sunpc1:22290] mca:base:select:(state) Skipping component [hnp]. Query failed 
to return a module
[sunpc1:22290] mca:base:select:(state) Querying component [novm]
[sunpc1:22290] mca:base:select:(state) Skipping component [novm]. Query failed 
to return a module
[sunpc1:22290] mca:base:select:(state) Querying component [orted]
[sunpc1:22290] mca:base:select:(state) Query of component [orted] set priority 
to 100
[sunpc1:22290] mca:base:select:(state) Querying component [staged_hnp]
[sunpc1:22290] mca:base:select:(state) Skipping component [staged_hnp]. Query 
failed to return a module
[sunpc1:22290] mca:base:select:(state) Querying component [staged_orted]
[sunpc1:22290] mca:base:select:(state) Skipping component [staged_orted]. Query 
failed to return a module
[sunpc1:22290] mca:base:select:(state) Querying component [tool]
[sunpc1:22290] mca:base:select:(state) Skipping component [tool]. Query failed 
to return a module
[sunpc1:22290] mca:base:select:(state) Selected component [orted]
[sunpc1:22290] mca:base:select:(  plm) Querying component [rsh]
[sunpc1:22290] [[38447,0],2] plm:rsh_lookup on agent ssh : rsh path NULL
[sunpc1:22290] mca:base:select:(  plm) Query of component [rsh] set priority to 
10
[sunpc1:22290] mca:base:select:(  plm) Selected component [rsh]
[sunpc1:22290] [[38447,0],2] plm:rsh_setup on agent ssh : rsh path NULL
[sunpc1:22290] [[38447,0],2] plm:base:receive start comm
[sunpc1:22290] [[38447,0],2] ACTIVATE PROC [[38447,0],0] STATE UNABLE TO SEND 
MSG AT ../../../../openmpi-1.9a1r30100/orte/mca/rml/base/rml_base_frame.c:205
[sunpc1:22290] [[38447,0],2] ACTIVATING PROC [[38447,0],0] STATE UNABLE TO SEND 
MSG PRI 0
[sunpc1:22290] [[38447,0],2] FORCE-TERMINATE AT 
../../../../../openmpi-1.9a1r30100/orte/mca/errmgr/default_orted/errmgr_default_orted.c:259
[sunpc1:22290] [[38447,0],2] ACTIVATE JOB NULL STATE FORCED EXIT AT 
../../../../../openmpi-1.9a1r30100/orte/mca/errmgr/default_orted/errmgr_default_orted.c:259
[sunpc1:22290] [[38447,0],2] ACTIVATING JOB NULL STATE FORCED EXIT PRI 0
[sunpc1:22290] [[38447,0],2] plm:base:receive stop comm
[tyr.informatik.hs-fulda.de:12297] [[38447,0],0] daemon 2 failed with status 1
[tyr.informatik.hs-fulda.de:12297] [[38447,0],0] ACTIVATE PROC [[38447,0],2] 
STATE FAILED TO START AT 
../../../../../openmpi-1.7.4rc2r30094/orte/mca/plm/rsh/plm_rsh_module.c:304
[tyr.informatik.hs-fulda.de:12297] [[38447,0],0] ACTIVATING PROC [[38447,0],2] 
STATE FAILED TO START PRI 0
[tyr.informatik.hs-fulda.de:12297] [[38447,0],0] plm:base:orted_cmd sending 
orted_exit commands
[tyr.informatik.hs-fulda.de:12297] [[38447,0],0] ACTIVATE JOB NULL STATE 
DAEMONS TERMINATED AT ../../openmpi-1.7.4rc2r30094/orte/orted/orted_comm.c:465
[tyr.informatik.hs-fulda.de:12297] [[38447,0],0] ACTIVATING JOB NULL STATE 
DAEMONS TERMINATED PRI 0
[tyr.informatik.hs-fulda.de:12297] [[38447,0],0] plm:base:receive stop comm
tyr small_prog 56 [rs0.informatik.hs-fulda.de:03686] mca:base:select:(state) 
Querying component [app]
[rs0.informatik.hs-fulda.de:03686] mca:base:select:(state) Skipping component 
[app]. Query failed to return a module
[rs0.informatik.hs-fulda.de:03686] mca:base:select:(state) Querying component 
[hnp]
[rs0.informatik.hs-fulda.de:03686] mca:base:select:(state) Skipping component 
[hnp]. Query failed to return a module
[rs0.informatik.hs-fulda.de:03686] mca:base:select:(state) Querying component 
[novm]
[rs0.informatik.hs-fulda.de:03686] mca:base:select:(state) Skipping component 
[novm]. Query failed to return a module
[rs0.informatik.hs-fulda.de:03686] mca:base:select:(state) Querying component 
[orted]
[rs0.informatik.hs-fulda.de:03686] mca:base:select:(state) Query of component 
[orted] set priority to 100
[rs0.informatik.hs-fulda.de:03686] mca:base:select:(state) Querying component 
[staged_hnp]
[rs0.informatik.hs-fulda.de:03686] mca:base:select:(state) Skipping component 
[staged_hnp]. Query failed to return a module
[rs0.informatik.hs-fulda.de:03686] mca:base:select:(state) Querying component 
[staged_orted]
[rs0.informatik.hs-fulda.de:03686] mca:base:select:(state) Skipping component 
[staged_orted]. Query failed to return a module
[rs0.informatik.hs-fulda.de:03686] mca:base:select:(state) Querying component 
[tool]
[rs0.informatik.hs-fulda.de:03686] mca:base:select:(state) Skipping component 
[tool]. Query failed to return a module
[rs0.informatik.hs-fulda.de:03686] mca:base:select:(state) Selected component 
[orted]
[rs0.informatik.hs-fulda.de:03686] mca:base:select:(  plm) Querying component 
[rsh]
[rs0.informatik.hs-fulda.de:03686] [[38447,0],1] plm:rsh_lookup on agent ssh : 
rsh path NULL
[rs0.informatik.hs-fulda.de:03686] mca:base:select:(  plm) Query of component 
[rsh] set priority to 10
[rs0.informatik.hs-fulda.de:03686] mca:base:select:(  plm) Selected component 
[rsh]
[rs0.informatik.hs-fulda.de:03686] [[38447,0],1] plm:rsh_setup on agent ssh : 
rsh path NULL
[rs0.informatik.hs-fulda.de:03686] [[38447,0],1] plm:base:receive start comm
[rs0.informatik.hs-fulda.de:03686] [[38447,0],1] ACTIVATE PROC [[38447,0],0] 
STATE UNABLE TO SEND MSG AT 
../../../../openmpi-1.9a1r30100/orte/mca/rml/base/rml_base_frame.c:205
[rs0.informatik.hs-fulda.de:03686] [[38447,0],1] ACTIVATING PROC [[38447,0],0] 
STATE UNABLE TO SEND MSG PRI 0
[rs0.informatik.hs-fulda.de:03686] [[38447,0],1] FORCE-TERMINATE AT 
../../../../../openmpi-1.9a1r30100/orte/mca/errmgr/default_orted/errmgr_default_orted.c:259
[rs0.informatik.hs-fulda.de:03686] [[38447,0],1] ACTIVATE JOB NULL STATE FORCED 
EXIT AT 
../../../../../openmpi-1.9a1r30100/orte/mca/errmgr/default_orted/errmgr_default_orted.c:259
[rs0.informatik.hs-fulda.de:03686] [[38447,0],1] ACTIVATING JOB NULL STATE 
FORCED EXIT PRI 0
[rs0.informatik.hs-fulda.de:03686] [[38447,0],1] plm:base:receive stop comm

tyr small_prog 56 echo $status
1
tyr small_prog 57 










tyr small_prog 57 mpiexec -np 3 -host rs0,sunpc1,linpc1 -mca plm_base_verbose 5 
\
  -mca state_base_verbose 5 --hetero-nodes --hetero-apps rank_size
[tyr.informatik.hs-fulda.de:12313] mca:base:select:(state) Querying component 
[app]
[tyr.informatik.hs-fulda.de:12313] mca:base:select:(state) Skipping component 
[app]. Query failed to return a module
[tyr.informatik.hs-fulda.de:12313] mca:base:select:(state) Querying component 
[hnp]
[tyr.informatik.hs-fulda.de:12313] mca:base:select:(state) Query of component 
[hnp] set priority to 60
[tyr.informatik.hs-fulda.de:12313] mca:base:select:(state) Querying component 
[novm]
[tyr.informatik.hs-fulda.de:12313] mca:base:select:(state) Skipping component 
[novm]. Query failed to return a module
[tyr.informatik.hs-fulda.de:12313] mca:base:select:(state) Querying component 
[orted]
[tyr.informatik.hs-fulda.de:12313] mca:base:select:(state) Skipping component 
[orted]. Query failed to return a module
[tyr.informatik.hs-fulda.de:12313] mca:base:select:(state) Querying component 
[staged_hnp]
[tyr.informatik.hs-fulda.de:12313] mca:base:select:(state) Skipping component 
[staged_hnp]. Query failed to return a module
[tyr.informatik.hs-fulda.de:12313] mca:base:select:(state) Querying component 
[staged_orted]
[tyr.informatik.hs-fulda.de:12313] mca:base:select:(state) Skipping component 
[staged_orted]. Query failed to return a module
[tyr.informatik.hs-fulda.de:12313] mca:base:select:(state) Querying component 
[tool]
[tyr.informatik.hs-fulda.de:12313] mca:base:select:(state) Skipping component 
[tool]. Query failed to return a module
[tyr.informatik.hs-fulda.de:12313] mca:base:select:(state) Selected component 
[hnp]
[tyr.informatik.hs-fulda.de:12313] mca:base:select:(  plm) Querying component 
[rsh]
[tyr.informatik.hs-fulda.de:12313] [[INVALID],INVALID] plm:rsh_lookup on agent 
ssh : rsh path NULL
[tyr.informatik.hs-fulda.de:12313] mca:base:select:(  plm) Query of component 
[rsh] set priority to 10
[tyr.informatik.hs-fulda.de:12313] mca:base:select:(  plm) Selected component 
[rsh]
[tyr.informatik.hs-fulda.de:12313] plm:base:set_hnp_name: initial bias 12313 
nodename hash 339128848
[tyr.informatik.hs-fulda.de:12313] plm:base:set_hnp_name: final jobfam 38463
[tyr.informatik.hs-fulda.de:12313] [[38463,0],0] plm:rsh_setup on agent ssh : 
rsh path NULL
[tyr.informatik.hs-fulda.de:12313] [[38463,0],0] plm:base:receive start comm
[tyr.informatik.hs-fulda.de:12313] [[38463,0],0] ACTIVATE JOB [INVALID] STATE 
PENDING INIT AT 
../../../../../openmpi-1.7.4rc2r30094/orte/mca/plm/rsh/plm_rsh_module.c:900
[tyr.informatik.hs-fulda.de:12313] [[38463,0],0] ACTIVATING JOB [INVALID] STATE 
PENDING INIT PRI 4
[tyr.informatik.hs-fulda.de:12313] [[38463,0],0] plm:base:setup_job
[tyr.informatik.hs-fulda.de:12313] [[38463,0],0] ACTIVATE JOB [38463,1] STATE 
INIT_COMPLETE AT 
../../../../openmpi-1.7.4rc2r30094/orte/mca/plm/base/plm_base_launch_support.c:317
[tyr.informatik.hs-fulda.de:12313] [[38463,0],0] ACTIVATING JOB [38463,1] STATE 
INIT_COMPLETE PRI 4
[tyr.informatik.hs-fulda.de:12313] [[38463,0],0] ACTIVATE JOB [38463,1] STATE 
PENDING ALLOCATION AT 
../../../../openmpi-1.7.4rc2r30094/orte/mca/plm/base/plm_base_launch_support.c:328
[tyr.informatik.hs-fulda.de:12313] [[38463,0],0] ACTIVATING JOB [38463,1] STATE 
PENDING ALLOCATION PRI 4
[tyr.informatik.hs-fulda.de:12313] [[38463,0],0] ACTIVATE JOB [38463,1] STATE 
ALLOCATION COMPLETE AT 
../../../../openmpi-1.7.4rc2r30094/orte/mca/ras/base/ras_base_allocate.c:423
[tyr.informatik.hs-fulda.de:12313] [[38463,0],0] ACTIVATING JOB [38463,1] STATE 
ALLOCATION COMPLETE PRI 4
[tyr.informatik.hs-fulda.de:12313] [[38463,0],0] ACTIVATE JOB [38463,1] STATE 
PENDING DAEMON LAUNCH AT 
../../../../openmpi-1.7.4rc2r30094/orte/mca/plm/base/plm_base_launch_support.c:184
[tyr.informatik.hs-fulda.de:12313] [[38463,0],0] ACTIVATING JOB [38463,1] STATE 
PENDING DAEMON LAUNCH PRI 4
[tyr.informatik.hs-fulda.de:12313] [[38463,0],0] plm:base:setup_vm
[tyr.informatik.hs-fulda.de:12313] [[38463,0],0] plm:base:setup_vm creating map
[tyr.informatik.hs-fulda.de:12313] [[38463,0],0] setup:vm: working unmanaged 
allocation
[tyr.informatik.hs-fulda.de:12313] [[38463,0],0] using dash_host
[tyr.informatik.hs-fulda.de:12313] [[38463,0],0] checking node rs0
[tyr.informatik.hs-fulda.de:12313] [[38463,0],0] checking node sunpc1
[tyr.informatik.hs-fulda.de:12313] [[38463,0],0] checking node linpc1
[tyr.informatik.hs-fulda.de:12313] [[38463,0],0] plm:base:setup_vm add new 
daemon [[38463,0],1]
[tyr.informatik.hs-fulda.de:12313] [[38463,0],0] plm:base:setup_vm assigning 
new daemon [[38463,0],1] to node rs0
[tyr.informatik.hs-fulda.de:12313] [[38463,0],0] plm:base:setup_vm add new 
daemon [[38463,0],2]
[tyr.informatik.hs-fulda.de:12313] [[38463,0],0] plm:base:setup_vm assigning 
new daemon [[38463,0],2] to node sunpc1
[tyr.informatik.hs-fulda.de:12313] [[38463,0],0] plm:base:setup_vm add new 
daemon [[38463,0],3]
[tyr.informatik.hs-fulda.de:12313] [[38463,0],0] plm:base:setup_vm assigning 
new daemon [[38463,0],3] to node linpc1
[tyr.informatik.hs-fulda.de:12313] [[38463,0],0] plm:rsh: launching vm
[tyr.informatik.hs-fulda.de:12313] [[38463,0],0] plm:rsh: local shell: 2 (tcsh)
[tyr.informatik.hs-fulda.de:12313] [[38463,0],0] plm:rsh: assuming same remote 
shell as local shell
[tyr.informatik.hs-fulda.de:12313] [[38463,0],0] plm:rsh: remote shell: 2 (tcsh)
[tyr.informatik.hs-fulda.de:12313] [[38463,0],0] plm:rsh: final template argv:
        /usr/local/bin/ssh <template>  orted -mca orte_hetero_nodes 1 -mca ess 
env -mca orte_ess_jobid 2520711168 -mca orte_ess_vpid <template> -mca 
orte_ess_num_procs 4 -mca orte_hnp_uri 
"2520711168.0;tcp://193.174.24.39:59756" --tree-spawn -mca plm_base_verbose 5 
-mca state_base_verbose 5 -mca plm rsh -mca orte_hetero_apps 1
[tyr.informatik.hs-fulda.de:12313] [[38463,0],0] plm:rsh:launch daemon 0 not a 
child of mine
[tyr.informatik.hs-fulda.de:12313] [[38463,0],0] plm:rsh: adding node rs0 to 
launch list
[tyr.informatik.hs-fulda.de:12313] [[38463,0],0] plm:rsh: adding node sunpc1 to 
launch list
[tyr.informatik.hs-fulda.de:12313] [[38463,0],0] plm:rsh:launch daemon 3 not a 
child of mine
[tyr.informatik.hs-fulda.de:12313] [[38463,0],0] plm:rsh: activating launch 
event
[tyr.informatik.hs-fulda.de:12313] [[38463,0],0] plm:rsh: recording launch of 
daemon [[38463,0],1]
[tyr.informatik.hs-fulda.de:12313] [[38463,0],0] plm:rsh: executing: 
(/usr/local/bin/ssh) [/usr/local/bin/ssh rs0  orted -mca orte_hetero_nodes 1 
-mca ess env -mca orte_ess_jobid 2520711168 -mca 
orte_ess_vpid 1 -mca orte_ess_num_procs 4 -mca orte_hnp_uri 
"2520711168.0;tcp://193.174.24.39:59756" --tree-spawn -mca plm_base_verbose 5 
-mca state_base_verbose 5 -mca plm rsh -mca orte_hetero_apps 1]
[tyr.informatik.hs-fulda.de:12313] [[38463,0],0] plm:rsh: executing: 
(/usr/local/bin/ssh) [/usr/local/bin/ssh sunpc1  orted -mca orte_hetero_nodes 1 
-mca ess env -mca orte_ess_jobid 2520711168 -mca 
orte_ess_vpid 2 -mca orte_ess_num_procs 4 -mca orte_hnp_uri 
"2520711168.0;tcp://193.174.24.39:59756" --tree-spawn -mca plm_base_verbose 5 
-mca state_base_verbose 5 -mca plm rsh -mca orte_hetero_apps 1]
[tyr.informatik.hs-fulda.de:12313] [[38463,0],0] plm:rsh: recording launch of 
daemon [[38463,0],2]
Warning: No xauth data; using fake authentication data for X11 forwarding.
X11 forwarding request failed on channel 0
[sunpc1:22320] mca:base:select:(state) Querying component [app]
[sunpc1:22320] mca:base:select:(state) Skipping component [app]. Query failed 
to return a module
[sunpc1:22320] mca:base:select:(state) Querying component [hnp]
[sunpc1:22320] mca:base:select:(state) Skipping component [hnp]. Query failed 
to return a module
[sunpc1:22320] mca:base:select:(state) Querying component [novm]
[sunpc1:22320] mca:base:select:(state) Skipping component [novm]. Query failed 
to return a module
[sunpc1:22320] mca:base:select:(state) Querying component [orted]
[sunpc1:22320] mca:base:select:(state) Query of component [orted] set priority 
to 100
[sunpc1:22320] mca:base:select:(state) Querying component [staged_hnp]
[sunpc1:22320] mca:base:select:(state) Skipping component [staged_hnp]. Query 
failed to return a module
[sunpc1:22320] mca:base:select:(state) Querying component [staged_orted]
[sunpc1:22320] mca:base:select:(state) Skipping component [staged_orted]. Query 
failed to return a module
[sunpc1:22320] mca:base:select:(state) Querying component [tool]
[sunpc1:22320] mca:base:select:(state) Skipping component [tool]. Query failed 
to return a module
[sunpc1:22320] mca:base:select:(state) Selected component [orted]
[sunpc1:22320] mca:base:select:(  plm) Querying component [rsh]
[sunpc1:22320] [[38463,0],2] plm:rsh_lookup on agent ssh : rsh path NULL
[sunpc1:22320] mca:base:select:(  plm) Query of component [rsh] set priority to 
10
[sunpc1:22320] mca:base:select:(  plm) Selected component [rsh]
[sunpc1:22320] [[38463,0],2] plm:rsh_setup on agent ssh : rsh path NULL
[sunpc1:22320] [[38463,0],2] plm:base:receive start comm
[sunpc1:22320] [[38463,0],2] ACTIVATE PROC [[38463,0],0] STATE UNABLE TO SEND 
MSG AT ../../../../openmpi-1.9a1r30100/orte/mca/rml/base/rml_base_frame.c:205
[sunpc1:22320] [[38463,0],2] ACTIVATING PROC [[38463,0],0] STATE UNABLE TO SEND 
MSG PRI 0
[sunpc1:22320] [[38463,0],2] FORCE-TERMINATE AT 
../../../../../openmpi-1.9a1r30100/orte/mca/errmgr/default_orted/errmgr_default_orted.c:259
[sunpc1:22320] [[38463,0],2] ACTIVATE JOB NULL STATE FORCED EXIT AT 
../../../../../openmpi-1.9a1r30100/orte/mca/errmgr/default_orted/errmgr_default_orted.c:259
[sunpc1:22320] [[38463,0],2] ACTIVATING JOB NULL STATE FORCED EXIT PRI 0
[sunpc1:22320] [[38463,0],2] plm:base:receive stop comm
[tyr.informatik.hs-fulda.de:12313] [[38463,0],0] daemon 2 failed with status 1
[tyr.informatik.hs-fulda.de:12313] [[38463,0],0] ACTIVATE PROC [[38463,0],2] 
STATE FAILED TO START AT 
../../../../../openmpi-1.7.4rc2r30094/orte/mca/plm/rsh/plm_rsh_module.c:304
[tyr.informatik.hs-fulda.de:12313] [[38463,0],0] ACTIVATING PROC [[38463,0],2] 
STATE FAILED TO START PRI 0
[tyr.informatik.hs-fulda.de:12313] [[38463,0],0] plm:base:orted_cmd sending 
orted_exit commands
[tyr.informatik.hs-fulda.de:12313] [[38463,0],0] ACTIVATE JOB NULL STATE 
DAEMONS TERMINATED AT ../../openmpi-1.7.4rc2r30094/orte/orted/orted_comm.c:465
[tyr.informatik.hs-fulda.de:12313] [[38463,0],0] ACTIVATING JOB NULL STATE 
DAEMONS TERMINATED PRI 0
[tyr.informatik.hs-fulda.de:12313] [[38463,0],0] plm:base:receive stop comm
tyr small_prog 58 [rs0.informatik.hs-fulda.de:03718] mca:base:select:(state) 
Querying component [app]
[rs0.informatik.hs-fulda.de:03718] mca:base:select:(state) Skipping component 
[app]. Query failed to return a module
[rs0.informatik.hs-fulda.de:03718] mca:base:select:(state) Querying component 
[hnp]
[rs0.informatik.hs-fulda.de:03718] mca:base:select:(state) Skipping component 
[hnp]. Query failed to return a module
[rs0.informatik.hs-fulda.de:03718] mca:base:select:(state) Querying component 
[novm]
[rs0.informatik.hs-fulda.de:03718] mca:base:select:(state) Skipping component 
[novm]. Query failed to return a module
[rs0.informatik.hs-fulda.de:03718] mca:base:select:(state) Querying component 
[orted]
[rs0.informatik.hs-fulda.de:03718] mca:base:select:(state) Query of component 
[orted] set priority to 100
[rs0.informatik.hs-fulda.de:03718] mca:base:select:(state) Querying component 
[staged_hnp]
[rs0.informatik.hs-fulda.de:03718] mca:base:select:(state) Skipping component 
[staged_hnp]. Query failed to return a module
[rs0.informatik.hs-fulda.de:03718] mca:base:select:(state) Querying component 
[staged_orted]
[rs0.informatik.hs-fulda.de:03718] mca:base:select:(state) Skipping component 
[staged_orted]. Query failed to return a module
[rs0.informatik.hs-fulda.de:03718] mca:base:select:(state) Querying component 
[tool]
[rs0.informatik.hs-fulda.de:03718] mca:base:select:(state) Skipping component 
[tool]. Query failed to return a module
[rs0.informatik.hs-fulda.de:03718] mca:base:select:(state) Selected component 
[orted]
[rs0.informatik.hs-fulda.de:03718] mca:base:select:(  plm) Querying component 
[rsh]
[rs0.informatik.hs-fulda.de:03718] [[38463,0],1] plm:rsh_lookup on agent ssh : 
rsh path NULL
[rs0.informatik.hs-fulda.de:03718] mca:base:select:(  plm) Query of component 
[rsh] set priority to 10
[rs0.informatik.hs-fulda.de:03718] mca:base:select:(  plm) Selected component 
[rsh]
[rs0.informatik.hs-fulda.de:03718] [[38463,0],1] plm:rsh_setup on agent ssh : 
rsh path NULL
[rs0.informatik.hs-fulda.de:03718] [[38463,0],1] plm:base:receive start comm
[rs0.informatik.hs-fulda.de:03718] [[38463,0],1] ACTIVATE PROC [[38463,0],0] 
STATE UNABLE TO SEND MSG AT 
../../../../openmpi-1.9a1r30100/orte/mca/rml/base/rml_base_frame.c:205
[rs0.informatik.hs-fulda.de:03718] [[38463,0],1] ACTIVATING PROC [[38463,0],0] 
STATE UNABLE TO SEND MSG PRI 0
[rs0.informatik.hs-fulda.de:03718] [[38463,0],1] FORCE-TERMINATE AT 
../../../../../openmpi-1.9a1r30100/orte/mca/errmgr/default_orted/errmgr_default_orted.c:259
[rs0.informatik.hs-fulda.de:03718] [[38463,0],1] ACTIVATE JOB NULL STATE FORCED 
EXIT AT 
../../../../../openmpi-1.9a1r30100/orte/mca/errmgr/default_orted/errmgr_default_orted.c:259
[rs0.informatik.hs-fulda.de:03718] [[38463,0],1] ACTIVATING JOB NULL STATE 
FORCED EXIT PRI 0
[rs0.informatik.hs-fulda.de:03718] [[38463,0],1] plm:base:receive stop comm

tyr small_prog 58 echo $status                                                  
      1
tyr small_prog 59 



Kind regards

Siegmar





> On Jan 1, 2014, at 1:48 AM, Siegmar Gross
> <siegmar.gr...@informatik.hs-fulda.de> wrote:
> 
> > In the past I could run a small program in a real heterogeneous
> > system with little (sunpc1, linpc1) and big endian (rs0, tyr)
> > machines.
> > 
> > tyr small_prog 101 ompi_info | grep MPI:
> >                Open MPI: 1.6.6a1r29175
> > tyr small_prog 102 mpiexec -np 3 -host rs0,sunpc1,linpc1 rank_size
> > I'm process 1 of 3 available processes running on sunpc1.
> > MPI standard 2.1 is supported.
> > I'm process 0 of 3 available processes running on 
> > rs0.informatik.hs-fulda.de.
> > MPI standard 2.1 is supported.
> > I'm process 2 of 3 available processes running on linpc1.
> > MPI standard 2.1 is supported.
> > tyr small_prog 103 
> > 
> > 
> > Now I get no output at all.
> > 
> > tyr small_prog 130 ompi_info | grep MPI:
> >                Open MPI: 1.9a1r30100
> > tyr small_prog 131 mpiexec -np 3 -host rs0,sunpc1,linpc1 rank_size
> > tyr small_prog 132 mpiexec -np 3 -host rs0,sunpc1,linpc1 \
> >  --hetero-nodes --hetero-apps rank_size
> > tyr small_prog 133
> > 
> > 
> > Perhaps this behaviour is intended, because Open MPI doesn't
> > support little and big endian machines in the same cluster or
> > virtual computer (I know only LAM-MPI which works in such an
> > environment). On the other side: Does it make sense to run
> > the command without any output, if it doesn't work (even if
> > "mpiexec" returns "1")?
> 

Reply via email to