Hello!

I have Open MPI  v1.8.2rc4r32485

When i run hello_c, I got this error message
$mpirun  -np 2 hello_c

An ORTE daemon has unexpectedly failed after launch and before
communicating back to mpirun. This could be caused by a number
of factors, including an inability to create a connection back
to mpirun due to a lack of common network interfaces and/or no
route found between them. Please check network connectivity
(including firewalls and network routing requirements).

When i run with --debug-daemons --mca plm_base_verbose 5 -mca oob_base_verbose 
10 -mca rml_base_verbose 10 i got this output:
$mpirun  --debug-daemons --mca plm_base_verbose 5 -mca oob_base_verbose 10 -mca 
rml_base_verbose 10   -np 2 hello_c
[compiler-2:08780] mca:base:select:( plm) Querying component [isolated]
[compiler-2:08780] mca:base:select:( plm) Query of component [isolated] set 
priority to 0
[compiler-2:08780] mca:base:select:( plm) Querying component [rsh]
[compiler-2:08780] mca:base:select:( plm) Query of component [rsh] set priority 
to 10
[compiler-2:08780] mca:base:select:( plm) Querying component [slurm]
[compiler-2:08780] mca:base:select:( plm) Query of component [slurm] set 
priority to 75
[compiler-2:08780] mca:base:select:( plm) Selected component [slurm]
[compiler-2:08780] mca: base: components_register: registering oob components
[compiler-2:08780] mca: base: components_register: found loaded component tcp
[compiler-2:08780] mca: base: components_register: component tcp register 
function successful
[compiler-2:08780] mca: base: components_open: opening oob components
[compiler-2:08780] mca: base: components_open: found loaded component tcp
[compiler-2:08780] mca: base: components_open: component tcp open function 
successful
[compiler-2:08780] mca:oob:select: checking available component tcp
[compiler-2:08780] mca:oob:select: Querying component [tcp]
[compiler-2:08780] oob:tcp: component_available called
[compiler-2:08780] WORKING INTERFACE 1 KERNEL INDEX 1 FAMILY: V4
[compiler-2:08780] WORKING INTERFACE 2 KERNEL INDEX 3 FAMILY: V4
[compiler-2:08780] [[42202,0],0] oob:tcp:init adding 10.0.251.53 to our list of 
V4 connections
[compiler-2:08780] WORKING INTERFACE 3 KERNEL INDEX 4 FAMILY: V4
[compiler-2:08780] [[42202,0],0] oob:tcp:init adding 10.0.0.4 to our list of V4 
connections
[compiler-2:08780] WORKING INTERFACE 4 KERNEL INDEX 5 FAMILY: V4
[compiler-2:08780] [[42202,0],0] oob:tcp:init adding 10.2.251.14 to our list of 
V4 connections
[compiler-2:08780] WORKING INTERFACE 5 KERNEL INDEX 6 FAMILY: V4
[compiler-2:08780] [[42202,0],0] oob:tcp:init adding 10.128.0.4 to our list of 
V4 connections
[compiler-2:08780] WORKING INTERFACE 6 KERNEL INDEX 7 FAMILY: V4
[compiler-2:08780] [[42202,0],0] oob:tcp:init adding 93.180.7.38 to our list of 
V4 connections
[compiler-2:08780] [[42202,0],0] TCP STARTUP
[compiler-2:08780] [[42202,0],0] attempting to bind to IPv4 port 0
[compiler-2:08780] [[42202,0],0] assigned IPv4 port 38420
[compiler-2:08780] mca:oob:select: Adding component to end
[compiler-2:08780] mca:oob:select: Found 1 active transports
[compiler-2:08780] mca: base: components_register: registering rml components
[compiler-2:08780] mca: base: components_register: found loaded component oob
[compiler-2:08780] mca: base: components_register: component oob has no 
register or open function
[compiler-2:08780] mca: base: components_open: opening rml components
[compiler-2:08780] mca: base: components_open: found loaded component oob
[compiler-2:08780] mca: base: components_open: component oob open function 
successful
[compiler-2:08780] orte_rml_base_select: initializing rml component oob
[compiler-2:08780] [[42202,0],0] posting recv
[compiler-2:08780] [[42202,0],0] posting persistent recv on tag 30 for peer 
[[WILDCARD],WILDCARD]
[compiler-2:08780] [[42202,0],0] posting recv
[compiler-2:08780] [[42202,0],0] posting persistent recv on tag 15 for peer 
[[WILDCARD],WILDCARD]
[compiler-2:08780] [[42202,0],0] posting recv
[compiler-2:08780] [[42202,0],0] posting persistent recv on tag 32 for peer 
[[WILDCARD],WILDCARD]
[compiler-2:08780] [[42202,0],0] posting recv
[compiler-2:08780] [[42202,0],0] posting persistent recv on tag 33 for peer 
[[WILDCARD],WILDCARD]
[compiler-2:08780] [[42202,0],0] posting recv
[compiler-2:08780] [[42202,0],0] posting persistent recv on tag 5 for peer 
[[WILDCARD],WILDCARD]
[compiler-2:08780] [[42202,0],0] posting recv
[compiler-2:08780] [[42202,0],0] posting persistent recv on tag 10 for peer 
[[WILDCARD],WILDCARD]
[compiler-2:08780] [[42202,0],0] posting recv
[compiler-2:08780] [[42202,0],0] posting persistent recv on tag 12 for peer 
[[WILDCARD],WILDCARD]
[compiler-2:08780] [[42202,0],0] posting recv
[compiler-2:08780] [[42202,0],0] posting persistent recv on tag 9 for peer 
[[WILDCARD],WILDCARD]
[compiler-2:08780] [[42202,0],0] posting recv
[compiler-2:08780] [[42202,0],0] posting persistent recv on tag 34 for peer 
[[WILDCARD],WILDCARD]
[compiler-2:08780] [[42202,0],0] posting recv
[compiler-2:08780] [[42202,0],0] posting persistent recv on tag 2 for peer 
[[WILDCARD],WILDCARD]
[compiler-2:08780] [[42202,0],0] posting recv
[compiler-2:08780] [[42202,0],0] posting persistent recv on tag 21 for peer 
[[WILDCARD],WILDCARD]
[compiler-2:08780] [[42202,0],0] posting recv
[compiler-2:08780] [[42202,0],0] posting persistent recv on tag 22 for peer 
[[WILDCARD],WILDCARD]
[compiler-2:08780] [[42202,0],0] posting recv
[compiler-2:08780] [[42202,0],0] posting persistent recv on tag 45 for peer 
[[WILDCARD],WILDCARD]
[compiler-2:08780] [[42202,0],0] posting recv
[compiler-2:08780] [[42202,0],0] posting persistent recv on tag 46 for peer 
[[WILDCARD],WILDCARD]
[compiler-2:08780] [[42202,0],0] posting recv
[compiler-2:08780] [[42202,0],0] posting persistent recv on tag 1 for peer 
[[WILDCARD],WILDCARD]
[compiler-2:08780] [[42202,0],0] posting recv
[compiler-2:08780] [[42202,0],0] posting persistent recv on tag 27 for peer 
[[WILDCARD],WILDCARD]
Daemon was launched on node1-130-08 - beginning to initialize
Daemon was launched on node1-130-03 - beginning to initialize
Daemon was launched on node1-130-05 - beginning to initialize
Daemon was launched on node1-130-02 - beginning to initialize
Daemon was launched on node1-130-01 - beginning to initialize
Daemon was launched on node1-130-04 - beginning to initialize
Daemon was launched on node1-130-07 - beginning to initialize
Daemon was launched on node1-130-06 - beginning to initialize
Daemon [[42202,0],3] checking in as pid 7178 on host node1-130-03
[node1-130-03:07178] [[42202,0],3] orted: up and running - waiting for commands!
Daemon [[42202,0],2] checking in as pid 13581 on host node1-130-02
[node1-130-02:13581] [[42202,0],2] orted: up and running - waiting for commands!
Daemon [[42202,0],1] checking in as pid 17220 on host node1-130-01
[node1-130-01:17220] [[42202,0],1] orted: up and running - waiting for commands!
Daemon [[42202,0],5] checking in as pid 6663 on host node1-130-05
[node1-130-05:06663] [[42202,0],5] orted: up and running - waiting for commands!
Daemon [[42202,0],8] checking in as pid 6683 on host node1-130-08
[node1-130-08:06683] [[42202,0],8] orted: up and running - waiting for commands!
Daemon [[42202,0],7] checking in as pid 7877 on host node1-130-07
[node1-130-07:07877] [[42202,0],7] orted: up and running - waiting for commands!
Daemon [[42202,0],4] checking in as pid 7735 on host node1-130-04
[node1-130-04:07735] [[42202,0],4] orted: up and running - waiting for commands!
Daemon [[42202,0],6] checking in as pid 8451 on host node1-130-06
[node1-130-06:08451] [[42202,0],6] orted: up and running - waiting for commands!
srun: error: node1-130-03: task 2: Exited with exit code 1
srun: Terminating job step 657040.1
srun: error: node1-130-02: task 1: Exited with exit code 1
slurmd[node1-130-04]: *** STEP 657040.1 KILLED AT 2014-08-12T12:59:07 WITH 
SIGNAL 9 ***
slurmd[node1-130-07]: *** STEP 657040.1 KILLED AT 2014-08-12T12:59:07 WITH 
SIGNAL 9 ***
slurmd[node1-130-06]: *** STEP 657040.1 KILLED AT 2014-08-12T12:59:07 WITH 
SIGNAL 9 ***
srun: Job step aborted: Waiting up to 2 seconds for job step to finish.
srun: error: node1-130-01: task 0: Exited with exit code 1
srun: error: node1-130-05: task 4: Exited with exit code 1
srun: error: node1-130-08: task 7: Exited with exit code 1
srun: error: node1-130-07: task 6: Exited with exit code 1
srun: error: node1-130-04: task 3: Killed
srun: error: node1-130-06: task 5: Killed
--------------------------------------------------------------------------
An ORTE daemon has unexpectedly failed after launch and before
communicating back to mpirun. This could be caused by a number
of factors, including an inability to create a connection back
to mpirun due to a lack of common network interfaces and/or no
route found between them. Please check network connectivity
(including firewalls and network routing requirements).
--------------------------------------------------------------------------
[compiler-2:08780] [[42202,0],0] orted_cmd: received halt_vm cmd
[compiler-2:08780] mca: base: close: component oob closed
[compiler-2:08780] mca: base: close: unloading component oob
[compiler-2:08780] [[42202,0],0] TCP SHUTDOWN
[compiler-2:08780] mca: base: close: component tcp closed
[compiler-2:08780] mca: base: close: unloading component tcp

Reply via email to