Hi Jeff,

Thanks for the reply. FYI since I originally posted this, I uninstalled
OpenMPI 3.0.1 and installed 3.1.0, but I'm still experiencing the same
problem.

When I run the command without the `--mca plm_base_verbose 100` flag, it
hangs indefinitely with no output.

As far as I can tell, these are the additional processes running on each
machine while mpirun is hanging (printed using `ps -aux | less`):

On executing host b09-30:

user     361714  0.4  0.0 293016  8444 pts/0    Sl+  15:10   0:00 mpirun
--host b09-30,b09-32 hostname
user     361719  0.0  0.0  37092  5112 pts/0    T    15:10   0:00
/usr/bin/ssh -x b09-32  orted -mca ess "env" -mca ess_base_jobid
"638517248" -mca ess_base_vpid 1 -mca ess_base_num_procs "2" -mca
orte_node_regex "b[2:9]-30,b[2:9]-32@0(2)" -mca orte_hnp_uri
"638517248.0;tcp://169.228.66.102,10.1.100.30:55090" -mca plm "rsh" -mca
pmix "^s1,s2,cray,isolated"

On remote host b09-32:

root     175273  0.0  0.0  61752  5824 ?        Ss   15:10   0:00 sshd:
[accepted]
sshd     175274  0.0  0.0  61752   708 ?        S    15:10   0:00 sshd:
[net]

I only see orted showing up in the ssh flags on b09-30. Any ideas what I
should try next?

Thanks,
Max
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Reply via email to