Marcin,
You can also try to exclude the public subnet(s) (e.g. 1.2.3.0/24) and the
loopback interface instead of em4 that does not exist on the compute nodes.
Or you can include only the private subnet(s) that are common to frontend
and compute nodes
Cheers,
Gilles
On Saturday, September 24, 20
For reference, the issue can be tracked at:
https://github.com/open-mpi/ompi/issues/2116
-Nathan
--
Nathaniel Graham
HPC-DES
Los Alamos National Laboratory
From: users on behalf of Gundram Leifert
Sent: Tuesday, September 20, 2016 2:13 AM
To: users@lists.ope
The WinName test was failing because MPI was never finalized. The window was
also not being freed. I have fixed that test and pushed the changes to the
ompi-java-test repo.
I was not seeing failures with 2 processes for any of the tests except for
WinName, but I did have quite a few fail oc
Thanks for a quick answer, Ralph!
This does not work, because em4 is only defined on the frontend node.
Now I get errors from the computes:
[compute-1-4.local:12206] found interface lo
[compute-1-4.local:12206] found interface em1
[compute-1-4.local:12206] mca: base: components_open: component
This isn’t an issue with the SLURM integration - this is the problem of our OOB
not correctly picking the right subnet for connecting back to mpirun. In this
specific case, you probably want
-mca btl_tcp_if_include em4 -mca oob_tcp_if_include em4
since it is the em4 network that ties the comput
Hi,
I have stumbled upon a similar issue, so I wonder those might be
related. On one of our systems I get the following error message, both
when using openmpi 1.8.8 and 1.10.4
$ mpirun -debug-daemons --mca btl tcp,self --mca mca_base_verbose 100
--mca btl_base_verbose 100 ls
[...]
[compute