Hello All, I am trying to use the approach explained in https://stackoverflow.com/questions/15007164/can-mpi-publish-name-be-used-for-two-separately-started-applications/15008715#15008715 but when I start the master and slave instances on different machines I got the following message:
-------------------------------------------------------------------------- WARNING: Open MPI accepted a TCP connection from what appears to be a another Open MPI process but cannot find a corresponding process entry for that peer. This attempted connection will be ignored; your MPI job may or may not continue properly. Local host: centos64 PID: 96652 -------------------------------------------------------------------------- This happens for the first remote slave and, after this warning, the second remote slave hangs. All runs smoothly if I ran all instances in the same host. Can anybody give me a hint on what to check? I am using openmpi-v4.0.x-201905010241-888d014 and the mpirun commands are as follows: On host centos64: mpirun -H centos64 --ompi-server `cat /tmp/server.uri` \ -np 1 /home/erico/master & sleep 2 mpirun -H centos64 --ompi-server `cat /tmp/server.uri` \ -np 1 /home/erico/slave -i 1 & sleep 1 mpirun -H centos64 --ompi-server `cat /tmp/server.uri` \ -np 1 /home/erico/slave -i 2 & sleep 1 On host centos64-cl mpirun -H centos64-cl -oversubscribe --ompi-server `cat /tmp/server.uri` -np 1 /home/erico/slave -i 3 & mpirun -H centos64-cl -oversubscribe --ompi-server `cat /tmp/server.uri` -np 1 /home/erico/slave -i 4 & Thanks in advance!! Erico Silva
_______________________________________________ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users