Try running it with “—mca oob_base_verbose 100” on both client and server - it will tell us why the connection was refused.
> On Jul 13, 2015, at 2:14 PM, Audet, Martin <martin.au...@cnrc-nrc.gc.ca> > wrote: > > Hi OMPI_Developers, > > It seems that I am unable to establish an MPI communication between two > independently started MPI programs using the simplest client/server call > sequence I can imagine (see the two attached files) when the client and > server process are started on different machines. Note that I have no > problems when the client and server program run on the same machine. > > For example if I do the following on the server machine (running on fn1): > > [audet@fn1 mpi]$ mpicc -Wall simpleserver.c -o simpleserver > [audet@fn1 mpi]$ mpiexec -n 1 ./simpleserver > Server port = > '3054370816.0;tcp://172.17.15.20:54458+3054370817.0;tcp://172.17.15.20:58943:300' > > The server prints its port (created with MPI_Open_port()) and wait for a > connection by calling MPI_Comm_accept(). > > Now on the client machine (running on linux15) if I compile the client and > run it with the above port address on the command line, I get: > > [audet@linux15 mpi]$ mpicc -Wall simpleclient.c -o simpleclient > [audet@linux15 mpi]$ mpiexec -n 1 ./simpleclient > '3054370816.0;tcp://172.17.15.20:54458+3054370817.0;tcp://172.17.15.20:58943:300' > trying to connect... > ------------------------------------------------------------ > A process or daemon was unable to complete a TCP connection > to another process: > Local host: linux15 > Remote host: linux15 > This is usually caused by a firewall on the remote host. Please > check that any firewall (e.g., iptables) has been disabled and > try again. > ------------------------------------------------------------ > [linux15:24193] [[13075,0],0]-[[46606,0],0] mca_oob_tcp_peer_send_handler: > invalid connection state (6) on socket 16 > > And then I have to stop the client program by pressing ^C (and also the > server which doesn't seems affected). > > What's wrong ? > > And I am almost sure there is no firewall running on linux15. > > It is not the first MPI client/server application I am developing (with both > OpenMPI and mpich). > These simple MPI client/server programs work well with mpich (version 3.1.3). > > This problem happens with both OpenMPI 1.8.3 and 1.8.6 > > linux15 and fn1 run both on Fedora Core 12 Linux (64 bits) and are connected > by a Gigabit Ethernet (the normal network). > > And again if client and server run on the same machine (either fn1 or > linux15) no such problems happens. > > Thanks in advance, > > Martin > Audet<simpleserver.c><simpleclient.c>_______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2015/07/27271.php