On Jun 11, 2012, at 11:15 AM, BOUVIER Benjamin wrote: > Thanks for your hints Jeff. > I've just tried without any firewalls on involved machines, but the issue > remains. > > # /etc/init.d/ip6tables status > ip6tables: Firewall is not running. > # /etc/init.d/iptables status > iptables: Firewall is not running.
Ok. > The machines have the host names "node1", "node2" and "node3". > I launch the basic program on one machine, asking node1 and node2 to be > hosts. Typing `netstat -a | grep node1` from node2 shows me that node1 and > node2 are connected by tcp, as the connection is marked as ESTABLISHED. I > have the same thing when I do `netstat -a | grep node2` from node1. However, > the program keeps blocking. I'm not entirely clear which combinations are working and which are not. Can you specify which ones are working? You might want to try the ring_c.c program in the OMPI examples/ directory -- it's a trivial "send a message around in a ring" program that will scale up to >=2 processes. - on node1, "mpirun --host node1,node2 ring_c" - on node1, "mpirun --host node1,node3 ring_c" - on node1, "mpirun --host node2,node3 ring_c" - on node1, "mpirun --host node1,node2,node3 ring_c" Repeat all 4 from node2. > What else could provoke that failure ? > -- > Benjamin BOUVIER > > ________________________________________ > To start, I would ensure that all firewalling (e.g., iptables) is disabled > on all machines involved. > > On Jun 11, 2012, at 10:16 AM, BOUVIER Benjamin wrote: > >> Hi, >> >>> I'd guess that running net pipe with 3 procs may be undefined. >> >> It is indeed undefined. Running the net pipe program locally with 3 >> processors blocks, on my computer. >> >> This issue is especially weird as there is no problem for running the >> example program on network with MPICH2 implementation, for 2 processes. >> >> However, with MPICH2, it fails with 3 processes and blocks also on connect >> ("Connection refused"), which could indicate that it's actually a network >> issue, with both MPICH2 and OMPI. I don't know how many connections OMPI use >> to send the data in the example program, but with the assumption that it >> tries to open 2 connections (while for the same program, MPICH2 only uses >> one connection, which is another hypothesis), maybe the number of >> connections is the right way to look for. I'll ask MPICH2 users on their >> mailing list, so as to get their opinion about it. >> >> Now that I know the program doesn't work both with OMPI and MPICH2 >> implementations, I guess it's not dependant of MPI implementation. >> >> If you have any ideas or comments, I would be pleased to hear them. >> >> -- >> Benjamin Bouvier >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/