Hi,

      I have already seen this faq. Nodes in cluster does not have multiple
IP addresses. One thing i forgot to mention is that systems in cluster does
not have static IPs and get IP address through DHCP.

Also if there is a print statement (printf("hello world\n"); ) in slave it
is correctly printed on masters consoles but none of MPI commands work.

regards,

Abhishek

I need to make that error string be google-able -- I'll add it to the
faq. :-)

The problem is likely that you have multiple IP addresses, some of
which are not routable to each other (but fail OMPI's routability
assumptions). Check out these FAQ entries:

http://www.open-mpi.org/faq/?category=tcp#tcp-routability<http://www.open-mpi.org/faq/?category=tcp#tcp-routability>
http://www.open-mpi.org/faq/?category=tcp#tcp-selection<http://www.open-mpi.org/faq/?category=tcp#tcp-selection>

Does this help?

On Apr 19, 2007, at 11:07 AM, Babu Bhai wrote:

I have migrated from LAM/MPI to OpenMPI. I am not able to
execute simple mpi code in which master sends an integer to slave.
If i execute code on single machine i.e start 2 instance on same
machine (mpirun -np 2 hello) this works fine.

If i execute in cluster using mpirun --prefix /usr /local -
np 2 --host 199.63.34.154,199.63.34.36 hello
it gives following error "btl_tcp_endpoint.c:
572:mca_btl_tcp_endpoint_complete_connect] connect() failed with
errno=113"

>I am using openmpi-1.2

>regards,
>Abhishek
>_______________________________________________
>users mailing list
>users_at_[hidden]
>http://www.open-mpi.org/mailman/listinfo.cgi/users<http://www.open-mpi.org/mailman/listinfo.cgi/users>

--
Jeff Squyres
Cisco Systems

Reply via email to