OK. Figured that it was wrong number of arguments to the code.

Thanks,
Prakash

Jeff Squyres (jsquyres) wrote:
I'm assuming that this is during the startup shortly after mpirun,
right?  (i.e., during MPI_INIT)

It looks like MPI processes were unable to connect back to the
rendezvous point (mpirun) during startup.  Do you have any firewalls or
port blocking running in your cluster?
-----Original Message-----
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Prakash Velayutham
Sent: Friday, April 14, 2006 11:00 AM
To: us...@open-mpi.org
Cc: Prakash Velayutham
Subject: [OMPI users] Open MPI error

Hi All,

What does this error mean?

**************************************************************
****************
socket 10: [wins02:19102] [0,0,3]-[0,0,0] mca_oob_tcp_msg_recv: readv
failed with errno=104
socket 12: [wins01:19281] [0,0,4]-[0,0,0] mca_oob_tcp_msg_recv: readv
failed with errno=104
socket 6: [wins05:00939] [0,0,1]-[0,0,0] mca_oob_tcp_msg_send_handler:
writev failed with errno=104
socket 6: [wins05:00939] [0,0,1] ORTE_ERROR_LOG: Communication failure
in file gpr_proxy_put_get.c at line 143
socket 6: [wins05:00939] [0,0,1]-[0,0,0]
mca_oob_tcp_peer_complete_connect: connection failed (errno=111) -
retrying (pid=939)
socket 6: [wins05:00939] mca_oob_tcp_peer_timer_handler
socket 6: [wins05:00939] [0,0,1]-[0,0,0]
mca_oob_tcp_peer_complete_connect: connection failed (errno=111) -
retrying (pid=939)
socket 6: [wins05:00939] mca_oob_tcp_peer_timer_handler
socket 6: [wins05:00939] [0,0,1]-[0,0,0]
mca_oob_tcp_peer_complete_connect: connection failed (errno=111) -
retrying (pid=939)
**************************************************************
*****************

I am still debugging the code I am working on, but just wanted to get
some insight into where I should be looking at.

I am running openmpi-1.0.1.

Thanks,
Prakash

Reply via email to