Lydia,

This message means that the TCP driver try to send some illegal buffer. Look like the send failed while sending 16 bytes, which is pretty uncommon. What is this application? Do you get the same error message when running with fewer nodes?

  Thanks,
    George.

On Feb 2, 2008, at 4:37 AM, Lydia Heck wrote:


In one of our big runs (512 cpus) the code fails and produces on a list
of nodes the following type of error:

I have searched the FAQs but could not find an answer there.
There are difficulties getting the code to run because of its shear size
but there is no other indication of the problem.

Does the following error message mean the some of the nodes have given up?


mca_btl_tcp_frag_send] mca_btl_tcp_frag_send: writev error
([361eca8[m2234][0,1,283][m2317, 16][0,)
       1Bad address,422(3)
][[
/ws/hpc-ct-7.1/builds/7.1.build-ct7.1-003c/ompi-ct7.1/ompi/mca/btl/ tcp/btl_tcp_frag.c:114:mca_btl_tcp
_frag_send]
/ws/hpc-ct-7.1/builds/7.1.build-ct7.1-003c/ompi-ct7.1/ompi/mca/btl/ tcp/btl_tcp_frag.c[m22 41][0,1,430][m2140[m2152][0,1,150][mca_btl_tcp_frag_send: writev error (3c759a8,
16)
       Bad address(3)


Lydia

------------------------------------------
Dr E L  Heck

University of Durham
Institute for Computational Cosmology
Ogden Centre
Department of Physics
South Road

DURHAM, DH1 3LE
United Kingdom

e-mail: lydia.h...@durham.ac.uk

Tel.: + 44 191 - 334 3628
Fax.: + 44 191 - 334 3645
___________________________________________
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to