I'm testing a couple of applications with OpenMPI v1.2b, using over 1000
processors, and am getting TCP errors. These apps ran fine for a lesser
number of processors.

The errors can be different for different runs. Here's one:

[blade90][0,1,223][../../../../../ompi/mca/btl/tcp/btl_tcp_endpoint.c:572:mc
a_btl_tcp_endpoint_complete_connect] connect() failed with errno=113
[blade82][0,1,203][../../../../../ompi/mca/btl/tcp/btl_tcp_endpoint.c:572:mc
a_btl_tcp_endpoint_complete_connect] connect() failed with errno=113

And I've appended the output from a second type of error, on another trial
run.

I only have a single interface, and understand I'm pushing the capacity of
the single gigE. But I'd like to know what these errors signify.

Thanks,

Todd

-----



[blade6][0,1,10][../../../../../ompi/mca/btl/tcp/btl_tcp_endpoint.c:415:mca_
btl_tcp_endpoint_recv_blocking] recv() failed with errno=104
[blade309:12625] mca_btl_tcp_frag_send: writev failed with errno=104
[blade309:12625] mca_btl_tcp_frag_send: writev failed with errno=104
[blade5][0,1,9][../../../../../ompi/mca/btl/tcp/btl_tcp_endpoint.c:415:mca_b
tl_tcp_endpoint_recv_blocking] recv() failed with errno=104
[blade134:12179] mca_btl_tcp_frag_send: writev failed with errno=104
[blade3][0,1,4][../../../../../ompi/mca/btl/tcp/btl_tcp_endpoint.c:415:mca_b
tl_tcp_endpoint_recv_blocking] recv() failed with errno=104
[blade484][0,1,1060][../../../../../ompi/mca/btl/tcp/btl_tcp_endpoint.c:415:
mca_btl_tcp_endpoint_recv_blocking] recv() failed with errno=104
[blade146][0,1,400][../../../../../ompi/mca/btl/tcp/btl_tcp_endpoint.c:415:m
ca_btl_tcp_endpoint_recv_blocking]
[blade157][0,1,444][../../../../../ompi/mca/btl/tcp/btl_tcp_endpoint.c:415:m
ca_btl_tcp_endpoint_recv_blocking]
[blade212][0,1,532][../../../../../ompi/mca/btl/tcp/btl_tcp_endpoint.c:415:m
ca_btl_tcp_endpoint_recv_blocking] recv() failed with errno=104
[blade134:12182] mca_btl_tcp_frag_send: writev failed with errno=104
recv() failed with errno=104
recv() failed with errno=104
[blade146][0,1,402][../../../../../ompi/mca/btl/tcp/btl_tcp_endpoint.c:415:m
ca_btl_tcp_endpoint_recv_blocking] recv() failed with errno=104
[blade157][0,1,446][../../../../../ompi/mca/btl/tcp/btl_tcp_endpoint.c:415:m
ca_btl_tcp_endpoint_recv_blocking] recv() failed with errno=104
[blade4][0,1,6][../../../../../ompi/mca/btl/tcp/btl_tcp_endpoint.c:415:mca_b
tl_tcp_endpoint_recv_blocking] recv() failed with errno=104
[blade485][0,1,1062][../../../../../ompi/mca/btl/tcp/btl_tcp_endpoint.c:415:
mca_btl_tcp_endpoint_recv_blocking] recv() failed with errno=104
[blade214][0,1,534][../../../../../ompi/mca/btl/tcp/btl_tcp_endpoint.c:415:m
ca_btl_tcp_endpoint_recv_blocking] recv() failed with errno=104
[blade146][0,1,403][../../../../../ompi/mca/btl/tcp/btl_tcp_endpoint.c:415:m
ca_btl_tcp_endpoint_recv_blocking]
[blade4][0,1,7][../../../../../ompi/mca/btl/tcp/btl_tcp_endpoint.c:415:mca_b
tl_tcp_endpoint_recv_blocking] recv() failed with errno=104
[blade486][0,1,1063][../../../../../ompi/mca/btl/tcp/btl_tcp_endpoint.c:415:
mca_btl_tcp_endpoint_recv_blocking] recv() failed with errno=104
[blade157][0,1,447][../../../../../ompi/mca/btl/tcp/btl_tcp_endpoint.c:415:m
ca_btl_tcp_endpoint_recv_blocking]
[blade215][0,1,535][../../../../../ompi/mca/btl/tcp/btl_tcp_endpoint.c:415:m
ca_btl_tcp_endpoint_recv_blocking] recv() failed with errno=104
recv() failed with errno=104
recv() failed with errno=104
[blade146][0,1,401][../../../../../ompi/mca/btl/tcp/btl_tcp_endpoint.c:415:m
ca_btl_tcp_endpoint_recv_blocking] recv() failed with errno=104
[blade157][0,1,445][../../../../../ompi/mca/btl/tcp/btl_tcp_endpoint.c:415:m
ca_btl_tcp_endpoint_recv_blocking] recv() failed with errno=104
[blade3][0,1,5][../../../../../ompi/mca/btl/tcp/btl_tcp_endpoint.c:415:mca_b
tl_tcp_endpoint_recv_blocking] recv() failed with errno=104
[blade485][0,1,1061][../../../../../ompi/mca/btl/tcp/btl_tcp_endpoint.c:415:
mca_btl_tcp_endpoint_recv_blocking] recv() failed with errno=104
[blade213][0,1,533][../../../../../ompi/mca/btl/tcp/btl_tcp_endpoint.c:415:m
ca_btl_tcp_endpoint_recv_blocking] recv() failed with errno=104
[blade62][0,1,124][../../../../../ompi/mca/btl/tcp/btl_tcp_endpoint.c:415:mc
a_btl_tcp_endpoint_recv_blocking] [blade71:12423] mca_btl_tcp_frag_send:
writev failed with errno=104
[blade132][0,1,344][../../../../../ompi/mca/btl/tcp/btl_tcp_endpoint.c:415:m
ca_btl_tcp_endpoint_recv_blocking]
[blade389][0,1,872][../../../../../ompi/mca/btl/tcp/btl_tcp_endpoint.c:415:m
ca_btl_tcp_endpoint_recv_blocking] recv() failed with errno=104
recv() failed with errno=104
[blade132][0,1,347][../../../../../ompi/mca/btl/tcp/btl_tcp_endpoint.c:415:m
ca_btl_tcp_endpoint_recv_blocking] recv() failed with errno=104
[blade390][0,1,873][../../../../../ompi/mca/btl/tcp/btl_tcp_endpoint.c:415:m
ca_btl_tcp_endpoint_recv_blocking] recv() failed with errno=104
recv() failed with errno=104
[blade62][0,1,125][../../../../../ompi/mca/btl/tcp/btl_tcp_endpoint.c:415:mc
a_btl_tcp_endpoint_recv_blocking] recv() failed with errno=104
[blade62][0,1,127][../../../../../ompi/mca/btl/tcp/btl_tcp_endpoint.c:415:mc
a_btl_tcp_endpoint_recv_blocking] recv() failed with errno=104
[blade72:12411] mca_btl_tcp_frag_send: writev failed with errno=104
[blade132][0,1,345][../../../../../ompi/mca/btl/tcp/btl_tcp_endpoint.c:415:m
ca_btl_tcp_endpoint_recv_blocking] recv() failed with errno=104
[blade391][0,1,875][../../../../../ompi/mca/btl/tcp/btl_tcp_endpoint.c:415:m
ca_btl_tcp_endpoint_recv_blocking] recv() failed with errno=104
[blade390][0,1,874][../../../../../ompi/mca/btl/tcp/btl_tcp_endpoint.c:415:m
ca_btl_tcp_endpoint_recv_blocking] recv() failed with errno=104
[blade62][0,1,126][../../../../../ompi/mca/btl/tcp/btl_tcp_endpoint.c:415:mc
a_btl_tcp_endpoint_recv_blocking] recv() failed with errno=104
[blade132][0,1,346][../../../../../ompi/mca/btl/tcp/btl_tcp_endpoint.c:415:m
ca_btl_tcp_endpoint_recv_blocking] recv() failed with errno=104


Reply via email to