Re: [OMPI users] writev error: Bad address

Ross Boylan Wed, 5 Feb 2014 21:58:26 -0500 (EST)

On 1/31/2014 1:08 PM, Ross Boylan wrote:

I am getting the following error, amidst many successful message sends:
[n10][[50048,1],1][../../../../../../ompi/mca/btl/tcp/btl_tcp_frag.c:118:mca_btl_tcp_frag_send]
 mca_btl_tcp_frag_send: writev error (0x7f6155970038, 578659815)
         Bad address(1)

I think I've tracked down the immediate cause: I was sending a verylarge object (from R--I assume serialized into a byte stream) that wasover 3G. I'm not sure why it would produce that error, but it doesn'tseem that surprising that something would go wrong.


Ross

Any ideas about what is going on or what I can do to fix it?
I am using the openmpi-bin 1.4.2-4 Debian package on a cluster runningDebian squeeze.
I couldn't find a config.log file; there is/etc/openmpi/openmpi-mca-params.conf, which is completely commented out.
Invocation is from R 3.0.1 (debian package) with Rmpi 0.6.3 built byme from source in a local directory. My sends all use mpi.isend.Robjand the receives use mpi.recv.Robj, both from the Rmpi library.
The jobs were started with rmpilaunch; it and the hosts file areincluded in the attachments. TCP connections. rmpilaunch leaves me inan R session on the master. I invoked the code inside the toplevel()function toward the bottom of dbox-master.R.
The program source files and other background information is in theattached file. n10 has the output of |ompi_info --all, and n1011has other info for both nodes that were active (n10 was master; n11had some slaves).
|


_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] writev error: Bad address

Reply via email to