Have you tried upgrading to a newer version of Open MPI?  The 1.4.x series is 
several generations old.  Open MPI 1.7.4 was just released yesterday.


On Feb 5, 2014, at 9:58 PM, Ross Boylan <r...@biostat.ucsf.edu> wrote:

> On 1/31/2014 1:08 PM, Ross Boylan wrote:
>> I am getting the following error, amidst many successful message sends:
>> [n10][[50048,1],1][../../../../../../ompi/mca/btl/tcp/btl_tcp_frag.c:118:mca_btl_tcp_frag_send]
>>  mca_btl_tcp_frag_send: writev error (0x7f6155970038, 578659815)
>>         Bad address(1)
>> 
> I think I've tracked down the immediate cause: I was sending a very large 
> object (from R--I assume serialized into a byte stream) that was over 3G.  
> I'm not sure why it would produce that error, but it doesn't seem that 
> surprising that something would go wrong.
> 
> Ross
>> Any ideas about what is going on or what I can do to fix it?
>> 
>> I am using the openmpi-bin 1.4.2-4 Debian package on a cluster running 
>> Debian squeeze.
>> 
>> I couldn't find a config.log file; there is 
>> /etc/openmpi/openmpi-mca-params.conf, which is completely commented out.
>> 
>> Invocation is from R 3.0.1 (debian package) with Rmpi 0.6.3 built by me from 
>> source in a local directory. My sends all use mpi.isend.Robj and the 
>> receives use mpi.recv.Robj, both from the Rmpi library.
>> 
>> The jobs were started with rmpilaunch; it and the hosts file are included in 
>> the attachments. TCP connections.  rmpilaunch leaves me in an R session on 
>> the master.  I invoked the code inside the toplevel() function toward the 
>> bottom of dbox-master.R. 
>> 
>> The program source files and other background information is in the attached 
>> file.    n10 has the output of ompi_info --all, and n1011 has other info for 
>> both nodes that were active (n10 was master; n11 had some slaves).
>> 
>> 
>> _______________________________________________
>> users mailing list
>> 
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

Reply via email to