Std TCP/IP stack it hung with an unknown but large(ish) quantity of data. when I ran just one Bcast it was fine but Bcasts in lots in separate MPI_WORLD's hung. - All the details are in some recent posts.
I could not figure it out and moved back to my PVM solution. --- On Wed, 25/8/10, Rahul Nabar <rpna...@gmail.com> wrote: From: Rahul Nabar <rpna...@gmail.com> Subject: Re: [OMPI users] IMB-MPI broadcast test stalls for large core counts: debug ideas? To: "Open MPI Users" <us...@open-mpi.org> Received: Wednesday, 25 August, 2010, 3:38 AM On Mon, Aug 23, 2010 at 8:39 PM, Randolph Pullen <randolph_pul...@yahoo.com.au> wrote: > > I have had a similar load related problem with Bcast. Thanks Randolph! That's interesting to know! What was the hardware you were using? Does your bcast fail at the exact same point too? > > I don't know what caused it though. With this one, what about the > possibility of a buffer overrun or network saturation? How can I test for a buffer overrun? For network saturation I guess I could use something like mrtg to monitor the bandwidth used. On the other hand, all 32 servers are connected to a single dedicated Nexus5000. The back-plane carries no other traffic. Hence I am skeptical that just 41943040 saturated what Cisco rates as a 10GigE fabric. But I might be wrong. -- Rahul _______________________________________________ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users