Re: [OMPI users] IMB-MPI broadcast test stalls for large core counts: debug ideas?

Randolph Pullen Tue, 24 Aug 2010 21:54:24 -0400

Std TCP/IP stack
it hung with an unknown but large(ish) quantity of data.  when I ran just one 
Bcast it was fine but Bcasts in lots in separate MPI_WORLD's hung.   - All the 
details are in some recent posts.

I could not figure it out and moved back to my PVM solution.

--- On Wed, 25/8/10, Rahul Nabar <rpna...@gmail.com> wrote:

From: Rahul Nabar <rpna...@gmail.com>
Subject: Re: [OMPI users] IMB-MPI broadcast test stalls for large core counts: 
debug ideas?
To: "Open MPI Users" <us...@open-mpi.org>
Received: Wednesday, 25 August, 2010, 3:38 AM

On Mon, Aug 23, 2010 at 8:39 PM, Randolph Pullen
<randolph_pul...@yahoo.com.au> wrote:
>
> I have had a similar load related problem with Bcast.

Thanks Randolph! That's interesting to know! What was the hardware you
were using? Does your bcast fail at the exact same point too?

>
> I don't know what caused it though.  With this one, what about the 
> possibility of a buffer overrun or network saturation?

How can I test for a buffer overrun?

For network saturation I guess I could use something like mrtg to
monitor the bandwidth used. On the other hand, all 32 servers are
connected to a single dedicated Nexus5000. The back-plane carries no
other traffic. Hence I am skeptical that just 41943040 saturated what
Cisco rates as a 10GigE fabric. But I might be wrong.

-- 
Rahul

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] IMB-MPI broadcast test stalls for large core counts: debug ideas?

Reply via email to