Re: [OMPI users] IMB-MPI broadcast test stalls for large core counts: debug ideas?

Richard Treumann Mon, 23 Aug 2010 22:43:50 -0400

Network saturation could produce arbitrary long delays the total data load 
we are talking about is really small.  It is the responsibility of an MPI 
library to do one of the following:


1) Use a reliable message protocol for each message (e.g. Infiniband RC or 
TCP/IP)
2) detect lost packets and retransmit them if the protocol is un-reliable 
(E.G. Infiniband UD or UDP/IP)

It is the responsibility of an MPI library to manage any MPI or system 
buffers to prevent over run. That is why I mention that 1/2 MB messages 
would use rendezvous protocol.  The send side would push a descriptor 
(called an envelop) to the receive side. The receive side would push back 
an OK_to_send once a matching receive was posted.  The 1/2 MB message data 
would not begin to flow across the network until the receive buffer was 
known. 

It is also the responsibility of an MPI library to detect when MPI level 
messages have become undeliverable and fail the job.

Bugs are always a possibility but unless there is something very unusual 
about the cluster and interconnect or this is an unstable version of MPI, 
it seems very unlikely this use of MPI_Bcast with so few tasks and only a 
1/2 MB message would trip on one.  80 tasks is a very small number in 
modern parallel computing.  Thousands of tasks involved in an MPI 
collective has become pretty standard.


Dick Treumann  -  MPI Team 
IBM Systems & Technology Group
Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
Tele (845) 433-7846         Fax (845) 433-8363


users-boun...@open-mpi.org wrote on 08/23/2010 09:39:29 PM:


> 
> I have had a similar load related problem with Bcast.  I don't know 
> what caused it though.  With this one, what about the possibility of
> a buffer overrun or network saturation?
> 
>

Re: [OMPI users] IMB-MPI broadcast test stalls for large core counts: debug ideas?

Reply via email to