I did not take the time to try to fully understand your approach so this 
may sound like a dumb question; 

Do you have an MPI_Bcast ROOT process in every MPI_COMM_WORLD and does 
every non-ROOT MPI_Bcast call correctly identify the rank of ROOT in its 
MPI_COMM_WORLD ? 

An MPI_Bcast call when there is not root task in the communicator or when 
the root task rank is given incorrectly will hang.


Dick Treumann  -  MPI Team 
IBM Systems & Technology Group
Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
Tele (845) 433-7846         Fax (845) 433-8363




From:
Randolph Pullen <randolph_pul...@yahoo.com.au>
To:
us...@open-mpi.org
Date:
08/07/2010 01:23 AM
Subject:
[OMPI users] MPI_Bcast issue
Sent by:
users-boun...@open-mpi.org




I seem to be having a problem with MPI_Bcast.
My massive I/O intensive data movement program must broadcast from n to n 
nodes. My problem starts because I require 2 processes per node, a sender 
and a receiver and I have implemented these using MPI processes rather 
than tackle the complexities of threads on MPI.

Consequently, broadcast and calls like alltoall are not completely 
helpful.  The dataset is huge and each node must end up with a complete 
copy built by the large number of contributing broadcasts from the sending 
nodes.  Network efficiency and run time are paramount.

As I don’t want to needlessly broadcast all this data to the sending nodes 
and I have a perfectly good MPI program that distributes globally from a 
single node (1 to N), I took the unusual decision to start N copies of 
this program by spawning the MPI system from the PVM system in an effort 
to get my N to N concurrent transfers.

It seems that the broadcasts running on concurrent MPI environments 
collide and cause all but the first process to hang waiting for their 
broadcasts.  This theory seems to be confirmed by introducing a sleep of 
n-1 seconds before the first MPI_Bcast  call on each node, which results 
in the code working perfectly.  (total run time 55 seconds, 3 nodes, 
standard TCP stack)

My guess is that unlike PVM, OpenMPI implements broadcasts with broadcasts 
rather than multicasts.  Can someone confirm this?  Is this a bug?

Is there any multicast or N to N broadcast where sender processes can 
avoid participating when they don’t need to?

Thanks in advance
Randolph


 _______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


Reply via email to