Re: [OMPI users] Multi-threading with OpenMPI ?

Richard Treumann Fri, 25 Sep 2009 07:40:03 -0400

MPI_COMM_SELF is one example. The only task it contains is the local task.


The other case I had in mind is where there is a master  doing all spawns.
Master is launched as an MPI "job" but it has only one task. In that
master, even MPI_COMM_WORLD is what I called a "single task communicator".

Because the collective spawn call is "collective: across only one task in
this case, it does not have the same sort of dependency on what other tasks
do.

I think it is common for a single task master to have responsibility for
all spawns in the kind of model yours sounds like.  I did not study the
conversation enough to knew if you are doing all spawn calls from a
"single task communicator" and I was trying to give a broadly useful
explanation.


Dick Treumann  -  MPI Team
IBM Systems & Technology Group
Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
Tele (845) 433-7846         Fax (845) 433-8363


users-boun...@open-mpi.org wrote on 09/25/2009 02:59:04 AM:

> [image removed]
>
> Re: [OMPI users] Multi-threading with OpenMPI ?
>
> Ashika Umanga Umagiliya
>
> to:
>
> Open MPI Users
>
> 09/25/2009 03:00 AM
>
> Sent by:
>
> users-boun...@open-mpi.org
>
> Please respond to Open MPI Users
>
> Thank you Dick for your detailed reply,
>
> I am sorry, could you explain more what you meant by "unless you are
> calling MPI_Comm_spawn on a single task communicator you would need
> to have a different input communicator for each thread that will
> make an MPI_Comm_spawn call" , i am confused with the term "single
> task communicator"
>
> Best Regards,
> umanga
>
> Richard Treumann wrote:
> It is dangerous to hold a local lock (like a mutex} across a
> blocking MPI call unless you can be 100% sure everything that must
> happen remotely will be completely independent of what is done with
> local locks & communication dependancies on other tasks.
>
> It is likely that a MPI_Comm_spawn call in which the spawning
> communicator is MPI_COMM_SELF would be safe to serialize with a
> mutex. But be careful and do not view this as an approach to making
> MPI applications thread safe in general. Also, unless you are
> calling MPI_Comm_spawn on a single task communicator you would need
> to have a different input communicator for each thread that will
> make an MPI_Comm_spawn call. MPI requires that collective calls on a
> given communicator be made in the same order by all participating tasks.
>
> If there are two or more tasks making the MPI_Comm_spawn call
> collectively from multiple threads (even with per-thread input
> communicators) then using a local lock this way is pretty sure to
> deadlock at some point. Say task 0 serializes spawning threads as A
> then B and task 1 serializes them as B then A. The job will deadlock
> because task 0 cannot free its lock for thread A until task 1 makes
> the spawn call for thread A as well. That will never happen if task
> 1 is stuck in a lock that will not release until task 0 makes its
> call for thread B.
>
> When you look at the code for a particular task and consider thread
> interactions within the task, the use of the lock looks safe. It is
> only when you consider the dependancies on what other tasks are
> doing that the danger becomes clear. This particular case is pretty
> easy to see but sometime when there is a temptation to hold a local
> mutex across an blocking MPI call, the chain of dependancies that
> can lead to deadlock becomes very hard to predict.
>
> BTW - maybe this is obvious but you also need to protect the logic
> which calls MPI_Thread_init to make sure you do not have a a race in
> which 2 threads each race to test the flag for whether
> MPI_Init_thread has already been called. If two thread do:
> 1) if (MPI_Inited_flag == FALSE) {
> 2) set MPI_Inited_flag
> 3) MPI_Init_thread
> 4) }
> You have a couple race conditions.
> 1) Two threads may both try to call MPI_Iint_thread if one thread
> tests " if (MPI_Inited_flag == FALSE)" while the other is between
> statements 1 & 2.
> 2) If some thread tests "if (MPI_Inited_flag == FALSE)" while
> another thread is between statements 2 and 3, that thread could
> assume MPI_Init_thread is done and make the MPI_Comm_spawn call
> before the thread that is trying to initialize MPI manages to do it.
>
> Dick
>
>
> Dick Treumann - MPI Team
> IBM Systems & Technology Group
> Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
> Tele (845) 433-7846 Fax (845) 433-8363
>
>
> users-boun...@open-mpi.org wrote on 09/17/2009 11:36:48 PM:
>
> > [image removed]
> >
> > Re: [OMPI users] Multi-threading with OpenMPI ?
> >
> > Ralph Castain
> >
> > to:
> >
> > Open MPI Users
> >
> > 09/17/2009 11:37 PM
> >
> > Sent by:
> >
> > users-boun...@open-mpi.org
> >
> > Please respond to Open MPI Users
> >
> > Only thing I can suggest is to place a thread lock around the call to
> > comm_spawn so that only one thread at a time can execute that
> > function. The call to mpi_init_thread is fine - you just need to
> > explicitly protect the call to comm_spawn.
> >
> >
>
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] Multi-threading with OpenMPI ?

Reply via email to