Hi Dick,
Jeff paraphrased an unnamed source as suggesting that: "any MPI
program that relies on a barrier for correctness is an incorrect MPI
application." . That is probably too strong.
How about this assertion?
*If there are no wildcard receives - every MPI_Barrier call is
semantically irrelevant.*
This depends on what 'semantically irrelevant' means. It is clear that
one can write a wildcard-free program that will deadlock if you insert a
barrier incorrectly, but that removing the barrier will avoid the
deadlock. (Imagine P1 doing a Send; Barrier and P2 doing a Barrier;
Receive(nonwildcard)).
So a wildcard-free program may still deadlock (semantically noticeable
effect) by having barriers. I'm sure you did not mean to include this
degenerate nit-pick - but yes otherwise you are right! The proof exists
in a Siegel paper (cited in our EuroPVM'08) for a subset of MPI. Our
work takes that idea further and offers a complete checking algorithm
for one test harness (data set) as now described.
The exact consideration for locating semantically irrelevant barriers
(we call it Functionally Irrelevant Barriers in our paper) is given in
our EuroPVM / MPI 2008 paper. The analysis involves ordering paths --
IntraCB and InterCB. CB stands for Completes-before.
What is IntraCB? Imagine two MPI sends from P1 to P2 in that order. MPI
forces them to complete in program order. Now imagine P1 sending to P2
and then P1 sending to P3. These can complete out of program order. Why
so? Because MPI guarantees only non-overtaking (point-to-point
non-overtaking). It also makes sense practically: the first send may be
shipping a gigabyte to P2 and the second shipping a byte to P3.
IntraCB is a weak relation wrt program order. We have accurately defined
IntraCB in our CAV 2008 paper (available on our website). The basic idea
is simple: about 6-7 rules capture IntraCB (like in the above example).
Now in our EuroPVM 2008 paper, we show how to lift IntraCB to InterCB by
computing a "closure" thru barriers. This defines MPI ordering paths.
This is again a simple idea.
The gist of FIB is this: if an ordering path is affected by the removal
of barriers, then that barrier is functionally relevant; else it is not.
FIB does this analysis for all possible ordering paths.
How are all ordering paths determined? Well for this, FIB needs help
from our POE algorithm (CAV 2008) that generates the RELEVANT executions
of an MPI program. Basically POE gives you the semantically minimal
(close to minimal; slightly bloated is possible) set of interleavings of
an MPI program. Here is the idea: if you write an MPI program with P1
sending to P2, P3 sending to P2, and P3 doing a wildcard receive from
either P1 or P2, our POE algorithm generates two interleavings. These
are sufficient. No need to consider all permutations of posting send1,
send2, receive(*) in all orders. The POE algorithm is essential for FIB
to work.
In fact, thru a few mouse-clicks, you can do all this
1) download ISP from
http://www.cs.utah.edu/formal_verification/ISP-release
2) fire it up
3) If running under Linux, use the --fib flag; if running under Windows,
the flag is on by default
4) ISP verifies the program for assert failures, MPI object leaks, and
deadlocks
5) If ISP stops w/o deadlocks found (ie all goes well) it prints the
list of FIBs.
Please try - we will appreciate it greatly! We may have always
overlooked something -- we will be very grateful if you could offer
feedback to improve our ISP tool that contains the FIB algo implementation.
As a bonus, if you read our EuroPVM / MPI 2008 paper, you will find, in
its first 3-4 pages, some "brain teasers" that you can read, and see if
you think those barriers could be removed. Next you can type those 3-4
line examples into ISP and see what it says wrt the FIB status.
I'm not fibbing... :-)
Cheers,
Ganesh
p.s. I said that FIB does the analysis for one data set. As in our
paper, we have shown that in many cases, a static analyzer can determine
that a program is data independent. In that case, the FIB analysis holds
for all inputs (input = test harness = data set).
--
It is the exception that tests the rule.
If someone can provide an example of an MPI_Barrier that is required
by an application based on MPI communication and that does not use
wildcard receive I am interested in seeing it. I do not know of a
counter example but also do not have proof of the assertion I place
before the group.
No fair using examples with non-MPI interactions among tasks or with
job steering by asynchronous triggers from outside the job. I can
construct them myself.
MPI_WIN_FENCE is semantically required in some situations and examples
that show a semantic need for MPI_WIN_FENCE do not count against the
assertion.
I have appreciated the descriptions from Gus, Asjley and others of
some non-symantic justifications for an MPI_Barrier.
Dick Treumann - MPI Team
IBM Systems & Technology Group
Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
Tele (845) 433-7846 Fax (845) 433-8363
------------------------------------------------------------------------
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users