Re: [OMPI users] MPI_CANCEL

Richard Treumann Tue, 15 Apr 2008 22:48:16 -0400

Hi slimtimmy

I have been involved in several of the MPI Forum's discussions of how
MPI_Cancel should work and I agree with your interpretation of the
standard. By my reading of the standard, the MPI_Wait must not hang and the
cancel must succeed.


Making an MPI implementation work exactly as the standard describes may
have performance implications and MPI_Cancel is rarely used so as a
practical matter an MPI implementation may chose to fudge the letter of the
law for better performance.

There also may be people who would argue you and I have misread the
standard and I am happy to follow up (off line if they wish) with anyone
who interprets the standard differently. The MPI Forum is working on MPI
2.2 right now and if there is something that needs fixing in the MPI
standard, now is the time to get a resolution.

                Regards - Dick

Dick Treumann  -  MPI Team/TCEM
IBM Systems & Technology Group
Dept 0lva / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
Tele (845) 433-7846         Fax (845) 433-8363


users-boun...@open-mpi.org wrote on 04/15/2008 03:14:39 PM:

> I encountered some problems when using MPI_CANCEL. I call
> Request::Cancel followed by Request::Wait to ensure that the request has
> been cancelled. However Request::Wait does not return when I send bigger
> messages. The following code should reproduce this behaviour:
>
> #include "mpi.h"
> #include <iostream>
>
> using namespace std;
>
> enum Tags
> {
>      TAG_UNMATCHED1,
>      TAG_UNMATCHED2
> };
>
> int main()
> {
>      MPI::Init();
>
>      const int rank = MPI::COMM_WORLD.Get_rank();
>      const int numProcesses = MPI::COMM_WORLD.Get_size();
>      const int masterRank = 0;
>
>      if (rank == masterRank)
>      {
>          cout << "master" << endl;
>          const int numSlaves = numProcesses - 1;
>          for(int i = 0; i < numSlaves; ++i)
>          {
>              const int slaveRank = i + 1;
>              int buffer;
>              MPI::COMM_WORLD.Recv(&buffer, 1, MPI::INT, slaveRank,
>                  TAG_UNMATCHED1);
>          }
>
>      }
>      else
>      {
>          cout << "slave " << rank << endl;
>          //const int size = 1;
>          const int size = 10000;
>          int buffer[size];
>          MPI::Request request = MPI::COMM_WORLD.Isend(buffer, size,
> MPI::INT,
>              masterRank, TAG_UNMATCHED2);
>
>          cout << "slave ("<< rank<<"): sent data" << endl;
>
>          request.Cancel();
>
>          cout << "slave ("<< rank<<"): cancel issued" << endl;
>
>          request.Wait();
>
>          cout << "slave ("<< rank<<"): finished" << endl;
>      }
>
>
>      MPI::Finalize();
>
>      return 0;
> }
>
>
> If I set size to 1, everything works as expected, the slave process
> finishes execution. However if I use a bigger buffer (in this case
> 10000) the wait blocks forever. That's the output of the program when
> run with two processes:
>
> master
> slave 1
> slave (1): sent data
> slave (1): cancel issued
>
>
> Have I misinterpreted the standard? Or does Request::Wait block until
> the message is delievered?
>
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] MPI_CANCEL

Reply via email to