MPI 3.1 5.12 is pretty clear on the matter: "It is erroneous to call MPI_REQUEST_FREE or MPI_CANCEL for a request associated with a nonblocking collective operation."
-Nathan > On Jun 9, 2017, at 5:33 AM, Markus <mjero...@gmail.com> wrote: > > Dear MPI Users and Maintainers, > > I am using openMPI in version 1.10.4 with enabled multithread support and > java bindings. I use MPI in java, having one process per machine and multiple > threads per process. > > I was trying to build a broadcast listener thread which calls MPI_iBcast, > followed by MPI_WAIT. > > I use the request object, which is returned by MPI_iBcast, to shut the > listener down, calling MPI-CANCEL for that request from the main thread. This > results in > > [fe-402-1:2972] *** An error occurred in MPI_Cancel > [fe-402-1:2972] *** reported by process [1275002881,17179869185] > [fe-402-1:2972] *** on communicator MPI_COMM_WORLD > [fe-402-1:2972] *** MPI_ERR_REQUEST: invalid request > [fe-402-1:2972] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will > now abort, > [fe-402-1:2972] *** and potentially your MPI job) > > > Which indicates that the request is invalid in some fashion. I already > checked that it is not null (MPI_REQUEST_NULL). I have also set up a simple > testbed, where nothing else happens, except that one broadcast. The request > object is always invalid, no matter from where i call cancel(). > > As far as I understand the MPI specifications, cancel is also supposed to > work for collective nonblocking communication (which includes my broadcasts). > I haven't found any advice yet, so I hope to find some help in this mailing > list. > > Kind regards, > Markus Jeromin > > PS: Testbed for calling mpi cancel, written in Java. > _______ > > package distributed.mpi; > > import java.nio.ByteBuffer; > > import mpi.MPI; > import mpi.MPIException; > import mpi.Request; > > /** > * Testing MPI_CANCEL on MPI_iBcast.<br> > * Program does not terminate because the listeners are still running and > * waiting for the java native call MPI_WAIT to return. MPI_CANCEL is called, > but > * the listener never unblocks (i.e. the MPI_WAIT never returns) > * > * @author mjeromin > * > */ > public class BroadcastTestCancel { > > static int myrank; > > /** > * Listener that waits for incoming broadcasts from specified root. Uses > * asynchronous MPI_iBcast and MPI_WAIT > * > */ > static class Listener extends Thread { > > ByteBuffer b = ByteBuffer.allocateDirect(100); > public Request req = null; > > @Override > public void run() { > super.run(); > try { > req = MPI.COMM_WORLD.iBcast(b, b.limit(), > MPI.BYTE, 0); > System.out.println(myrank + ": waiting for > bcast (that will never come)"); > req.waitFor(); > } catch (MPIException e) { > e.printStackTrace(); > } > System.out.println(myrank + ": listener unblocked"); > } > } > > public static void main(String[] args) throws MPIException, > InterruptedException { > > // we need full thread support > int threadSupport = MPI.InitThread(args, MPI.THREAD_MULTIPLE); > if (threadSupport != MPI.THREAD_MULTIPLE) { > System.out.println(myrank + ": no multithread support. > Aborting."); > MPI.Finalize(); > return; > } > > // disable or enable exceptions, it does not matter at all. > MPI.COMM_WORLD.setErrhandler(MPI.ERRORS_RETURN); > > myrank = MPI.COMM_WORLD.getRank(); > > // start receiving listeners, but no sender (which would be > node 0) > if (myrank > 0) { > Listener l = new Listener(); > l.start(); > > // let the listener reach at waitFor() > Thread.sleep(5000); > > // call MPI_CANCEL (matching send will never arrive) > try { > l.req.cancel(); > } catch (MPIException e) { > // depends on error handler > System.out.println(myrank + ": MPI Exception > \n" + e.toString()); > } > } > > // don't call MPI_FINISH too early. (not that necessary to wait > here, but just to be sure) > Thread.sleep(15000); > > System.out.println(myrank + ": calling finish"); > MPI.Finalize(); > System.out.println(myrank + ": finished"); > > } > > } > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://rfd.newmexicoconsortium.org/mailman/listinfo/users
_______________________________________________ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users