Hi, My supercomputer has OpenMPI 1.4. I am running into a frustrating problem with my MPI program. I am using only the following calls, which I expect to be blocking: MPI_Wtime MPI_Error_string MPI_Abort MPI_Send MPI_Get_count MPI_Recv MPI_Probe MPI_Init MPI_Comm_rank MPI_Comm_size MPI_Finalize
Somehow I am getting this error when I do a large number of sequential communications: "c002:2.0.Exhausted 1048576 MQ irecv request descriptors, which usually indicates a user program error or insufficient request descriptors (PSM_MQ_RECVREQS_MAX=1048576)" This seems counter-intuitive to me because I don't think I should be using irecvs since I am wanting specifically to rely on the documented blocking behavior of MPI_Recv (not MPI_Irecv, which I am not using). My main program is quite large, however I have managed to replicate the irritating behavior in this much smaller program, which executes a number of MPI_Send or MPI_Recv calls in a loop. The program's default behaviour is to run 2,000,000 iterations. When I turn it up to 20,000,000, after a short time it generates the PSM_MQ_RECVREQS_MAX exception. I would appreciate if anyone could advise why it might be happening in this "test" case -- basically what is going on that causes my presumably blocking MPI_Recv calls to "accumulate" such a large number of "irecv request descriptors", when I expect they should be blocking and get immediately resolved and the count should go down when the matching MPI_Send is posted. I appreciate your assistance. Thank you! Jonathan Stone Research Assistant, U. Oklahoma
crash.c
Description: Binary data