David, I do not see any mechanism for protecting the accesses to the requests to a single thread? What is the thread model you're using?
>From an implementation perspective, your code is correct only if you >initialize the MPI library with MPI_THREAD_MULTIPLE and if the library >accepts. Otherwise, there is an assumption that the application is single >threaded, or that the MPI behavior is implementation dependent. Please read >the MPI standard regarding to MPI_Init_thread for more details. Regards, george. On May 19, 2011, at 02:34 , David Büttner wrote: > Hello, > > I am working on a hybrid MPI (OpenMPI 1.4.3) and Pthread code. I am using > MPI_Isend and MPI_Irecv for communication and MPI_Test/MPI_Wait to check if > it is done. I do this repeatedly in the outer loop of my code. The MPI_Test > is used in the inner loop to check if some function can be called which > depends on the received data. > The program regularly crashed (only when not using printf...) and after > debugging it I figured out the following problem: > > In MPI_Isend I have an invalid read of memory. I fixed the problem with not > re-using a > > MPI_Request req_s, req_r; > > but by using > > MPI_Request* req_s; > MPI_Request* req_r > > and re-allocating them before the MPI_Isend/recv. > > The documentation says, that in MPI_Wait and MPI_Test (if successful) the > request-objects are deallocated and set to MPI_REQUEST_NULL. > It also says, that in MPI_Isend and MPI_Irecv, it allocates the Objects and > associates it with the request object. > > As I understand this, this either means I can use a pointer to MPI_Request > which I don't have to initialize for this (it doesn't work but crashes), or > that I can use a MPI_Request pointer which I have initialized with > malloc(sizeof(MPI_REQUEST)) (or passing the address of a MPI_Request req), > which is set and unset in the functions. But this version crashes, too. > What works is using a pointer, which I allocate before the MPI_Isend/recv and > which I free after MPI_Wait in every iteration. In other words: It only uses > if I don't reuse any kind of MPI_Request. Only if I recreate one every time. > > Is this, what is should be like? I believe that a reuse of the memory would > be a lot more efficient (less calls to malloc...). Am I missing something > here? Or am I doing something wrong? > > > Let me provide some more detailed information about my problem: > > I am running the program on a 30 node infiniband cluster. Each node has 4 > single core Opteron CPUs. I am running 1 MPI Rank per node and 4 threads per > rank (-> one thread per core). > I am compiling with mpicc of OpenMPI using gcc below. > Some pseudo-code of the program can be found at the end of this e-mail. > > I was able to reproduce the problem using different amount of nodes and even > using one node only. The problem does not arise when I put printf-debugging > information into the code. This pointed me into the direction that I have > some memory problem, where some write accesses some memory it is not supposed > to. > I ran the tests using valgrind with --leak-check=full and > --show-reachable=yes, which pointed me either to MPI_Isend or MPI_Wait > depending on whether I had the threads spin in a loop for MPI_Test to return > success or used MPI_Wait respectively. > > I would appreciate your help with this. Am I missing something important > here? Is there a way to re-use the request in the different iterations other > than I thought it should work? > Or is there a way to re-initialize the allocated memory before the > MPI_Isend/recv so that I at least don't have to call free and malloc each > time? > > Thank you very much for your help! > Kind regards, > David Büttner > > _____________________ > Pseudo-Code of program: > > MPI_Request* req_s; > MPI_Request* req_w; > OUTER-LOOP > if(0 == threadid) > { > req_s = malloc(sizeof(MPI_Request)); > req_r = malloc(sizeof(MPI_Request)); > MPI_Isend(..., req_s) > MPI_Irecv(..., req_r) > } > pthread_barrier > INNER-LOOP (while NOT_DONE or RET) > if(TRYLOCK && NOT_DONE) > { > if(MPI_TEST(req_r)) > { > Call_Function_A; > NOT_DONE = 0; > } > > } > RET = Call_Function_B; > } > pthread_barrier_wait > if(0 == threadid) > { > MPI_WAIT(req_s) > MPI_WAIT(req_r) > free(req_s); > free(req_r); > } > _____________ > > > -- > David Büttner, Informatik, Technische Universität München > TUM I-10 - FMI 01.06.059 - Tel. 089 / 289-17676 > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users "To preserve the freedom of the human mind then and freedom of the press, every spirit should be ready to devote itself to martyrdom; for as long as we may think as we will, and speak as we think, the condition of man will proceed in improvement." -- Thomas Jefferson, 1799