David,
I do not see any mechanism for protecting the accesses to the requests to a
single thread? What is the thread model you're using?
>From an implementation perspective, your code is correct only if you
>initialize the MPI library with MPI_THREAD_MULTIPLE and if the library
>accepts. Otherwise, there is an assumption that the application is single
>threaded, or that the MPI behavior is implementation dependent. Please read
>the MPI standard regarding to MPI_Init_thread for more details.
Regards,
george.
On May 19, 2011, at 02:34 , David Büttner wrote:
> Hello,
>
> I am working on a hybrid MPI (OpenMPI 1.4.3) and Pthread code. I am using
> MPI_Isend and MPI_Irecv for communication and MPI_Test/MPI_Wait to check if
> it is done. I do this repeatedly in the outer loop of my code. The MPI_Test
> is used in the inner loop to check if some function can be called which
> depends on the received data.
> The program regularly crashed (only when not using printf...) and after
> debugging it I figured out the following problem:
>
> In MPI_Isend I have an invalid read of memory. I fixed the problem with not
> re-using a
>
> MPI_Request req_s, req_r;
>
> but by using
>
> MPI_Request* req_s;
> MPI_Request* req_r
>
> and re-allocating them before the MPI_Isend/recv.
>
> The documentation says, that in MPI_Wait and MPI_Test (if successful) the
> request-objects are deallocated and set to MPI_REQUEST_NULL.
> It also says, that in MPI_Isend and MPI_Irecv, it allocates the Objects and
> associates it with the request object.
>
> As I understand this, this either means I can use a pointer to MPI_Request
> which I don't have to initialize for this (it doesn't work but crashes), or
> that I can use a MPI_Request pointer which I have initialized with
> malloc(sizeof(MPI_REQUEST)) (or passing the address of a MPI_Request req),
> which is set and unset in the functions. But this version crashes, too.
> What works is using a pointer, which I allocate before the MPI_Isend/recv and
> which I free after MPI_Wait in every iteration. In other words: It only uses
> if I don't reuse any kind of MPI_Request. Only if I recreate one every time.
>
> Is this, what is should be like? I believe that a reuse of the memory would
> be a lot more efficient (less calls to malloc...). Am I missing something
> here? Or am I doing something wrong?
>
>
> Let me provide some more detailed information about my problem:
>
> I am running the program on a 30 node infiniband cluster. Each node has 4
> single core Opteron CPUs. I am running 1 MPI Rank per node and 4 threads per
> rank (-> one thread per core).
> I am compiling with mpicc of OpenMPI using gcc below.
> Some pseudo-code of the program can be found at the end of this e-mail.
>
> I was able to reproduce the problem using different amount of nodes and even
> using one node only. The problem does not arise when I put printf-debugging
> information into the code. This pointed me into the direction that I have
> some memory problem, where some write accesses some memory it is not supposed
> to.
> I ran the tests using valgrind with --leak-check=full and
> --show-reachable=yes, which pointed me either to MPI_Isend or MPI_Wait
> depending on whether I had the threads spin in a loop for MPI_Test to return
> success or used MPI_Wait respectively.
>
> I would appreciate your help with this. Am I missing something important
> here? Is there a way to re-use the request in the different iterations other
> than I thought it should work?
> Or is there a way to re-initialize the allocated memory before the
> MPI_Isend/recv so that I at least don't have to call free and malloc each
> time?
>
> Thank you very much for your help!
> Kind regards,
> David Büttner
>
> _____________________
> Pseudo-Code of program:
>
> MPI_Request* req_s;
> MPI_Request* req_w;
> OUTER-LOOP
> if(0 == threadid)
> {
> req_s = malloc(sizeof(MPI_Request));
> req_r = malloc(sizeof(MPI_Request));
> MPI_Isend(..., req_s)
> MPI_Irecv(..., req_r)
> }
> pthread_barrier
> INNER-LOOP (while NOT_DONE or RET)
> if(TRYLOCK && NOT_DONE)
> {
> if(MPI_TEST(req_r))
> {
> Call_Function_A;
> NOT_DONE = 0;
> }
>
> }
> RET = Call_Function_B;
> }
> pthread_barrier_wait
> if(0 == threadid)
> {
> MPI_WAIT(req_s)
> MPI_WAIT(req_r)
> free(req_s);
> free(req_r);
> }
> _____________
>
>
> --
> David Büttner, Informatik, Technische Universität München
> TUM I-10 - FMI 01.06.059 - Tel. 089 / 289-17676
>
> _______________________________________________
> users mailing list
> [email protected]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
"To preserve the freedom of the human mind then and freedom of the press, every
spirit should be ready to devote itself to martyrdom; for as long as we may
think as we will, and speak as we think, the condition of man will proceed in
improvement."
-- Thomas Jefferson, 1799