Hi Ralph,

Thanks for the quick answer!

Try running the "ring" program in our example directory and see if that works

I just did this, and it works. (I ran ring_c.c)

Looking in your ring_c.c code, I see that it is quite similar to my test program but one thing that differs is the datatype: the ring program uses MPI_INT but my test uses MPI_CHARACTER. I tried changing from MPI_INT to MPI_CHARACTER in ring_c.c (and the type of the variable "message" from int to char), and then ring_c.c fails in the same way as my test code. And my code works if changing from MPI_CHARACTER to MPI_INT.

So, it looks like the there is a bug that is triggered when using MPI_CHARACTER, but it works with MPI_INT.

/ Elias


Quoting Ralph Castain <r...@open-mpi.org>:

Try running the "ring" program in our example directory and see if that works

On Mar 16, 2014, at 4:26 PM, Elias Rudberg <elias.rudb...@it.uu.se> wrote:

Hello!

I would like to report a bug in Open MPI 1.7.4 when compiled with --enable-mpi-thread-multiple.

The bug can be reproduced with the following test program (mpi-send-recv.c):
===========================================
#include <mpi.h>
#include <stdio.h>
int main() {
 MPI_Init(NULL, NULL);
 int rank;
 MPI_Comm_rank(MPI_COMM_WORLD, &rank);
 printf("Rank %d at start\n", rank);
 if (rank)
   MPI_Send(NULL, 0, MPI_CHARACTER, 0, 0, MPI_COMM_WORLD);
 else
MPI_Recv(NULL, 0, MPI_CHARACTER, 1, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
 printf("Rank %d at end\n", rank);
 MPI_Finalize();
 return 0;
}
===========================================

With Open MPI 1.7.4 compiled with --enable-mpi-thread-multiple, the test program above fails like this:
$ mpirun -np 2 ./a.out
Rank 0 at start
Rank 1 at start
[elias-p6-2022scm:2743] *** An error occurred in MPI_Recv
[elias-p6-2022scm:2743] *** reported by process [140733606985729,140256452018176]
[elias-p6-2022scm:2743] *** on communicator MPI_COMM_WORLD
[elias-p6-2022scm:2743] *** MPI_ERR_TYPE: invalid datatype
[elias-p6-2022scm:2743] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[elias-p6-2022scm:2743] ***    and potentially your MPI job)

Steps I use to reproduce this in Ubuntu:

(1) Download openmpi-1.7.4.tar.gz

(2) Configure like this:
./configure --enable-mpi-thread-multiple

(3) make

(4) Compile test program like this:
mpicc mpi-send-recv.c

(5) Run like this:
mpirun -np 2 ./a.out
This gives the error above.

Of course, in my actual application I will want to call MPI_Init_thread with MPI_THREAD_MULTIPLE instead of just MPI_Init, but that does not seem to matter for this error; the same error comes regardless of the way I call MPI_Init/MPI_Init_thread. So I just put MPI_Init in the test code above to make it as short as possible.

Do you agree that this is a bug, or am I doing something wrong?

Any ideas for workarounds to make things work with --enable-mpi-thread-multiple? (I do need threads, so skipping --enable-mpi-thread-multiple is probably not an option for me.)

Best regards,
Elias


_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Reply via email to