Hi users list, I would like to report a bug in the CUDA-aware OpenMPI 1.7.3 implementation. I'm using CUDA 5.0 and Ubuntu 12.04.
Attached, you will find an example code file, to reproduce the bug. The point is that MPI_Reduce with normal CPU memory fully works but the use of GPU memory leads to a segfault. (GPU memory is used when defining USE_GPU). The segfault looks like this: [peak64g-36:25527] *** Process received signal *** [peak64g-36:25527] Signal: Segmentation fault (11) [peak64g-36:25527] Signal code: Invalid permissions (2) [peak64g-36:25527] Failing at address: 0x600100200 [peak64g-36:25527] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x364a0) [0x7ff2abdb24a0] [peak64g-36:25527] [ 1] /data/zaspel/openmpi-1.7.3_build/lib/libmpi.so.1(+0x7d410) [0x7ff2ac4b9410] [peak64g-36:25527] [ 2] /data/zaspel/openmpi-1.7.3_build/lib/openmpi/mca_coll_tuned.so(ompi_coll_tuned_reduce_intra_basic_linear+0x371) [0x7ff2a5987531] [peak64g-36:25527] [ 3] /data/zaspel/openmpi-1.7.3_build/lib/libmpi.so.1(MPI_Reduce+0x135) [0x7ff2ac499d55] [peak64g-36:25527] [ 4] /home/zaspel/testMPI/test_reduction() [0x400ca0] [peak64g-36:25527] [ 5] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed) [0x7ff2abd9d76d] [peak64g-36:25527] [ 6] /home/zaspel/testMPI/test_reduction() [0x400af9] [peak64g-36:25527] *** End of error message *** -------------------------------------------------------------------------- mpirun noticed that process rank 0 with PID 25527 on node peak64g-36 exited on signal 11 (Segmentation fault). -------------------------------------------------------------------------- Best regards, Peter
test_reduction.cu
Description: application/cu-seeme