-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi Rolf,
OK, I didn't know that. Sorry. Yes, it would be a pretty important feature in cases when you are doing reduction operations on many, many entries in parallel. Therefore, each reduction is not very complex or time-consuming but potentially hundreds of thousands reductions are done at the same time. This is definitely a point where a CUDA-aware implementation can give some performance improvements. I'm curious: Rather complex operations like allgatherv are CUDA-aware, but a reduction is not. Is there a reasoning for this? Is there some documentation, which MPI calls are CUDA-aware and which not? Best regards Peter On 12/02/2013 02:18 PM, Rolf vandeVaart wrote: > Thanks for the report. CUDA-aware Open MPI does not currently support doing > reduction operations on GPU memory. > Is this a feature you would be interested in? > > Rolf > >> -----Original Message----- >> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Peter Zaspel >> Sent: Friday, November 29, 2013 11:24 AM >> To: us...@open-mpi.org >> Subject: [OMPI users] Bug in MPI_REDUCE in CUDA-aware MPI >> >> Hi users list, >> >> I would like to report a bug in the CUDA-aware OpenMPI 1.7.3 >> implementation. I'm using CUDA 5.0 and Ubuntu 12.04. >> >> Attached, you will find an example code file, to reproduce the bug. >> The point is that MPI_Reduce with normal CPU memory fully works but the >> use of GPU memory leads to a segfault. (GPU memory is used when defining >> USE_GPU). >> >> The segfault looks like this: >> >> [peak64g-36:25527] *** Process received signal *** [peak64g-36:25527] >> Signal: Segmentation fault (11) [peak64g-36:25527] Signal code: Invalid >> permissions (2) [peak64g-36:25527] Failing at address: 0x600100200 [peak64g- >> 36:25527] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x364a0) >> [0x7ff2abdb24a0] >> [peak64g-36:25527] [ 1] >> /data/zaspel/openmpi-1.7.3_build/lib/libmpi.so.1(+0x7d410) >> [0x7ff2ac4b9410] [peak64g-36:25527] [ 2] >> /data/zaspel/openmpi- >> 1.7.3_build/lib/openmpi/mca_coll_tuned.so(ompi_coll_tuned_reduce_intra_ >> basic_linear+0x371) >> [0x7ff2a5987531] >> [peak64g-36:25527] [ 3] >> /data/zaspel/openmpi-1.7.3_build/lib/libmpi.so.1(MPI_Reduce+0x135) >> [0x7ff2ac499d55] >> [peak64g-36:25527] [ 4] /home/zaspel/testMPI/test_reduction() [0x400ca0] >> [peak64g-36:25527] [ 5] >> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed) [0x7ff2abd9d76d] >> [peak64g-36:25527] [ 6] /home/zaspel/testMPI/test_reduction() [0x400af9] >> [peak64g-36:25527] *** End of error message *** >> -------------------------------------------------------------------------- >> mpirun noticed that process rank 0 with PID 25527 on node peak64g-36 exited >> on signal 11 (Segmentation fault). >> -------------------------------------------------------------------------- >> >> Best regards, >> >> Peter > ----------------------------------------------------------------------------------- > This email message is for the sole use of the intended recipient(s) and may > contain > confidential information. Any unauthorized review, use, disclosure or > distribution > is prohibited. If you are not the intended recipient, please contact the > sender by > reply email and destroy all copies of the original message. > ----------------------------------------------------------------------------------- > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > - -- Dipl.-Inform. Peter Zaspel Institut fuer Numerische Simulation, Universitaet Bonn Wegelerstr.6, 53115 Bonn, Germany tel: +49 228 73-2748 mailto:zas...@ins.uni-bonn.de fax: +49 228 73-7527 http://wissrech.ins.uni-bonn.de/people/zaspel.html -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQIcBAEBAgAGBQJSnIsjAAoJEKPU5iaGEeWb8P4P/iJBmdEev/jK0wpTkM0Fi1Dt BXaJjDKUOaNVxrvQXJPtY1g6AZUWphndi26Y5SP4T7JyvF2isHtjwJq6KiCBJ4KW KYEga3y8m8o1hocqoW465EkVaibo5zHqXcX7yzVGqkWb/1LwZJh9zjrGBhjPoFzT JwuEaw7rq1DSn9QeQQPB+CnQsCrKuef5MqDQCfNcBFSoifYks32cdj2l5+Ye/Ltx vaxPi7VeQuWGcPlvAIE4rdgQVjV3IS+1WcxiMSpUoj2D1IgLDveXWdUlRFjxwEu8 gmRxKMAH4A4WfvpppQYGV9h49kim8EZHfVtHf7c+jRRPDJEDLPdmOltkAlfENL5e GroMx5PFUqWRpBYoFPh51XqBak9uqai3tD/R2YdBITufRC/UvrfIq0nYyKsnOLUc 0VXejoRJRMuRrJbjHJMtT+EZsln0jaoRuNERbikCwlFvkNevSpcHnC+SNIN73KUY 99g+hwtxdk4oIH4W+YmRlzbKPRBxiTTw9VjufIwo0EcFoI9JfiVbFpXGDTZfUu6x Z088fu3hCA/q5UoXS1NsDHWUywzkrWsnANSQHXIKXK8jMnounX1kGZ7NH1eA3rrF IX+EqBybTyrbUQb+XDy3cltBeXFiMxTfN0f4KN8yATol7qeSIpxeeYf5NMT/eBn/ uEWxs9hiQW1IYJ4q3F1S =Wr/G -----END PGP SIGNATURE-----