Re: [OMPI users] . cuda-memcheck reports errors for MPI functions after use of cudaSetDevice

Kristina Tesch Thu, 9 Jun 2016 03:29:56 -0400 (EDT)

Hi Jiri, 

thank you for your reply. That is a good thing to know, I will ignore those 
error messages then.


Best regards,
Kristina

> Am 09.06.2016 um 09:13 schrieb Jiri Kraus <jkr...@nvidia.com>:
> 
> Hi Kristina,
> 
> Although its reported by cuda-memcheck as an error it really is an expected 
> return code from cuPointerGetAttributes. The CUDA-aware build of OpenMPI 
> calls cuPointerGetAttributes to query if a pointer is a host or device 
> pointer. Memory allocated with the system allocator (malloc, global, stack 
> and static data) is not part of the Unified Virtual Addressspace (UVA) known 
> to the driver and therefore cuPointerGetAttributes returns 
> CUDA_ERROR_INVALID_VALUE for those. For OpenMPI this simply means that it is 
> a host pointer.
> 
> Hope this helps
> 
> Jiri
> 
>> -----Original Message-----
>> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of users-
>> requ...@open-mpi.org
>> Sent: Mittwoch, 8. Juni 2016 18:00
>> To: us...@open-mpi.org
>> Subject: users Digest, Vol 3525, Issue 3
>>   1. cuda-memcheck reports errors for MPI functions after    use of
>>      cudaSetDevice (Kristina Tesch)
>> ----------------------------------------------------------------------
>> 
>> Message: 1
>> Date: Wed, 8 Jun 2016 14:59:24 +0200
>> From: Kristina Tesch <kristina.te...@gmx.de>
>> To: us...@open-mpi.org
>> Subject: [OMPI users] cuda-memcheck reports errors for MPI functions
>>      after   use of cudaSetDevice
>> Message-ID: <2efdd862-c5b1-44f1-9710-932fa7411...@gmx.de>
>> Content-Type: text/plain; charset="us-ascii"
>> 
>> Hello everyone,
>> 
>> in my application I use CUDA-aware OpenMPI 1.10.2 together with CUDA
>> 7.5. If I call cudaSetDevice() cuda-memcheck reports this error for all
>> subsequent MPI function calls:
>> 
>> ========= CUDA-MEMCHECK
>> ========= Program hit CUDA_ERROR_INVALID_VALUE (error 1) due to
>> "invalid argument" on CUDA API call to cuPointerGetAttributes.
>> =========     Saved host backtrace up to driver entry point at error
>> =========     Host Frame:/usr/lib64/libcuda.so.1 (cuPointerGetAttributes +
>> 0x18d) [0x144ffd]
>> =========     Host Frame:/modules/opt/spack/linux-x86_64/gcc-
>> 5.3.0/openmpicuda-1.10.2-
>> y246ecjlhmkkh7lhbrgvdwpazc4mgetr/lib/libmpi.so.12 [0xb0f52]
>> =========     Host Frame:/modules/opt/spack/linux-x86_64/gcc-
>> 5.3.0/openmpicuda-1.10.2-y246ecjlhmkkh7lhbrgvdwpazc4mgetr/lib/libopen-
>> pal.so.13 (mca_cuda_convertor_init + 0xac) [0x3cbcc]
>> =========     Host Frame:/modules/opt/spack/linux-x86_64/gcc-
>> 5.3.0/openmpicuda-1.10.2-y246ecjlhmkkh7lhbrgvdwpazc4mgetr/lib/libopen-
>> pal.so.13 (opal_convertor_prepare_for_recv + 0x25) [0x33f65]
>> =========     Host Frame:/modules/opt/spack/linux-x86_64/gcc-
>> 5.3.0/openmpicuda-1.10.2-
>> y246ecjlhmkkh7lhbrgvdwpazc4mgetr/lib/libmpi.so.12
>> (mca_pml_ob1_recv_req_start + 0x15e) [0x1b487e]
>> =========     Host Frame:/modules/opt/spack/linux-x86_64/gcc-
>> 5.3.0/openmpicuda-1.10.2-
>> y246ecjlhmkkh7lhbrgvdwpazc4mgetr/lib/libmpi.so.12 (mca_pml_ob1_irecv +
>> 0xc4) [0x1ab464]
>> =========     Host Frame:/modules/opt/spack/linux-x86_64/gcc-
>> 5.3.0/openmpicuda-1.10.2-
>> y246ecjlhmkkh7lhbrgvdwpazc4mgetr/lib/libmpi.so.12
>> (ompi_coll_tuned_barrier_intra_recursivedoubling + 0xde) [0x13d79e]
>> =========     Host Frame:/modules/opt/spack/linux-x86_64/gcc-
>> 5.3.0/openmpicuda-1.10.2-
>> y246ecjlhmkkh7lhbrgvdwpazc4mgetr/lib/libmpi.so.12 (MPI_Barrier + 0x72)
>> [0x86eb2]
>> =========     Host Frame:./Errortest [0x2cb3]
>> =========     Host Frame:/usr/lib64/libc.so.6 (__libc_start_main + 0xf5)
>> [0x21b15]
>> =========     Host Frame:./Errortest [0x2b99]
>> 
>> A minimal example that reproduces the error on my system is:
>> #include <mpi.h>
>> 
>> int main(int argc, char *argv[]) {
>> 
>>    MPI_Init(&argc, &argv);
>> 
>>    cudaSetDevice(0);
>> 
>>    MPI_Barrier(MPI_COMM_WORLD);
>> 
>>    MPI_Finalize();
>>    return 0;
>> }
>> 
>> I find the same behavior when cudaSetDevice() is swapped with MPI_Init().
>> How can I avoid these errors and still select the GPU to work on?
>> 
>> Thank you,
>> Kristina
>> 
>> End of users Digest, Vol 3525, Issue 3
>> **************************************
> NVIDIA GmbH, Wuerselen, Germany, Amtsgericht Aachen, HRB 8361
> Managing Director: Karen Theresa Burns
> 
> -----------------------------------------------------------------------------------
> This email message is for the sole use of the intended recipient(s) and may 
> contain
> confidential information.  Any unauthorized review, use, disclosure or 
> distribution
> is prohibited.  If you are not the intended recipient, please contact the 
> sender by
> reply email and destroy all copies of the original message.
> -----------------------------------------------------------------------------------
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2016/06/29414.php

Re: [OMPI users] . cuda-memcheck reports errors for MPI functions after use of cudaSetDevice

Reply via email to