subject:"Re\: \[OMPI users\] CUDA mpi question"

Re: [OMPI users] CUDA mpi question

2019-11-28 Thread Justin Luitjens via users

:bosi...@icl.utk.edu>> Cc: Zhang, Junchao mailto:jczh...@mcs.anl.gov>>; Open MPI Users mailto:users@lists.open-mpi.org>> Subject: Re: [OMPI users] CUDA mpi question I was pointed to "2.7. Synchronization and Memory Ordering" of https://docs.nvidia.com/pdf/GP

Re: [OMPI users] CUDA mpi question

2019-11-28 Thread George Bosilca via users

rn 1; >> >> } >> >> } >> >> >> >> for (int i = 0; i < num_threads; i++) { >> >> if(pthread_join(threads[i], NULL)) { >> >> fprintf(stderr, "Error joining threadn"); >> &g

Re: [OMPI users] CUDA mpi question

2019-11-27 Thread Zhang, Junchao via users

2019 5:43 PM To: George Bosilca mailto:bosi...@icl.utk.edu>> Cc: Zhang, Junchao mailto:jczh...@mcs.anl.gov>>; Open MPI Users mailto:users@lists.open-mpi.org>> Subject: Re: [OMPI users] CUDA mpi question I was pointed to "2.7. Synchronization and Memory Ordering" of

Re: [OMPI users] CUDA mpi question

2019-11-27 Thread Zhang, Junchao via users

I was pointed to "2.7. Synchronization and Memory Ordering" of https://docs.nvidia.com/pdf/GPUDirect_RDMA.pdf. It is on topic. But unfortunately it is too short and I could not understand it. I also checked cudaStreamAddCallback/cudaLaunchHostFunc, which say the host function "must not make any

Re: [OMPI users] CUDA mpi question

2019-11-27 Thread George Bosilca via users

On Wed, Nov 27, 2019 at 5:02 PM Zhang, Junchao wrote: > On Wed, Nov 27, 2019 at 3:16 PM George Bosilca > wrote: > >> Short and portable answer: you need to sync before the Isend or you will >> send garbage data. >> > Ideally, I want to formulate my code into a series of asynchronous "kernel > la

Re: [OMPI users] CUDA mpi question

2019-11-27 Thread Zhang, Junchao via users

On Wed, Nov 27, 2019 at 3:16 PM George Bosilca mailto:bosi...@icl.utk.edu>> wrote: Short and portable answer: you need to sync before the Isend or you will send garbage data. Ideally, I want to formulate my code into a series of asynchronous "kernel launch, kernel launch, ..." without synchron

Re: [OMPI users] CUDA mpi question

2019-11-27 Thread George Bosilca via users

Short and portable answer: you need to sync before the Isend or you will send garbage data. Assuming you are willing to go for a less portable solution you can get the OMPI streams and add your kernels inside, so that the sequential order will guarantee correctness of your isend. We have 2 hidden

Re: [OMPI users] CUDA mpi question

Re: [OMPI users] CUDA mpi question

Re: [OMPI users] CUDA mpi question

Re: [OMPI users] CUDA mpi question

Re: [OMPI users] CUDA mpi question

Re: [OMPI users] CUDA mpi question

Re: [OMPI users] CUDA mpi question

7 matches

Site Navigation

Mail list logo

Footer information