On Sun, Dec 13, 2020 at 4:34 PM Barry Smith <[email protected]> wrote:
> > > On Dec 13, 2020, at 8:39 AM, Mark Adams <[email protected]> wrote: > > So how do I declare > > struct myMat { > PetscScalar *a; > PetscInt *i, *j, nrows; > } > > so that I can use, or give me an alternative idiom, > > struct myMat h_mat, *d_mat; > .... > cudaMemcpy( d_mat, &h_mat, sizeof(myMat), > cudaMemcpyHostToDevice)) > > > This will copy the pointers (not the entries in the arrays) which means > in both d_mat and h_mat a and i,j refer to Kokkos (or CUDA etc) memory so > Yes. Note, h_mat is, in my usage, a temporary container. It is a mix of host and device memory and is just there to stuff device pointers and raw (meta) data, to be copied whole onto the device. > > struct myMat { > PetscKokkosScalar *a; > PetscKokkosInt *i, *j; > > PetscInt nrows; > > } > > > the nrows is annoying because in d_mat it refers to device memory while > in h_mat it refers to host memory so we cannot label it clearly. > Kokkos wraps all this ugliness, but in raw Cuda I do it this manual way. It's low level and you have to keep track in your head (names help) what is going on. But this mixed h_mat is just around in a local scope. A better name for it might be h_temp_mat. I think your PetscKokkosScalar is a Scalar in the default execution's memory space. > > Sorry, by code reader I meant someone reading or maintaining the code. > > To me the above struct is clearer, I know immediately that *a and *i, > *j point to Kokkos space. While labeled as PetscScalar *a I have to figure > out somehow by examining other code where it points (this is a waste of > time I think). > I think Junchao is doing something like this in aijkokkosimpl.hpp, as I said. He and I use variable names to make it clear where they are (eg, d_mat for a Mat on the device). Junchoa had to use Kokkos data type to talk to Kokkos, but we both use naming conventions. Here is a whole method that does this idiom. I have a Mat with a "data" pointer (this is a bad name but this was a void* at first) to its device "self". I'm not sure how your suggestion would impact this. PETSC_EXTERN PetscErrorCode LandauCUDACreateMatMaps(P4estVertexMaps *maps, pointInterpolationP4est (*points)[LANDAU_MAX_Q_FACE], PetscInt Nf, PetscInt Nq) { P4estVertexMaps h_maps; PetscFunctionBegin; h_maps.num_elements =maps->num_elements; h_maps.num_face = maps->num_face; h_maps.num_reduced = maps->num_reduced; h_maps.deviceType = maps->deviceType; h_maps.Nf = Nf; h_maps.Nq = Nq; CUDA_SAFE_CALL(cudaMalloc((void **)&h_maps.c_maps, maps->num_reduced * sizeof *points)); CUDA_SAFE_CALL(cudaMemcpy( h_maps.c_maps, maps->c_maps, maps->num_reduced * sizeof *points, cudaMemcpyHostToDevice)); CUDA_SAFE_CALL(cudaMalloc((void **)&h_maps.gIdx, maps->num_elements * sizeof *maps->gIdx)); CUDA_SAFE_CALL(cudaMemcpy( h_maps.gIdx, maps->gIdx, maps->num_elements * sizeof *maps->gIdx, cudaMemcpyHostToDevice)); CUDA_SAFE_CALL(cudaMalloc((void **)&maps->data, sizeof(P4estVertexMaps))); CUDA_SAFE_CALL(cudaMemcpy( maps->data, &h_maps, sizeof(P4estVertexMaps), cudaMemcpyHostToDevice)); PetscFunctionReturn(0); } Mark
