Glad to hear it works. Anyway, without the MatShellVecSetType call the code was erroring for me, not leaking memory. Where you also providing -vec_type cuda at command line or what? Mark recently noted a similar leak, and I was wondering what was the cause for yours. A MWE would be great.
BTW, the branch also provide MatDenseReplaceArray() now. Last thing I want to do is to support the user to provide MatMat (AB and AtB at least) callbacks for MATSHELL Can SLEPc benefit from such a feature ? > On May 10, 2020, at 7:47 PM, Jose E. Roman <[email protected]> wrote: > > Thanks for the hints. I have modified my branch. I was missing the > MatShellSetVecType() call. Now everything works fine and all tests are clean. > > Jose > > >> El 9 may 2020, a las 21:32, Stefano Zampini <[email protected]> >> escribió: >> >> Jose >> >> I have just pushed an updated example with the MatMat operation, and I do >> not see the memory leak. Can you check? >> >> zampins@jasmine:~/petsc$ make -f gmakefile.test test search='mat%' >> searchin='ex69' PETSC_OPTIONS='-malloc -malloc_dump -malloc_debug' >> /usr/bin/python /home/zampins/petsc/config/gmakegentest.py >> --petsc-dir=/home/zampins/petsc --petsc-arch=arch-gpu-double-openmp-openblas >> --testdir=./arch-gpu-double-openmp-openblas/tests >> Using MAKEFLAGS: -- PETSC_OPTIONS=-malloc -malloc_dump -malloc_debug >> searchin=ex69 search=mat% >> CC arch-gpu-double-openmp-openblas/tests/mat/tests/ex69.o >> CLINKER arch-gpu-double-openmp-openblas/tests/mat/tests/ex69 >> TEST >> arch-gpu-double-openmp-openblas/tests/counts/mat_tests-ex69_1.counts >> ok mat_tests-ex69_1+nsize-1test-0_l-0_use_shell-0 >> ok diff-mat_tests-ex69_1+nsize-1test-0_l-0_use_shell-0 >> ok mat_tests-ex69_1+nsize-1test-0_l-0_use_shell-1 >> ok diff-mat_tests-ex69_1+nsize-1test-0_l-0_use_shell-1 >> ok mat_tests-ex69_1+nsize-1test-0_l-5_use_shell-0 >> ok diff-mat_tests-ex69_1+nsize-1test-0_l-5_use_shell-0 >> ok mat_tests-ex69_1+nsize-1test-0_l-5_use_shell-1 >> ok diff-mat_tests-ex69_1+nsize-1test-0_l-5_use_shell-1 >> ok mat_tests-ex69_1+nsize-1test-1_l-0_use_shell-0 >> ok diff-mat_tests-ex69_1+nsize-1test-1_l-0_use_shell-0 >> ok mat_tests-ex69_1+nsize-1test-1_l-0_use_shell-1 >> ok diff-mat_tests-ex69_1+nsize-1test-1_l-0_use_shell-1 >> ok mat_tests-ex69_1+nsize-1test-1_l-5_use_shell-0 >> ok diff-mat_tests-ex69_1+nsize-1test-1_l-5_use_shell-0 >> ok mat_tests-ex69_1+nsize-1test-1_l-5_use_shell-1 >> ok diff-mat_tests-ex69_1+nsize-1test-1_l-5_use_shell-1 >> ok mat_tests-ex69_1+nsize-1test-2_l-0_use_shell-0 >> ok diff-mat_tests-ex69_1+nsize-1test-2_l-0_use_shell-0 >> ok mat_tests-ex69_1+nsize-1test-2_l-0_use_shell-1 >> ok diff-mat_tests-ex69_1+nsize-1test-2_l-0_use_shell-1 >> ok mat_tests-ex69_1+nsize-1test-2_l-5_use_shell-0 >> ok diff-mat_tests-ex69_1+nsize-1test-2_l-5_use_shell-0 >> ok mat_tests-ex69_1+nsize-1test-2_l-5_use_shell-1 >> ok diff-mat_tests-ex69_1+nsize-1test-2_l-5_use_shell-1 >> ok mat_tests-ex69_1+nsize-2test-0_l-0_use_shell-0 >> ok diff-mat_tests-ex69_1+nsize-2test-0_l-0_use_shell-0 >> ok mat_tests-ex69_1+nsize-2test-0_l-0_use_shell-1 >> ok diff-mat_tests-ex69_1+nsize-2test-0_l-0_use_shell-1 >> ok mat_tests-ex69_1+nsize-2test-0_l-5_use_shell-0 >> ok diff-mat_tests-ex69_1+nsize-2test-0_l-5_use_shell-0 >> ok mat_tests-ex69_1+nsize-2test-0_l-5_use_shell-1 >> ok diff-mat_tests-ex69_1+nsize-2test-0_l-5_use_shell-1 >> ok mat_tests-ex69_1+nsize-2test-1_l-0_use_shell-0 >> ok diff-mat_tests-ex69_1+nsize-2test-1_l-0_use_shell-0 >> ok mat_tests-ex69_1+nsize-2test-1_l-0_use_shell-1 >> ok diff-mat_tests-ex69_1+nsize-2test-1_l-0_use_shell-1 >> ok mat_tests-ex69_1+nsize-2test-1_l-5_use_shell-0 >> ok diff-mat_tests-ex69_1+nsize-2test-1_l-5_use_shell-0 >> ok mat_tests-ex69_1+nsize-2test-1_l-5_use_shell-1 >> ok diff-mat_tests-ex69_1+nsize-2test-1_l-5_use_shell-1 >> ok mat_tests-ex69_1+nsize-2test-2_l-0_use_shell-0 >> ok diff-mat_tests-ex69_1+nsize-2test-2_l-0_use_shell-0 >> ok mat_tests-ex69_1+nsize-2test-2_l-0_use_shell-1 >> ok diff-mat_tests-ex69_1+nsize-2test-2_l-0_use_shell-1 >> ok mat_tests-ex69_1+nsize-2test-2_l-5_use_shell-0 >> ok diff-mat_tests-ex69_1+nsize-2test-2_l-5_use_shell-0 >> ok mat_tests-ex69_1+nsize-2test-2_l-5_use_shell-1 >> ok diff-mat_tests-ex69_1+nsize-2test-2_l-5_use_shell-1 >> >> # ------------- >> # Summary >> # ------------- >> # success 48/48 tests (100.0%) >> # failed 0/48 tests (0.0%) >> # todo 0/48 tests (0.0%) >> # skip 0/48 tests (0.0%) >> # >> # Wall clock time for tests: 58 sec >> # Approximate CPU time (not incl. build time): 62.11 sec >> # >> # Timing summary (actual test time / total CPU time): >> # mat_tests-ex69_1: 2.30 sec / 62.11 sec >> >>> On May 9, 2020, at 9:28 PM, Jose E. Roman <[email protected]> wrote: >>> >>> >>> >>>> El 9 may 2020, a las 20:00, Stefano Zampini <[email protected]> >>>> escribió: >>>> >>>> >>>> >>>> Il giorno sab 9 mag 2020 alle ore 19:43 Jose E. Roman <[email protected]> >>>> ha scritto: >>>> >>>> >>>>> El 9 may 2020, a las 12:45, Stefano Zampini <[email protected]> >>>>> escribió: >>>>> >>>>> Jose >>>>> >>>>> I have just pushed a test >>>>> https://gitlab.com/petsc/petsc/-/blob/d64c2bc63c8d5d1a8c689f1abc762ae2722bba26/src/mat/tests/ex69.c >>>>> See if it fits your framework, and feel free to modify the test to add >>>>> more checks >>>> >>>> Almost good. The following modification of the example fails with -test 1: >>>> >>>> >>>> diff --git a/src/mat/tests/ex69.c b/src/mat/tests/ex69.c >>>> index e562f1e2e3..2df2c89be1 100644 >>>> --- a/src/mat/tests/ex69.c >>>> +++ b/src/mat/tests/ex69.c >>>> @@ -84,6 +84,10 @@ int main(int argc,char **argv) >>>> } >>>> ierr = VecCUDARestoreArray(v,&vv);CHKERRQ(ierr); >>>> >>>> + if (test==1) { >>>> + ierr = MatDenseCUDAGetArray(B,&aa);CHKERRQ(ierr); >>>> + if (aa) SETERRQ(PETSC_COMM_WORLD,PETSC_ERR_USER,"Expected a null >>>> pointer"); >>>> + } >>>> >>>> /* free work space */ >>>> ierr = MatDestroy(&B);CHKERRQ(ierr); >>>> >>>> >>>> >>>> I would expect that after MatDenseCUDAResetArray() the pointer is NULL >>>> because it was set so in line 60. In the CPU counterpart it works as >>>> expected. >>>> >>>> Pushed a fix for this, thanks. >>>> >>>> Another comment is: in line 60 you have changed MatDenseCUDAPlaceArray() >>>> to MatDenseCUDAReplaceArray(). This is ok, but it is strange because >>>> MatDenseReplaceArray() does not exist. So the interface is different in >>>> GPU vs CPU, but I guess it is necessary here. >>>> >>>> I think we do not support calling PlaceArray twice anywhere PETSc. This is >>>> why I have added MatDenseCUDAReplaceArray(). If you need support for the >>>> CPU case too, I can add it. >>> >>> Yes, please. It is better to have the same thing in both cases. >>> >>> I am attaching the modified example, now performs a mat-mat product. If I >>> do A*B it works well, but if I replace A with a shell matrix I get a memory >>> leak. >>> >>> [ 0]32 bytes VecCUDAAllocateCheck() line 34 in >>> /home/users/proy/copa/jroman/soft/petsc/src/vec/vec/impls/seq/seqcuda/veccuda2.cu >>> [ 0]32 bytes VecCUDAAllocateCheck() line 34 in >>> /home/users/proy/copa/jroman/soft/petsc/src/vec/vec/impls/seq/seqcuda/veccuda2.cu >>> >>> >>> >>>> >>>> Thanks. >>>> Jose >>>> >>>> >>>>> >>>>> >>>>> Il giorno ven 8 mag 2020 alle ore 18:48 Jose E. Roman >>>>> <[email protected]> ha scritto: >>>>> Attached. Run with -test 1 or -test 2 >>>>> >>>>>> El 8 may 2020, a las 17:14, Stefano Zampini <[email protected]> >>>>>> escribió: >>>>>> >>>>>> Jose >>>>>> >>>>>> Just send me a MWE and I’ll fix the case for you >>>>>> >>>>>> Thanks >>>>>> Stefano >>>>> >>>>> >>>>> -- >>>>> Stefano >>>> >>>> >>>> >>>> -- >>>> Stefano >>> <ex69.c> >> >
