> On May 10, 2020, at 8:56 PM, Jose E. Roman <[email protected]> wrote:
>
>
>
>> El 10 may 2020, a las 19:42, Stefano Zampini <[email protected]>
>> escribió:
>>
>> Glad to hear it works. Anyway, without the MatShellVecSetType call the code
>> was erroring for me, not leaking memory.
>> Where you also providing -vec_type cuda at command line or what?
>> Mark recently noted a similar leak, and I was wondering what was the cause
>> for yours. A MWE would be great.
>
> I never use -vec_type cuda because all SLEPc tests that use CUDA create
> vectors with MatCreateVecs(). Until now I had never tested examples with
> shell matrices.
>
> This kind of issues would be more easy to detect if macros such as
> PetscCheckSameTypeAndComm() would actually compare type_name.
VECCUDA specific code checks for type names (either VECSEQCUDA or VECMPICUDA)
and these errored in my case. (Try by simply removing MatShellSetVecType)
zampins@jasmine:~/petsc/src/mat/tests$ git diff
diff --git a/src/mat/tests/ex69.c b/src/mat/tests/ex69.c
index b04652d..2a9374d 100644
--- a/src/mat/tests/ex69.c
+++ b/src/mat/tests/ex69.c
@@ -89,7 +89,7 @@ int main(int argc,char **argv)
if (use_shell) {
ierr =
MatCreateShell(PetscObjectComm((PetscObject)v),nloc,nloc,n,n,A,&S);CHKERRQ(ierr);
ierr =
MatShellSetOperation(S,MATOP_MULT,(void(*)(void))MatMult_S);CHKERRQ(ierr);
- ierr = MatShellSetVecType(S,vtype);CHKERRQ(ierr);
+ //ierr = MatShellSetVecType(S,vtype);CHKERRQ(ierr);
/* we could have called the general convertor also */
/* ierr = MatConvert(A,MATSHELL,MAT_INITIAL_MATRIX,&S);CHKERRQ(ierr); */
} else {
zampins@jasmine:~/petsc/src/mat/tests$ ./ex69 -use_shell
[0]PETSC ERROR: --------------------- Error Message
--------------------------------------------------------------
[0]PETSC ERROR: Invalid argument
[0]PETSC ERROR: Object (seq) is not seqcuda or mpicuda
[0]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html for
trouble shooting.
[0]PETSC ERROR: Petsc Development GIT revision: v3.13-213-g441560f GIT Date:
2020-04-25 17:01:13 +0300
[0]PETSC ERROR: ./ex69 on a arch-gpu-double-openmp-openblas named jasmine by
zampins Sun May 10 21:03:09 2020
[0]PETSC ERROR: Configure options --download-cub=1 --download-hara-commit=HEAD
--download-hara=1 --download-kblas=1 --download-magma=1
--download-openblas-use-pthreads=1 --download-openblas=1
--download-superlu_dist-commit=HEAD --download-superlu_dist=1
--download-mumps=1 --download-scalapack=1
--with-cc=/opt/ecrc/intel/2018/compilers_and_libraries/linux/mpi/intel64/bin/mpicc
--with-cuda-gencodearch=37 --with-cuda=1
--with-cudac=/opt/ecrc/cuda/10.1/bin/nvcc --with-cxx-dialect=C++11
--with-cxx=/opt/ecrc/intel/2018/compilers_and_libraries/linux/mpi/intel64/bin/mpicxx
--with-fc=/opt/ecrc/intel/2018/compilers_and_libraries/linux/mpi/intel64/bin/mpif90
--with-fortran-bindings=0 --with-magma-fortran-bindings=0
--with-opencl-include=/opt/ecrc/cuda/10.1/include
--with-opencl-lib="-L/opt/ecrc/cuda/10.1/lib64 -lOpenCL" --with-openmp=1
--with-precision=double PETSC_ARCH=arch-gpu-double-openmp-openblas
[0]PETSC ERROR: #1 VecCUDAGetArrayRead() line 2206 in
/home/zampins/petsc/src/vec/vec/interface/rvector.c
[0]PETSC ERROR: #2 MatMultAdd_SeqAIJCUSPARSE() line 1448 in
/home/zampins/petsc/src/mat/impls/aij/seq/seqcusparse/aijcusparse.cu
[0]PETSC ERROR: #3 MatMult_SeqAIJCUSPARSE() line 1375 in
/home/zampins/petsc/src/mat/impls/aij/seq/seqcusparse/aijcusparse.cu
[0]PETSC ERROR: #4 MatMult() line 2392 in
/home/zampins/petsc/src/mat/interface/matrix.c
[0]PETSC ERROR: #5 MatMult_S() line 12 in ex69.c
[0]PETSC ERROR: #6 MatMult_Shell() line 586 in
/home/zampins/petsc/src/mat/impls/shell/shell.c
[0]PETSC ERROR: #7 MatMult() line 2392 in
/home/zampins/petsc/src/mat/interface/matrix.c
[0]PETSC ERROR: #8 MatProductNumeric_X_Dense() line 305 in
/home/zampins/petsc/src/mat/interface/matproduct.c
[0]PETSC ERROR: #9 MatProductNumeric() line 976 in
/home/zampins/petsc/src/mat/interface/matproduct.c
[0]PETSC ERROR: #10 main() line 109 in ex69.c
[0]PETSC ERROR: PETSc Option Table entries:
[0]PETSC ERROR: -use_gpu_aware_mpi 0
[0]PETSC ERROR: -use_shell
[0]PETSC ERROR: ----------------End of Error Message -------send entire error
message to [email protected]
application called MPI_Abort(MPI_COMM_SELF, 109062) - process 0
[unset]: write_line error; fd=-1 buf=:cmd=abort exitcode=109062
:
system msg for write_line failure : Bad file descriptor
>
>
>>
>> BTW, the branch also provide MatDenseReplaceArray() now.
>
> Yes, I saw it. Thanks.
>
>> Last thing I want to do is to support the user to provide MatMat (AB and AtB
>> at least) callbacks for MATSHELL
>> Can SLEPc benefit from such a feature ?
>
> Some solvers yes.
>
>>
>>> On May 10, 2020, at 7:47 PM, Jose E. Roman <[email protected]> wrote:
>>>
>>> Thanks for the hints. I have modified my branch. I was missing the
>>> MatShellSetVecType() call. Now everything works fine and all tests are
>>> clean.
>>>
>>> Jose
>>>
>>>
>>>> El 9 may 2020, a las 21:32, Stefano Zampini <[email protected]>
>>>> escribió:
>>>>
>>>> Jose
>>>>
>>>> I have just pushed an updated example with the MatMat operation, and I do
>>>> not see the memory leak. Can you check?
>>>>
>>>> zampins@jasmine:~/petsc$ make -f gmakefile.test test search='mat%'
>>>> searchin='ex69' PETSC_OPTIONS='-malloc -malloc_dump -malloc_debug'
>>>> /usr/bin/python /home/zampins/petsc/config/gmakegentest.py
>>>> --petsc-dir=/home/zampins/petsc
>>>> --petsc-arch=arch-gpu-double-openmp-openblas
>>>> --testdir=./arch-gpu-double-openmp-openblas/tests
>>>> Using MAKEFLAGS: -- PETSC_OPTIONS=-malloc -malloc_dump -malloc_debug
>>>> searchin=ex69 search=mat%
>>>> CC arch-gpu-double-openmp-openblas/tests/mat/tests/ex69.o
>>>> CLINKER arch-gpu-double-openmp-openblas/tests/mat/tests/ex69
>>>> TEST
>>>> arch-gpu-double-openmp-openblas/tests/counts/mat_tests-ex69_1.counts
>>>> ok mat_tests-ex69_1+nsize-1test-0_l-0_use_shell-0
>>>> ok diff-mat_tests-ex69_1+nsize-1test-0_l-0_use_shell-0
>>>> ok mat_tests-ex69_1+nsize-1test-0_l-0_use_shell-1
>>>> ok diff-mat_tests-ex69_1+nsize-1test-0_l-0_use_shell-1
>>>> ok mat_tests-ex69_1+nsize-1test-0_l-5_use_shell-0
>>>> ok diff-mat_tests-ex69_1+nsize-1test-0_l-5_use_shell-0
>>>> ok mat_tests-ex69_1+nsize-1test-0_l-5_use_shell-1
>>>> ok diff-mat_tests-ex69_1+nsize-1test-0_l-5_use_shell-1
>>>> ok mat_tests-ex69_1+nsize-1test-1_l-0_use_shell-0
>>>> ok diff-mat_tests-ex69_1+nsize-1test-1_l-0_use_shell-0
>>>> ok mat_tests-ex69_1+nsize-1test-1_l-0_use_shell-1
>>>> ok diff-mat_tests-ex69_1+nsize-1test-1_l-0_use_shell-1
>>>> ok mat_tests-ex69_1+nsize-1test-1_l-5_use_shell-0
>>>> ok diff-mat_tests-ex69_1+nsize-1test-1_l-5_use_shell-0
>>>> ok mat_tests-ex69_1+nsize-1test-1_l-5_use_shell-1
>>>> ok diff-mat_tests-ex69_1+nsize-1test-1_l-5_use_shell-1
>>>> ok mat_tests-ex69_1+nsize-1test-2_l-0_use_shell-0
>>>> ok diff-mat_tests-ex69_1+nsize-1test-2_l-0_use_shell-0
>>>> ok mat_tests-ex69_1+nsize-1test-2_l-0_use_shell-1
>>>> ok diff-mat_tests-ex69_1+nsize-1test-2_l-0_use_shell-1
>>>> ok mat_tests-ex69_1+nsize-1test-2_l-5_use_shell-0
>>>> ok diff-mat_tests-ex69_1+nsize-1test-2_l-5_use_shell-0
>>>> ok mat_tests-ex69_1+nsize-1test-2_l-5_use_shell-1
>>>> ok diff-mat_tests-ex69_1+nsize-1test-2_l-5_use_shell-1
>>>> ok mat_tests-ex69_1+nsize-2test-0_l-0_use_shell-0
>>>> ok diff-mat_tests-ex69_1+nsize-2test-0_l-0_use_shell-0
>>>> ok mat_tests-ex69_1+nsize-2test-0_l-0_use_shell-1
>>>> ok diff-mat_tests-ex69_1+nsize-2test-0_l-0_use_shell-1
>>>> ok mat_tests-ex69_1+nsize-2test-0_l-5_use_shell-0
>>>> ok diff-mat_tests-ex69_1+nsize-2test-0_l-5_use_shell-0
>>>> ok mat_tests-ex69_1+nsize-2test-0_l-5_use_shell-1
>>>> ok diff-mat_tests-ex69_1+nsize-2test-0_l-5_use_shell-1
>>>> ok mat_tests-ex69_1+nsize-2test-1_l-0_use_shell-0
>>>> ok diff-mat_tests-ex69_1+nsize-2test-1_l-0_use_shell-0
>>>> ok mat_tests-ex69_1+nsize-2test-1_l-0_use_shell-1
>>>> ok diff-mat_tests-ex69_1+nsize-2test-1_l-0_use_shell-1
>>>> ok mat_tests-ex69_1+nsize-2test-1_l-5_use_shell-0
>>>> ok diff-mat_tests-ex69_1+nsize-2test-1_l-5_use_shell-0
>>>> ok mat_tests-ex69_1+nsize-2test-1_l-5_use_shell-1
>>>> ok diff-mat_tests-ex69_1+nsize-2test-1_l-5_use_shell-1
>>>> ok mat_tests-ex69_1+nsize-2test-2_l-0_use_shell-0
>>>> ok diff-mat_tests-ex69_1+nsize-2test-2_l-0_use_shell-0
>>>> ok mat_tests-ex69_1+nsize-2test-2_l-0_use_shell-1
>>>> ok diff-mat_tests-ex69_1+nsize-2test-2_l-0_use_shell-1
>>>> ok mat_tests-ex69_1+nsize-2test-2_l-5_use_shell-0
>>>> ok diff-mat_tests-ex69_1+nsize-2test-2_l-5_use_shell-0
>>>> ok mat_tests-ex69_1+nsize-2test-2_l-5_use_shell-1
>>>> ok diff-mat_tests-ex69_1+nsize-2test-2_l-5_use_shell-1
>>>>
>>>> # -------------
>>>> # Summary
>>>> # -------------
>>>> # success 48/48 tests (100.0%)
>>>> # failed 0/48 tests (0.0%)
>>>> # todo 0/48 tests (0.0%)
>>>> # skip 0/48 tests (0.0%)
>>>> #
>>>> # Wall clock time for tests: 58 sec
>>>> # Approximate CPU time (not incl. build time): 62.11 sec
>>>> #
>>>> # Timing summary (actual test time / total CPU time):
>>>> # mat_tests-ex69_1: 2.30 sec / 62.11 sec
>>>>
>>>>> On May 9, 2020, at 9:28 PM, Jose E. Roman <[email protected]> wrote:
>>>>>
>>>>>
>>>>>
>>>>>> El 9 may 2020, a las 20:00, Stefano Zampini <[email protected]>
>>>>>> escribió:
>>>>>>
>>>>>>
>>>>>>
>>>>>> Il giorno sab 9 mag 2020 alle ore 19:43 Jose E. Roman
>>>>>> <[email protected]> ha scritto:
>>>>>>
>>>>>>
>>>>>>> El 9 may 2020, a las 12:45, Stefano Zampini <[email protected]>
>>>>>>> escribió:
>>>>>>>
>>>>>>> Jose
>>>>>>>
>>>>>>> I have just pushed a test
>>>>>>> https://gitlab.com/petsc/petsc/-/blob/d64c2bc63c8d5d1a8c689f1abc762ae2722bba26/src/mat/tests/ex69.c
>>>>>>> See if it fits your framework, and feel free to modify the test to add
>>>>>>> more checks
>>>>>>
>>>>>> Almost good. The following modification of the example fails with -test
>>>>>> 1:
>>>>>>
>>>>>>
>>>>>> diff --git a/src/mat/tests/ex69.c b/src/mat/tests/ex69.c
>>>>>> index e562f1e2e3..2df2c89be1 100644
>>>>>> --- a/src/mat/tests/ex69.c
>>>>>> +++ b/src/mat/tests/ex69.c
>>>>>> @@ -84,6 +84,10 @@ int main(int argc,char **argv)
>>>>>> }
>>>>>> ierr = VecCUDARestoreArray(v,&vv);CHKERRQ(ierr);
>>>>>>
>>>>>> + if (test==1) {
>>>>>> + ierr = MatDenseCUDAGetArray(B,&aa);CHKERRQ(ierr);
>>>>>> + if (aa) SETERRQ(PETSC_COMM_WORLD,PETSC_ERR_USER,"Expected a null
>>>>>> pointer");
>>>>>> + }
>>>>>>
>>>>>> /* free work space */
>>>>>> ierr = MatDestroy(&B);CHKERRQ(ierr);
>>>>>>
>>>>>>
>>>>>>
>>>>>> I would expect that after MatDenseCUDAResetArray() the pointer is NULL
>>>>>> because it was set so in line 60. In the CPU counterpart it works as
>>>>>> expected.
>>>>>>
>>>>>> Pushed a fix for this, thanks.
>>>>>>
>>>>>> Another comment is: in line 60 you have changed MatDenseCUDAPlaceArray()
>>>>>> to MatDenseCUDAReplaceArray(). This is ok, but it is strange because
>>>>>> MatDenseReplaceArray() does not exist. So the interface is different in
>>>>>> GPU vs CPU, but I guess it is necessary here.
>>>>>>
>>>>>> I think we do not support calling PlaceArray twice anywhere PETSc. This
>>>>>> is why I have added MatDenseCUDAReplaceArray(). If you need support for
>>>>>> the CPU case too, I can add it.
>>>>>
>>>>> Yes, please. It is better to have the same thing in both cases.
>>>>>
>>>>> I am attaching the modified example, now performs a mat-mat product. If I
>>>>> do A*B it works well, but if I replace A with a shell matrix I get a
>>>>> memory leak.
>>>>>
>>>>> [ 0]32 bytes VecCUDAAllocateCheck() line 34 in
>>>>> /home/users/proy/copa/jroman/soft/petsc/src/vec/vec/impls/seq/seqcuda/veccuda2.cu
>>>>> [ 0]32 bytes VecCUDAAllocateCheck() line 34 in
>>>>> /home/users/proy/copa/jroman/soft/petsc/src/vec/vec/impls/seq/seqcuda/veccuda2.cu
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>> Thanks.
>>>>>> Jose
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Il giorno ven 8 mag 2020 alle ore 18:48 Jose E. Roman
>>>>>>> <[email protected]> ha scritto:
>>>>>>> Attached. Run with -test 1 or -test 2
>>>>>>>
>>>>>>>> El 8 may 2020, a las 17:14, Stefano Zampini
>>>>>>>> <[email protected]> escribió:
>>>>>>>>
>>>>>>>> Jose
>>>>>>>>
>>>>>>>> Just send me a MWE and I’ll fix the case for you
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Stefano
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Stefano
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Stefano
>>>>> <ex69.c>
>>>>
>>>
>>
>