Yes, these messages are because of lack of resources even though the error 
message says not initialized.

  Barry


> On Dec 9, 2020, at 8:01 PM, Junchao Zhang <[email protected]> wrote:
> 
> Could be GPU resource competition. Note this test uses nsize=8.
> --Junchao Zhang
> 
> 
> On Wed, Dec 9, 2020 at 7:15 PM Mark Adams <[email protected] 
> <mailto:[email protected]>> wrote:
> And this is a Cuda 11 complex build: 
> https://gitlab.com/petsc/petsc/-/jobs/901108135 
> <https://gitlab.com/petsc/petsc/-/jobs/901108135>
> On Wed, Dec 9, 2020 at 8:11 PM Mark Adams <[email protected] 
> <mailto:[email protected]>> wrote:
> My MR is generating an error. Tee error message says cusparse has not been 
> initialized, so I added a cuparse init, but I still get the error (appended, 
> adams/landau-gpu-assembly 
> <https://gitlab.com/petsc/petsc/-/tree/adams/landau-gpu-assembly>).  Any 
> ideas would be appreciated.
> 
> I am trying to reproduce this on Summit and it fails with a timeout limit of 
> 60s, but it only runs for a few seconds (see timers). Any ideas?
> 
> 19:58 adams/landau-gpu-assembly= ~/petsc$ make -f gmakefile test 
> search='ksp_ksp_tutorials-ex71_bddc_cusparse' 
> PETSC_ARCH=arch-summit-opt-gnu-cuda
> Using MAKEFLAGS: PETSC_ARCH=arch-summit-opt-gnu-cuda 
> search=ksp_ksp_tutorials-ex71_bddc_cusparse
>         TEST 
> arch-summit-opt-gnu-cuda/tests/counts/ksp_ksp_tutorials-ex71_bddc_cusparse.counts
> not ok ksp_ksp_tutorials-ex71_bddc_cusparse # Exceeded timeout limit of 60 s
>  ok ksp_ksp_tutorials-ex71_bddc_cusparse # SKIP Command failed so no diff
> 
> # -------------
> #   Summary
> # -------------
> # FAILED ksp_ksp_tutorials-ex71_bddc_cusparse
> # success 0/1 tests (0.0%)
> # failed 1/1 tests (100.0%)
> # todo 0/1 tests (0.0%)
> # skip 0/1 tests (0.0%)
> #
> # Wall clock time for tests: 3 sec
> # Approximate CPU time (not incl. build time): 3.14 sec
> 
> 
> 
> 
> 
> not ok ksp_ksp_tutorials-ex71_bddc_cusparse # Error code: 201
> 2391 <https://gitlab.com/petsc/petsc/-/jobs/901108135#L2391># [1]PETSC ERROR: 
> --------------------- Error Message 
> --------------------------------------------------------------
> 2392 <https://gitlab.com/petsc/petsc/-/jobs/901108135#L2392># [1]PETSC ERROR: 
> GPU error 
> 2393 <https://gitlab.com/petsc/petsc/-/jobs/901108135#L2393># [1]PETSC ERROR: 
> cuSPARSE error 1 (CUSPARSE_STATUS_NOT_INITIALIZED) : initialization error
> 2394 <https://gitlab.com/petsc/petsc/-/jobs/901108135#L2394># [1]PETSC ERROR: 
> See https://www.mcs.anl.gov/petsc/documentation/faq.html 
> <https://www.mcs.anl.gov/petsc/documentation/faq.html> for trouble shooting.
> 2395 <https://gitlab.com/petsc/petsc/-/jobs/901108135#L2395># [1]PETSC ERROR: 
> Petsc Development GIT revision: v3.14.2-85-gd60087d  GIT Date: 2020-12-09 
> 17:49:59 -0500
> 2396 <https://gitlab.com/petsc/petsc/-/jobs/901108135#L2396># [1]PETSC ERROR: 
> ../ex71 on a  named frog by petsc Wed Dec  9 18:41:10 2020
> 2397 <https://gitlab.com/petsc/petsc/-/jobs/901108135#L2397># [1]PETSC ERROR: 
> Configure options --package-prefix-hash=/home/petsc/petsc-hash-pkgs 
> --with-make-test-np=2 COPTFLAGS="-g -O" FOPTFLAGS="-g -O" CXXOPTFLAGS="-g -O" 
> --with-scalar-type=complex --with-precision=single 
> --with-cuda-dir=/usr/local/cuda-11.0 PETSC_ARCH=arch-ci-linux-cuda11-complex
> 2398 <https://gitlab.com/petsc/petsc/-/jobs/901108135#L2398># [1]PETSC ERROR: 
> #1 MatConvert_SeqAIJ_SeqAIJCUSPARSE() line 2708 in 
> /home/petsc/builds/KFnbdjNX/0/petsc/petsc/src/mat/impls/aij/seq/seqcusparse/aijcusparse.cu
>  <http://aijcusparse.cu/>
> 2399 <https://gitlab.com/petsc/petsc/-/jobs/901108135#L2399># [1]PETSC ERROR: 
> #2 MatCreate_SeqAIJCUSPARSE() line 2739 in 
> /home/petsc/builds/KFnbdjNX/0/petsc/petsc/src/mat/impls/aij/seq/seqcusparse/aijcusparse.cu
>  <http://aijcusparse.cu/>

Reply via email to