Mark,

   There could be multiple possible issues.

1)  Calling PetscOptions inside threads. I looked quickly at the code and it 
seems like it should be ok but perhaps not. This is one reason why having stuff 
like PetscOptionsBegin inside a low-level creation VecCreate_SeqCUDA_Private is 
normally not done in PETSc. Eventually this needs to be moved or reworked. 

2) PetscCUDAInitializeCheck is not thread safe.If it is being call for the 
first timeby multiple threads there can be trouble. So edit init.c and under

#if defined(PETSC_HAVE_CUDA)
  ierr = PetscOptionsCheckCUDA(logView);CHKERRQ(ierr);
#endif

#if defined(PETSC_HAVE_HIP)
  ierr = PetscOptionsCheckHIP(logView);CHKERRQ(ierr);
#endif

put in 
#if defined thread safety
PetscCUPMInitializeCheck
#endif

this will force the initialize to be done before any threads are used

3) a million more possibilities

I would fix 2 first and if that does not solve the problem comment out the 
options stuff in 1 and see if that resolves the problem.

Good luck,

  Barry




> On Jan 17, 2021, at 4:45 PM, Mark Adams <[email protected]> wrote:
> 
> I have put the Fieldsplit additive solver loop in an OpenMP loop and it seems 
> to work with CPU solvers. The VecScatters seem to work and the KSP solver 
> seems to work on the CPU. THis is an MPI serial code.
> 
> It fails with a cuSparse Jacobi solver with the error below, on each thread 
> (two threads/blocks in this example).
> I am not following this stack trace completely. eg, is the error in 
> VecInitializePackage perhaps? 
> I don't understand the thread safety mechanism. (eg, I'm not sure why the CPU 
> solvers worked), but I am thinking that this mechanism did not get used 
> properly in cuSparse. Understandable. 
> 
> Can anyone shed any light on what might be going wrong here?
> 
> Thanks,
> Mark
> 
> ======= Backtrace: =========
> /lib64/libc.so.6(cfree+0x4a0)[0x200021839be0]
> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(PetscFreeAlign+0x4c)[0x2000002a368c]
> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(PetscOptionsEnd_Private+0xf4)[0x2000002e53f0]
> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(+0x7c6c28)[0x2000008b6c28]
> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(VecCreate_SeqCUDA+0x11c)[0x20000052c510]
> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(VecSetType+0x670)[0x200000549664]
> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(VecCreateSeqCUDA+0x150)[0x20000052c0b0]
> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(+0x43c198)[0x20000052c198]
> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(VecDuplicate+0x44)[0x200000542168]
> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(VecDuplicateVecs_Default+0x148)[0x200000543820]
> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(VecDuplicateVecs+0x54)[0x2000005425f4]
> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(KSPCreateVecs+0x4b4)[0x2000016f0aec]
> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(+0x16b8b84)[0x2000017a8b84]
> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(KSPSetUp+0x918)[0x2000016d8368]
> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(+0x15ec754)[0x2000016dc754]
> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(KSPSolve+0x44)[0x2000016df100]
> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(+0x14ff5a8)[0x2000015ef5a8]
> /sw/summit/gcc/6.4.0/lib64/libgomp.so.1(GOMP_parallel+0x74)[0x200021711074]
> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(+0x14efb28)[0x2000015dfb28]
> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(PCApply+0x588)[0x200001685c70]
> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(+0x16b6374)[0x2000017a6374]
> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(+0x16b6738)[0x2000017a6738]
> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(+0x15ed594)[0x2000016dd594]
> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(KSPSolve+0x44)[0x2000016df100]
> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(SNESSolve_NEWTONLS+0x11f4)[0x20000189dcd8]
> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(SNESSolve+0x1b54)[0x2000018401a8]
> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(+0x18b0d3c)[0x2000019a0d3c]
> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(TSStep+0x408)[0x2000019284a0]
> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(TSSolve+0x15cc)[0x20000192a334]
> ./ex2[0x1000a348]
> /lib64/libc.so.6(+0x25200)[0x2000217c5200]
> /lib64/libc.so.6(__libc_start_main+0xc4)[0x2000217c53f4]

Reply via email to