Thanks Pierre, that works. On Sun, Jan 24, 2021 at 11:42 AM Pierre Jolivet <[email protected]> wrote:
> > > On 24 Jan 2021, at 4:54 PM, Mark Adams <[email protected]> wrote: > > Hi Sherry, I have this running with OMP, with cuSparse solves (PETSc CPU > factorizations) > > Building SuperLU_dist w/o _OPENMP was not easy for me. > > > Expanding on Sherry’s answer on Thu Jan 21, it should be as easy as adding > to your configure script > '--download-superlu-cmake-arguments=-Denable_openmp=FALSE' ' > --download-superlu_dist-cmake-arguments=-Denable_openmp=FALSE'. > Is that not easy enough, or is it not working? > > Thanks, > Pierre > > We need to get a better way to do this. (Satish or Barry?) > > SuperLU works with one thread and two subdomains. With two threads I see > this (appended). So this seems to be working in that before it was hanging. > > I set the solver up so that it does not use threads the first time it is > called so that solvers can get any lazy allocations done in serial. This is > just to be safe in that we do not use a Krylov method here and I don't > believe "preonly" allocates any work vectors, and SuperLU does the symbolic > factorizations without threads. > > Let me know how you want to proceed. > > Thanks, > Mark > > ijcusparse -dm_vec_type cuda' NC=2 |g energy > 0) species-0: charge density= -1.6022862392985e+01 z-momentum= > -3.4369550192576e-19 energy= 9.6063873494138e+04 > 0) species-1: charge density= 1.6029950760009e+01 z-momentum= > -2.7844197929124e-18 energy= 9.6333444502318e+04 > 0) Total: charge density= 7.0883670236874e-03, momentum= > -3.1281152948382e-18, energy= 1.9239731799646e+05 (m_i[0]/m_e = 1835.47, > 92 cells) > ex2: > /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/externalpackages/git.superlu_dist/SRC/dSchCompUdt-cuda.c:157: > pdgstrf: Assertion `jjj-1<nub' failed. > [h16n13:21073] *** Process received signal *** > [h16n13:21073] Signal: Aborted (6) > [h16n13:21073] Signal code: User function (kill, sigsend, abort, etc.) (0) > [h16n13:21073] [ 0] [0x2000000504d8] > [h16n13:21073] [ 1] /lib64/libc.so.6(abort+0x2b4)[0x200020ef2094] > [h16n13:21073] [ 2] /lib64/libc.so.6(+0x356d4)[0x200020ee56d4] > [h16n13:21073] [ 3] /lib64/libc.so.6(__assert_fail+0x64)[0x200020ee57c4] > [h16n13:21073] [ 4] > /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libsuperlu_dist.so.6(pdgstrf+0x3848)[0x2000022fe5d8] > [h16n13:21073] [ 5] > /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libsuperlu_dist.so.6(pdgssvx+0x1220)[0x2000022dc4a8] > [h16n13:21073] [ 6] > /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(+0x9aff28)[0x200000a9ff28] > [h16n13:21073] [ 7] > /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(MatLUFactorNumeric+0x144)[0x2000007d273c] > [h16n13:21073] [ 8] > /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(+0xecffc4)[0x200000fbffc4] > [h16n13:21073] [ 9] > /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(PCSetUp+0x134)[0x20000107dd38] > [h16n13:21073] [10] > /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(KSPSetUp+0x9f8)[0x2000010b272c] > [h16n13:21073] [11] > /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(+0xfc46f0)[0x2000010b46f0] > [h16n13:21073] [12] > /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(KSPSolve+0x20)[0x2000010b6fb8] > [h16n13:21073] [13] > /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(+0xf1e4f8)[0x20000100e4f8] > [h16n13:21073] [14] > /sw/summit/gcc/6.4.0/lib64/libgomp.so.1(+0x1a51c)[0x200020e2a51c] > [h16n13:21073] [15] /lib64/libpthread.so.0(+0x8b94)[0x200020e78b94] > [h16n13:21073] [16] /lib64/libc.so.6(clone+0xe4)[0x200020fd85f4] > [h16n13:21073] *** End of error message *** > ERROR: One or more process (first noticed rank 0) terminated with signal > 6 (core dumped) > make: [runasm] Error 134 (ignored) > > On Thu, Jan 21, 2021 at 11:57 PM Xiaoye S. Li <[email protected]> wrote: > >> All the OpenMP calls are surrounded by >> >> #ifdef _OPENMP >> ... >> #endif >> >> You can disable openmp during Cmake installation, with the following: >> -Denable_openmp=FALSE >> (the default is true) >> >> (I think Satish knows how to do this with PETSc installation) >> >> ------- >> The reason to use mixed MPI & OpenMP is mainly less memory consumption, >> compared to pure MPI. Timewise probably it is just slightly faster. (I >> think that's the case with many codes.) >> >> >> Sherry >> >> On Thu, Jan 21, 2021 at 7:20 PM Mark Adams <[email protected]> wrote: >> >>> >>> >>> On Thu, Jan 21, 2021 at 10:16 PM Barry Smith <[email protected]> wrote: >>> >>>> >>>> >>>> On Jan 21, 2021, at 9:11 PM, Mark Adams <[email protected]> wrote: >>>> >>>> I have tried it and it hangs, but that is expected. This is not >>>> something she has prepared for. >>>> >>>> I am working with Sherry on it. >>>> >>>> And she is fine with just one thread and suggests it if she is in a >>>> thread. >>>> >>>> Now that I think about it, I don't understand why she needs OpenMP if >>>> she can live with OMP_NUM_THREADS=1. >>>> >>>> >>>> It is very possible it was just a coding decision by one of her >>>> students and with a few ifdef in her code should would not need the OpenMP >>>> but I don't have the time or energy to check her code and design decision. >>>> >>> >>> Oh yea there OMP calls like omp_num_threads() that need something. There >>> is probably a omp1.h file somewhere in the world like our serial MPI. >>> >>> >>>> >>>> Barry >>>> >>>> >>>> Mark >>>> >>>> >>>> >>>> On Thu, Jan 21, 2021 at 9:30 PM Barry Smith <[email protected]> wrote: >>>> >>>>> >>>>> >>>>> On Jan 21, 2021, at 5:37 PM, Mark Adams <[email protected]> wrote: >>>>> >>>>> This did not work. I verified that MPI_Init_thread is being called >>>>> correctly and that MPI returns that it supports this highest level of >>>>> thread safety. >>>>> >>>>> I am going to ask ORNL. >>>>> >>>>> And if I use: >>>>> >>>>> -fieldsplit_i1_ksp_norm_type none >>>>> -fieldsplit_i1_ksp_max_it 300 >>>>> >>>>> for all 9 "i" variables, I can run normal iterations on the 10th >>>>> variable, in a 10 species problem, and it works perfectly with 10 threads. >>>>> >>>>> So it is definitely that VecNorm is not thread safe. >>>>> >>>>> And, I want to call SuperLU_dist, which uses threads, but I don't want >>>>> SuperLU to start using threads. Is there a way to tell superLU that there >>>>> are no threads but have PETSc use them? >>>>> >>>>> >>>>> My interpretation and Satish's for many years is that SuperLU_DIST >>>>> has to be built with and use OpenMP in order to work with CUDA. >>>>> >>>>> def formCMakeConfigureArgs(self): >>>>> args = config.package.CMakePackage.formCMakeConfigureArgs(self) >>>>> if self.openmp.found: >>>>> self.usesopenmp = 'yes' >>>>> else: >>>>> args.append('-DCMAKE_DISABLE_FIND_PACKAGE_OpenMP=TRUE') >>>>> if self.cuda.found: >>>>> if not self.openmp.found: >>>>> raise RuntimeError('SuperLU_DIST GPU code currently requires >>>>> OpenMP. Use --with-openmp=1') >>>>> >>>>> But this could be ok. You use OpenMP and then it uses OpenMP >>>>> internally, each doing their own business (what could go wrong :-)). >>>>> >>>>> Have you tried it? >>>>> >>>>> Barry >>>>> >>>>> >>>>> >>>>> Thanks, >>>>> Mark >>>>> >>>>> On Thu, Jan 21, 2021 at 5:19 PM Mark Adams <[email protected]> wrote: >>>>> >>>>>> OK, the problem is probably: >>>>>> >>>>>> PetscMPIInt PETSC_MPI_THREAD_REQUIRED = MPI_THREAD_FUNNELED; >>>>>> >>>>>> There is an example that sets: >>>>>> >>>>>> PETSC_MPI_THREAD_REQUIRED = MPI_THREAD_MULTIPLE; >>>>>> >>>>>> This is what I need. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Thu, Jan 21, 2021 at 2:26 PM Mark Adams <[email protected]> wrote: >>>>>> >>>>>>> >>>>>>> >>>>>>> On Thu, Jan 21, 2021 at 2:11 PM Matthew Knepley <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> On Thu, Jan 21, 2021 at 2:02 PM Mark Adams <[email protected]> wrote: >>>>>>>> >>>>>>>>> On Thu, Jan 21, 2021 at 1:44 PM Matthew Knepley <[email protected]> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> On Thu, Jan 21, 2021 at 11:16 AM Mark Adams <[email protected]> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Yes, the problem is that each KSP solver is running in an OMP >>>>>>>>>>> thread (So at this point it only works for SELF and its Landau so >>>>>>>>>>> it is all >>>>>>>>>>> I need). It looks like MPI reductions called with a comm_self are >>>>>>>>>>> not >>>>>>>>>>> thread safe (eg, the could say, this is one proc, thus, just copy >>>>>>>>>>> send --> >>>>>>>>>>> recv, but they don't) >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Instead of using SELF, how about Comm_dup() for each thread? >>>>>>>>>> >>>>>>>>> >>>>>>>>> OK, raw MPI_Comm_dup. I tried PetscCommDup. Let me this. >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>> >>>>>>>> You would have to dup them all outside the OMP section, since it is >>>>>>>> not threadsafe. Then each thread uses one I think. >>>>>>>> >>>>>>> >>>>>>> Yea sure. I do it in SetUp. >>>>>>> >>>>>>> Well that worked to get *different Comms*, finally, I still get the >>>>>>> same problem. The number of iterations differ wildly. This two species >>>>>>> and >>>>>>> two threads (13 SNES its that is not deterministic). Way below is one >>>>>>> thread (8 its) and fairly uniform iteration counts. >>>>>>> >>>>>>> Maybe this MPI is just not thread safe at all. Let me look into it. >>>>>>> Thanks anyway, >>>>>>> >>>>>>> 0 SNES Function norm 4.974994975313e-03 >>>>>>> In PCFieldSplitSetFields_FieldSplit with -------------- link: >>>>>>> 0x80017c60. Comms pc=0x67ad27c0 ksp=*0x7ffe1600* newcomm=0x8014b6e0 >>>>>>> In PCFieldSplitSetFields_FieldSplit with -------------- link: >>>>>>> 0x7ffdabc0. Comms pc=0x67ad27c0 ksp=*0x7fff70d0* newcomm=0x7ffe9980 >>>>>>> Linear fieldsplit_e_ solve converged due to CONVERGED_RTOL >>>>>>> iterations 282 >>>>>>> 1 SNES Function norm 1.836376279964e-05 >>>>>>> Linear fieldsplit_e_ solve converged due to CONVERGED_ATOL >>>>>>> iterations 19 >>>>>>> 2 SNES Function norm 3.059930074740e-07 >>>>>>> Linear fieldsplit_e_ solve converged due to CONVERGED_ATOL >>>>>>> iterations 15 >>>>>>> 3 SNES Function norm 4.744275398121e-08 >>>>>>> Linear fieldsplit_e_ solve converged due to CONVERGED_ATOL >>>>>>> iterations 4 >>>>>>> 4 SNES Function norm 4.014828563316e-08 >>>>>>> Linear fieldsplit_e_ solve converged due to CONVERGED_RTOL >>>>>>> iterations 456 >>>>>>> 5 SNES Function norm 5.670836337808e-09 >>>>>>> Linear fieldsplit_e_ solve converged due to CONVERGED_ATOL >>>>>>> iterations 2 >>>>>>> 6 SNES Function norm 2.410421401323e-09 >>>>>>> Linear fieldsplit_e_ solve converged due to CONVERGED_ATOL >>>>>>> iterations 18 >>>>>>> 7 SNES Function norm 6.533948191791e-10 >>>>>>> Linear fieldsplit_e_ solve converged due to CONVERGED_RTOL >>>>>>> iterations 458 >>>>>>> 8 SNES Function norm 1.008133815842e-10 >>>>>>> Linear fieldsplit_e_ solve converged due to CONVERGED_ATOL >>>>>>> iterations 9 >>>>>>> 9 SNES Function norm 1.690450876038e-11 >>>>>>> Linear fieldsplit_e_ solve converged due to CONVERGED_ATOL >>>>>>> iterations 4 >>>>>>> 10 SNES Function norm 1.336383986009e-11 >>>>>>> Linear fieldsplit_e_ solve converged due to CONVERGED_RTOL >>>>>>> iterations 463 >>>>>>> 11 SNES Function norm 1.873022410774e-12 >>>>>>> Linear fieldsplit_e_ solve converged due to CONVERGED_RTOL >>>>>>> iterations 113 >>>>>>> 12 SNES Function norm 1.801834606518e-13 >>>>>>> Linear fieldsplit_e_ solve converged due to CONVERGED_ATOL >>>>>>> iterations 1 >>>>>>> 13 SNES Function norm 1.004397317339e-13 >>>>>>> Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE >>>>>>> iterations 13 >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> 0 SNES Function norm 4.974994975313e-03 >>>>>>> In PCFieldSplitSetFields_FieldSplit with -------------- link: >>>>>>> 0x6e265010. Comms pc=0x56450340 ksp=0x6e2168d0 newcomm=0x6e265090 >>>>>>> In PCFieldSplitSetFields_FieldSplit with -------------- link: >>>>>>> 0x6e25bc40. Comms pc=0x56450340 ksp=0x6e22c1d0 newcomm=0x6e21e8f0 >>>>>>> Linear fieldsplit_e_ solve converged due to CONVERGED_RTOL >>>>>>> iterations 282 >>>>>>> 1 SNES Function norm 1.836376279963e-05 >>>>>>> Linear fieldsplit_e_ solve converged due to CONVERGED_RTOL >>>>>>> iterations 380 >>>>>>> 2 SNES Function norm 3.018499983019e-07 >>>>>>> Linear fieldsplit_e_ solve converged due to CONVERGED_RTOL >>>>>>> iterations 387 >>>>>>> 3 SNES Function norm 1.826353175637e-08 >>>>>>> Linear fieldsplit_e_ solve converged due to CONVERGED_RTOL >>>>>>> iterations 391 >>>>>>> 4 SNES Function norm 1.378600599548e-09 >>>>>>> Linear fieldsplit_e_ solve converged due to CONVERGED_RTOL >>>>>>> iterations 392 >>>>>>> 5 SNES Function norm 1.077289085611e-10 >>>>>>> Linear fieldsplit_e_ solve converged due to CONVERGED_RTOL >>>>>>> iterations 394 >>>>>>> 6 SNES Function norm 8.571891727748e-12 >>>>>>> Linear fieldsplit_e_ solve converged due to CONVERGED_RTOL >>>>>>> iterations 395 >>>>>>> 7 SNES Function norm 6.897647643450e-13 >>>>>>> Linear fieldsplit_e_ solve converged due to CONVERGED_RTOL >>>>>>> iterations 395 >>>>>>> 8 SNES Function norm 5.606434614114e-14 >>>>>>> Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE >>>>>>> iterations 8 >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> Matt >>>>>>>> >>>>>>>> >>>>>>>>> Matt >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> On Thu, Jan 21, 2021 at 10:46 AM Matthew Knepley < >>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>> >>>>>>>>>>>> On Thu, Jan 21, 2021 at 10:34 AM Mark Adams <[email protected]> >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> It looks like PETSc is just too clever for me. I am trying to >>>>>>>>>>>>> get a different MPI_Comm into each block, but PETSc is thwarting >>>>>>>>>>>>> me: >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> It looks like you are using SELF. Is that what you want? Do you >>>>>>>>>>>> want a bunch of comms with the same group, but independent >>>>>>>>>>>> somehow? I am >>>>>>>>>>>> confused. >>>>>>>>>>>> >>>>>>>>>>>> Matt >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> if (jac->use_openmp) { >>>>>>>>>>>>> ierr = >>>>>>>>>>>>> KSPCreate(MPI_COMM_SELF,&ilink->ksp);CHKERRQ(ierr); >>>>>>>>>>>>> PetscPrintf(PETSC_COMM_SELF,"In >>>>>>>>>>>>> PCFieldSplitSetFields_FieldSplit with -------------- link: %p. >>>>>>>>>>>>> Comms %p >>>>>>>>>>>>> %p\n",ilink,PetscObjectComm((PetscObject)pc),PetscObjectComm((PetscObject)ilink->ksp)); >>>>>>>>>>>>> } else { >>>>>>>>>>>>> ierr = >>>>>>>>>>>>> KSPCreate(PetscObjectComm((PetscObject)pc),&ilink->ksp);CHKERRQ(ierr); >>>>>>>>>>>>> } >>>>>>>>>>>>> >>>>>>>>>>>>> produces: >>>>>>>>>>>>> >>>>>>>>>>>>> In PCFieldSplitSetFields_FieldSplit with -------------- link: >>>>>>>>>>>>> 0x7e9cb4f0. Comms 0x660c6ad0 0x660c6ad0 >>>>>>>>>>>>> In PCFieldSplitSetFields_FieldSplit with -------------- link: >>>>>>>>>>>>> 0x7e88f7d0. Comms 0x660c6ad0 0x660c6ad0 >>>>>>>>>>>>> >>>>>>>>>>>>> How can I work around this? >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Thu, Jan 21, 2021 at 7:41 AM Mark Adams <[email protected]> >>>>>>>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Wed, Jan 20, 2021 at 6:21 PM Barry Smith <[email protected]> >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Jan 20, 2021, at 3:09 PM, Mark Adams <[email protected]> >>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> So I put in a temporary hack to get the first Fieldsplit >>>>>>>>>>>>>>> apply to NOT use OMP and it sort of works. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Preonly/lu is fine. GMRES calls vector creates/dups in every >>>>>>>>>>>>>>> solve so that is a big problem. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> It should definitely not be creating vectors "in every" >>>>>>>>>>>>>>> solve. But it does do lazy allocation of needed restarted >>>>>>>>>>>>>>> vectors which may >>>>>>>>>>>>>>> make it look like it is creating "every" vectors in every >>>>>>>>>>>>>>> solve. You can >>>>>>>>>>>>>>> use -ksp_gmres_preallocate to force it to create all the >>>>>>>>>>>>>>> restart vectors up >>>>>>>>>>>>>>> front at KSPSetUp(). >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Well, I run the first solve w/o OMP and I see Vec dups in >>>>>>>>>>>>>> cuSparse Vecs in the 2nd solve. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Why is creating vectors "at every solve" a problem? It is >>>>>>>>>>>>>>> not thread safe I guess? >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> It dies when it looks at the options database, in a Free in >>>>>>>>>>>>>> the get-options method to be exact (see stacks). >>>>>>>>>>>>>> >>>>>>>>>>>>>> ======= Backtrace: ========= >>>>>>>>>>>>>> /lib64/libc.so.6(cfree+0x4a0)[0x200021839be0] >>>>>>>>>>>>>> >>>>>>>>>>>>>> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(PetscFreeAlign+0x4c)[0x2000002a368c] >>>>>>>>>>>>>> >>>>>>>>>>>>>> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(PetscOptionsEnd_Private+0xf4)[0x2000002e53f0] >>>>>>>>>>>>>> >>>>>>>>>>>>>> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(+0x7c6c28)[0x2000008b6c28] >>>>>>>>>>>>>> >>>>>>>>>>>>>> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(VecCreate_SeqCUDA+0x11c)[0x20000052c510] >>>>>>>>>>>>>> >>>>>>>>>>>>>> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(VecSetType+0x670)[0x200000549664] >>>>>>>>>>>>>> >>>>>>>>>>>>>> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(VecCreateSeqCUDA+0x150)[0x20000052c0b0] >>>>>>>>>>>>>> >>>>>>>>>>>>>> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(+0x43c198)[0x20000052c198] >>>>>>>>>>>>>> >>>>>>>>>>>>>> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(VecDuplicate+0x44)[0x200000542168] >>>>>>>>>>>>>> >>>>>>>>>>>>>> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(VecDuplicateVecs_Default+0x148)[0x200000543820] >>>>>>>>>>>>>> >>>>>>>>>>>>>> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(VecDuplicateVecs+0x54)[0x2000005425f4] >>>>>>>>>>>>>> >>>>>>>>>>>>>> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-cuda-omp/lib/libpetsc.so.3.014(KSPCreateVecs+0x4b4)[0x2000016f0aec] >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Richardson works except the convergence test gets confused, >>>>>>>>>>>>>>> presumably because MPI reductions with PETSC_COMM_SELF is not >>>>>>>>>>>>>>> threadsafe. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> One fix for the norms might be to create each >>>>>>>>>>>>>>> subdomain solver with a different communicator. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Yes you could do that. It might actually be the correct >>>>>>>>>>>>>>> thing to do also, if you have multiple threads call MPI >>>>>>>>>>>>>>> reductions on the >>>>>>>>>>>>>>> same communicator that would be a problem. Each KSP should get >>>>>>>>>>>>>>> a new >>>>>>>>>>>>>>> MPI_Comm. >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> OK. I will only do this. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> What most experimenters take for granted before they begin >>>>>>>>>>>> their experiments is infinitely more interesting than any results >>>>>>>>>>>> to which >>>>>>>>>>>> their experiments lead. >>>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>>> >>>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>>>> <http://www.cse.buffalo.edu/~knepley/> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> What most experimenters take for granted before they begin their >>>>>>>>>> experiments is infinitely more interesting than any results to which >>>>>>>>>> their >>>>>>>>>> experiments lead. >>>>>>>>>> -- Norbert Wiener >>>>>>>>>> >>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>> <http://www.cse.buffalo.edu/~knepley/> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> What most experimenters take for granted before they begin their >>>>>>>> experiments is infinitely more interesting than any results to which >>>>>>>> their >>>>>>>> experiments lead. >>>>>>>> -- Norbert Wiener >>>>>>>> >>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>> <http://www.cse.buffalo.edu/~knepley/> >>>>>>>> >>>>>>> >>>>> >>>> >
