Please run with -malloc_debug option or even better run under Valgrind https://petsc.org/release/faq/
> On Oct 26, 2023, at 10:35 AM, Joauma Marichal via petsc-users > <[email protected]> wrote: > > Hello, > > Here is a very simple version where I have issues. > > Which I run as follows: > > cd Grid_generation > make clean > make all > ./grid_generation > cd .. > make clean > make all > ./cobpor # on 1 proc > # OR > mpiexec ./cobpor -ksp_type cg -pc_type pfmg -dm_mat_type hyprestruct > -pc_pfmg_skip_relax 1 -pc_pfmg_rap_time non-Galerkin # on multiple procs > > The error that I get is the following: > munmap_chunk(): invalid pointer > [cns266:2552391] *** Process received signal *** > [cns266:2552391] Signal: Aborted (6) > [cns266:2552391] Signal code: (-6) > [cns266:2552391] [ 0] /lib64/libc.so > <http://libc.so/>.6(+0x4eb20)[0x7fd7fd194b20] > [cns266:2552391] [ 1] /lib64/libc.so > <http://libc.so/>.6(gsignal+0x10f)[0x7fd7fd194a9f] > [cns266:2552391] [ 2] /lib64/libc.so > <http://libc.so/>.6(abort+0x127)[0x7fd7fd167e05] > [cns266:2552391] [ 3] /lib64/libc.so > <http://libc.so/>.6(+0x91037)[0x7fd7fd1d7037] > [cns266:2552391] [ 4] /lib64/libc.so > <http://libc.so/>.6(+0x9819c)[0x7fd7fd1de19c] > [cns266:2552391] [ 5] /lib64/libc.so > <http://libc.so/>.6(+0x9844c)[0x7fd7fd1de44c] > [cns266:2552391] [ 6] > /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so > <http://libpetsc.so/>.3.019(PetscFreeAlign+0xe)[0x7fd7fe63d50e] > [cns266:2552391] [ 7] > /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so > <http://libpetsc.so/>.3.019(DMSetMatType+0x3d)[0x7fd7feab87ad] > [cns266:2552391] [ 8] > /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so > <http://libpetsc.so/>.3.019(DMSetFromOptions+0x109)[0x7fd7feab8b59] > [cns266:2552391] [ 9] ./cobpor[0x402df9] > [cns266:2552391] [10] /lib64/libc.so > <http://libc.so/>.6(__libc_start_main+0xf3)[0x7fd7fd180cf3] > [cns266:2552391] [11] ./cobpor[0x40304e] > [cns266:2552391] *** End of error message *** > > > Thanks a lot for your help. > > Best regards, > > Joauma > > > > De : Matthew Knepley <[email protected] <mailto:[email protected]>> > Date : mercredi, 25 octobre 2023 à 14:45 > À : Joauma Marichal <[email protected] > <mailto:[email protected]>> > Cc : [email protected] <mailto:[email protected]> > <[email protected] <mailto:[email protected]>>, > [email protected] <mailto:[email protected]> > <[email protected] <mailto:[email protected]>> > Objet : Re: [petsc-maint] DMSwarm on multiple processors > > On Wed, Oct 25, 2023 at 8:32 AM Joauma Marichal via petsc-maint > <[email protected] <mailto:[email protected]>> wrote: > Hello, > > I am using the DMSwarm library in some Eulerian-Lagrangian approach to have > vapor bubbles in water. > I have obtained nice results recently and wanted to perform bigger > simulations. Unfortunately, when I increase the number of processors used to > run the simulation, I get the following error: > > free(): invalid size > > [cns136:590327] *** Process received signal *** > > [cns136:590327] Signal: Aborted (6) > > [cns136:590327] Signal code: (-6) > > [cns136:590327] [ 0] /lib64/libc.so > <http://libc.so/>.6(+0x4eb20)[0x7f56cd4c9b20] > > [cns136:590327] [ 1] /lib64/libc.so > <http://libc.so/>.6(gsignal+0x10f)[0x7f56cd4c9a9f] > > [cns136:590327] [ 2] /lib64/libc.so > <http://libc.so/>.6(abort+0x127)[0x7f56cd49ce05] > > [cns136:590327] [ 3] /lib64/libc.so > <http://libc.so/>.6(+0x91037)[0x7f56cd50c037] > > [cns136:590327] [ 4] /lib64/libc.so > <http://libc.so/>.6(+0x9819c)[0x7f56cd51319c] > > [cns136:590327] [ 5] /lib64/libc.so > <http://libc.so/>.6(+0x99aac)[0x7f56cd514aac] > > [cns136:590327] [ 6] > /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so > <http://libpetsc.so/>.3.019(PetscSFSetUpRanks+0x4c4)[0x7f56cea71e64] > > [cns136:590327] [ 7] > /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so > <http://libpetsc.so/>.3.019(+0x841642)[0x7f56cea83642] > > [cns136:590327] [ 8] > /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so > <http://libpetsc.so/>.3.019(PetscSFSetUp+0x9e)[0x7f56cea7043e] > > [cns136:590327] [ 9] > /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so > <http://libpetsc.so/>.3.019(VecScatterCreate+0x164e)[0x7f56cea7bbde] > > [cns136:590327] [10] > /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so > <http://libpetsc.so/>.3.019(DMSetUp_DA_3D+0x3e38)[0x7f56cee84dd8] > > [cns136:590327] [11] > /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so > <http://libpetsc.so/>.3.019(DMSetUp_DA+0xd8)[0x7f56cee9b448] > > [cns136:590327] [12] > /gpfs/home/acad/ucl-tfl/marichaj/marha/lib_petsc/lib/libpetsc.so > <http://libpetsc.so/>.3.019(DMSetUp+0x20)[0x7f56cededa20] > > [cns136:590327] [13] ./cobpor[0x4418dc] > > [cns136:590327] [14] ./cobpor[0x408b63] > > [cns136:590327] [15] /lib64/libc.so > <http://libc.so/>.6(__libc_start_main+0xf3)[0x7f56cd4b5cf3] > > [cns136:590327] [16] ./cobpor[0x40bdee] > > [cns136:590327] *** End of error message *** > > -------------------------------------------------------------------------- > > Primary job terminated normally, but 1 process returned > > a non-zero exit code. Per user-direction, the job has been aborted. > > -------------------------------------------------------------------------- > > -------------------------------------------------------------------------- > > mpiexec noticed that process rank 84 with PID 590327 on node cns136 exited on > signal 6 (Aborted). > > -------------------------------------------------------------------------- > > > When I reduce the number of processors the error disappears and when I run my > code without the vapor bubbles it also works. > The problem seems to take place at this moment: > > DMCreate(PETSC_COMM_WORLD,swarm); > DMSetType(*swarm,DMSWARM); > DMSetDimension(*swarm,3); > DMSwarmSetType(*swarm,DMSWARM_PIC); > DMSwarmSetCellDM(*swarm,*dmcell); > > > Thanks a lot for your help. > > Things that would help us track this down: > > 1) The smallest example where it fails > > 2) The smallest number of processes where it fails > > 3) A stack trace of the failure > > 4) A simple example that we can run that also fails > > Thanks, > > Matt > > Best regards, > > Joauma > > > -- > What most experimenters take for granted before they begin their experiments > is infinitely more interesting than any results to which their experiments > lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
