Re: [petsc-users] performance regression with GAMG

Stephan Kramer Mon, 14 Aug 2023 08:03:36 -0700

Many thanks for looking into this, Mark

My 3D tests were not that different and I see you lowered the threshold.
Note, you can set the threshold to zero, but your test is running so much
differently than mine there is something else going on.
Note, the new, bad, coarsening rate of 30:1 is what we tend to shoot for
in 3D.


So it is not clear what the problem is.  Some questions:

* do you have a picture of this mesh to show me?

It's just a standard hexahedral cubed sphere mesh with the refinementlevel giving the number of times each of the six sides have beensubdivided: so Level_5 mean 2^5 x 2^5 squares which is extruded to 16layers. So the total number of elements at Level_5 is 6 x 32 x 32 x 16 =98304 hexes. And everything doubles in all 3 dimensions (so 2^3) goingto the next Level

* what do you mean by Q1-Q2 elements?

Q2-Q1, basically Taylor hood on hexes, so (tri)quadratic for velocityand (tri)linear for pressure

I guess you could argue we could/should just do good old geometricmultigrid instead. More generally we do use this solver configuration alot for tetrahedral Taylor Hood (P2-P1) in particular also for ouradaptive mesh runs - would it be worth to see if we have the sameperformance issues with tetrahedral P2-P1?


It would be nice to see if the new and old codes are similar without
aggressive coarsening.
This was the intended change of the major change in this time frame as you
noticed.
If these jobs are easy to run, could you check that the old and new
versions are similar with "-pc_gamg_square_graph  0 ",  ( and you only need
one time step).
All you need to do is check that the first coarse grid has about the same
number of equations (large).

Unfortunately we're seeing some memory errors when we use this option,and I'm not entirely clear whether we're just running out of memory andneed to put it on a special queue.

The run with square_graph 0 using new PETSc managed to get through onesolve at level 5, and is giving the following mg levels:


        rows=174, cols=174, bs=6
          total: nonzeros=30276, allocated nonzeros=30276
--
          rows=2106, cols=2106, bs=6
          total: nonzeros=4238532, allocated nonzeros=4238532
--
          rows=21828, cols=21828, bs=6
          total: nonzeros=62588232, allocated nonzeros=62588232
--
          rows=589824, cols=589824, bs=6
          total: nonzeros=1082528928, allocated nonzeros=1082528928
--
          rows=2433222, cols=2433222, bs=3
          total: nonzeros=456526098, allocated nonzeros=456526098

comparing with square_graph 100 with new PETSc

          rows=96, cols=96, bs=6
          total: nonzeros=9216, allocated nonzeros=9216
--
          rows=1440, cols=1440, bs=6
          total: nonzeros=647856, allocated nonzeros=647856
--
          rows=97242, cols=97242, bs=6
          total: nonzeros=65656836, allocated nonzeros=65656836
--
          rows=2433222, cols=2433222, bs=3
          total: nonzeros=456526098, allocated nonzeros=456526098

and old PETSc with square_graph 100

          rows=90, cols=90, bs=6
          total: nonzeros=8100, allocated nonzeros=8100
--
          rows=1872, cols=1872, bs=6
          total: nonzeros=1234080, allocated nonzeros=1234080
--
          rows=47652, cols=47652, bs=6
          total: nonzeros=23343264, allocated nonzeros=23343264
--
          rows=2433222, cols=2433222, bs=3
          total: nonzeros=456526098, allocated nonzeros=456526098
--

Unfortunately old PETSc with square_graph 0 did not complete a singlesolve before giving the memory error


BTW, I am starting to think I should add the old method back as an option.
I did not think this change would cause large differences.

Yes, I think that would be much appreciated. Let us know if we can doany testing


Best wishes
Stephan


Thanks,
Mark

Note that we are providing the rigid body near nullspace,
hence the bs=3 to bs=6.
We have tried different values for the gamg_threshold but it doesn't
really seem to significantly alter the coarsening amount in that first
step.

Do you have any suggestions for further things we should try/look at?
Any feedback would be much appreciated

Best wishes
Stephan Kramer

Full logs including log_view timings available from
https://github.com/stephankramer/petsc-scaling/

In particular:


https://github.com/stephankramer/petsc-scaling/blob/main/before/Level_5/output_2.dat

https://github.com/stephankramer/petsc-scaling/blob/main/after/Level_5/output_2.dat

https://github.com/stephankramer/petsc-scaling/blob/main/before/Level_6/output_2.dat

https://github.com/stephankramer/petsc-scaling/blob/main/after/Level_6/output_2.dat

https://github.com/stephankramer/petsc-scaling/blob/main/before/Level_7/output_2.dat

https://github.com/stephankramer/petsc-scaling/blob/main/after/Level_7/output_2.dat

Re: [petsc-users] performance regression with GAMG

Reply via email to