Dear Lucas,

Without seeing your code, it is difficult to nail down the issue. But by far the most common mistake that lead to this type of problem in my codes is that I forgot to initialize an AffineConstraints object with an index set for the locally relevant DoFs, i.e., I was missing this line: https://github.com/dealii/dealii/blob/ea23d6bb90739b6bd2f7af96a2b9b73bb10c7298/examples/step-40/step-40.cc#L294

In general, deal.II has been used for far larger problems than this, so most data structures should be scalable. But of course, there are many places that might be problematic, and as a library deal.II might have many things that have not been tested at scale. I found it useful to add lines like https://github.com/kronbichler/multigrid/blob/6b43f32b4758a169af5b4bb54546ad279d6fee9f/poisson_dg/program.cc#L245-L247 with the 'print_time' function given here https://github.com/kronbichler/multigrid/blob/6b43f32b4758a169af5b4bb54546ad279d6fee9f/common/laplace_operator.h#L38-L52 (the name is not really good and only done on a side branch, the motivation is of course to print some statistics) at various places in the code, in order to identify the problem. With measurements of the actual memory consumption you are often able to identify the problem on a smaller scale, even though it might take 2-3 attempts to have the timers in the right spots.

Best,
Martin

On 07.06.23 21:04, Lucas Myers wrote:
Hi everyone,

I'm trying to run a scaling analysis on my code, but when I make the system too large I get a (Killed) error, which I strongly suspect has to do with memory issues. I'm unsure why this is happening, because I am increasing resources proportional to the system size.

Details: I'm trying to get a sense for weak scaling in 3D, so for this I use a subdivided hyper-rectangle. Since the configuration is nearly constant in the third dimension, I use p1 = (-20, -20, -5) and p2 = (20, 20, 5) for my defining points. To try to keep the number of DoFs per core constant, I do runs with the following sets of parameters (so 5 runs total):

hyper-rectangle subdivisions: (4, 4, 1), (4, 4, 2), (4, 8, 2), (4, 4, 1), (4, 4, 2)
global refines: 5, 5, 5, 6, 6
# cores: 128, 256, 512, 1024, 2048

Each node is 128 cores, and so doubling the number of cores also doubles the amount of available memory. However, it seems that the memory is running out even before starting to assemble the system (so it couldn't be the solver that is causing this problem). Are there any data structures in deal.II which might scale poorly (memory-wise) in this scenario? And also are there any nice ways of figuring out what is eating all the memory?

- Lucas
--
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see https://groups.google.com/d/forum/dealii?hl=en
---
You received this message because you are subscribed to the Google Groups "deal.II User Group" group. To unsubscribe from this group and stop receiving emails from it, send an email to dealii+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/dealii/fee240c8-795f-432f-a1e2-c462a981c26fn%40googlegroups.com <https://groups.google.com/d/msgid/dealii/fee240c8-795f-432f-a1e2-c462a981c26fn%40googlegroups.com?utm_medium=email&utm_source=footer>.

--
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- You received this message because you are subscribed to the Google Groups "deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dealii+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/dealii/fbb8751e-e850-c9d5-4a72-4dd9d21067d2%40gmail.com.

Reply via email to