I am running in release mode. I am attaching the results for cycle 3 for both debug and release modes. I will try to reproduce the plot of wall time vs the number of processors for 52M DOFs as given in the tutorial problem. That would be a better way to compare the performances!
On Wednesday, April 5, 2023 at 10:24:37 PM UTC+5:30 Wolfgang Bangerth wrote: > > Wasim, > > > I am trying to run step-40 on my local system and match the performance > > against the results given in the tutorial problem. I am running the > > problem on a single processor. My total time is significantly less than > > that given in the tutorial problem. But my output takes up around 50% of > > the total time, which isn't the case in the tutorial problem. > > I cannot figure out why my total time is significantly less than that in > > the tutorial and why my output takes 50% of the total wall clock time. > > I am attaching both results for reference. > > Are you running in debug or release mode? > > For reference, here is what I get in debug mode for cycle 3: > > Cycle 3: > Number of active cells: 7096 > Number of degrees of freedom: 31639 > Solved in 11 iterations. > > > +---------------------------------------------+------------+------------+ > | Total wallclock time elapsed since start | 5.84s | | > | | | | > | Section | no. calls | wall time | % of total | > +---------------------------------+-----------+------------+------------+ > | assembly | 1 | 1.85s | 32% | > | output | 1 | 0.532s | 9.1% | > | refine | 1 | 2.01s | 34% | > | setup | 1 | 0.926s | 16% | > | solve | 1 | 0.52s | 8.9% | > +---------------------------------+-----------+------------+------------+ > > > > And here for release mode: > > Cycle 3: > Number of active cells: 7096 > Number of degrees of freedom: 31639 > Solved in 11 iterations. > > > +---------------------------------------------+------------+------------+ > | Total wallclock time elapsed since start | 1.07s | | > | | | | > | Section | no. calls | wall time | % of total | > +---------------------------------+-----------+------------+------------+ > | assembly | 1 | 0.0412s | 3.9% | > | output | 1 | 0.204s | 19% | > | refine | 1 | 0.282s | 26% | > | setup | 1 | 0.0349s | 3.3% | > | solve | 1 | 0.506s | 47% | > +---------------------------------+-----------+------------+------------+ > > > As a general rule, though, these tiny problems are not of great > interest. step-40 is written to be run on substantial numbers of > processes, on millions or billions of degrees of freedom. > > Best > W. > -- > ------------------------------------------------------------------------ > Wolfgang Bangerth email: bang...@colostate.edu > www: http://www.math.colostate.edu/~bangerth/ > -- The deal.II project is located at http://www.dealii.org/ For mailing list/forum options, see https://groups.google.com/d/forum/dealii?hl=en --- You received this message because you are subscribed to the Google Groups "deal.II User Group" group. To unsubscribe from this group and stop receiving emails from it, send an email to dealii+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/dealii/0a27e90f-1cfd-4f3e-90ef-ba1538863f26n%40googlegroups.com.