Dear Wayne,
For performance it certainly matters, because some components of our
codes have more low-level checks in debug mode than others, and because
the compiler optimizations do not have the same effect on all parts of
our code. Make sure to test the release mode and see if it makes more
sense. We'd be happy to help from there.
On 19.10.22 13:50, 'yy.wayne' via deal.II User Group wrote:
Thanks Martin !
I never considered about Debug or optimized mode before. Cmake result
says I'm using Debug mode.
Some more information: The computaiton is done in deal.ii 9.4.0 oracle
virtualBox, with 1 mpi process in qtcreator, and CPU is intel 10600kf.
I didn't change the CMakeLists and just copy from examples, so I think
by default it's debug mode.
在2022年10月19日星期三 UTC+8 19:40:01<Martin Kronbichler> 写道:
Dear Wayne,
I am a bit surprised by your numbers and find them rather high, at
least with the chosen problem sizes. I would expect the
matrix-free solver to run in less than a second for 111,000
unknowns on typical computers, not almost 10 seconds. I need to
honestly say that I do not have a good explanation at this point.
I did not write this tutorial program, but I know more or less
what should happen. Let me ask a basic question first: Did you
record the timings with release mode? The numbers would make more
sense if they are based on the debug mode.
On 19.10.22 12:08, 'yy.wayne' via deal.II User Group wrote:
Thanks for your reply Peter,
The matrix-free run is basic same as in step-75 except I
substitute coarse grid solver. For fe_degree=6 without GMG and
fe_degree in each level decrease by 1 for pMG, the solve_system()
function runtime is 24.1s. It's decomposed to *MatrixFree MG
operators construction*(1.36s), MatrixFree MG transfers(2.73s),
KLU coarse grid solver(5.7s), *setting smoother_data and
compute_inverse_diagonal for level matrices*(3.4s) CG
The two bold texts cost a lot more(133s and 62s, respectively) in
matrix-based multigrid case. I noticed just as in step-16, the
finest level matrix is assembled twice(one for system_matrix and
one for mg_matrices[maxlevel]) so assembling time cost more.
在2022年10月19日星期三 UTC+8 17:10:27<> 写道:
Hi Wayne,
your numbers make totally sense. Don't forget that you are
running for high order: degree=6! The number of non-zeroes
per element-stiffness matrix is ((degree + 1)^dim)^2 and the
cost of computing the element stiffness matrix is even
((degree + 1)^dim)^3 if I am not mistaken (3 nested loop: i,
j and q). Higher orders are definitely made for matrix-free
Out of curiosity: how large is the setup cost of MG in the
case of the matrix-free run? As a comment: don't be surprised
that the setup costs are relatively high compared to the
solution process: you are probably setting up a new
Triangulation-, DoFHander-, MatrixFree-, ... -object per
level. In many simulations, you can reuse these objects,
since you don't perform AMR every time step.
On Wednesday, 19 October 2022 at 10:38:34 UTC+2 yy.wayne wrote:
Hello everyone,
I modified step-75 a little bit and try to test it's
runtime. However the result is kind of inexplainable from
my point of view, especially on *disproportionate
assemble time and solve time*. Here are some changes:
1. a matrix-based version of step75 is contructed to
compare with matrix-free one.
2. no mesh refinement and no GMG, and fe_degree is
constant across all cells within every cycle. Fe_degree
adds one after each cycle. I make this setting to compare
runtime due to fe_degree.
3. a direct solver on coareset grid. I think it won't
affect runtime since coarest grid never change
For final cycle it has fe_degree=6 and DoFs=111,361.
For matrix-based method, overall runtime is 301s where
setup system(84s) and solve system(214s) take up most. In
step-75 solve system actually did both multigrid matrices
assembling, smoother construction, and CG solving.
Runtime of this case is shown:
On each level I print time assembling level matrix. *The
solve system is mostly decomposed to MG matrices
assembling(83.9+33.6+...=133s), smoother set up(65s),
coarse grid solve(6s) and CG solve(2.56).* My doubt is
why actual CG solve only takes 2.56 out of 301 seconds
for this problem? The time spent on assembling and
smoother construction account too much that they seems a
For matrix-free method however, runtime is much smaller
without assembling matrices. Besides, CG solve cost more
because of more computation required by matrix-free I
guess. But *smoother construction time reduces
significantly* as well is out of my expectation.
Matrix-free framework saves assembling time but it seems
too efficient to be real. The text in bold are my main
confusion. May someone share some experience on
matrix-free and multigrid methods' time consumption?
The deal.II project is located at
For mailing list/forum options, see
You received this message because you are subscribed to the
Google Groups "deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to
To view this discussion on the web visit
The deal.II project is located at
For mailing list/forum options, see
You received this message because you are subscribed to the Google
Groups "deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to
To view this discussion on the web visit
The deal.II project is located at
For mailing list/forum options, see
You received this message because you are subscribed to the Google Groups "deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email
To view this discussion on the web visit