Yi-Chung,
I'm willing to work on this part. Please let me know how should I start it. I
believe this code will help the community.
No doubt! Thank you for your offer of help!
I'm going to comment in more detail below, but will point out that for all
major development, it is always helpful to do it in a way so that you can get
feedback early and often. There is nothing worse than going to work for two or
three months, uploading all of your code, and then getting feedback that
something could have been done in a much simpler way, or should have been done
in a different way to make it more general. In other words, whenever you have
something that is working, put it into a github pull request and let others
take a look and comment on it!
> My application is about IC designs that
> may have million to billion cells. A fully distributed triangulation
> helps to reduce memory usage. The current shared_memory system can
> handle 20M (single core) in system of 32GB main memory.
That's already quite impressive :-) What kind of meshes do you have that
require so many cells? Are they geometrically incredibly complicated to
require that many cells already at the coarse level?
Actually, this is the interesting part. We try to simulate thermal profile of
integrated circuit. For a CPU, it has billion transistors inside and each of
then has its own power trace as RHS. That is why we have to give it a large
coarse mesh at beginning. I did some model reductions for transistors, but I
still want my tool can simulate 100M cells to ensure accuracy.
That makes sense. My question was more motivated by why you need a *coarse*
mesh that is so fine? If your geometry is relatively simple, but you need high
resolution, then you can just take a coarse mesh with simple cells and refine
it a sufficient number of times. That already works -- we have done
computations on 100M cells many times.
The only reason I can see for a very fine coarse mesh is if you need to
resolve geometries with lots and lots and lots of curved edges and corners.
Yes, I'm willing to believe this. The algorithm wasn't intended for
meshes of this size, though we did test it with ~300k cells in 2d and we
know that it scales like O(N). So 200 seconds seems like a long time. Is
this in debug mode?
Unfortunately not in debug mode.I guess the reorder is more like O(N^2 or N^4)
if I may recall.
Hm, that is strange. Would you be willing to share one or two of your meshes
with one or a few million cells? If you can't share them publicly, can you
share them with me?
It searches the cells will minimum numbers of neighbors and then search again
recursively for its neighbors. With increasing number of dofs, time increases
exponentially.
Ah, I see -- that's the reordering in the Cuthill-McKee step in
parallel::distributed::Triangulation. Interesting. I wonder whether we could
come up with a more scalable implementation of this algorithm by building up
different data structures.
In my setup program, a steady state 3-D thermal simulation (distributed trial)
for a problem of 5M dofs in two cores requires 200 sec reorder, 80 sec
setup_system, 80 sec solver time (petsc-MPI). 100sec output, and 80 sec
create_tria, and 45 sec assembly. Each core requires 9GB memory. This is why I
want to reduce memory usage.
Yes, I can see why the reorder step is annoying then.
Best
Wolfgang
--
------------------------------------------------------------------------
Wolfgang Bangerth email: bange...@colostate.edu
www: http://www.math.colostate.edu/~bangerth/
--
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see
https://groups.google.com/d/forum/dealii?hl=en
---
You received this message because you are subscribed to the Google Groups "deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to dealii+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.