Re: [deal.II] Fully distributed triangulation (level 0)

Yi-Chung Chen Thu, 02 Feb 2017 10:12:43 -0800

Hi Wolfgang,


On Tuesday, January 31, 2017 at 10:57:35 PM UTC, Wolfgang Bangerth wrote:
>
>
> Yi-Chung, 
>
> > Thank you for your prompt reply. I was wondeirng if I can integrate 
> > other partition tools, such as metis or parmetis to handle the fully 
> > distributed triangulation. I can develop that part by myself (or with 
> > some help from the community). Do you have any suggestions? 
>
> I believe that it would not be impossible to develop this, but it will 
> probably not be a small exercise. You would have to develop a fair share 
> of code, and gain an understanding of parts of the library you don't 
> usually get to see if you use deal.II. 
>
> If you're willing to do this, we can guide you along the process, but 
> it's going to be a bit of work for sure. 
>
> I'm willing to work on this part. Please let me know how should I start 
it. I believe this code will help the community. 
 

> > My following 
> > project also relies on this since I will try to manage number of cells 
> > in each processor. With p4est, it is hard to manage number of cells 
> > based on a in-house algorithm. 
>
> That is actually not quite true. p4est (and the deal.II classes that use 
> it) allow you to describe a "weight" for each cell, and p4est will 
> partition in a way that the sum of weights is roughly equal among all 
> processors. 
>

I knew that p4est can do partition by weight. However, my current needs to 
control and distributed the cells in need. It would be better to control 
the partition by myself rather than p4est.
 

> > My application is about IC designs that 
> > may have million to billion cells. A fully distributed triangulation 
> > helps to reduce memory usage. The current shared_memory system can 
> > handle 20M (single core) in system of 32GB main memory. 
>
> That's already quite impressive :-) What kind of meshes do you have that 
> require so many cells? Are they geometrically incredibly complicated to 
> require that many cells already at the coarse level? 
>
Actually, this is the interesting part. We try to simulate thermal profile 
of integrated circuit. For a CPU, it has billion transistors inside and 
each of then has its own power trace as RHS. That is why we have to give it 
a large coarse mesh at beginning. I did some model reductions for 
transistors, but I still want my tool can simulate 100M cells to ensure 
accuracy.
 

>
> > Any design of 1M cells on the distributed triangulation have problem of 
> > computation time because of the reorder step. This is why I bypassed it 
> > and provided a sorted mesh to grid_in (read_msh). For a problem of 5M 
> > cells, I can save 200sec at the step of create_triangulation. 
>
> Yes, I'm willing to believe this. The algorithm wasn't intended for 
> meshes of this size, though we did test it with ~300k cells in 2d and we 
> know that it scales like O(N). So 200 seconds seems like a long time. Is 
> this in debug mode? 
>
Unfortunately not in debug mode.I guess the reorder is more like O(N^2 or 
N^4) if I may recall.
It searches the cells will minimum numbers of neighbors and then search 
again recursively for its neighbors. With increasing number of dofs, time 
increases exponentially.
For a 2-D problem it might be fast. For a 3-D, problem with >1M dofs takes 
a while to reorder.
In my setup program, a steady state 3-D thermal simulation (distributed 
trial) for a problem of 5M dofs in two cores requires 200 sec reorder, 80 
sec setup_system, 80 sec solver time (petsc-MPI). 100sec output, and 80 sec 
create_tria, and 45 sec assembly. Each core requires 9GB memory. This is 
why I want to reduce memory usage.

my solution is to create mesh by myself since I generate it from a real IC. 
here is an example, if anyone want to generate the mesh. For a 3X3 2-D 
mesh, normally I will create mesh start from x then y. However, p4est 
requires a mesh with minimum (or balanced communication), it would be 
better to group neighbors together. So cell_id becomes the following:
7 8 9         7 8 9
4 5 6  =>> 3 4 6
1 2 3         1 2 5
In this way, it is easy to map it to distributed them by linearly 
distributing. 


Best
YC Chen

Best 
>   Wolfgang 
>
>
> -- 
> ------------------------------------------------------------------------ 
> Wolfgang Bangerth          email:                 bang...@colostate.edu 
> <javascript:> 
>                             www: http://www.math.colostate.edu/~bangerth/ 
>

-- 
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dealii+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [deal.II] Fully distributed triangulation (level 0)

Reply via email to