[deal.II] Re: Status of hybrid shared/distributed parallelism

Bruno Turcksin Thu, 16 Mar 2023 14:25:52 -0700

Lucas,

Why are you interested in hybrid parallelism? Are you hoping to improve the 
performance of your code or is it simply something you want to try? If the 
solver is the bottleneck in your code, you should focus on finding a better 
preconditioner. With that being said, matrix-free methods tend to be faster 
than matrix-based method if you can use them. As to your question, in 
general you can assume that when you use Trilinos or PETSc, we only use 
distributed parallelism. However, when you use deal.II's own data structure 
(Solver, Vector, etc.), we use multithreading. We just don't advertise it. 
The matrix-free framework support MPI-3.0 shared memory. We have CUDA 
matrix-free methods but we are rewriting them using Kokkos. Hopefully, the 
refactor will be done by the end of next month.


Best,

Bruno

On Thursday, March 16, 2023 at 2:40:24 PM UTC-4 lucasm...@gmail.com wrote:

> Hi folks,
>
> I'm wondering if there's somewhere I can look to get a broad overview of 
> the different parallelization schemes available in deal.II for various 
> pieces of the library, and maybe what people's experiences have been. As I 
> understand it, any matrix/vector assembly can be done with a 
> shared-parallel scheme with the threads framework, but I'm less sure about 
> solvers (which are typically the bottleneck in my applications). To relay 
> my understanding so far (and ask some specific questions):
>
> Matrix-based AMG methods:
>
>    - PETSc and Trilinos Epetra only use MPI distributed parallelism
>    - Trilinos Tpetra with MueLu is hybrid, but requires Kokkos. Do the 
>    Tpetra wrappers have associated solvers? And how do they work with Kokkos 
>    via CUDA?
>
> Matrix-based GMG methods:
>
>    - Use deal.II's own distributed vector, and wraps a regular deal.II 
>    solver (although perhaps uses a Trilinos smoother). Are any of the solvers 
>    hybrid-parallelized? And are there deal.II-specific smoothers to avoid 
>    copying to Epetra vectors?
>
> Matrix-free GMG methods:
>
>    - Use deal.II's distributed vector and wrap regular solver. 
>    Additionally, the matrix multiplication is done via a matrix-free 
> operator. 
>    Is it possible (or easy) to write a custom matrix-free operator which 
> takes 
>    advantage of shared parallelism? And will the solver take advantage of 
>    shared-memory parallelism when (say) adding vectors?
>
> Matrix-free GMG methods with hp-adaptivity:
>
>    - The only example that I've seen is step-75, and that uses Trilinos 
>    AMG for the coarse solver, which I think is limited to MPI distributed 
>    parallelism. Could the coarse solver be adapted to use shared parallelism, 
>    and are the other aspects of this solver able to be parallelized?
>
> As a final question: which of these aspects are eligible for handling by 
> CUDA, MPI-3.0 shared memory, and Kokkos? And have folks found a significant 
> speed-up by implementing any of these tools in their code?
>
> Thanks so much for any insight,
> Lucas
>

-- 
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dealii+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/dealii/22d853c1-ade8-43c0-91f2-d0ecde4f2fa1n%40googlegroups.com.

[deal.II] Re: Status of hybrid shared/distributed parallelism

Reply via email to