On Tue, Apr 15, 2025 at 1:45 PM Junchao Zhang <junchao.zh...@gmail.com> wrote:
> Hi, Justin, > I don't know ASM well enough. I just browsed its code. It seems it has a > lot of matrix partitioning and indexing operations, which definitely are > not done on GPUs currently. > But you could still try that, as petsc will copy data from the device to > the host as needed to perform host-only operations. You can profile with > -log_vew -log_view_gpu_time so that we can see how expensive these > operations are. > > Barry should know more about ASM. > Usually, ASM is a tactic for solving in parallel. Each process has a diagonal matrix block. ASM means overlapping blocks, whereas BJacobi means non-overlapping. The block will likely be extracted on the CPU, but pushed down to the GPU for solves. Thanks, Matt > --Junchao Zhang > > > On Tue, Apr 15, 2025 at 10:43 AM Angus, Justin Ray <ang...@llnl.gov> > wrote: > >> Hi Junchao, >> >> >> >> Thanks for the reply. >> >> >> >> Does ASM work the same on GPU systems as it does on CPU systems? >> >> >> >> *From: *Junchao Zhang <junchao.zh...@gmail.com> >> *Date: *Monday, April 14, 2025 at 7:35 PM >> *To: *Angus, Justin Ray <ang...@llnl.gov> >> *Cc: *petsc-dev@mcs.anl.gov <petsc-dev@mcs.anl.gov>, Ghosh, Debojyoti < >> gho...@llnl.gov> >> *Subject: *Re: [petsc-dev] Additive Schwarz Method + ILU on GPU platforms >> >> Petsc supports ILU0/ICC0 numeric factorization (without reordering) and >> then triangular solve on GPUs. It is done by calling vendor libraries (ex. >> cusparse). >> >> We have options -pc_factor_mat_factor_on_host <bool> >> -pc_factor_mat_solve_on_host <bool> to force doing the factorization and >> MatSolve on the host for device matrix types. >> >> >> >> You can try to see if it works for your case. >> >> >> >> --Junchao Zhang >> >> >> >> >> >> On Mon, Apr 14, 2025 at 4:39 PM Angus, Justin Ray via petsc-dev < >> petsc-dev@mcs.anl.gov> wrote: >> >> Hello, >> >> >> >> A project I work on uses GMRES via PETSc. In particular, we have had good >> successes using the Additive Schwarz Method + ILU preconditioner setup >> using a CPU-based code. I found online where it is stated that “Parts of >> most preconditioners run directly on the GPU” ( >> https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!fQqW7mjnh1VSngks4WuKIejODD259NIRzuXN8kFwXrunsgj2S6ecC6AfmnT8mUT5wQFY3F5kWSLiAlPIa5E7$ >> >> <https://urldefense.us/v3/__https:/petsc.org/release/faq/__;!!G_uCfscf7eWS!bw6qeKcY7MKSvlEgcogdKR7fpjZSOFvka6zfDprUZ_sJHdE-YZmRD6UTqWQW3_uGVBII4P-AG0zaGTLbI67_fQ$>). >> Is ASM + ILU also available for GPU platforms? >> >> >> >> -Justin >> >> -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!fQqW7mjnh1VSngks4WuKIejODD259NIRzuXN8kFwXrunsgj2S6ecC6AfmnT8mUT5wQFY3F5kWSLiAlTztbRP$ <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!fQqW7mjnh1VSngks4WuKIejODD259NIRzuXN8kFwXrunsgj2S6ecC6AfmnT8mUT5wQFY3F5kWSLiAoDn6W5U$ >