If you are using direct solvers on each block on each GPU (several matrices on each GPU) you could pull apart, for example, MatSolve_SeqAIJCUSPARSE() and launch each of the matrix solves on a separate stream. You could use a MatSolveBegin/MatSolveEnd style or as Jed may prefer a Wait() model. Maybe a couple hours coding to produce a prototype MatSolveBegin/MatSolveEnd from MatSolve_SeqAIJCUSPARSE.
Note pulling apart a non-coupled single MatAIJ that contains all the matrices would be hugely expensive. Better to build each matrix already separate or use MatNest with only diagonal matrices. Barry > On Dec 30, 2020, at 5:46 PM, Jed Brown <[email protected]> wrote: > > Mark Adams <[email protected]> writes: > >> I see that ASM has a DM and can get subdomains from it. I have a DMForest >> and I would like an ASM that has a subdomain for each field. How might I go >> about doing this? (the fields are not coupled in the matrix so this would >> give a block diagonal matrix, and thus exact with LU sub solvers. > > The fields are already not coupled or you want to filter the matrix and give > back a single matrix with coupling removed? > > You can use Fieldsplit to get the math of field-based block Jacobi (or ASM, > but overlap with fields tends to be expensive). Neither FieldSplit or ASM can > run the (additive) solves concurrently (and most libraries would need > something to drive the threads). > >> I am then going to want to get these separate solves to be run in parallel >> on a GPU (I'm talking with Sherry about getting SuperLU working on these >> small problems). In looking at PCApply_ASM it looks like this will take >> some thought. KSPSolve would need to be non-blocking, etc., or a new apply >> op might be needed. >> >> Thanks, >> Mark
