I thought that I'd re-up this thread, since it didn't get any responses. I apologize if the questions are a bit broad. Typically, I spend a bit of time trying to get some code working myself before posting a question here, but I don't currently have a machine with a NVIDIA GPU. If the CUDA support for deal.II is at the point where our code could take advantage of it, I'd be able to justify the investment to get access to a compatible machine.
Thanks! Steve On Monday, October 29, 2018 at 10:43:48 AM UTC-4, Stephen DeWitt wrote: > > Hello all, > I'm interested to know what the status is for using CUDA with matrix free > calculations. We have a PRISMS-PF user who is interested in GPU > calculations, and I'd like to get a better idea of what would be involved > in adding CUDA support on our end. > > So far I've read through the "CUDA Support" issue > <https://github.com/dealii/dealii/issues/7037>, the "Roadmap for > inclusion of GPU implementation of matrix free in Deal.II" issue > <https://github.com/dealii/dealii/issues/2351>, the Doxygen documentation > for classes in the CUDAWrappers namespace > <https://www.dealii.org/9.0.0/doxygen/deal.II/group__CUDAWrappers.html>, > and the manuscript by Karl Ljungkvist > <http://scs.org/wp-content/uploads/2017/06/4_Final_Manuscript-1.pdf>. Are > there any other pages I should be looking at? > > My understanding from these pages is that deal.II has partial support for > using CUDA with matrix free calculations. Currently, calculations can be > done with scalar variables (but not vector variables) and adaptive meshes. > > A few (somewhat inter-related) questions: > 1). Do all of the tools exist to create a GPU version of step-48? Has > anyone done so? > 2). What exactly would be involved in creating a GPU version of step-48? > Is it just changing the CPU Vector, MatrixFree, and FEEvaluation classes to > their GPU counterparts, plus packaging some data (plus local apply > functions?) into a CUDAWrappers::MatrixFree< dim, Number >::Data > <https://www.dealii.org/9.0.0/doxygen/deal.II/structCUDAWrappers_1_1MatrixFree_1_1Data.html> > struct? > 3). Most of the discussions seemed to revolve around linear solves. For > something like step-48 with explicit updates, will the current paradigm > work well? Or would that require shuttling data between the GPU and CPU > every time step, causing too much overhead? (I know that in general GPUs > can work very well for explicit codes.) > > Thanks! > Steve > > > -- The deal.II project is located at http://www.dealii.org/ For mailing list/forum options, see https://groups.google.com/d/forum/dealii?hl=en --- You received this message because you are subscribed to the Google Groups "deal.II User Group" group. To unsubscribe from this group and stop receiving emails from it, send an email to dealii+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.