I managed to pass the following options to PETSc using a GPU node on Perlmutter.
-mat_type aijcusparse -vec_type cuda -log_view -options_left
Below is a summary of the test using 4 MPI tasks and 1 GPU per task.
o #PETSc Option Table entries:
-log_view
-mat_type aijcusparse
-options_left
-vec_type cuda
#End of PETSc Option Table entries
WARNING! There are options you set that were not used!
WARNING! could be spelling mistake, etc!
There is one unused database option. It is:
Option left: name:-mat_type value: aijcusparse
The -mat_type option has not been used. In the application code, we use
ierr = MatCreateAIJ(PETSC_COMM_WORLD,mlocal,mlocal,m,n,
d_nz,PETSC_NULL,o_nz,PETSC_NULL,&A);;CHKERRQ(ierr);
o The percent flops on the GPU for KSPSolve is 17%.
In comparison with a CPU run using 16 MPI tasks, the GPU run is an order of
magnitude slower. How can I improve the GPU performance?
Thanks,
Cho
________________________________
From: Ng, Cho-Kuen <[email protected]>
Sent: Friday, June 30, 2023 7:57 AM
To: Barry Smith <[email protected]>; Mark Adams <[email protected]>
Cc: Matthew Knepley <[email protected]>; [email protected]
<[email protected]>
Subject: Re: [petsc-users] Using PETSc GPU backend
Barry, Mark and Matt,
Thank you all for the suggestions. I will modify the code so we can pass
runtime options.
Cho
________________________________
From: Barry Smith <[email protected]>
Sent: Friday, June 30, 2023 7:01 AM
To: Mark Adams <[email protected]>
Cc: Matthew Knepley <[email protected]>; Ng, Cho-Kuen <[email protected]>;
[email protected] <[email protected]>
Subject: Re: [petsc-users] Using PETSc GPU backend
Note that options like -mat_type aijcusparse -vec_type cuda only work if the
program is set up to allow runtime swapping of matrix and vector types. If you
have a call to MatCreateMPIAIJ() or other specific types then then these
options do nothing but because Mark had you use -options_left the program will
tell you at the end that it did not use the option so you will know.
On Jun 30, 2023, at 9:30 AM, Mark Adams <[email protected]> wrote:
PetscCall(PetscInitialize(&argc, &argv, NULL, help)); gives us the args and you
run:
a.out -mat_type aijcusparse -vec_type cuda -log_view -options_left
Mark
On Fri, Jun 30, 2023 at 6:16 AM Matthew Knepley
<[email protected]<mailto:[email protected]>> wrote:
On Fri, Jun 30, 2023 at 1:13 AM Ng, Cho-Kuen via petsc-users
<[email protected]<mailto:[email protected]>> wrote:
Mark,
The application code reads in parameters from an input file, where we can put
the PETSc runtime options. Then we pass the options to PetscInitialize(...).
Does that sounds right?
PETSc will read command line argument automatically in PetscInitialize() unless
you shut it off.
Thanks,
Matt
Cho
________________________________
From: Ng, Cho-Kuen <[email protected]<mailto:[email protected]>>
Sent: Thursday, June 29, 2023 8:32 PM
To: Mark Adams <[email protected]<mailto:[email protected]>>
Cc: [email protected]<mailto:[email protected]>
<[email protected]<mailto:[email protected]>>
Subject: Re: [petsc-users] Using PETSc GPU backend
Mark,
Thanks for the information. How do I put the runtime options for the
executable, say, a.out, which does not have the provision to append arguments?
Do I need to change the C++ main to read in the options?
Cho
________________________________
From: Mark Adams <[email protected]<mailto:[email protected]>>
Sent: Thursday, June 29, 2023 5:55 PM
To: Ng, Cho-Kuen <[email protected]<mailto:[email protected]>>
Cc: [email protected]<mailto:[email protected]>
<[email protected]<mailto:[email protected]>>
Subject: Re: [petsc-users] Using PETSc GPU backend
Run with options: -mat_type aijcusparse -vec_type cuda -log_view -options_left
The last column of the performance data (from -log_view) will be the percent
flops on the GPU. Check that that is > 0.
The end of the output will list the options that were used and options that
were _not_ used (if any). Check that there are no options left.
Mark
On Thu, Jun 29, 2023 at 7:50 PM Ng, Cho-Kuen via petsc-users
<[email protected]<mailto:[email protected]>> wrote:
I installed PETSc on Perlmutter using "spack install petsc+cuda+zoltan" and
used it by "spack load petsc/fwge6pf". Then I compiled the application code
(purely CPU code) linking to the petsc package, hoping that I can get
performance improvement using the petsc GPU backend. However, the timing was
the same using the same number of MPI tasks with and without GPU accelerators.
Have I missed something in the process, for example, setting up PETSc options
at runtime to use the GPU backend?
Thanks,
Cho
--
What most experimenters take for granted before they begin their experiments is
infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener
https://www.cse.buffalo.edu/~knepley/<http://www.cse.buffalo.edu/~knepley/>