It’s 48 cores but there are 4 NUMA domains (CMGs). So, you may want to 
experiment in hybrid mode (4x12 etc.) if possible.

-Sarat.

From: Mark Adams <[email protected]>
Sent: Friday, April 16, 2021 2:10 PM
To: petsc-dev <[email protected]>
Cc: Satish Balay <[email protected]>; Sreepathi, Sarat <[email protected]>
Subject: Re: [petsc-dev] [EXTERNAL] Re: building on Fugaku

Sarat, is there anything special that you do for Kokkos - OpenMP?

Just set OMP_NUM_THREADS=48 ?

Also, I am confused about the number of cores here. Is 48 or 64 per node/socket?

On Fri, Apr 16, 2021 at 2:03 PM Mark Adams 
<[email protected]<mailto:[email protected]>> wrote:
Cool, I have it running too. Need to add Sarat's flags and test ex2.

On Fri, Apr 16, 2021 at 1:57 PM Satish Balay via petsc-dev 
<[email protected]<mailto:[email protected]>> wrote:
Mark,

The following build works for me:

Satish

----

pjsub --interact -L "node=1" -L "rscunit=rscunit_ft01" -L "elapse=1:00:00" 
--sparam "wait-time=1200"

. /vol0004/apps/oss/spack/share/spack/setup-env.sh
spack load fujitsu-mpi%gcc
spack load [email protected]<mailto:[email protected]> arch=linux-rhel8-a64fx
./configure COPTFLAGS='-Ofast -march=armv8.2-a+sve -msve-vector-bits=512' 
CXXOPTFLAGS='-Ofast -march=armv8.2-a+sve -msve-vector-bits=512' 
FOPTFLAGS='-Ofast -march=armv8.2-a+sve -msve-vector-bits=512' --with-openmp=1 
--download-p4est --download-zlib --download-kokkos --download-kokkos-kernels 
--download-kokkos-commit=origin/develop 
--download-kokkos-kernels-commit=origin/develop 
'--download-kokkos-cmake-arguments=-DBUILD_TESTING=OFF 
-DKokkos_ENABLE_LIBDL=OFF -DKokkos_ENABLE_AGGRESSIVE_VECTORIZATION=ON' 
--download-cmake=https://github.com/Kitware/CMake/releases/download/v3.20.1/cmake-3.20.1.tar.gz
  --download-fblaslapack=1
make PETSC_DIR=/vol0004/ra010009/a04201/petsc.z PETSC_ARCH=arch-linux-c-debug 
all


To test - redo job allocation using max-proc-per-node:

login6$ pjsub --interact -L "node=1" -L "rscunit=rscunit_ft01" -L 
"elapse=1:00:00" --sparam "wait-time=1200" --mpi "max-proc-per-node=16"

[a04201@c31-3201c petsc.z]$ . /vol0004/apps/oss/spack/share/spack/setup-env.sh
[a04201@c31-3201c petsc.z]$ spack load fujitsu-mpi%gcc
[a04201@c31-3201c petsc.z]$ spack load [email protected]<mailto:[email protected]> 
arch=linux-rhel8-a64fx
[a04201@c31-3201c petsc.z]$ make check
Running check examples to verify correct installation
Using PETSC_DIR=/vol0004/ra010009/a04201/petsc.z and 
PETSC_ARCH=arch-linux-c-debug
C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process
C/C++ example src/snes/tutorials/ex19 run successfully with 2 MPI processes
C/C++ example src/snes/tutorials/ex3k run successfully with kokkos-kernels
Fortran example src/snes/tutorials/ex5f run successfully with 1 MPI process
Completed test examples
[a04201@c31-3201c petsc.z]$

Reply via email to