On Sat, May 23, 2020 at 1:49 AM Drew Parsons <dpars...@debian.org> wrote:
> On 2020-05-23 14:18, Jed Brown wrote: > > Drew Parsons <dpars...@debian.org> writes: > > > >> Hi, the Debian project is discussing whether we should start providing > >> a > >> 64 bit build of PETSc (which means we'd have to upgrade our entire > >> computational library stack, starting from BLAS and going through MPI, > >> MUMPS, etc). > > > > You don't need to change BLAS or MPI. > > I see, the PETSc API allows for PetscBLASInt and PetscMPIInt distinct > from PetscInt. That gives us more flexibility. (In any case, the Debian > BLAS maintainer is already providing blas64 packages. We've started > discussions about MPI). > > But what about MUMPS? Would MUMPS need to be built with 64 bit support > to work with 64-bit PETSc? > (the MUMPS docs indicate that its 64 bit support needs 64-bit versions > of BLAS, SCOTCH, METIS and MPI). > In MUMPS's manual, it is called full 64-bit. Out of the same memory bandwidth concern, MUMPS also supports selective 64-bit, in a sense it only uses int64_t for selected variables. One can still use it with 32-bit BLAS, MPI etc. We support selective 64-bit MUMPS starting from petsc-3.13.0 > > > >> A default PETSc build uses 32 bit addressing to index vectors and > >> matrices. 64 bit addressing can be switched on by configuring with > >> --with-64-bit-indices=1, allowing much larger systems to be handled. > >> > >> My question for petsc-maint is, is there a reason why 64 bit indexing > >> is > >> not already activated by default on 64-bit systems? Certainly C > >> pointers and type int would already be 64 bit on these systems. > > > > Umm, x86-64 Linux is LP64, so int is 32-bit. ILP64 is relatively > > exotic > > these days. > > > oh ok. I had assumed int was 64 bit on x86-64. Thanks for the > correction. > > > >> Is it a question of performance? Is 32 bit indexing executed faster > >> (in > >> the sense of 2 operations per clock cycle), such that 64-bit > >> addressing > >> is accompanied with a drop in performance? > > > > Sparse iterative solvers are entirely limited by memory bandwidth; > > sizeof(double) + sizeof(int64_t) = 16 incurs a performance hit relative > > to 12 for int32_t. It has nothing to do with clock cycles for > > instructions, just memory bandwidth (and usage, but that is less often > > an issue). > > > >> In that case we'd only want to use 64-bit PETSc if the system being > >> modelled is large enough to actually need it. Or is there a different > >> reason that 64 bit indexing is not switched on by default? > > > > It's just about performance, as above. > > > Thanks Jed. That's good justification for us to keep our current 32-bit > built then, and provide a separate 64-bit build alongside it. > > > > There are two situations in > > which 64-bit is needed. Historically (supercomputing with thinner > > nodes), it has been that you're solving problems with more than 2B > > dofs. > > In today's age of fat nodes, it also happens that a matrix on a single > > MPI rank has more than 2B nonzeros. This is especially common when > > using direct solvers. We'd like to address the latter case by only > > promoting the row offsets (thereby avoiding the memory hit of promoting > > column indices): > > > > https://gitlab.com/petsc/petsc/-/issues/333 > > An interesting extra challenge. > > > > I wonder if you are aware of any static analysis tools that can > > flag implicit conversions of this sort: > > > > int64_t n = ...; > > for (int32_t i=0; i<n; i++) { > > ... > > } > > > > There is -fsanitize=signed-integer-overflow (which generates a runtime > > error message), but that requires data to cause overflow at every > > possible location. > > I'll ask the Debian gcc team and the Science team if they have ideas > about this. > > Drew >