Ernesto,
the coll/tuned module (that should handle collective subroutines by
default) has a known issue when matching but non identical signatures are
used:
for example, one rank uses one vector of n bytes, and an other rank uses n
bytes.
Is there a chance your application might use this pattern?
Forgot to mention that in all 3 situations, mpirun is called as follows (35
nodes, 4 MPI ranks per node):
mpirun -x LD_LIBRARY_PATH=:::... -hostfile /tmp/hostfile.txt -np
140 -npernode 4 --mca btl_tcp_if_include eth0
So I have a question 3) Should I add some extra option in the mpirun command
Thank you for the quick answer, George. I wanted to investigate the problem
further before replying.
Below I show 3 situations of my C++ (and Fortran) application, which runs on
top of PETSc, OpenMPI, and MKL. All 3 situations use MKL 2019.0.5 compiled with
INTEL.
At the end, I have 2 question