Hello, I am attempting to run High Performance Linpack (2.3) between 2 nodes with Open MPI 4.1.4 and MLNX_OFED_LINUX-5.6-2.0.9.0-rhel8.6-x86_64. Within a minute or so, the run always crashes with
[node002:04556] *** An error occurred in MPI_Recv [node002:04556] *** reported by process [1007222785,24] [node002:04556] *** on communicator MPI COMMUNICATOR 5 SPLIT FROM 3 [node002:04556] *** MPI_ERR_TRUNCATE: message truncated [node002:04556] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, [node002:04556] *** and potentially your MPI job) I have reverted back to Open MPI 4.1.2 with which I have had no issues on other systems, but the problem persists on this cluster. Any suggestions on steps to diagnose? Thank you, Bart