Hi Kurt, Without knowing your exact MPI launch command, my cristal orb thinks you might want to try the -mpi=pmix flag for srun as documented for slurm+openmpi: https://slurm.schedmd.com/mpi_guide.html#open_mpi
-Joachim ________________________________ From: users <users-boun...@lists.open-mpi.org> on behalf of Mccall, Kurt E. (MSFC-EV41) via users <users@lists.open-mpi.org> Sent: Thursday, June 15, 2023 11:56:28 PM To: users@lists.open-mpi.org <users@lists.open-mpi.org> Cc: Mccall, Kurt E. (MSFC-EV41) <kurt.e.mcc...@nasa.gov> Subject: [OMPI users] OpenMPI crashes with TCP connection error My job immediately crashes with the error message below. I don’t know where to begin looking for the cause of the error, or what information to provide to help you understand it. Maybe you could clue me in 😊. I am on RedHat 4.18.0, using Slurm 20.11.8 and OpenMPI 4.1.5 compiled with gcc 8.5.0. I built OpenMPI with the following “configure” command: ./configure --prefix=/opt/openmpi/4.1.5_gnu --with-slurm --enable-debug WARNING: Open MPI accepted a TCP connection from what appears to be a another Open MPI process but cannot find a corresponding process entry for that peer. This attempted connection will be ignored; your MPI job may or may not continue properly. Local host: n001 PID: 985481