Hello Kurt, The host name looks a little odd. Do you by chance have a reproducer and instructions on how you’re running it that we could try?
Howard From: users <[email protected]> on behalf of "Mccall, Kurt E. (MSFC-EV41) via users" <[email protected]> Reply-To: Open MPI Users <[email protected]> Date: Monday, July 1, 2024 at 9:36 AM To: "OpenMpi User List ([email protected])" <[email protected]> Cc: "Mccall, Kurt E. (MSFC-EV41)" <[email protected]> Subject: [EXTERNAL] [OMPI users] Slurm or OpenMPI error? Using OpenMPI 5.0.3 and Slurm slurm 20.11.8. Is this error message issued by Slurm or by OpenMPI? A google search on the error message yielded nothing. -------------------------------------------------------------------------- At least one of the requested hosts is not included in the current allocation. Missing requested host: n001^X Please check your allocation or your request. -------------------------------------------------------------------------- Following that error, MPI_Comm_Spawn failed on the named node, n001. [n001:00000] *** An error occurred in MPI_Comm_spawn [n001:00000] *** reported by process [595787777,0] [n001:00000] *** on communicator MPI_COMM_SELF [n001:00000] *** MPI_ERR_UNKNOWN: unknown error [n001:00000] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, [n001:00000] *** and MPI will try to terminate your MPI job as well) ^@1 more process has sent help message help-mpi-errors.txt / mpi_errors_are_fatal ^@1 more process has sent help message help-mpi-errors.txt / mpi_errors_are_fatal Thanks, Kurt
