Using OpenMPI 5.0.3 and Slurm slurm 20.11.8.
Is this error message issued by Slurm or by OpenMPI? A google search on the
error message yielded nothing.
--
At least one of the requested hosts is not included in the current
a
Hello Kurt,
The host name looks a little odd. Do you by chance have a reproducer and
instructions on how you’re running it that we could try?
Howard
From: users on behalf of "Mccall, Kurt E.
(MSFC-EV41) via users"
Reply-To: Open MPI Users
Date: Monday, July 1, 2024 at 9:36 AM
To: "OpenMpi
Howard,
I don’t know where that ^X following the hostname came from. The node is
definitely named n001.I will try to create a reproducer.
Thanks,
Kurt
From: Pritchard Jr., Howard
Sent: Monday, July 1, 2024 11:03 AM
To: Open MPI Users
Cc: Mccall, Kurt E. (MSFC-EV41)
Subject: Re: [EXTERN
Howard,
I should note that this code ran fine up to the point that our sysadmins
updated something on the cluster.
That makes me think it is a configuration issue, and that it wouldn’t give you
any insight if you ran my
reproducer. It would succeed for you and still fail for me.
What do you t
On a Cray XC (requiring aprun launcher to get from batch node to compute
node), 4.0.5 works but 4.1.1 and 4.1.6 do not (even on a single node). The
newer ones throw this:
--
An ORTE daemon has unexpectedly failed after launch a
Hi Christoph,
First a big caveat and disclaimer. I'm not sure if any Open MPI developers
have access any longer to Cray XC systems, so all I can do is make suggestions.
What's probably happening is orte is thinking it is going to fork off the
application processes on the head node itself. Tha