Thanks Howard. Here is what I got.
batch35:/p/work/borchert> mpirun -n 1 -d ./a.out
[batch35:62735] procdir: /p/work/borchert/ompi.batch35.34110/pid.62735/0/0
[batch35:62735] jobdir: /p/work/borchert/ompi.batch35.34110/pid.62735/0
[batch35:62735] top: /p/work/borchert/ompi.batch35.34110/pid.62735
Hi Chris
I wonder if somethings messed up with the way alps is interpreting node names
on the system.
Could you try doing the following:
1. get a two node allocation on your cluster
2. run aprun -n 2 -N 1 hostname
3. take the hostnames returned then run aprun -n 2 -N 1 -L X,Y hostname
Where X=
It’s the same output and the same result:
batch13:~> aprun -n 2 -N 1 hostname
nid00418
nid00419
batch13:~> aprun -n 2 -N 1 -L nid00418,nid00419 hostname
aprun: -L node_list contains an invalid entry
Usage: aprun [global_options] [command_options] cmd1
...
Thanks,
Chris
-Original Message
Okay, try setting this environment variable and see if the mpirun command works:
export OMPI_MCA_ras=alps
On 7/11/24, 8:10 AM, "Borchert, Christopher B ERDC-RDE-ITL-MS CIV"
mailto:christopher.b.borch...@erdc.dren.mil>> wrote:
It’s the same output and the same result:
batch13:~> aprun -n 2
That did it! Thanks Howard!
-Original Message-
From: Pritchard Jr., Howard
Sent: Thursday, July 11, 2024 9:14 AM
To: Borchert, Christopher B ERDC-RDE-ITL-MS CIV
; Open MPI Users
Subject: Re: [EXTERNAL] [OMPI users] Invalid -L flag added to aprun
Okay, try setting this environment var
Okay. Something must have broken between 4.0.x and 4.1.x to give pbs pro ras
priority over alps even for Cray XC systems.
On 7/11/24, 8:21 AM, "Borchert, Christopher B ERDC-RDE-ITL-MS CIV"
mailto:christopher.b.borch...@erdc.dren.mil>> wrote:
That did it! Thanks Howard!
-Original Messag