Hi Mihai,
this is a problem that is not Slurm related. It's rather about:
"when does command substitution happen?"
When you write
srun echo Running on host: $(hostname)
$(hostname) is replaced by the output of the hostname-command *before*
the line is "submitted" to srun. Which means that srun will happily run
it on any (remote) node using the name of the host it is running on.
If you want to avoid this, one possible solution is
srun bash -c 'echo Running on host: $(hostname)'
In this case the command substitution is happening after srun starts the
process on a (potentially remote) node.
Regards,
Hermann
On 5/28/24 10:54, Mihai Ciubancan via slurm-users wrote:
Hello,
My name is Mihai and a have an issue with a small GPU cluster manage
with slurm 22.05.11. I got 2 different output when I'm trying to find
out the name of the nodes(one correct and one wrong). The script is:
#!/bin/bash
#SBATCH --job-name=test
#SBATCH --output=/data/mihai/res.txt
#SBATCH --partition=eli
#SBATCH --nodes=2
srun echo Running on host: $(hostname)
srun hostname
srun sleep 15
And the output look like this:
cat res.txt
Running on host: mihai-x8640
Running on host: mihai-x8640
mihaigpu2
mihai-x8640
As you can see the output of the command 'srun echo Running on host:
$(hostname)' is the same, as the jobs was running twice on the same
node, while command 'srun hostname' it's giving me the correct output.
Do you have any idea why the outputs of the 2 commands are different?
Thank you,
Mihai
--
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com