I will admit that I have not used sbcast but from reading the man pages I think
that it does not do what you hope.
The sbcast command will indeed run on the first allocated node, so the source
file must be accessible from there. The man page does say that shared file
systems are a better solution than sbcast, and I think that is the clue; sbcast
is designed for clusters where there is no shared file system available between
all the nodes. If there was such a shared file system, the main part of the
script (that executes on every node including the first) could just copy any
file from shared storage to the local storage. That might be common where the
shared storage is e.g. NFS which might not be very fast, and make use of local
fast storage on each node.
It sounds like sbcast also has some useful options for copying different files
to different subsets of allocated nodes; there is an example for heterogenous
jobs: https://slurm.schedmd.com/heterogeneous_jobs.html
Hopefully some else has actually used it and can give an idea how to do what
you need to do.
William
From: slurm-users On Behalf Of Hector
Yuen
Sent: 18 April 2020 09:06
To: slurm-users@lists.schedmd.com
Subject: [slurm-users] Correct way to do sbcast with sbatch
Hello, I have a very basic question about using sbcast.
I start from the master node (the one running slurmctld). And have my binary
there.
When I submit a job, I do this through sbatch. The first command I want to run
is an sbcast. But by then I am already inside one of the allocated nodes for
the job which doesn't have the binary in first place.
What is the correct way to use sbcast inside sbatch?
Thanks
--
-h