[slurm-users] How to get the path to original sbatch script

2018-05-25 Thread 程迪
Hi, everyone I just found the sbatch will copy the original sbatch script to a new place and I cannot get the path to original sbatch script. Is there any method to solve it? I am using the path to copy related files. I need to populate a scratch folder to run my job. Di Cheng Engineer of Resear

[slurm-users] Nested sruns

2018-05-25 Thread Raymond Norris
Slurm: 17.11.4 I want to run an interactive job on a compute node. I know that I'm going to need to run an MPI app, so I request a bunch of tasks upfront srun -n 16 -gres=gpu:4 -pty $SHELL This creates a job with 4 nodes. ... SLURM_CPUS_ON_NODE=4 SLURM_DISTRIBUTION=block SLURM

Re: [slurm-users] Controller / backup controller q's

2018-05-25 Thread Will Dennis
On Friday, May 25, 2018 5:31 AM, Pär Lindfors wrote: > Time to start upgrading to Ubuntu 18.04 now then? :-) Not yet time for us... There's problems with U18.04 that render it unusable for our environment. > For a 10 node cluster it might make more sense to run slurmctld and slurmdbd > on the

Re: [slurm-users] Controller / backup controller q's

2018-05-25 Thread Will Dennis
No cluster mgr/framework in use... Custom-compiled and packaged the Slurm 16.05.4 release into .rpm/.deb files, and used them to install the different nodes. Although the homedirs are no longer shared, the nodes do have access to shared storage, one mounted as a subdir of the home directory (wh

[slurm-users] sbatch option --propagate ignored

2018-05-25 Thread Hendryk Bockelmann
Hello, we recently updated from slurm 16.05.x to 17.11.5 and found that the sbatch option --propagate is no longer followed. Although written in the man pages the following does not modify the core file and stack size limits on the compute nodes #SBATCH --propagate=STACK,CORE but it can sti

Re: [slurm-users] Why SlurmUser is set to slurm by default?

2018-05-25 Thread Douglas Jacobsen
SlurmUser == root also has implications for strigger. It allows any user to set slurmctld executed striggers. This can be OK, or not, depending on your use cases and user community. User-specified strigger commands would run on the same node as the slurmctld process, and so the user-specified sc

Re: [slurm-users] Why SlurmUser is set to slurm by default?

2018-05-25 Thread Pär Lindfors
Hi Taras, On 05/24/2018 11:17 AM, Taras Shapovalov wrote: > We always use the default value for SlurmUser, but now we have realized > that we don't really get why it is user slurm, but not root. Sometimes > it is useful to run SlurmctlProlog as root, but then slurmctld will also > run as root. Oth

Re: [slurm-users] Controller / backup controller q's

2018-05-25 Thread John Hearns
Will, I know I will regret chiming in here. Are you able to say what cluster manager or framework you are using? I don't see a problem in running two different distributions. But as Per says look at your development environment. For my part, I would ask have you thought about containerisation? ie

Re: [slurm-users] Controller / backup controller q's

2018-05-25 Thread Pär Lindfors
Hi Will, On 05/24/2018 05:43 PM, Will Dennis wrote: > (we were using CentOS 7.x > originally, now the compute nodes are on Ubuntu 16.04.) Currently, we > have a single controller (slurmctld) node, an accounting db node> (slurmdbd), > and 10 compute/worker nodes (slurmd.) Time to start upgrading

Re: [slurm-users] Controller / backup controller q's

2018-05-25 Thread Benjamin Redling
Am 24.05.2018 um 17:43 schrieb Will Dennis: > 3)  What are the steps to replace a primary controller, given that a > backup controller exists? (Hopefully this is already documented > somewhere that I haven’t found yet) Why not drive such a small cluster with a single primary controller in a mig