Re: [slurm-users] how can users start their worker daemons using srun?

Priedhorsky, Reid Tue, 28 Aug 2018 16:13:04 -0700


> On Aug 28, 2018, at 6:35 AM, Chris Samuel <ch...@csamuel.org> wrote:
> 
> On Tuesday, 28 August 2018 10:21:45 AM AEST Chris Samuel wrote:
> 
>> That won't happen on a well configured Slurm system as it is Slurm's role to
>> clear up any processes from that job left around once that job exits.
> 
> Sorry Reid, for some reason I misunderstood your email and the fact you were 
> talking about job steps! :-(
> 
> One other option in this case is that you can say add 2 cores per node for 
> the 
> daemons to the overall job request and then do in your jobs
> 
> srun --ntasks-per-node=1 -c 2 ./foo.py &


Thanks Chris.

I tried the following:

  $ srun --ntasks-per-node=1 -c1 -- sleep 15 &
  [1] 180948
  $ srun --ntasks-per-node=1 -c1 -- hostname
  srun: Job step creation temporarily disabled, retrying
  srun: Job step created
  cn001.localdomain
  [1]+  Done                    srun --ntasks-per-node=1 -c1 -- sleep 15

and the second srun still waits until the first is complete.

This is surprising to me, as my interpretation is that the first run should 
allocate only one CPU, leaving 35 for the second srun, which also only needs 
one CPU and need not wait.

Is this behavior expected?
Am I missing something?

Thanks,
Reid

Re: [slurm-users] how can users start their worker daemons using srun?

Reply via email to