That's great! Thanks David! On Wed, May 6, 2020 at 11:35 AM David Braun <dlbr...@umich.edu> wrote:
> i'm not sure I understand the problem. If you want to make sure the > preamble and postamble run even if the main job doesn't run you can use '-d' > > from the man page > > -d, --dependency=<dependency_list> > Defer the start of this job until the > specified dependencies have been satisfied completed. > <dependency_list> is of the form > <type:job_id[:job_id][,type:job_id[:job_id]]> or > <type:job_id[:job_id][?type:job_id[:job_id]]>. All dependencies must be > satisfied if the "," separator is > used. Any dependency may be satisfied if the "?" > separator is used. Many jobs can share the same dependency and these jobs > may even belong to different > users. The value may be changed after job submission using > the scontrol command. Once a job dependency fails due to the termination > state of a preceding job, > the dependent job will never be run, even if the preceding > job is requeued and has a different termination state in a subsequent > execution. > > > for instance, create a job that contains this: > > preamble_id=`sbatch preamble.job` > main_id=`sbatch -d afterok:$preamble_id main.job` > sbatch -d afterany:$main_id postamble.job > > Best, > > D > > On Wed, May 6, 2020 at 2:19 PM Maria Semple <ma...@rstudio.com> wrote: > >> Hi Chris, >> >> I think my question isn't quite clear, but I'm also pretty confident the >> answer is no at this point. The idea is that the script is sort of like a >> template for running a job, and an end user can submit a custom job with >> their own desired resource requests which will end up filling in the >> template. I'm not in control of the Slurm cluster that will ultimately run >> the job, nor the details of the job itself. For example, template-job.sh >> might look like this: >> >> #!/bin/bash >> srun -c 1 --mem=1k echo "Preamble" >> srun -c <CPUs> --mem=<Memory>m /bin/sh -c <user's shell script> >> srun -c 1 --mem=1k echo "Postamble" >> >> My goal is that even if the user requests 10 CPUs when the cluster only >> has 4 available, the Preamble and Postamble steps will always run. But as I >> said, it seems like that's not possible since the maximum number of CPUs >> needs to be set on the sbatch allocation and the whole job would be >> rejected on the basis that too many CPUs were requested. Is that correct? >> >> On Tue, May 5, 2020, 11:13 PM Chris Samuel <ch...@csamuel.org> wrote: >> >>> On Tuesday, 5 May 2020 11:00:27 PM PDT Maria Semple wrote: >>> >>> > Is there no way to achieve what I want then? I'd like the first and >>> last job >>> > steps to always be able to run, even if the second step needs too many >>> > resources (based on the cluster). >>> >>> That should just work. >>> >>> #!/bin/bash >>> #SBATCH -c 2 >>> #SBATCH -n 1 >>> >>> srun -c 1 echo hello >>> srun -c 4 echo big wide >>> srun -c 1 echo world >>> >>> gives: >>> >>> hello >>> srun: Job step's --cpus-per-task value exceeds that of job (4 > 2). Job >>> step >>> may never run. >>> srun: error: Unable to create step for job 604659: More processors >>> requested >>> than permitted >>> world >>> >>> > As a side note, do you know why it's not even possible to restrict the >>> > number of resources a single step uses (i.e. set less CPUs than are >>> > available to the full job)? >>> >>> My suspicion is that you've not set up Slurm to use cgroups to restrict >>> the >>> resources a job can use to just those requested. >>> >>> https://slurm.schedmd.com/cgroups.html >>> >>> All the best, >>> Chris >>> -- >>> Chris Samuel : http://www.csamuel.org/ : Berkeley, CA, USA >>> >>> >>> >>> >>> -- Thanks, Maria