IIRC, slurm parses the batch file as options until it hits the first
non-comment line, which includes blank lines.
You may want to double-check some of the gaps in the option section of
your batch script.
That being said and you say you removed the '&' at the end of the
command, which would help.
If they are all exiting with exit code 9, you need to look at the code
for your a.out to see what code 9 means, as that is who is raising that
error. Slurm sees that and if it is non-zero, it interprets it as a
failed job.
Brian Andrus
On 8/19/2024 12:50 AM, Arko Roy via slurm-users wrote:
Thanks Loris and Gareth. here is the job submission script. if you
find any errors please let me know.
since i am not the admin but just an user, i think i dont have access
to the prolog and epilogue files.
If the jobs are independent, why do you want to run them all on the same
node?
I am running sequential codes. Essentially 50 copies of the same node
with a variation in parameter.
Since I am using the Slurm scheduler, the nodes and cores are
allocated depending upon the
available resources. So there are instances, when 20 of them goes to
20 free cores located on a particular
node and the rest 30 goes to the free 30 cores on another node. It
turns out that only 1 job out of 20 and 1 job
out of 30 are completed succesfully with exitcode 0 and the rest gets
terminated with exitcode 9.
for information, i run sjobexitmod -l jobid to check the exitcodes.
----------------------------------
the submission script is as follows:
#!/bin/bash
################
# Setting slurm options
################
# lines starting with "#SBATCH" define your jobs parameters
# requesting the type of node on which to run job
##SBATCH --partition <patition name>
#SBATCH --partition=standard
# telling slurm how many instances of this job to spawn (typically 1)
##SBATCH --ntasks <number>
##SBATCH --ntasks=1
#SBATCH --nodes=1
##SBATCH -N 1
##SBATCH --ntasks-per-node=1
# setting number of CPUs per task (1 for serial jobs)
##SBATCH --cpus-per-task <number>
##SBATCH --cpus-per-task=1
# setting memory requirements
##SBATCH --mem-per-cpu <memory in MB>
#SBATCH --mem-per-cpu=1G
# propagating max time for job to run
##SBATCH --time <days-hours:minute:seconds>
##SBATCH --time <hours:minute:seconds>
##SBATCH --time <minutes>
#SBATCH --time 10:0:0
#SBATCH --job-name gstate
#module load compiler/intel/2018_4
module load fftw-3.3.10-intel-2021.6.0-ppbepka
echo "Running on $(hostname)"
echo "We are in $(pwd)"
################
# run the program
################
/home/arkoroy.sps.iitmandi/ferro-detun/input1/a_1.out &
--
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com