Hello, guys,
I am doing a parsing job on the output of the sacct command and I know fields
that could be specified to be outputted.
The difficulty I am facing is that I am in lack of detailed info about the
fields. I need to do calculation on the fields so I need to understand what
values they
We're in the midst of transitioning our SGE cluster to slurm 20.02.6, running
on up-to-date CentOS-7. We built RPMs from the standard tarball against CUDA
10.1. These RPMs worked just fine on our first GPU test node (with Tesla K80s)
using "AutoDetect=nvml" in /etc/gres.conf. However, we just
Hello, slurm users and Brian,
Thanks a lot for your reply. The thing is actually I know the fields. I just
need to know detailed info about them. For example, you may get an “Unknown”
for some time fields. And the maxVMSize field is an empty string except for
some job steps. I need to know the
My mistake - from slurm.conf(5):
SrunProlog runs on the node where the "srun" is executing.
i.e. the login node, which explains why the directory is not being created on
the compute node, while the echos work.
--
David Chin, PhD (he/him) Sr. SysAdmin, URCF, Drexel
dw...@drexel.edu
Hi all:
Prentice wrote:
> I don't see how that bug is related. That bug is about requiring the
> libnvidia-ml.so library for an RPM that was built with NVML
> Autodetect enabled. His problem is the opposite - he's already using
> NVML autodetect, but wants to disable that feature on a single node,
Well, reading the source it looks like xcgroup_set_params is just writing to
the devices.allow and devices.deny files. I haven't yet found what cg->path is
being set to but presumably it is too
/sys/fs/cgroup/slurm/uid_##/job_#/step_0 or equivalent for the job
in question.
I'm sti
Hello
I am trying to debug an issue with EGL support (updated NVIDIA drivers and
now EGLGetDisplay and EGLQueryDevicesExt are failing if they can't access all
/dev/nvidia# devices in slurm) and am wondering how slurm uses device cgroups
so I can implement the same cgroup setup by hand and te
They're also listed on the sacct online man page:
https://slurm.schedmd.com/sacct.html
Scroll down until you see the text box with the white text on a black
background - you can't miss it.
Also, depending how your parsing the output, you might want to skip
printing out the headers, which ca
Hi Brian:
This works just as I expect for sbatch.
The example srun execution I showed was a non-array job, so the first half of
the "if []" statement holds. It is the second half, which deals with job
arrays, which has the period.
The value of TMP is correct, i.e. "/local/scratch/80472"
And t
I think it isn't running how you think or there is something not
provided in the description.
You have:
export
TMP="/local/scratch/${SLURM_ARRAY_JOB_ID}.${SLURM_ARRAY_TASK_ID}"
Notice that period in there.
Then you have:
node001::~$ echo $TMP
/local/scratch/80472
There is
Hi, Brian:
So, this is my SrunProlog script -- I want a job-specific tmp dir, which makes
for easy cleanup at end of job:
#!/bin/bash
if [[ -z ${SLURM_ARRAY_JOB_ID+x} ]]
then
export TMP="/local/scratch/${SLURM_JOB_ID}"
export TMPDIR="${TMP}"
export LOCAL_TMPDIR="${TMP}"
export BE
It seems to me, if you are using srun directly to get an interactive
shell, you can just run the script once you get your shell.
You can set the variables and then run srun. It automatically exports
the environment.
If you want to change a particular one (or more), use something like
--ex
Dear Slurm users,
we are running a cluster that has a flat account structure. All accounts
have a monthly limit that can only change on the 1st of a month. Users
assigned to the very same account shall not compete against each other
(created with fairshare=parent) and their fairshare shall be cal
13 matches
Mail list logo