Do you know if the job is actually being killed? We had an issue on an older
version of slurm whereby we got OOM errors but the tasks actually completed.
The OOM came when the job exited and was a false error.
Also, there are several bug reports open right now about an issue similar to
what
We had issues getting TMPDIR to work as well. We finally did this in our
prolog:
export SLURM_TMPDIR="/tmp/slurm/${SLURM_JOB_ID}"
This works.
-Roger
From: slurm-users On Behalf Of
Ellestad, Erik
Sent: Tuesday, May 12, 2020 10:40 AM
To: slurm-users@lists.schedmd.com
Subject: [slurm-users] R
Our prolog script just does this:
export SLURM_TMPDIR="/tmp/slurm/${SLURM_JOB_ID}"
This has worked for us.
-Roger
-Original Message-
From: slurm-users [mailto:slurm-users-boun...@lists.schedmd.com] On Behalf Of
Angelines
Sent: Thursday, December 5, 2019 9:58 AM
To: slurm-users@lists.sc
8.08.4.
Thanks in advance!
-Roger
[cid:image001.png@01D22319.C7D5D540]
Roger Moye
HPC Engineer
713.425.6236 Office
713.898.0021 Mobile
QUANTLAB Financial, LLC
3 Greenway Plaza
Suite 200
Houston, Texas 77046
www.quantlab.com<https:
hat it can run? There were plenty of healthy nodes
for this job so I'd prefer that the job not remained held indefinitely.
Thanks!
-Roger
[cid:image001.png@01D22319.C7D5D540]
Roger Moye
HPC Engineer
713.425.6236 Office
713.898.0021 Mobile
QUANTLAB Financial, LLC
3 Greenway Plaza
Suite 200
accomplish this? Without this, node 1 fills up first before
any cores on node 2 are assigned.
Thanks in advance!
-Roger
[cid:image001.png@01D22319.C7D5D540]
Roger Moye
HPC Engineer
713.425.6236 Office
713.898.0021 Mobile
QUANTLAB Financial, LLC
3 Greenway Plaza
Suite 200
Houston, Texas 77046
We are having the exact same problem with $TMPDIR. I wonder if a bug has
crept in?I spoke to the SchedMD guys at SC18 last week and they were not
aware of a bug but since more than one person is having this difficulty
something must be wrong somewhere.
-Roger
From: slurm-users [mailto:sl
he step is running or
pending. Once it is finished, the information disappears. Is there a way to
see all job steps associated with a job regardless of the state of the step?
Thanks so much!
-Roger Moye
[cid:image001.png@01D22319.C7D5D540]
Roger Moye
HPC Engineer
713.425.6236 Office
713.376.2