You appear to have HT/SMT enabled, so I would guess Slurm is treating the
node as 256 threads, 128 cpus.
In other words, it'll depend on how jobs request resources (by thread or by
core).
You can force Slurm to ignore this distinction, if that's what you really
want.
regards, mark ha
srun -N 1 -n 1 -p testA sleep 10
then the cpurawtime of this job recorded by slurm is 640s, but actually
this job only used 10s;
so, I want to know are there any way to get the real cputime used by this
job in slurm.
if you really mean cpu time (compute-bound, not elapsed),
then don't you just w
might want login nodes of different clusters to trust each other.
The big win is that you entirely avoid the presence of private keys on the
cluster.
We've used this widely in ComputeCanada since about 2003.
regards, mark hahn.
Is there no way to set or define a custom variable like at node level and
you could use a per-node Feature for this, but a partition would also work.
bbles.
(So only helps when the workload doesn't keep all resources busy.)
regards, mark hahn.
it's worth noting that host-based trust has a lot of nice properties
for this kind of intra-cluster authentication. and in particular,
you don't need fragile and potentially dangerous keys sitting around.
regards, mark hahn
--
operator may differ from spokesperson.
As a follow up to my last problem, I would like to know how can I tell
slurm to increase the virtual memory size for a process?
have you read the messages on this list?
first, you can ask for the correct amount of memory.
this approach assumes that it is dangerous to allow VSZ > RSS.
that's c
d a large file.
According to [1], " No value is provided by cgroups for virtual memory size
('vsize') "
[1] https://slurm.schedmd.com/slurm.conf.html
depends on whether "ConstrainSwapSpace=yes" appears in cgroup.conf.
(it's yes on the system above)
rega
he cgroup control to tell the kernel how much memory to
permit the job-step to use.
I would like to locally solve the problem for blast and I am not seeking a
system wide solution right now.
there's nothing unique about your system or blast (which is extremely common
on many large slurm installs).
regards, mark hahn
ove you're referring to the base
cgroup, not the cgroup for your job.) of course, manually fighting
Slurm is a Fairly Bad Idea.
you should read the documentation on cgroups to understand how these work.
memsw basically corresponds to VSZ in ps, whereas mem corresponds with RSS.
regards, mark hahn.
different, ad-hoc format), and partly to keep systems loosely coupled.
regards, mark hahn
--
operator may differ from spokesperson. h...@mcmaster.ca
that if you want, you can write a 10-line python script to generate
a report (maybe joining data in a way Grafana doesn't let you.) Or if you
want to create automated actions (email notice, etc), even mods to Slurm
controls.
regards,
--
Mark Hahn | SHARCnet Sysadmin | h...@sharcn
sorting the pending queue. to some extent, they let you view
a set of jobs as a unit, but you can also organize sets of jobs
via jobname.
regards, mark hahn
--
operator may differ from spokesperson. h...@mcmaster.ca
try CoresPerSocket=12 here, to match the provided lscpu?
(normally also ThreadsPerCore=2, since HT is enabled.)
regards,
--
Mark Hahn | SHARCnet Sysadmin | h...@sharcnet.ca | http://www.sharcnet.ca
| McMaster RHPCS| h...@mcmaster.ca | 905 525 9140 x24687
| Compute/Calcul Can
indeed, the install would have to be performed via srun.
regards,
--
Mark Hahn | SHARCnet Sysadmin | h...@sharcnet.ca | http://www.sharcnet.ca
| McMaster RHPCS| h...@mcmaster.ca | 905 525 9140 x24687
| Compute/Calcul Canada| http://www.computecanada.ca
g and substracting is not.
regards, mark hahn.
llout to query the number of
free licenses, and consider a job eligible to start if its declared
usage fits (gres in Slurm terms, I think).
regards, mark hahn
--
operator may differ from spokesperson. h...@mcmaster.ca
In theory, these small jobs could slip in and run alongside the large jobs,
what are your SelectType and SelectTypeParameters settings?
ExclusiveUser=YES on partitions?
regards, mark hahn.
hat script calculates and sets the --licenses.
I would like to expose user to slurm as much
as possible, and use scripts as little as possible.
Well, that exposes it to a lot of error and possibly abuse.
regards,
--
Mark Hahn | SHARCnet Sysadmin | h...@sharcnet.ca | http://www.sharcnet.ca
need. I simply want to enforce the memory limits as specified by the user
at job submission time. This seems to have been the behavior in previous
but cgroups (with Constrain) do that all by themselves.
If someone could post just a simple slurm.conf file that forces the memory
limits to be
this! we've had a lot of discussion
on how to collect this information as well, even whether
it would be worth doing in a prolog script...
regards,
--
Mark Hahn | SHARCnet Sysadmin | h...@sharcnet.ca | http://www.sharcnet.ca
| McMaster RHPCS| h...@mcmaster.ca | 905 525 9140 x
er to work well, it needs a particular distribution
of job priorities.
regards, mark hahn.
dea to use the pam adopt-to-slurm plugin,
which makes even scheduler-oblivious mpirun behave better.
regards, mark hahn
When I use sacct to show job stats, it always has a blank entry for the
MaxRSS field. Is there something that needs enabled to get that in?
missing for steps as well or only when using --allocations?
regards, mark hahn.
he resources, as long
as it's only what's allocated to their jobs, doesn't interfere with other
users, and is hopefully reasonably efficient. heck, we configure clusters
with hostbased trust, so it's easy for users to ssh among nodes.
regards,
--
Mark Hahn | SHARCnet Sysadmi
ption parsing.
regards,
--
Mark Hahn | SHARCnet Sysadmin | h...@sharcnet.ca | http://www.sharcnet.ca
| McMaster RHPCS| h...@mcmaster.ca | 905 525 9140 x24687
| Compute/Calcul Canada| http://www.computecanada.ca
I'll be very grateful if anyone can explain where does the 30-second clock
hide!
how about a timeout from elsewhere? for instance, when I see a 30s delay,
I normally at least check DNS, which can introduce such quantized delays.
regards, mark hahn.
on the compute
nodes? over-quota even? I would certainly examine the slurm logs on
the compute nodes.
regards, mark hahn.
#!/bin/bash
#SBATCH -c 2
#SBATCH -o slurm-gpu-job.out
#SBATCH -p gpu.q
#SBATCH -w mk-gpu-1
#SBATCH --gres=gpu:1
could it be that sbatch is defaulting to --mem=0, meaning "all the node's
memory"?
regards, mark hahn.
/step_extern
4:devices:/slurm/uid_3000566/job_17268219/step_extern
3:cpuset:/slurm/uid_3000566/job_17268219/step_extern
2:cpuacct,cpu:/
1:name=systemd:/system.slice/sshd.service
regards, mark hahn
look at slurmd logs on the nodes.
regards, mark hahn.
Is there a way to instruct SBATCH to submit a job with a certain number of
cores without specifying anything else? I don?t care which nodes or sockets
they run on. They would only use on thread per core.
not just --ntasks?
regards, mark hahn.
would have expected a different approach: use a unique string for the
jobname, and always verify after submission. after all, squeue provides
a --name parameter for this (efficient query by logical job "identity").
regards, mark hahn.
name somewhat richer (username, account, etc)
regards, mark hahn.
eric jobid, and why would configuring
the scratch space be too slow to perform in the job prolog?
regards, mark hahn.
Also why aren't you using the Slurm commands to run things?
Which command?
srun or sbatch
difference between 1 task/node and all
threads/node?
regards, mark hahn.
our).
I would be interested to know whether other Slurm sites do this successfully,
particularly in avoiding the victim-stays-suspended priority inversion.
thanks,
--
Mark Hahn | SHARCnet Sysadmin | h...@sharcnet.ca | http://www.sharcnet.ca
| McMaster RHPCS| h...@mcmaster.ca | 905 525
tem on ssd)?
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/storage_administration_guide/ch-fscache
if files are being re-read, this would be effective, fast, and convenient,
and wouldn't require any staging or hooks into Slurm.
regards, mark hahn
--
operator
nteractive (and salloc's is). this may affect partition choice, etc.
regards, mark hahn.
-requiring script. why not just:
salloc --x11 srun ./whateveryourscriptwas
regards,
--
Mark Hahn | SHARCnet Sysadmin | h...@sharcnet.ca | http://www.sharcnet.ca
| McMaster RHPCS| h...@mcmaster.ca | 905 525 9140 x24687
| Compute/Calcul Canada| http
but are not bothering the scheduler during the job.
an alternative would be to run something like GNU Parallel within the job.
regards, mark hahn.
--
operator may differ from spokesperson. h...@mcmaster.ca
erver, and also on the same (trusted/routable) network
as the compute node?
In which case you don't want Slurm doing anything at all. Just let the
X client read DISPLAY from the environment propagated by Slurm.
regards, mark hahn
We have a use-case in that the GRES being tracked on a particular partition
are GPU cards, but aren't being used by applications that would require them
exclusively (lightweight direct rendering rather than GP-GPU/CUDA
the issue is that slurm/kernel can't arbitrate resources on the GPU,
so overs
44 matches
Mail list logo