Chandler Sobel-Sorenson writes:
> Perhaps there is a way to import it into a spreadsheet?
You can use `sacct -P -l`, which gives you a '|' separated output, which
should be possible to import in a spread sheet.
(Personally I only use `-l` when I'm looking for the name of an
attribute and am to
Is there a recommended way to read output from `sacct` involving `-l` or
`--long` option? I have dual monitors and shrunk the terminal's font down to 6
pt or so until I could barely read it, giving me 675 columns. This was still
not enough...
Perhaps there is a way of displaying it so the li
Hi Paul,
sorry to say, but that has to be some coincidence on your system. I've never
seen Slurm reporting using corenumbers, which are higher than the total number
of cores.
I have e.g. a intel Platinum 8160 here. 24 Cores per Socket, no HyperThreading
activated.
Yet here the last lines of /
You can use slurm with hyperthreaded cores. It takes awareness when
configuring and requesting the resources.
The can of worms you are opening is the stance (in HPC) that
hyperthreading is detrimental. If you are using HPC as intended, I
completely agree with this stance. The objective is to b
Hi, thanks for getting back to me.
I have been doing some more experimenting, and I think that the issue is
because the Azure VMs for my nodes are HyperThreaded.
Slurm sees the cluser as 5 nodes with 1 CPU and seems to ignore the
HyperThreading - so hence Slurm sees the cluster as a 5 CPU cluster
Nice find. Thanks for sharing back.
On Tue, Dec 13, 2022 at 10:39 AM Paul Raines
wrote:
>
> Yes, looks like SLURM is using the apicid that is in /proc/cpuinfo
> The first 14 cpus in /proc/cpu (procs 0-13) have apicid
> 0,2,4,6,8,10,12,14,16,20,22,24,26,28 in /proc/cpuinfo
>
> So after setting Cp
Gary,
Well your first issue is using Cyclecloud, but that is mostly opinion :)
Your error states there aren't enough CPUs in the partition, which means
we should take a look at the partition settings.
Take a look at 'scontrol show partition hpc' and see how many nodes are
assigned to it. Als
Yes, looks like SLURM is using the apicid that is in /proc/cpuinfo
The first 14 cpus in /proc/cpu (procs 0-13) have apicid
0,2,4,6,8,10,12,14,16,20,22,24,26,28 in /proc/cpuinfo
So after setting CpuSpecList=0,2,4,6,8,10,12,14,16,18,20,22,24,26
in slurm.conf it appears to be doing what I want
In the slurm.conf manual they state the CpuSpecList ids are "abstract", and
in the CPU management docs they enforce the notion that the abstract Slurm
IDs are not related to the Linux hardware IDs, so that is probably the
source of the behavior. I unfortunately don't have more information.
On Tue,
Hmm. Actually looks like confusion between CPU IDs on the system
and what SLURM thinks the IDs are
# scontrol -d show job 8
...
Nodes=foobar CPU_IDs=14-21 Mem=25600 GRES=
...
# cat
/sys/fs/cgroup/system.slice/slurmstepd.scope/job_8/cpuset.cpus.effective
7-10,39-42
-- Paul Raines (htt
Oh but that does explain the CfgTRES=cpu=14. With the CpuSpecList
below and SlurmdOffSpec I do get CfgTRES=cpu=50 so that makes sense.
The issue remains that thought the number of cpus in CpuSpecList
is taken into account, the exact IDs seem to be ignored.
-- Paul Raines (http://help.nmr.mgh
I have tried it both ways with the same result. The assigned CPUs
will be both in and out of the range given to CpuSpecList
I tried setting using commas instead of ranges so used
CpuSpecList=0,1,2,3,4,5,6,7,8,9,10,11,12,13
But still does not work
$ srun -p basic -N 1 --ntasks-per-node=1 --m
Dear Slurm Users, perhaps you can help me with a problem that I am having
using the Scheduler (I am new to this, so please forgive me for any stupid
mistakes/misunderstandings).
I am not able to submit a Multi-Threaded MPI job on a small demo cluster
that I have setup using Azure CycleCloud that u
Thanks a a lot.
That is what I was looking for.
Regards.
> Kilian Cavalotti şunları yazdı (12 Ara 2022
> 20:51):
>
> Hi Sefa,
>
> `scontrol -d show job ` should give you that information:
>
> # scontrol -d show job 2781284 | grep Nodes=
> NumNodes=10 NumCPUs=256 NumTasks=128 CPUs/T
14 matches
Mail list logo