Hi,
I'm running Slurm 19.05.5.
I've tried to write a job submission script for a heterogeneous job
following the example at https://slurm.schedmd.com/heterogeneous_jobs.html
But it failed with the following error message:
$ sbatch new.bash
sbatch: error: Invalid directive found in batch script:
Hi Marcus,
Thanks for the clarification - I'd actually missed the 'SMT' in subject.
Marcus Wagner writes:
> Hi Loris,
>
> CPU is the smallest schedulable unit, in case of SMT its threads.
Would it be reasonable to say it's *always* threads and with HT you just
have twice as many as without? H
> What is your hardware configuration? Do you have 1 server with 44
processor sockets, and each processor has 4 CPU cores? Or is it maybe 1
server with 1 or more sockets for a total of 44 CPU cores, and each CPU
core is running 4 hyperthreads?
1 server, 2 sockets, 22 cores each, 4 hyperthrea
What Marcus reports is quite correct. It can be confusing, and Slurm uses
'CPU' I think as a non-specific term to mean 'the smallest assignable
compute object'. With SMT enabled that is the thread, and with it
disabled it is the core.
We were told by the company that installed the cluster at m
Hi Gizo,
I noticed SLURM_CONF was set to a broken socket when inside salloc,
that's why sinfo was confused.
I've found a workaround that if I "unset SLURM_CONF" before sinfo, then
sinfo works.
Maybe a bug needs to be reported for this.
Best regards,
Angelos
On 3/4/20 2:07 AM, nan...@luis.un
Hi Loris,
CPU is the smallest schedulable unit, in case of SMT its threads.
At the moment we have HT disabled on our systems, therefore CPU is equal
to the cores for us. But with HT enabled, CPU is double that large (at
least form slurm 18.08).
Best
Marcus
On 3/4/20 10:33 AM, Loris Bennett
Hi Alexander,
Alexander Grund writes:
> Hi,
>
> we have a Power9 partition with 44 processors having 4 cores each
> totaling 176.
>
> `scontrol show node ` shows "CoresPerSocket=22" and "CPUTot=176"
> which confuses me. Especially as `whypending` reports e.g. "172 cores
> free: 1"
What's 'whype
On 3/4/20 10:12 AM, Alexander Grund wrote:
we have a Power9 partition with 44 processors having 4 cores each totaling
176.
What is your hardware configuration? Do you have 1 server with 44
processor sockets, and each processor has 4 CPU cores? Or is it maybe 1
server with 1 or more sockets
Hi,
we have a Power9 partition with 44 processors having 4 cores each
totaling 176.
`scontrol show node ` shows "CoresPerSocket=22" and "CPUTot=176"
which confuses me. Especially as `whypending` reports e.g. "172 cores
free: 1"
So what are "CPUs" and what are "Cores" to SLURM? Why does it