On Mon, May 27, 2024 at 2:59 PM Bjørn-Helge Mevik via slurm-users
<slurm-users@lists.schedmd.com> wrote:
>
> Ole Holm Nielsen via slurm-users <slurm-users@lists.schedmd.com> writes:
>
> > Whether or not to enable Hyper-Threading (HT) on your compute nodes
> > depends entirely on the properties of applications that you wish to
> > run on the nodes.  Some applications are faster without HT, others are
> > faster with HT.  When HT is enabled, the "virtual CPU cores" obviously
> > will have only half the memory available per core.
>
> Another consideration is, if you keep HT enabled, do you want Slurm to
> hand out physical cores to jobs, or logical cpus (hyperthreads)?  Again,
> what is best depends on your workload.  On our systems, we tend to
> either turn off HT, or hand our cores.

In the case where Hyper-Threading (HT) is enabled, is it possible to
configure Slurm to achieve the following effects:

1. If the total number of cores used by jobs is less than the number
of physical cores, then hand out physical cores to jobs.
2. When the total number of cores used by jobs exceeds the number of
physical cores, use logical CPUs for the excess part.

I heard about the following method to achieve the above purpose, but
have not tried it so far:

To configure Slurm for managing jobs more effectively when
Hyper-Threading (HT) is enabled, you can implement a strategy that
involves distinguishing between physical and logical cores. Here's a
possible approach to meet the requirements you described:

1. Configure Slurm to Recognize Physical and Logical Cores

First, ensure that Slurm can differentiate between physical and
logical cores. This typically involves setting the CpuBind and
TaskPlugin parameters correctly in the Slurm configuration file
(usually slurm.conf).

# Settings in slurm.conf
TaskPlugin=task/affinity
CpuBind=cores


2. Use Gres (Generic Resources) to Identify Physical and Logical Cores

You can utilize the GRES (Generic RESources) feature to define
additional resource types, such as physical and logical cores. First,
these resources need to be defined in the slurm.conf.

# Define resources in slurm.conf
NodeName=NODENAME Gres=cpu_physical:16,cpu_logical:32 CPUs=32 Boards=1
Sockets=2 CoresPerSocket=8 ThreadsPerCore=2


Here, cpu_physical and cpu_logical are custom resource names, followed
by the number of resources. The CPUs field should be set to the total
number of physical and logical cores.

3. Write Job Submission Scripts

When submitting a job, users need to request the appropriate type of
cores based on their needs. For example, if a job requires more cores
than the number of physical cores available, it can request a
combination of physical and logical cores.

#!/bin/bash
#SBATCH --gres=cpu_physical:8,cpu_logical:4
#SBATCH --ntasks=12
#SBATCH --cpus-per-task=1

# Your job execution command

This script requests 8 physical cores and 4 logical cores, totaling 12 cores.

> --
> B/H

Regards,
Zhao

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com

Reply via email to