Various options that might help reduce job fragmentation.
Turn up debugging on slurmctld and add the DebugFlags like TraceJobs,
SelectType, and Steps. With debugging set high enough one can see a good bit of
the logic in regard to node selection.
CR_LLN Schedule
Hi Gerhard,
I am not sure if this counts as administrative measure, but we do
highly encourage our users to always explicitely specify --nodes=n
together with --ntasks-per-node=m (rather than just --ntasks=n*m and
omitting --nodes option, which may lead to cores allocated here and
there and eve
oad through sooner.
Tun
*From:* Loris Bennett via slurm-users
*Sent:* 09 April 2024 06:51
*To:* slurm-users@lists.schedmd.com
*Cc:* Gerhard Strangar
*Subject:* [slurm-users] Re: Avoiding fragmentation
Hi Gerhard,
Gerhard Strangar via slurm-users writes:
> Hi,
>
> I&
, and then I'll compromise on throughput to
get the urgent workload through sooner.
Tun
From: Loris Bennett via slurm-users
Sent: 09 April 2024 06:51
To: slurm-users@lists.schedmd.com
Cc: Gerhard Strangar
Subject: [slurm-users] Re: Avoiding fragmentation
Hi Gerhard,
Gerhard Strangar via slurm-users writes:
> Hi,
>
> I'm trying to figure out how to deal with a mix of few- and many-cpu
> jobs. By that I mean most jobs use 128 cpus, but sometimes there are
> jobs with only 16. As soon as that job with only 16 is running, the
> scheduler splits the