lly others in the group have some
> ideas/explanations. I haven't had to deal with GPU resources in Slurm.
>
> On Fri, Jan 13, 2023 at 4:51 AM Helder Daniel wrote:
>
>> Oh, ok.
>> I guess I was expecting that the GPU job was suspended copying GPU memory
>> to R
ble with GANG,SUSPEND. GPU memory isn't
> managed in Slurm so the idea of suspending GPU memory for another job to
> use the rest simply isn't possible.
>
> On Fri, Jan 13, 2023 at 4:08 AM Helder Daniel wrote:
>
>> Hi Kevin
>>
>> I did a "scontrol
/usr/lib/xorg/Xorg
4MiB |
|3 N/A N/A524226 C /bin/python
15362MiB |
+-+
On Fri, 13 Jan 2023 at 12:08, Helder Daniel wrote:
> Hi Kevin
>
> I did a "scontrol show partition".
>
MemPerNode=UNLIMITED
On Fri, 13 Jan 2023 at 11:16, Kevin Broch wrote:
> Problem might be that OverSubscribe is not enabled? w/o it, I don't
> believe the time-slicing can be GANG scheduled
>
> Can you do a "scontrol show partition" to verify that it is?
>
> On Thu, J
Hi,
I am trying to enable gang scheduling on a server with a CPU with 32 cores
and 4 GPUs.
However, using Gang sched, the cpu jobs (or gpu jobs) are not being
preempted after the time slice, which is set to 30 secs.
Below is a snapshot of squeue. There are 3 jobs each needing 32 cores. The
first