Thank you for your response. Thanks to your explanation, I was able to
understand.
After writing and running a new test program that only logs on SIGTERM, I
could confirm that the GraceTime was applied.
Thank you once again.
Below is a sample code for reference for others:
$ cat run-gpu.cu
#inc
Le 08/11/2023 à 02:28, 김형진 a écrit :
> Hello ~
>
> …
>
> However, as soon as the base QoS job is created, the large QoS job is
> immediately canceled without any waiting time.
>
> __ __
>
> But in the slurmctld log, there is a grace time log.
>
> [2023-11-02T11:37:36.589] debug:
Hello ~
Please help me.
Total GPU : 4
Large qos : 3 (max 3 gpus)
Base qos : 2 (max 2 gpus)
I have a total of four GPUs,
and when a job with a large QoS is using three GPUs and a job with a base
QoS is created,
I want the large QoS job to wait for a certain period before the base QoS
j