Hi there, I have a recurring problem with allocated TRES, which are not released after all jobs on that node are finished. The TRES are still marked as allocated and no new jobs can't be scheduled on that node using those TRES.
$ scontrol show node node2 NodeName=node2 Arch=x86_64 CoresPerSocket=64 CPUAlloc=0 CPUTot=256 CPULoad=0.11 AvailableFeatures=(null) ActiveFeatures=(null) Gres=gpu:tesla:8 NodeAddr=node2 NodeHostName=node2 Version=21.08.5 OS=Linux 5.15.0-89-generic #99-Ubuntu SMP Mon Oct 30 20:42:41 UTC 2023 RealMemory=1025593 AllocMem=0 FreeMem=1025934 Sockets=2 Boards=1 State=IDLE ThreadsPerCore=2 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A Partitions=AMPERE BootTime=2023-11-23T09:01:28 SlurmdStartTime=2023-11-23T09:02:09 LastBusyTime=2023-11-23T09:03:19 CfgTRES=cpu=256,mem=1025593M,billing=256,gres/gpu=8,gres/gpu:tesla=8 AllocTRES=gres/gpu=8 CapWatts=n/a CurrentWatts=0 AveWatts=0 ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s Previously the allocation was gone after the server was turned off for a couple of hours (power conservation) but the issue occurred again and this time it persists even after the server was off over night. Is there any way to release the allocation manually? Regards, Gerald Schneider -- Gerald Schneider Fraunhofer-Institut für Graphische Datenverarbeitung IGD Joachim-Jungius-Str. 11 | 18059 Rostock | Germany Tel. +49 6151 155-309 | +49 381 4024-193 | Fax +49 381 4024-199 gerald.schnei...@igd-r.fraunhofer.de | www.igd.fraunhofer.de