Re: [slurm-users] Header lengths are longer than data received after changing SelectType & GresTypes to use MPS

2020-04-08 Thread Eric Berquist
I just ran into this issue. Specifically, SLURM looks for the NVML header file, which comes with CUDA or DCGM, in addition to the library that comes with the drivers. The check is at https://github.com/SchedMD/slurm/blob/a763a008b7700321b51aad2e619deab00638a379/auxdir/x_ac_nvml.m4#L32. Once you

[slurm-users] Dynamically change JobFileAppend?

2020-02-27 Thread Eric Berquist
Is it possible to dynamically change JobFileAppend/open-mode behavior? I’m using EpilogSlurmctld to automatically requeue jobs that exit with a certain code, and would like to have those append rather than overwrite, but it seems blunt to set `JobFileAppend=1` and force people who want the defau

[slurm-users] Per-partition PreemptExemptTime?

2019-12-03 Thread Eric Berquist
Hello all, I’m trying to implement multiple “ephemeral” queues that allow general usage of project-specific hardware, but with preemption. One partition would wait a while before jobs are preempted, another where preemption occurs almost immediately, using a very short time to emulate a short g