Hi all,
I'm trying to set up GPU Gres Types to correctly identify the installed
hardware (generation and memory size). I'm using a mix of explicit
configuration (to set a friendly type name) and autodetection (to handle the
cores and links detection). I'm seeing two related issues which I don't
Alexander Grund wrote:
> Our first approach with `scancel $SLURM_JOB_ID; exit 1` doesn't seem to
> work as the (sbatch) job still gets re-queued.
Try to exit with 0, because it's not your prolog that failed.
Hi
I made some progress trying to understand the problem i reported some weeks ago:
https://lists.schedmd.com/pipermail/slurm-users/2023-May/010027.html
I noticed that the intermittent connection timeout that i am experiencing
occurs only
when using the tcp based direct connection to establi