Re: [slurm-users] WTERMSIG 15

2021-12-01 Thread Yair Yarom
md/system/slurmd.service > > KillMode=process > > > > Instead of (for ubuntu nodes) > > KillMode=control-group > > > > *De :* slurm-users *De la part de* > Yair Yarom > *Envoyé :* mardi 30 novembre 2021 08:50 > *À :* Slurm User Community List > *Obj

Re: [slurm-users] WTERMSIG 15

2021-11-30 Thread LEROY Christine 208562
De : slurm-users De la part de Yair Yarom Envoyé : mardi 30 novembre 2021 08:50 À : Slurm User Community List Objet : Re: [slurm-users] WTERMSIG 15 Hi, There were two cases where this happened to us as well: 1. The systemd slurmd.service wasn't configured properly, and so the jobs

Re: [slurm-users] WTERMSIG 15

2021-11-29 Thread Yair Yarom
Hi, There were two cases where this happened to us as well: 1. The systemd slurmd.service wasn't configured properly, and so the jobs ran under the slurmd.slice. So by restarting slurmd, systemd will send a signal to all processes. You can check if this is the case with 'systemctl status slurmd.se

[slurm-users] WTERMSIG 15

2021-11-29 Thread LEROY Christine 208562
Hello all, I did some modification in my slurm.conf and I’ve restarted the slurmctld on the master and then the slurmd on the nodes. During this process I’ve lost some jobs (*), curiously all these jobs were on ubuntu nodes . These jobs were ok with the consumed resources (**). Any Idea what co