On 18/03/2019 23.07, Eric Rosenberg wrote:
>  [2019-03-15T09:48:43.000] update_node: node rn003 reason set to: Kill task 
> failed

This usually happens for me when one of the shared filesystems
is overloadedand processes are stuck in uninterruptible sleep
(D), thus unableto terminate.

Your reason can be different.

HTH, P

-- 
Dr. Pawel Dziekonski <pawel.dziekon...@kaust.edu.sa>
KAUST Advanced Computing Core Laboratory
https://www.hpc.kaust.edu.sa


Reply via email to