Hi, We have been having some with NFS mounts via Infiniband getting dropped by nodes. We ended up switching our main admin server, which provides NFS and Slurm from one machine to another.
Now, however, if slurmdbd is started, as soon as slurmctld starts, slurmdbd seg faults. In the slurmdbd.log we have slurmdbd: error: We have more allocated time than is possible (7724741 > 7012800) for cluster soroban(1948) from 2017-10-17T16:00:00 - 2017-10-17T17:00:00 tres 1 slurmdbd: error: We have more time than is possible (7012800+36720+0)(7049520) > 7012800 for cluster soroban(1948) from 2017-10-17T16:00:00 - 2017-10-17T17:00:00 tres 1 slurmdbd: Warning: Note very large processing time from hourly_rollup for soroban: usec=46390426 began=17:08:17.777 Segmentation fault (core dumped) and the corresponding output of strace is fstat(3, {st_mode=S_IFREG|0600, st_size=871270, ...}) = 0 write(3, "[2017-10-17T17:09:04.168] Warnin"..., 132) = 132 +++ killed by SIGSEGV (core dumped) +++ We're running 17.02.7. Any ideas? Cheers, Loris -- Dr. Loris Bennett (Mr.) ZEDAT, Freie Universität Berlin Email loris.benn...@fu-berlin.de