> Anyway I suggest to update the operating system to stretch and fix your > configuration under a more recent version of slurm.
I think I'll soon arrive to that :) b 2018-01-15 14:08 GMT+01:00 Gennaro Oliva <oliv...@na.icar.cnr.it>: > Ciao Elisabetta, > > On Mon, Jan 15, 2018 at 01:13:27PM +0100, Elisabetta Falivene wrote: > > Error messages are not much helping me in guessing what is going on. What > > should I check to get what is failing? > > check slurmctld.log and slurmd.log, you can find them under > /var/log/slurm-llnl > > > *PARTITION AVAIL TIMELIMIT NODES STATE NODELIST* > > *batch* up infinite 8 unk* node[01-08]* > > > > > > Running > > *systemctl status slurmctld.service* > > > > returns > > > > *slurmctld.service - Slurm controller daemon* > > * Loaded: loaded (/lib/systemd/system/slurmctld.service; enabled)* > > * Active: failed (Result: timeout) since Mon 2018-01-15 13:03:39 CET; > 41s > > ago* > > * Process: 2098 ExecStart=/usr/sbin/slurmctld $SLURMCTLD_OPTIONS > > (code=exited, status=0/SUCCESS)* > > > > * slurmctld[2100]: cons_res: select_p_reconfigure* > > * slurmctld[2100]: cons_res: select_p_node_init* > > * slurmctld[2100]: cons_res: preparing for 1 partitions* > > * slurmctld[2100]: Running as primary controller* > > * slurmctld[2100]: > > SchedulerParameters=default_queue_depth=100,max_rpc_cnt=0, > max_sched_time=4,partition_job_depth=0* > > * slurmctld.service start operation timed out. Terminating.* > > *Terminate signal (SIGINT or SIGTERM) received* > > * slurmctld[2100]: Saving all slurm state* > > * Failed to start Slurm controller daemon.* > > * Unit slurmctld.service entered failed state.* > > Do you have a backup controller? > Check your slurm.conf under: > /etc/slurm-llnl > > Anyway I suggest to update the operating system to stretch and fix your > configuration under a more recent version of slurm. > Best regards > -- > Gennaro Oliva > >