Hi Nathalie,

Which Slurm version and which OS version are you using?

FYI: My Slurm Wiki contains all the details of setting up Slurm on CentOS 7: https://wiki.fysik.dtu.dk/niflheim/SLURM

Best regards,
Ole

On 2/13/19 2:58 PM, Nathalie Gocht wrote:
Hey,

I am building up a one node cluster. Master and node are n the same machine. My slurm.conf:

ControlMachine=bayes

#

MpiDefault=none

ProctrackType=proctrack/pgid

ReturnToService=1

SlurmctldPidFile=/var/run/slurm-llnl/slurmctld.pid

SlurmctldPort=6817

SlurmdPidFile=/var/run/slurm-llnl/slurmd.pid

SlurmdPort=6818

SlurmdSpoolDir=/var/spool/slurmd

SlurmUser=slurm

StateSaveLocation=/var/spool/slurmctld

SwitchType=switch/none

TaskPlugin=task/none

#

#

# TIMERS

InactiveLimit=0

KillWait=30

MinJobAge=300

SlurmctldTimeout=120

SlurmdTimeout=300

Waittime=0

#

#

# SCHEDULING

FastSchedule=1

SchedulerType=sched/builtin

SelectType=select/linear

#

#

# LOGGING AND ACCOUNTING

AccountingStorageLoc=/var/log/slurm-llnl/job_accounting

AccountingStorageType=accounting_storage/filetxt

AccountingStoreJobComment=YES

ClusterName=bayes

JobCompLoc=/var/log/slurm-llnl/job_completion

JobCompType=jobcomp/filetxt

JobAcctGatherFrequency=60

JobAcctGatherType=jobacct_gather/linux

SlurmctldDebug=info

SlurmctldLogFile=/var/log/slurm-llnl/slurmctld.log

SlurmdDebug=info

SlurmdLogFile=/var/log/slurm-llnl/slurmd.log

# COMPUTE NODES

GresTypes=gpu

NodeName=bayes Gres=gpu:tesla:1 CPUs=48 Sockets=2 CoresPerSocket=12 ThreadsPerCore=2 State=UNKNOWN

PartitionName=long Nodes=bayes Default=YES MaxTime=INFINITE State=UP

I started the control deamon, but get this information:

$ systemctl status slurmctld.service

● slurmctld.service - Slurm controller daemon

   Loaded: loaded (/lib/systemd/system/slurmctld.service; enabled; vendor preset: enabled)

   Active: failed (Result: exit-code) since Wed 2019-02-13 14:43:02 CET; 7min ago

      Docs: man:slurmctld(8)

  Process: 40552 ExecStart=/usr/sbin/slurmctld $SLURMCTLD_OPTIONS (code=exited, status=0/SUCCE

Main PID: 40560 (code=exited, status=1/FAILURE)

$ sinfo

PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST

long*        up   infinite      1   idle bayes

I tried to start the slurm deamon, but the timout exceeds. slurmd -Dvvvgives:

slurmd: error: chmod(/var/spool/slurmd, 0755): Operation not permitted

slurmd: error: Unable to initialize slurmd spooldir

slurmd: error: slurmd initialization failed

Does someone know whats going on?

Reply via email to