Hi Sean, Thank you! It was a permissions issue and it’s not complaining anymore about cred/munge.
I appreciate your help. Thanks, Jesse > On Jan 23, 2024, at 3:34 PM, Sean Crosby <scro...@unimelb.edu.au> wrote: > > slurmctld runs as the user slurm, whereas slurmd runs as root. > > Make sure the permissions on /app/slurm-24.0.8/lib/slurm allow the user slurm > to read the files > > e.g. you could do (as root) > > sudo -u slurm ls /app/slurm-24.0.8/lib/slurm > > and see if the slurm user can read the directory (as well as the libraries > within it) > > Sean > From: slurm-users <slurm-users-boun...@lists.schedmd.com > <mailto:slurm-users-boun...@lists.schedmd.com>> on behalf of Jesse Aiton > <je...@clarkeconsulting.com <mailto:je...@clarkeconsulting.com>> > Sent: Wednesday, 24 January 2024 10:14 > To: slurm-users@lists.schedmd.com <mailto:slurm-users@lists.schedmd.com> > <slurm-users@lists.schedmd.com <mailto:slurm-users@lists.schedmd.com>> > Subject: [EXT] [slurm-users] error: Couldn't find the specified plugin name > for cred/munge looking at all files > > External email: Please exercise caution > > Hello Slurm Folks, > > I have a weird issue where on the same server, which acts as both a > controller and a node, slurmctld can’t find cred_munge.so > > slurmctld: debug3: Trying to load plugin > /app/slurm-24.0.8/lib/slurm/cred_munge.so > slurmctld: debug4: /app/slurm-24.0.8/lib/slurm/cred_munge.so: Does not exist > or not a regular file. > slurmctld: error: Couldn't find the specified plugin name for cred/munge > looking at all files > slurmctld: error: cannot open plugin directory /app/slurm-24.0.8/lib/slurm > slurmctld: error: cannot find cred plugin for cred/munge > slurmctld: error: cannot create cred context for cred/munge > slurmctld: fatal: failed to initialize cred plugin > > But slurmd can: > > slurmd: debug3: Trying to load plugin > /app/slurm-24.0.8/lib/slurm/cred_munge.so > slurmd: debug3: plugin_load_from_file->_verify_syms: found Slurm plugin > name:Munge credential signature plugin type:cred/munge version:0x180800 > slurmd: cred/munge: init: Munge credential signature plugin loaded > slurmd: debug3: Success. > > This is on Ubuntu 20.04 and happens both with Slurm 20.11.09 and 24.0.8 > > Thank you, > > Jesse > > > # slurm.conf file generated by configurator easy.html. > # Put this file on all nodes of your cluster. > # See the slurm.conf man page for more information. > # > ClusterName=prod-cluster > SlurmctldHost=controller > # > #MailProg=/bin/mail > #MpiDefault= > #MpiParams=ports=#-# > ProctrackType=proctrack/cgroup > ReturnToService=1 > SlurmctldPidFile=/var/run/slurmctld.pid > #SlurmctldPort=6817 > SlurmdPidFile=/var/run/slurmd.pid > #SlurmdPort=6818 > SlurmdSpoolDir=/var/spool/slurmd > SlurmUser=slurm > #SlurmdUser=root > StateSaveLocation=/var/spool/slurmctld > #SwitchType= > TaskPlugin=task/affinity,task/cgroup > # > # > # TIMERS > #KillWait=30 > #MinJobAge=300 > #SlurmctldTimeout=120 > #SlurmdTimeout=300 > # > # > # SCHEDULING > SchedulerType=sched/backfill > SelectType=select/cons_tres > # > # > # LOGGING AND ACCOUNTING > #AccountingStorageType= > #JobAcctGatherFrequency=30 > #JobAcctGatherType= > #SlurmctldDebug=info > SlurmctldLogFile=/var/log/slurmctld.log > #SlurmdDebug=info > SlurmdLogFile=/var/log/slurmd.log > # > # > # COMPUTE NODES > NodeName=controller CPUs=1 State=UNKNOWN > NodeName=node CPUs=1 State=UNKNOWN > PartitionName=prod-part Nodes=ALL Default=YES MaxTime=INFINITE State=UP