Slurm node daemon.
Dec 03 22:50:00 104 slurmd[18754]: slurmd: slurmd version 21.08.4 started
Dec 03 22:50:00 104 slurmd[18754]: slurmd: killing old slurmd[18744]
*[root@nousheen ~]# ping 192.168.60.104*
PING 192.168.60.104 (192.168.60.104) 56(84) bytes of data.
64 bytes from 192.168.60.104: icmp_seq=1
given below.
*[root@nousheen ~]# squeue -j*
JOBID PARTITION NAME USER ST TIME NODES
NODELIST(REASON)
120 debug SRBD-1 nousheen R 0:54 1 101
121 debug SRBD-2 nousheen R 0:54 1 105
122
Dear Robbert,
Thankyou so much for your response. I was so focused on sync of time that I
missed the date on one of the nodes which was 1 day behind as you said. I
have corrected it and now i get the following output in status.
*(base) [nousheen@nousheen slurm]$ systemctl status
Hello Everyone,
I am using slurm version 21.08.5 and Centos 7.
I successfully start slurmd on all compute nodes but when I start
slurmctld on server node it gives the following error:
*(base) [nousheen@nousheen ~]$ systemctl status slurmctld.service -l*
● slurmctld.service - Slurm controller
reate cred context for cred/munge
slurmctld: fatal: slurm_cred_creator_ctx_create((null)): Operation not
permitted
Best Regards,
Nousheen Parvaiz
ᐧ
On Tue, Feb 1, 2022 at 9:06 AM Nousheen wrote:
> Dear Ole,
>
> Thank you for your response.
> I am doing it again using your suggest
Dear Ole,
Thank you for your response.
I am doing it again using your suggested link.
Best Regards,
Nousheen Parvaiz
ᐧ
On Mon, Jan 31, 2022 at 2:07 PM Ole Holm Nielsen
wrote:
> Hi Nousheen,
>
> I recommend you again to follow the steps for installing Slurm on a CentOS
> 7 clus
Best Regards,
Nousheen Parvaiz
Ph.D. Scholar
National Center For Bioinformatics
Quaid-i-Azam University, Islamabad
Dear Hermann,
Thank you for your reply. I have given below my slurm.conf and log file.
*# slurm.conf file generated by configurator easy.html.*# Put this file on
all nodes of
systemd[1]: Started Slurm node daemon.
Jan 31 00:22:42 c103008 systemd[1]: slurmd.service: main process exited,
code=exited, status=203/EXEC
Jan 31 00:22:42 c103008 systemd[1]: Unit slurmd.service entered failed
state.
Jan 31 00:22:42 c103008 systemd[1]: slurmd.service failed.
Best Regards,
Nousheen
Dear Jeffrey,
Thank you for your response. I have followed the steps as instructed. After
the copying the files to their respective locations "systemctl status
slurmctld.service" command gives me an error as follows:
(base) [nousheen@exxact system]$ systemctl daemon-reload
(base)
=
#RebootProgram=
ReturnToService=1
SlurmctldPidFile=/var/run/slurmctld.pid
SlurmctldPort=6817
SlurmdPidFile=/var/run/slurmd.pid
SlurmdPort=6818
SlurmdSpoolDir=/var/spool/slurmd
SlurmUser=nousheen
#SlurmdUser=root
#SrunEpilog=
#SrunProlog=
StateSaveLocation=/home/nousheen/Documents/SILICS/slurm-21.08.5
Dear Jeffery,
Thank you so much for your prompt response. It has resolved my problem.
Best Regards,
Nousheen Parvaiz
ᐧ
On Wed, Jan 26, 2022 at 2:11 AM Jeffrey R. Lang wrote:
> Looking at what you provided in your email the groupadd commands are
> failing, due to the requested GID 9
USER -g
slurm -s /bin/bash slurmuseradd: group 'slurm' does not exist*
I am totally new to this. Kindly guide me on how to resolve this.
Best Regards,
Nousheen Parvaiz
ᐧ
ᐧ
12 matches
Mail list logo