Hi;

Sorry, as you can see, I did a mistake again.  I wrote two different directories:

"The owner of the /var/run/slurm-llnl directory and the
slurmctld.pid and slurmd.pid files should be "noki" user.

chown -R noki:root /var/spool/slurm-llnl"

You should run:

chown -R noki:root /var/run/slurm-llnl

Regards;

Ahmet M.


19.06.2019 05:55 tarihinde Noki Lee yazdı:
Hi, slurm-users and mercan.

I tried what you said.
|noki@noki-System-Product-Name:~$ sudo chown -R noki:root /var/spool/slurm-llnl/ |noki@noki-System-Product-Name:/var/spool/slurm-llnl$ ls -l
total 92
-rw------- 1 noki root 198 Jun 19 11:36 assoc_mgr_state
-rw------- 1 noki root 198 Jun 18 20:31 assoc_mgr_state.old
-rw------- 1 noki root  10 Jun 19 11:36 assoc_usage
-rw------- 1 noki root  10 Jun 18 20:31 assoc_usage.old
-rw-r--r-- 1 noki root   5 Jun 11 21:15 clustername
-rw------- 1 noki root  15 Jun 19 11:36 fed_mgr_state
-rw------- 1 noki root  15 Jun 18 20:31 fed_mgr_state.old
-rw------- 1 noki root  35 Jun 19 11:36 job_state
-rw------- 1 noki root  35 Jun 18 20:31 job_state.old
-rw------- 1 noki root  38 Jun 19 11:36 last_config_lite
-rw------- 1 noki root  38 Jun 19  2019 last_config_lite.old
-rw------- 1 noki root 109 Jun 19 11:36 layouts_state_base
-rw------- 1 noki root 109 Jun 18 20:31 layouts_state_base.old
-rw------- 1 noki root 194 Jun 19 11:36 node_state
-rw------- 1 noki root 194 Jun 18 20:31 node_state.old
-rw------- 1 noki root 142 Jun 19 11:36 part_state
-rw------- 1 noki root 142 Jun 18 20:31 part_state.old
-rw------- 1 noki root  10 Jun 19 11:36 qos_usage
-rw------- 1 noki root  10 Jun 18 20:31 qos_usage.old
-rw------- 1 noki root  35 Jun 19 11:36 resv_state
-rw------- 1 noki root  35 Jun 18 20:31 resv_state.old
-rw------- 1 noki root  31 Jun 19 11:36 trigger_state
-rw------- 1 noki root  31 Jun 18 20:31 trigger_state.old
After I restarted or not both slurmd and slrumctld, slurmctld is fine but slurmd still shows the same issue.
The below is the owners and groups after restart both slurmd and slurmctld
|noki@noki-System-Product-Name:~$ sudo chown -R noki:root /var/spool/slurm-llnl/ noki@noki-System-Product-Name:/var/spool/slurm-llnl$ ls -l total 92 -rw------- 1 noki noki 198 Jun 19 11:40 assoc_mgr_state -rw------- 1 noki root 198 Jun 19 11:36 assoc_mgr_state.old -rw------- 1 noki noki  10 Jun 19 11:40 assoc_usage -rw------- 1 noki root  10 Jun 19 11:36 assoc_usage.old -rw-r--r-- 1 noki root   5 Jun 11 21:15 clustername -rw------- 1 noki noki  15 Jun 19 11:40 fed_mgr_state -rw------- 1 noki root  15 Jun 19 11:36 fed_mgr_state.old -rw------- 1 noki noki  35 Jun 19 11:40 job_state -rw------- 1 noki root  35 Jun 19 11:36 job_state.old -rw------- 1 noki noki  38 Jun 19 11:40 last_config_lite -rw------- 1 noki root  38 Jun 19 11:36 last_config_lite.old -rw------- 1 noki noki 109 Jun 19 11:40 layouts_state_base -rw------- 1 noki root 109 Jun 19 11:36 layouts_state_base.old -rw------- 1 noki noki 194 Jun 19 11:40 node_state -rw------- 1 noki root 194 Jun 19 11:36 node_state.old -rw------- 1 noki noki 142 Jun 19 11:40 part_state -rw------- 1 noki root 142 Jun 19 11:36 part_state.old -rw------- 1 noki noki  10 Jun 19 11:40 qos_usage -rw------- 1 noki root  10 Jun 19 11:36 qos_usage.old -rw------- 1 noki noki  35 Jun 19 11:40 resv_state -rw------- 1 noki root  35 Jun 19 11:36 resv_state.old -rw------- 1 noki noki  31 Jun 19 11:40 trigger_state -rw------- 1 noki root  31 Jun 19 11:36 trigger_state.old |
Do you think I need to change chmod?

Regards,

On Tue, Jun 18, 2019 at 9:27 PM mercan <ahmet.mer...@uhem.itu.edu.tr <mailto:ahmet.mer...@uhem.itu.edu.tr>> wrote:

    Hi;

    I did not notice

    SlurmUser=noki

    line. The owner of the /var/run/slurm-llnl directory and the
    slurmctld.pid and slurmd.pid files should be "noki" user.

    chown -R noki:root /var/spool/slurm-llnl

    Regards;

    Ahmet M.


    On 18.06.2019 15:15, mercan wrote:
    > Hi;
    >
    > The owner of the /var/run/slurm-llnl directory and the
    slurmctld.pid
    > and slurmd.pid files should be "slurm" user. Your files owner
    are root
    > and noki.
    >
    > chown -R slurm:slurm /var/spool/slurm-llnl
    >
    >
    > Regards;
    >
    > Ahmet M.
    >
    >
    > On 18.06.2019 15:03, Noki Lee wrote:
    >>
    >> Though SLURM works fine for job submitting, running, and
    queueing, I
    >> got a minor error below.
    >>
    >> |sudo systemctl status slurmd|
    >>
    >> |Jun 12 10:20:40 noki-System-Product-Name systemd[1]:
    slurmd.service:
    >> Can't open PID file /var/run/slurm-llnl/slurmd.pid (yet?) after
    >> start: No such file or directory|
    >>
    >> |sudo systemctl status slurmctld|
    >>
    >> |Jun 12 10:20:40 noki-System-Product-Name systemd[1]:
    slurmd.service:
    >> Can't open PID file /var/run/slurm-llnl/slurmd.pid (yet?) after
    >> start: No such file or directory|
    >>
    >> I followed the installation of a guide from
    >>
    >>
    
ftp://www.microway.com/pub/pub/for-customer/SDSU-Training/Webinar_2_Slurm_II--Ubuntu16.04_and_18.04.pdf

    >>
    >>
    >> This problem may come from the ownership of slurm.conf file?
    >>
    >> Here are my slurm.conf and ownership for slur*.pid
    >>
    >> |# slurm.conf file generated by configurator easy.html. # Put this
    >> file on all nodes of your cluster. # See the slurm.conf man
    page for
    >> more information. # ControlMachine=noki-System-Product-Name
    >> #ControlAddr= # #MailProg=/bin/mail MpiDefault=none
    >> #MpiParams=ports=#-# ProctrackType=proctrack/pgid
    ReturnToService=1
    >> SlurmctldPidFile=/var/run/slurm-llnl/slurmctld.pid
    >> #SlurmctldPort=6817 SlurmdPidFile=/var/run/slurm-llnl/slurmd.pid
    >> #SlurmdPort=6818 SlurmdSpoolDir=/var/spool/slurmd SlurmUser=noki
    >> #SlurmdUser=root StateSaveLocation=/var/spool/slurm-llnl
    >> SwitchType=switch/none TaskPlugin=task/none # # # TIMERS
    #KillWait=30
    >> #MinJobAge=300 #SlurmctldTimeout=120 #SlurmdTimeout=300 # # #
    >> SCHEDULING FastSchedule=1 SchedulerType=sched/backfill
    >> SelectType=select/linear #SelectTypeParameters= # # # LOGGING AND
    >> ACCOUNTING AccountingStorageType=accounting_storage/none
    >> ClusterName=linux #JobAcctGatherFrequency=30
    >> JobAcctGatherType=jobacct_gather/none #SlurmctldDebug=3
    >> SlurmctldLogFile=/var/log/slurm-llnl/SlurmctldLogFile
    #SlurmdDebug=3
    >> SlurmdLogFile=/var/log/slurm-llnl/SlurmdLogFile # # # COMPUTE
    NODES
    >> NodeName=noki-System-Product-Name CPUs=4 RealMemory=6963 Sockets=1
    >> CoresPerSocket=4 ThreadsPerCore=1 State=UNKNOWN
    PartitionName=debug
    >> Nodes=noki-System-Product-Name Default=YES MaxTime=INFINITE
    State=UP |
    >> |$ ls -l /var/run/slurm-llnl/ total 8 -rw-r--r-- 1 noki root 6
    Jun 12
    >> 10:20 slurmctld.pid -rw-r--r-- 1 root root 6 Jun 12 10:20
    slurmd.pid|
    >>
    >


Reply via email to