>From the logs 2 errors,
8><---
Feb 04 03:08:48 vuwunicoslurmd1.ods.vuw.ac.nz systemd[1]: Starting Slurm
controller daemon...
Feb 04 03:08:48 vuwunicoslurmd1.ods.vuw.ac.nz slurmctld[1045020]: slurmctld:
error: chdir(/var/log): Permission denied
Feb 04 03:08:48 vuwunicoslurmd1.ods.vuw.ac.nz slur
No,
[root@node5 log]# ls -la /etc/pam.d/*slurm*
ls: cannot access '/etc/pam.d/*slurm*': No such file or directory
Slurm is installed,
[root@node5 log]# rpm -qi slurm
Name: slurm
Version : 22.05.9
Release : 1.el9
Architecture: x86_64
Install Date: Thu Dec 12 21:02:12 2024
Group
Hi Steven,
You can find list of packages to installed based on the node role here:
https://slurm.schedmd.com/quickstart_admin.html#pkg_install
Thanks,
Marko
On Mon, Feb 3, 2025 at 3:51 PM Steven Jones via slurm-users <
slurm-users@lists.schedmd.com> wrote:
> Hi,
>
> After rpmbuilding slurm,
>
Hi,
After rpmbuilding slurm,
Do I need to install all of these or just
slurm-slurmctld-24.11.1-1.el9.x86_64.rpm on the controller and
slurm-slurmd-24.11.1-1.el9.x86_64.rpm on the compute nodes?
-rw-r--r--. 1 root root 18508016 Feb 3 23:46 slurm-24.11.1-1.el9.x86_64.rpm
-rw-r--r--. 1 root roo
Just double checking. Can you check on your worker node
1.
ls -la /etc/pam.d/*slurm*
(just checking if there's a specific pam file for slurmd on your system)
1.
scontrol show config | grep -i SlurmdUser
(checking if slurmd is set up with a different user to SlurmUser)
1.
grep slurm /e
I rebuilt 4 nodes as rocky9.5
8><---
[2025-02-03T21:40:11.978] Node node6 now responding
[2025-02-03T21:41:15.698] _slurm_rpc_submit_batch_job: JobId=17
InitPrio=4294901759 usec=501
[2025-02-03T21:41:16.055] sched: Allocate JobId=17 NodeList=node6 #CPUs=1
Partition=debug
[2025-02-03T21:41:16.059
On 2/3/25 2:33 pm, Steven Jones via slurm-users wrote:
Just built 4 x rocky9 nodes and I do not get that error (but I get
another I know how to fix, I think) so holistically I am thinking the
version difference is too large.
Oh I think I missed this - when you say version difference do you m
We only do isolated on the students’ VirtualBox setups because it’s simpler for
them to get started with. Our production HPC with OpenHPC is definitely
integrated with our Active Directory (directly via sssd, not with an
intermediate product), etc. Not everyone does it that way, but our scale is
Hi,
Thanks, but isolated isnt the goal in my case. The goal is to save admin time
we cant afford and to have a far reaching setup.
So I have to link the HPC to IPA/Idm and on to AD in a trust that way user
admins can jsut drop a student or staff member into an AD group and job done.
That a
Late to the party here, but depending on how much time you have invested, how
much you can tolerate reformats or other more destructive work, etc., you might
consider OpenHPC and its install guide ([1] for RHEL 8 variants, [2] or [3] for
RHEL 9 variants, depending on which version of Warewulf yo
Slurm.conf is copied between nodes.
Just built 4 x rocky9 nodes and I do not get that error (but I get another I
know how to fix, I think) so holistically I am thinking the version difference
is too large.
regards
Steven
From: Chris Samuel via slurm-users
Manisha Yadav writes:
> Could you please confirm if my setup is correct, or if any modifications are
> required on my end?
I don't see anything wrong with the part of the setup that you've shown.
Have you checked with `sprio -l -j ` whether the jobs get the
extra qos priority? If not, perhaps
12 matches
Mail list logo