[slurm-users] Re: Print Slurm Stats on Login
> I too would be interested in some lightweight scripts For lightweight stats I tend to use this excellent script: slurmacct. Author is member of this mailinglist too. (hi): https://github.com/OleHolmNielsen/Slurm_tools/blob/master/slurmacct/slurmacct Currently I am in process of writing prometheus exporter as the one I've used for years (https://github.com/vpenso/prometheus-slurm-exporter) provides suboptimal results with Slurm 24.04+. (we use looong job arrays at our system breaking somehow the exporter, which is parsing text output of squeue command) cheers josef From: Davide DelVento via slurm-users Sent: Wednesday, 14 August 2024 01:52 To: Paul Edmon Cc: Reid, Andrew C.E. (Fed) ; Jeffrey T Frey ; slurm-users@lists.schedmd.com Subject: [slurm-users] Re: Print Slurm Stats on Login I too would be interested in some lightweight scripts. XDMOD in my experience has been very intense in workload to install, maintain and learn. It's great if one needs that level of interactivity, granularity and detail, but for some "quick and dirty" summary in a small dept it's not only overkill, it's also impossible given the available staffing. ... -- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
[slurm-users] Re: Print Slurm Stats on Login
This is wonderful, thanks Josef and Ole! I will need to familiarize myself with it, but on a cursory glance it looks almost exactly what I was looking for! On Wed, Aug 14, 2024 at 1:44 AM Josef Dvořáček via slurm-users < slurm-users@lists.schedmd.com> wrote: > > I too would be interested in some lightweight scripts > > For lightweight stats I tend to use this excellent script: slurmacct. > Author is member of this mailinglist too. (hi): > > > https://github.com/OleHolmNielsen/Slurm_tools/blob/master/slurmacct/slurmacct > > Currently I am in process of writing prometheus exporter as the one I've > used for years (https://github.com/vpenso/prometheus-slurm-exporter) > provides suboptimal results with Slurm 24.04+. > (we use looong job arrays at our system breaking somehow the exporter, > which is parsing text output of squeue command) > > cheers > > josef > > -- > *From:* Davide DelVento via slurm-users > *Sent:* Wednesday, 14 August 2024 01:52 > *To:* Paul Edmon > *Cc:* Reid, Andrew C.E. (Fed) ; Jeffrey T Frey < > f...@udel.edu>; slurm-users@lists.schedmd.com < > slurm-users@lists.schedmd.com> > *Subject:* [slurm-users] Re: Print Slurm Stats on Login > > I too would be interested in some lightweight scripts. XDMOD in my > experience has been very intense in workload to install, maintain and > learn. It's great if one needs that level of interactivity, granularity and > detail, but for some "quick and dirty" summary in a small dept it's not > only overkill, it's also impossible given the available staffing. > ... > > -- > slurm-users mailing list -- slurm-users@lists.schedmd.com > To unsubscribe send an email to slurm-users-le...@lists.schedmd.com > -- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
[slurm-users] REST API - get_user_environment
In previous versions (v0.0.36) of the REST API the job submission endpoint had a field titled "get_user_environment"; however, it doesn't appear to exist in v0.0.40. Is there an equivalent parameter that should be used? What is the suggested approach for mimicking this behavior in v0.0.40? Best regards, Juan -- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
[slurm-users] Re: REST API - get_user_environment
On 14-08-2024 19:52, jpuerto--- via slurm-users wrote: In previous versions (v0.0.36) of the REST API the job submission endpoint had a field titled "get_user_environment"; however, it doesn't appear to exist in v0.0.40. Is there an equivalent parameter that should be used? What is the suggested approach for mimicking this behavior in v0.0.40? What software are you referring to with the mentioned versions? REST prerequisites are listed in https://slurm.schedmd.com/rest_quickstart.html /Ole -- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
[slurm-users] Upgrade compute node to 24.05.2
G'Day all, I've been upgrading cmy cluster from 20.11.0 in small steps to get to 24.05.2. Currently 1 have all nodes on 23.02.8, the controller on 24.05.2 and a single test node on 24.05.2. All are Centos 7.9 (upgrade to Oracle Linux 8.10 is Phase 2 of the upgrades). When I check the slurmd status on the test node I get: [root@hpc-dev-01 24.05.2]# systemctl status slurmd ● slurmd.service - Slurm node daemon Loaded: loaded (/usr/lib/systemd/system/slurmd.service; enabled; vendor preset: disabled) Active: active (running) since Thu 2024-08-15 10:45:15 AEST; 24s ago Main PID: 46391 (slurmd) Tasks: 1 Memory: 1.2M CGroup: /system.slice/slurmd.service └─46391 /usr/sbin/slurmd --systemd Aug 15 10:45:15 hpc-dev-01 slurmd[46391]: slurmd: Considering each NUMA node as a socket Aug 15 10:45:15 hpc-dev-01 slurmd[46391]: slurmd: Node reconfigured socket/core boundaries SocketsPerBoard=4:8(hw) CoresPerSocket=16:8(hw) Aug 15 10:45:15 hpc-dev-01 slurmd[46391]: slurmd: Considering each NUMA node as a socket Aug 15 10:45:15 hpc-dev-01 slurmd[46391]: slurmd: slurmd version 24.05.2 started Aug 15 10:45:15 hpc-dev-01 slurmd[46391]: slurmd: *plugin_load_from_file: Incompatible Slurm plugin /usr/lib64/slurm/mpi_none.so version (23.02.8)* Aug 15 10:45:15 hpc-dev-01 slurmd[46391]: slurmd: error: Couldn't load specified plugin name for mpi/none: Incompatible plugin version Aug 15 10:45:15 hpc-dev-01 slurmd[46391]: slurmd: error: MPI: Cannot create context for mpi/none Aug 15 10:45:15 hpc-dev-01 systemd[1]: Started Slurm node daemon. Aug 15 10:45:15 hpc-dev-01 slurmd[46391]: slurmd: slurmd started on Thu, 15 Aug 2024 10:45:15 +1000 Aug 15 10:45:15 hpc-dev-01 slurmd[46391]: slurmd: CPUs=64 Boards=1 Sockets=8 Cores=8 Threads=1 Memory=257778 TmpDisk=15998 Uptime=2898769 CPUSpecL...ve=(null) Hint: Some lines were ellipsized, use -l to show in full. [root@hpc-dev-01 24.05.2]# We don't use MPI (life science workloads)... should I remove the file? If it is version 23.02.8 then doesn't 24.05.2 have that plugin built in? There are no references to mpi i the slurm.conf file Sid -- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...@lists.schedmd.com