[slurm-users] Re: Print Slurm Stats on Login

2024-08-14 Thread Josef Dvořáček via slurm-users
> I too would be interested in some lightweight scripts

For lightweight stats I tend to use this excellent script: slurmacct. Author is 
member of this mailinglist too. (hi):

https://github.com/OleHolmNielsen/Slurm_tools/blob/master/slurmacct/slurmacct

Currently I am in process of writing prometheus exporter as the one I've used 
for years (https://github.com/vpenso/prometheus-slurm-exporter) provides 
suboptimal results with Slurm 24.04+.
(we use looong job arrays at our system breaking somehow the exporter, which is 
parsing text output of squeue command)

cheers

josef


From: Davide DelVento via slurm-users 
Sent: Wednesday, 14 August 2024 01:52
To: Paul Edmon 
Cc: Reid, Andrew C.E. (Fed) ; Jeffrey T Frey 
; slurm-users@lists.schedmd.com 
Subject: [slurm-users] Re: Print Slurm Stats on Login

I too would be interested in some lightweight scripts. XDMOD in my experience 
has been very intense in workload to install, maintain and learn. It's great if 
one needs that level of interactivity, granularity and detail, but for some 
"quick and dirty" summary in a small dept it's not only overkill, it's also 
impossible given the available staffing.
...

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com


[slurm-users] Re: Print Slurm Stats on Login

2024-08-14 Thread Davide DelVento via slurm-users
This is wonderful, thanks Josef and Ole! I will need to familiarize myself
with it, but on a cursory glance it looks almost exactly what I was looking
for!

On Wed, Aug 14, 2024 at 1:44 AM Josef Dvořáček via slurm-users <
slurm-users@lists.schedmd.com> wrote:

> > I too would be interested in some lightweight scripts
>
> For lightweight stats I tend to use this excellent script: slurmacct.
> Author is member of this mailinglist too. (hi):
>
>
> https://github.com/OleHolmNielsen/Slurm_tools/blob/master/slurmacct/slurmacct
>
> Currently I am in process of writing prometheus exporter as the one I've
> used for years (https://github.com/vpenso/prometheus-slurm-exporter)
> provides suboptimal results with Slurm 24.04+.
> (we use looong job arrays at our system breaking somehow the exporter,
> which is parsing text output of squeue command)
>
> cheers
>
> josef
>
> --
> *From:* Davide DelVento via slurm-users 
> *Sent:* Wednesday, 14 August 2024 01:52
> *To:* Paul Edmon 
> *Cc:* Reid, Andrew C.E. (Fed) ; Jeffrey T Frey <
> f...@udel.edu>; slurm-users@lists.schedmd.com <
> slurm-users@lists.schedmd.com>
> *Subject:* [slurm-users] Re: Print Slurm Stats on Login
>
> I too would be interested in some lightweight scripts. XDMOD in my
> experience has been very intense in workload to install, maintain and
> learn. It's great if one needs that level of interactivity, granularity and
> detail, but for some "quick and dirty" summary in a small dept it's not
> only overkill, it's also impossible given the available staffing.
> ...
>
> --
> slurm-users mailing list -- slurm-users@lists.schedmd.com
> To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
>

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com


[slurm-users] REST API - get_user_environment

2024-08-14 Thread jpuerto--- via slurm-users
In previous versions (v0.0.36) of the REST API the job submission endpoint had 
a field titled "get_user_environment"; however, it doesn't appear to exist in 
v0.0.40. Is there an equivalent parameter that should be used? What is the 
suggested approach for mimicking this behavior in v0.0.40?

Best regards,

Juan

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com


[slurm-users] Re: REST API - get_user_environment

2024-08-14 Thread Ole Holm Nielsen via slurm-users

On 14-08-2024 19:52, jpuerto--- via slurm-users wrote:

In previous versions (v0.0.36) of the REST API the job submission endpoint had a field 
titled "get_user_environment"; however, it doesn't appear to exist in v0.0.40. 
Is there an equivalent parameter that should be used? What is the suggested approach for 
mimicking this behavior in v0.0.40?
What software are you referring to with the mentioned versions?  REST 
prerequisites are listed in https://slurm.schedmd.com/rest_quickstart.html


/Ole

--
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com


[slurm-users] Upgrade compute node to 24.05.2

2024-08-14 Thread Sid Young via slurm-users
G'Day all,

I've been upgrading cmy cluster from 20.11.0 in small steps to get to
24.05.2. Currently 1 have all nodes on 23.02.8, the controller on 24.05.2
and a single test node on 24.05.2. All are Centos 7.9 (upgrade to Oracle
Linux 8.10 is Phase 2 of the upgrades).

When I check the slurmd status on the test node I get:

[root@hpc-dev-01 24.05.2]# systemctl status slurmd
● slurmd.service - Slurm node daemon
   Loaded: loaded (/usr/lib/systemd/system/slurmd.service; enabled; vendor
preset: disabled)
   Active: active (running) since Thu 2024-08-15 10:45:15 AEST; 24s ago
 Main PID: 46391 (slurmd)
Tasks: 1
   Memory: 1.2M
   CGroup: /system.slice/slurmd.service
   └─46391 /usr/sbin/slurmd --systemd

Aug 15 10:45:15 hpc-dev-01 slurmd[46391]: slurmd: Considering each NUMA
node as a socket
Aug 15 10:45:15 hpc-dev-01 slurmd[46391]: slurmd: Node reconfigured
socket/core boundaries SocketsPerBoard=4:8(hw) CoresPerSocket=16:8(hw)
Aug 15 10:45:15 hpc-dev-01 slurmd[46391]: slurmd: Considering each NUMA
node as a socket
Aug 15 10:45:15 hpc-dev-01 slurmd[46391]: slurmd: slurmd version 24.05.2
started
Aug 15 10:45:15 hpc-dev-01 slurmd[46391]: slurmd: *plugin_load_from_file:
Incompatible Slurm plugin /usr/lib64/slurm/mpi_none.so version (23.02.8)*
Aug 15 10:45:15 hpc-dev-01 slurmd[46391]: slurmd: error: Couldn't load
specified plugin name for mpi/none: Incompatible plugin version
Aug 15 10:45:15 hpc-dev-01 slurmd[46391]: slurmd: error: MPI: Cannot create
context for mpi/none
Aug 15 10:45:15 hpc-dev-01 systemd[1]: Started Slurm node daemon.
Aug 15 10:45:15 hpc-dev-01 slurmd[46391]: slurmd: slurmd started on Thu, 15
Aug 2024 10:45:15 +1000
Aug 15 10:45:15 hpc-dev-01 slurmd[46391]: slurmd: CPUs=64 Boards=1
Sockets=8 Cores=8 Threads=1 Memory=257778 TmpDisk=15998 Uptime=2898769
CPUSpecL...ve=(null)
Hint: Some lines were ellipsized, use -l to show in full.
[root@hpc-dev-01 24.05.2]#

We don't use MPI (life science workloads)... should I remove the file? If
it is version 23.02.8 then doesn't 24.05.2 have that plugin built in? There
are no references to mpi i the slurm.conf file



Sid

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com