[slurm-users] dynamical configuration || meta configuration mgmt

2024-05-29 Thread Heckes, Frank via slurm-users
Hello all, I’m sorry if this has been asked and answered before, but I couldn’t find anything related. Does anyone know whether a framework of sorts exists that allow to change certain SLURM configuration parameters provided some conditions in the batch system’s state are detected and of c

Re: [slurm-users] after upgrade to 23.11.1 nodes stuck in completion state

2024-01-30 Thread Heckes, Frank
These are scary news. I just updated to 23.11.1, but couldn't confirm the problems described so far. I'll do some more extensive and intensive tests. In case of desaster: Does anyone knows how to rollback the DB, as some new DB 'objects' attributes are introduced in 23.11.1. I never had the chanc

[slurm-users] parastation (mpi)

2023-11-24 Thread Heckes, Frank
Hello all, Some of scientists found the toolchains based on ParaStation to be faster than those utilizing openMPI or impi. I couldn’t find any eb files for toolchains based on this MPI implementation. My colleagues are using this toolchains on Jülich cluster (especially Juwels). My question is w

[slurm-users] Accounting/access on total usage

2023-01-16 Thread Heckes, Frank
Hi all, I hope I don’t overlooked a posting or documentation, but I didn’t find anything related. Does anyone know whether it’s possible to configure an ‘intrinsic’ SLURM accounting scheme or mechanism like 1. All groups, users have an account with a total amount of TRES (cputime,

Re: [slurm-users] slurmd startup problem

2021-08-16 Thread Heckes, Frank
Setting –disable-frontend as option for configure in slurm.spec or on the configure command line ‘solved’ the issue. Is this a known bug? From: slurm-users On Behalf Of Heckes, Frank Sent: Monday, 16 August 2021 08:24 To: Slurm User Community List Subject: [slurm-users] slurmd startup

[slurm-users] slurmd startup problem

2021-08-15 Thread Heckes, Frank
Hi all, I’m using slurm version 20.02.01. After an OS update on a subset of nodes to sles15.2 the compiled daemon, using the corresponding build environment for sles15.2, fails to start with message: # slurmd -D -v -f /etc/slurm/slurm.conf slurmd: fatal: PrologFlags=alloc not supported

Re: [slurm-users] derived counters

2021-04-13 Thread Heckes, Frank
I6Mn0%3D%7C1000&sdata=5d82B%2BR1JhiuuUn0is%2FWojmMlt87YpzLnBI%2FOtpokTY%3D&reserved=0> https://open.xdmod.org/9.0/index.html). The nifty reporting tool has many features to make it easier for us to report out the cluster usage. Hadrian On Tue, Apr 13, 2021 at 8:0

Re: [slurm-users] derived counters

2021-04-13 Thread Heckes, Frank
Hello Ole, > >> -Original Message- > >>>* (average) queue length for a certain partition > > I wonder what exactly does your question mean? Maybe the number of jobs or > CPUs in the Pending state? Maybe relative to the number of CPUs in the > partition? > This result from a mgmt. -

Re: [slurm-users] derived counters

2021-04-12 Thread Heckes, Frank
- > From: slurm-users On Behalf Of > Ole Holm Nielsen > Sent: Monday, 12 April 2021 08:19 > To: slurm-users@lists.schedmd.com > Subject: Re: [slurm-users] derived counters > > On 4/11/21 6:17 PM, Heckes, Frank wrote: > > Sorry, if this has been asked and answered before. &g

[slurm-users] derived counters

2021-04-11 Thread Heckes, Frank
Hi all, Sorry, if this has been asked and answered before. Does someone created a script/sql-query or maybe can provide combination of command line flags to create a ‘report’ for: * partition utilization * (average) waittime for job send to a certain partition * (average) q

[slurm-users] Use nodes exclusive and shared simultaneously

2021-03-05 Thread Heckes, Frank
Hi all, Sorry if this has been ask and answered before. Resulting from a user/owner requirement I would need to set-up a subset of a nodes in a partition be used as common shared resources still, but these nodes should be available with smallest latency possible in case the owner wants run a

[slurm-users] slurm_pam_adapt & configless - set-up

2020-12-02 Thread Heckes, Frank
Hello all, sorry if this has been asked and/or answered before. I couldn’t find a posting related to my problem. I’m using slurm 20.02.01 and use a configless – set-up for all login and compute nodes. I set-up slurm PAM on a test node following the instructions at https://slurm.schedmd.com/