[slurm-users] Re: Changing account names in sacctmgr

2025-06-05 Thread Bjørn-Helge Mevik via slurm-users
"Burian, John via slurm-users" writes: > My understanding is that in the absence of > an --account option, sbatch/salloc assumes the account is the user’s > primary POSIX group. Are you sure about that? I've never heard of such defaulting. My understanding is that without --account, sbatch/sal

[slurm-users] Re: Restrict and prioritize usage of certain nodes according to accounts

2025-05-21 Thread Bjørn-Helge Mevik via slurm-users
"thomas.hartmann--- via slurm-users" writes: > I have three sets of accounts (each can have child accounts): > 1. "General" accounts: These are allowed to use all physical nodes. > 2. "ForProfit" accounts: These absolutely must not use the project_A_* nodes > 3. "Project_A" accounts: Their jobs s

[slurm-users] Re: Slurm webhooks

2025-04-22 Thread Bjørn-Helge Mevik via slurm-users
Davide DelVento via slurm-users writes: > I've gotten a request to have Slurm notify users for the typical email > things (job started, completed, failed, etc) with a REST API instead of > email. This would allow notifications in MS Teams, Slack, or log stuff in > some internal websites and thing

[slurm-users] Re: Slurm upgrade using Debian packages

2025-03-15 Thread Bjørn-Helge Mevik via slurm-users
Ole Holm Nielsen via slurm-users writes: > Hi Bjørn-Helge, > > On 3/7/25 08:59, Bjørn-Helge Mevik via slurm-users wrote: >> My 2¢: >> If upgrading the deb packages does *not* restart the services, then >> you >> can just upgrade all the slurm packages on the con

[slurm-users] Re: Slurm upgrade using Debian packages

2025-03-07 Thread Bjørn-Helge Mevik via slurm-users
My 2¢: If upgrading the deb packages does *not* restart the services, then you can just upgrade all the slurm packages on the controller, then restart slurmdbd first and slurmctld afterwards. (This is how I do upgrades with rpms.) If upgrading *does* restart the services, then you'd have to stop

[slurm-users] Re: Assistance with Node Restrictions and Priority for Users in Floating Partition

2025-02-03 Thread Bjørn-Helge Mevik via slurm-users
Manisha Yadav writes: > Could you please confirm if my setup is correct, or if any modifications are > required on my end? I don't see anything wrong with the part of the setup that you've shown. Have you checked with `sprio -l -j ` whether the jobs get the extra qos priority? If not, perhaps

[slurm-users] Re: Assistance with Node Restrictions and Priority for Users in Floating Partition

2025-01-27 Thread Bjørn-Helge Mevik via slurm-users
Manisha Yadav via slurm-users writes: > To achieve this, I attempted to use QoS by creating a floating > partition with some of the nodes and configuring a QoS with > priority. I also set a limit with GrpTRES=gres/gpu=24, given that each > node has 8 GPUs, and there are 3 nodes in total. If ther

[slurm-users] Re: The hostname resolution case sensitive

2024-11-07 Thread Bjørn-Helge Mevik via slurm-users
Ole Holm Nielsen via slurm-users writes: > Is Slurm's NodeName case sensitivity a bug or a feature? Preventing people from using UPPERCASE hostnames, usernames, group names etc. is IMNSHO a feature. :D -- B/H signature.asc Description: PGP signature -- slurm-users mailing list -- slurm-use

[slurm-users] Re: The hostname resolution case sensitive

2024-11-07 Thread Bjørn-Helge Mevik via slurm-users
Bill via slurm-users writes: > I want to confirm that the hostname resolution is case sensitive in SLURM ? That should be easy enough to test: $ sbatch -A nnk -t 10 --mem-per-cpu=100 --wrap='sleep 60' --nodelist=c3-1 Submitted batch job 13088180 $ sbatch -A nnk -t 10 --mem-per-cpu=100

[slurm-users] Re: I need to limit the number of jobs per user per partition

2024-10-28 Thread Bjørn-Helge Mevik via slurm-users
Have you set the AccountingStorageEnforce parameter in slurm.conf? I believe that it should be set to at least "limits", but do check "man slurm.conf" to be sure. -- Regards, Bjørn-Helge Mevik, dr. scient, Department for Research Computing, University of Oslo signature.asc Description: PGP si

[slurm-users] Re: GPU Accounting

2024-10-03 Thread Bjørn-Helge Mevik via slurm-users
Emyr James via slurm-users writes: > I have this set in slurm.conf > > AccountingStorageTRES=gres/gpu I believe you need to list all types of GPUs (including MIGs) that you have configured on the nodes, in addition to the general "gres/gpu". For instance, on one of our clusters, we have A

[slurm-users] Re: A note on updating Slurm from 23.02 to 24.05 & multi-cluster

2024-09-26 Thread Bjørn-Helge Mevik via slurm-users
Ward Poelmans via slurm-users writes: > We hit a snag when updating our clusters from Slurm 23.02 to > 24.05. After updating the slurmdbd, our multi cluster setup was broken > until everything was updated to 24.05. We had not anticipated this. When you say "everything", do you mean all the slurm

[slurm-users] Re: Detailed locations for SLUG'24

2024-09-10 Thread Bjørn-Helge Mevik via slurm-users
Bjørn-Helge Mevik via slurm-users writes: > Dear all SLUG attendees! > > The information about which buildings/addresses the SLUG reception and > presentations are to be held is not very visible on > the https://slug24.splashthat.com. There is a map there with all loc

[slurm-users] Detailed locations for SLUG'24

2024-09-09 Thread Bjørn-Helge Mevik via slurm-users
Dear all SLUG attendees! The information about which buildings/addresses the SLUG reception and presentations are to be held is not very visible on the https://slug24.splashthat.com. There is a map there with all locations (https://www.google.com/maps/d/u/0/edit?mid=1bcGaTiW0TNB5noQsjQ3ulctzKuqlG

[slurm-users] Re: Unable to run sequential jobs simultaneously on the same node

2024-08-19 Thread Bjørn-Helge Mevik via slurm-users
Brian Andrus via slurm-users writes: > IIRC, slurm parses the batch file as options until it hits the first > non-comment line, which includes blank lines. Blank lines do not stop sbatch from parsing the file. (But commands do.) -- B/H signature.asc Description: PGP signature -- slurm-use

[slurm-users] Re: Slurm sacct ResvCPURAW invalid field in version 24.12.5

2024-07-29 Thread Bjørn-Helge Mevik via slurm-users
Perhaps PlannedCPURAW? -- Regards, Bjørn-Helge Mevik, dr. scient, Department for Research Computing, University of Oslo signature.asc Description: PGP signature -- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...@lists.schedmd.com

[slurm-users] Re: Unsupported RPC version by slurmctld 19.05.3 from client slurmd 22.05.11

2024-06-17 Thread Bjørn-Helge Mevik via slurm-users
Paul Edmon via slurm-users writes: > https://slurm.schedmd.com/upgrades.html#compatibility_window > > Looks like no. You have to be with in 2 major releases. Also, server must be newer than client. -- B/H signature.asc Description: PGP signature -- slurm-users mailing list -- slurm-users@l

[slurm-users] Re: Performance Discrepancy between Slurm and Direct mpirun for VASP Jobs.

2024-05-26 Thread Bjørn-Helge Mevik via slurm-users
Ole Holm Nielsen via slurm-users writes: > Whether or not to enable Hyper-Threading (HT) on your compute nodes > depends entirely on the properties of applications that you wish to > run on the nodes. Some applications are faster without HT, others are > faster with HT. When HT is enabled, the

[slurm-users] Re: scrontab question

2024-05-07 Thread Bjørn-Helge Mevik via slurm-users
Sandor via slurm-users writes: > I am working out the details of scrontab. My initial testing is giving me > an unsolvable question If you have an unsolvable problem, you don't have a problem, you have a fact of life. :) > Within scrontab editor I have the following example from the slurm > d

[slurm-users] Re: Convergence of Kube and Slurm?

2024-05-06 Thread Bjørn-Helge Mevik via slurm-users
Tim Wickberg via slurm-users writes: > [1] Slinky is not an acronym (neither is Slurm [2]), but loosely > stands for "Slurm in Kubernetes". And not at all inspired by Slinky Dog in Toy Story, I guess. :D -- Cheers, Bjørn-Helge Mevik, dr. scient, Department for Research Computing, University of

[slurm-users] Re: Munge log-file fills up the file system to 100%

2024-04-17 Thread Bjørn-Helge Mevik via slurm-users
Jeffrey T Frey via slurm-users writes: >> AFAIK, the fs.file-max limit is a node-wide limit, whereas "ulimit -n" >> is per user. > > The ulimit is a frontend to rusage limits, which are per-process restrictions > (not per-user). You are right; I sit corrected. :) (Except for number of procs an

[slurm-users] Re: Munge log-file fills up the file system to 100%

2024-04-16 Thread Bjørn-Helge Mevik via slurm-users
Ole Holm Nielsen writes: > Hi Bjørn-Helge, > > That sounds interesting, but which limit might affect the kernel's > fs.file-max? For example, a user already has a narrow limit: > > ulimit -n > 1024 AFAIK, the fs.file-max limit is a node-wide limit, whereas "ulimit -n" is per user. Now that I t

[slurm-users] Re: Munge log-file fills up the file system to 100%

2024-04-16 Thread Bjørn-Helge Mevik via slurm-users
Ole Holm Nielsen via slurm-users writes: > Therefore I believe that the root cause of the present issue is user > applications opening a lot of files on our 96-core nodes, and we need > to increase fs.file-max. You could also set a limit per user, for instance in /etc/security/limits.d/. Then u

[slurm-users] Re: Increasing SlurmdTimeout beyond 300 Seconds

2024-02-12 Thread Bjørn-Helge Mevik via slurm-users
We've been running one cluster with SlurmdTimeout = 1200 sec for a couple of years now, and I haven't seen any problems due to that. -- Regards, Bjørn-Helge Mevik, dr. scient, Department for Research Computing, University of Oslo signature.asc Description: PGP signature -- slurm-users mailin

[slurm-users] Re: Starting a job after a file is created in previous job (dependency looking for soluton)

2024-02-06 Thread Bjørn-Helge Mevik via slurm-users
Amjad Syed via slurm-users writes: > I need to submit a sequence of up to 400 jobs where the even jobs depend on > the preceeding odd job to finish and every odd job depends on the presence > of a file generated by the preceding even job (availability of the file for > the first of those 400 jobs

[slurm-users] Re: Why is Slurm 20 the latest RPM in RHEL 8/Fedora repo?

2024-01-31 Thread Bjørn-Helge Mevik via slurm-users
This isn't answering your question, but I strongly suggest you build Slurm from source. You can use the provided slurm.spec file to make rpms (we do) or use "configure + make". Apart from being able to upgrade whenever a new version is out (especially important for security!), you can tailor the