[slurm-users] The issue in the distribution of job

2024-08-09 Thread Sundaram Kumaran via slurm-users
Dear All, May I have your suggestion in my issue facing, While the job is launched using "salloc -N4--mem 4000 -p active" I find the job is running in the one compute node and the other 3 machines are free, I don`t find the job is distributed evenly, May I have your suggestion, I do squeue /scon

[slurm-users] Re: The issue in the distribution of job

2024-08-09 Thread Renfro, Michael via slurm-users
It may be difficult to narrow down the problem without knowing what commands you're running inside the salloc session. For example, if it's a pure OpenMP program, it can't use more than one node. From: Sundaram Kumaran via slurm-users Sent: Friday, August 9, 2024

[slurm-users] Print Slurm Stats on Login

2024-08-09 Thread Paul Edmon via slurm-users
We are working to make our users more aware of their usage. One of the ideas we came up with was to having some basic usage stats printed at login (usage over past day, fairshare, job efficiency, etc). Does anyone have any scripts or methods that they use to do this? Before baking my own I was

[slurm-users] Re: Print Slurm Stats on Login

2024-08-09 Thread Jeffrey T Frey via slurm-users
You'd have to do this within e.g. the system's bashrc infrastructure. The simplest idea would be to add to e.g. /etc/profile.d/zzz-slurmstats.sh and have some canned commands/scripts running. That does introduce load to the system and Slurm on every login, though, and slows the startup of logi

[slurm-users] Re: Print Slurm Stats on Login

2024-08-09 Thread Paul Edmon via slurm-users
Yeah, I was contemplating doing that so I didn't have a dependency on the scheduler being up or down or busy. What I was more curious about is if any one had an prebaked scripts for that. -Paul Edmon- On 8/9/2024 12:04 PM, Jeffrey T Frey wrote: You'd have to do this within e.g. the system's

[slurm-users] Re: Print Slurm Stats on Login

2024-08-09 Thread Reid, Andrew C.E. (Fed) via slurm-users
Maybe a heavier lift than you had in mind, but check out xdmod, open.xdmod.org. It was developed by the NSF as part of the now-shuttered XSEDE program, and is useful for both system and user monitoring. -- A. On Fri, Aug 09, 2024 at 12:12:08PM -0400, Paul Edmon via slurm-u

[slurm-users] Re: Print Slurm Stats on Login

2024-08-09 Thread Paul Edmon via slurm-users
Yup, we have that installed already. It's been very beneficial for over all monitoring. -Paul Edmon- On 8/9/2024 12:27 PM, Reid, Andrew C.E. (Fed) wrote: Maybe a heavier lift than you had in mind, but check out xdmod, open.xdmod.org. It was developed by the NSF as part of the now-shutte

[slurm-users] Annoying canonical question about converting SLURM_JOB_NODELIST to a host list for mpirun

2024-08-09 Thread Jeffrey Layton via slurm-users
Good afternoon, I know this question has been asked a million times, but what is the canonical way to convert the list of nodes for a job that is container in a Slurm variable, I use SLURM_JOB_NODELIST, to a host list appropriate for mpirun in OpenMPI (perhaps MPICH as well)? Before anyone says,

[slurm-users] Re: Annoying canonical question about converting SLURM_JOB_NODELIST to a host list for mpirun

2024-08-09 Thread Paul Edmon via slurm-users
As I recall I think OpenMPI needs a list that has an entry on each line, rather than one seperated by a space. See: [root@holy7c26401 ~]# echo $SLURM_JOB_NODELIST holy7c[26401-26405] [root@holy7c26401 ~]# scontrol show hostnames $SLURM_JOB_NODELIST holy7c26401 holy7c26402 holy7c26403 holy7c26404

[slurm-users] Re: Annoying canonical question about converting SLURM_JOB_NODELIST to a host list for mpirun

2024-08-09 Thread Hermann Schwärzler via slurm-users
Hi Paul, On 8/9/24 18:45, Paul Edmon via slurm-users wrote: As I recall I think OpenMPI needs a list that has an entry on each line, rather than one seperated by a space. See: [root@holy7c26401 ~]# echo $SLURM_JOB_NODELIST holy7c[26401-26405] [root@holy7c26401 ~]# scontrol show hostnames $SLUR

[slurm-users] FairShare if there's only one account?

2024-08-09 Thread Drucker, Daniel via slurm-users
Simple question: Does FairShare still work if every user is under one account? E.g.: $ sacctmgr show assoc format=Account,User Account User -- -- root root root mic mic asmith mic bsmith mic csmith mic djones

[slurm-users] Re: FairShare if there's only one account?

2024-08-09 Thread Renfro, Michael via slurm-users
I don’t have any 21.08 systems to verify with, but that’s how I remember it. Use “sshare -a -A mic” to verify. You should see both a RawShares and a NormShares column for each user. By default they’ll all have the same value, but they can be adjusted if needed. From: Drucker, Daniel via slurm-u

[slurm-users] Re: FairShare if there's only one account?

2024-08-09 Thread Drucker, Daniel via slurm-users
Looks like this: $ sshare -a -A mic AccountUser RawShares NormSharesRawUsage EffectvUsage FairShare -- -- --- --- - -- mic1200.99173655524598 1.00

[slurm-users] Jobs distribution over CPUs

2024-08-09 Thread Rafał Lalik via slurm-users
Hi, I have a very simple computing farm on a single PC with AMD Ryzen 7950X (2x16 cores). I have configured my slurm to use up to 25 CPUs: NodeName=palmer CPUs=25 RealMemory=4 State=UNKNOWN # Boards=1 SocketsPerBoard=1 CoresPerSocket=16 ThreadsPerCore=2 PartitionName=main Nodes=ALL Default

[slurm-users] Re: FairShare if there's only one account?

2024-08-09 Thread Renfro, Michael via slurm-users
The format has changed a bit, since none of our RawShares column is ‘parent’. But you can test this to be certain. If your cluster already has jobs pending, have bsmith (who has zero usage) and csmith (who has a lot of usage, relatively) each submit several jobs into the pending queue. Alternat

[slurm-users] Re: FairShare if there's only one account?

2024-08-09 Thread Drucker, Daniel via slurm-users
I got the opposite result. When I submitted a job as bsmith, they got a lower priority (the number was smaller) than the job submitted as csmith. bsmith (who has never submitted a job before) got a priority of 98387 (which is 1 times the 0.983871 FairShare), whereas csmith (who is already ru

[slurm-users] Re: FairShare if there's only one account?

2024-08-09 Thread Paul Raines via slurm-users
This depends on how you have assigned fairshare in sacctmgr when creating the accounts and users. At our site we want fairshare only on accounts and not users, just like you are seeing, so we create accounts with sacctmgr -i add account $acct Description="$descr" \ fairshare=200 GrpJ

[slurm-users] Re: FairShare if there's only one account?

2024-08-09 Thread Drucker, Daniel via slurm-users
Hi Paul from over at mclean.harvard.edu! I have never added any users using sacctmgr - I've always just had everyone I guess automatically join the default account, mic. Are you saying that is what is causing my problem? I'm confused I guess because I would have expec

[slurm-users] Re: FairShare if there's only one account?

2024-08-09 Thread Fulcomer, Samuel via slurm-users
I don't think fairshare use is updated until jobs finish... On Fri, Aug 9, 2024 at 5:59 PM Drucker, Daniel via slurm-users < slurm-users@lists.schedmd.com> wrote: > Hi Paul from over at mclean.harvard.edu! > > I have never added *any* users using sacctmgr - I've always just had > everyone I guess

[slurm-users] Re: FairShare if there's only one account?

2024-08-09 Thread Drucker, Daniel via slurm-users
Well, let's say user A has completed a million jobs in the last few days as well, and user A has never submitted any before. On Aug 9, 2024, at 6:03 PM, Fulcomer, Samuel wrote: External Email - Use Caution I don't think fairshare use is updated until jobs finish... On Fri, Aug 9, 202

[slurm-users] Re: FairShare if there's only one account?

2024-08-09 Thread Drucker, Daniel via slurm-users
Er, user B has never. On Aug 9, 2024, at 6:08 PM, Daniel M. Drucker wrote: Well, let's say user A has completed a million jobs in the last few days as well, and user A has never submitted any before. On Aug 9, 2024, at 6:03 PM, Fulcomer, Samuel wrote: External Email - Use Caution

[slurm-users] Re: FairShare if there's only one account?

2024-08-09 Thread Paul Raines via slurm-users
I have never used Slurm where I have not added users explicitly first so I am not sure what happens in that case. But from your sshare output it certainly seems it default to fairshare=parent Trying modify the users with sacctmgr modify user $username fairshare=200 and then run sshare -a -A

[slurm-users] Re: FairShare if there's only one account?

2024-08-09 Thread Drucker, Daniel via slurm-users
NormShares changes to '1' for any user I modify like that. Everyone else has 0.991736. The "FairShare" column does not change. > On Aug 9, 2024, at 6:35 PM, Paul Raines wrote: > > I have never used Slurm where I have not added users explicitly first so I > am not sure what happens in that case

[slurm-users] Re: FairShare if there's only one account?

2024-08-09 Thread Fulcomer, Samuel via slurm-users
Yes, well, in that case, it should work as you desire, modulo your slurm.conf settings. What are the relevant lines in yours? On Fri, Aug 9, 2024 at 6:09 PM Drucker, Daniel wrote: > Er, user B has never. > > On Aug 9, 2024, at 6:08 PM, Daniel M. Drucker > wrote: > > Well, let's say user A has c

[slurm-users] Re: FairShare if there's only one account?

2024-08-09 Thread Drucker, Daniel via slurm-users
PriorityType=priority/multifactor PriorityFavorSmall=YES PriorityWeightAge=5 PriorityWeightFairshare=10 PriorityWeightJobSize=0 PriorityWeightQOS=0 In 21.08.8. On Aug 9, 2024, at 8:36 PM, Fulcomer, Samuel wrote: External Email - Use Caution Yes, well, in that case, it should

[slurm-users] Re: FairShare if there's only one account?

2024-08-09 Thread Fulcomer, Samuel via slurm-users
...and what are the top 10-15 lines in your share output?... On Fri, Aug 9, 2024 at 9:07 PM Drucker, Daniel wrote: > PriorityType=priority/multifactor > PriorityFavorSmall=YES > PriorityWeightAge=5 > PriorityWeightFairshare=10 > PriorityWeightJobSize=0 > PriorityWeightQOS=0 > > In 21.08.

[slurm-users] Re: FairShare if there's only one account?

2024-08-09 Thread Fulcomer, Samuel via slurm-users
"sshare", not share And note that the high PriorityWeightAge may be complicating things. We set it to 0. With it set so high, it allows users to gain priority by flooding the queue if you allow high numbers of job submissions and they age up in priority while they're waiting to run. On Fr

[slurm-users] Re: FairShare if there's only one account?

2024-08-09 Thread Drucker, Daniel via slurm-users
> On Aug 9, 2024, at 9:15 PM, Fulcomer, Samuel > wrote: > ...and what are the top 10-15 lines in your share output?... See the 4:10PM message in this thread. The information in this e-mail is intended only for the person to whom it is addressed. If you believe this e-mail was sent to you i

[slurm-users] Re: FairShare if there's only one account?

2024-08-09 Thread Drucker, Daniel via slurm-users
On Aug 9, 2024, at 9:21 PM, Fulcomer, Samuel wrote: > And note that the high PriorityWeightAge may be complicating things. We set > it to 0. With it set so high, it allows users to gain priority by flooding > the queue if you allow high numbers of job submissions and they age up in > priority w

[slurm-users] Re: FairShare if there's only one account?

2024-08-09 Thread Fulcomer, Samuel via slurm-users
For users with a parent account of "mic", I'd expect the RawShares to be listed as "1", not "parent". What's the "sprio" output for two jobs of users A and B, and which of them hasn't run any jobs? Also, the first 15 lines of output for "sshare" (no arguments) would be useful for me. On Fri, A