Re: [slurm-users] slurm-users Digest, Vol 67, Issue 20

2023-05-17 Thread Sridhar R
;t affect any jobs User A has that aren't tied to > the coordinated account. > * With User A's jobs held, then User B's jobs would be next to run. > * If the coordinator was particularly impatient, he could scancel User > A's currently running jobs so that Us

Re: [slurm-users] From an initial installation cannot start slurmctld with a slurmdbd running

2023-05-17 Thread Christopher Samuel
Hi Lawrence, On 5/17/23 3:26 pm, Sorrillo, Lawrence wrote: Here is the error I get: slurmctld: fatal: Can not recover assoc_usage state, incompatible version, got 9728 need >= 8704 <= 9216, The slurm version is:  20.11.9 That error seems to appear when slurmctld is loading usage data from

[slurm-users] From an initial installation cannot start slurmctld with a slurmdbd running

2023-05-17 Thread Sorrillo, Lawrence
Here is the error I get: slurmctld: fatal: Can not recover assoc_usage state, incompatible version, got 9728 need >= 8704 <= 9216, The slurm version is: 20.11.9 Marisdb version: mariadb-server-10.3.32-2.module+el8 Thanks.

Re: [slurm-users] On the ability of coordinators

2023-05-17 Thread Groner, Rob
I thought about that, and on a GENERAL level I think that would work. But I was thinking more along the lines of the scenario I had described where the coordinator wanted control of what jobs from their account would run next, right now. So, it would be changing the priority of jobs already as

Re: [slurm-users] On the ability of coordinators

2023-05-17 Thread Renfro, Michael
If there’s a fairshare component to job priorities, and there’s a share assigned to each user under the account, wouldn’t the light user’s jobs move ahead of any of the heavy user’s pending jobs automatically? From: slurm-users on behalf of "Groner, Rob" Reply-To: Slurm User Community List D

Re: [slurm-users] On the ability of coordinators

2023-05-17 Thread Groner, Rob
Ya, I found they had the power to hold jobs just be experimentation. Maybe it will turn out I had something misconfigured and coordinators don't have that ability either. I hope that's not the case, since being able to hold jobs in their account gives them some usefulness. My interest in this

Re: [slurm-users] On the ability of coordinators

2023-05-17 Thread Brian Andrus
Coordinator permissions from the man pages: coordinator A special privileged user, usually an account manager, that can add users or sub-accounts to the account they are coordinator over. This should be a trusted person since they can change limits on account and user associations,

Re: [slurm-users] On the ability of coordinators

2023-05-17 Thread Groner, Rob
I'm not sure what you mean by "if they have the permissions". I'm talking about someone who is specifically designated as "coordinator" of an account in slurm. With that designation, and no other admin level changes, I'm not aware that they can directly change the priority of jobs associated w

Re: [slurm-users] On the ability of coordinators

2023-05-17 Thread Brian Andrus
If they have the permissions, you can just raise the priority of user B's jobs to be higher than whatever A's currently are. Then they will run next. That will work if you are able to wait for some jobs to finish and you can 'skip the line' for the priority jobs. If you need to preempt runni

Re: [slurm-users] Slurmdbd High Availability

2023-05-17 Thread Shaghuf Rahman
Thanks ole for your input. I'm looking for the best fit solution so have a quick question related to slurmctld backup as well. I tested the read write speed on our NAS storage and local HDD, turns out the speed on local HDD is much higher than NAS storage. The r/w speed on NAS Storage is 250mb/s

[slurm-users] On the ability of coordinators

2023-05-17 Thread Groner, Rob
I was asked to see if coordinators could do anything in this scenario: * Within the account that they coordinated, User A submitted 1000s of jobs and left for the day. * Within the same account, User B wanted to run a few jobs really quickly. Once submitted, his jobs were of course behi

Re: [slurm-users] Troubles with cgroups

2023-05-17 Thread Hermann Schwärzler
Hi everybody, I would like to give you a quick update on this problem (hanging systems when swapping due to cgroup memory-limits is happening): We had opened a case with RedHat's customer support. After some to and fro they could reproduce the problem. Last week they told us to upgrade to ve