Re: [slurm-users] sbatch - accept jobs above limits

2022-02-08 Thread Alexander Block
Hi Mike, I'm just discussing a familiar case with SchedMD right now (ticket 13309). But it seems that it is not possible with Slurm to submit jobs that request features/configuration that are not available at the moment of submission. Cheers, Alexander Am 08.02.2022 um 23:26 schrieb z1...

Re: [slurm-users] sbatch - accept jobs above limits

2022-02-08 Thread Stephen Cousins
I can duplicate this error word for word by submitting a job asking for 150gb of memory when the nodes in that partition have a maximum of 128GB. Take a look at the memory values in your node specifications and in your job script or command line. Maybe there is a typo. On Tue, Feb 8, 2022, 8:40 P

Re: [slurm-users] sbatch - accept jobs above limits

2022-02-08 Thread Ryan Novosielski
I’m not 100% certain that this affects this situation, but there’s a slurm.conf setting called EnforcePartLimits that you might want to change. -- #BlackLivesMatter || \\UTGERS, |---*O*--- ||_// the State | Ryan Novosielski - novos

Re: [slurm-users] sbatch - accept jobs above limits

2022-02-08 Thread Stephen Cousins
What I'm saying is that the job might not be able to run in that partition. Ever. The job might be asking for more resources than the partition can provide. Maybe I'm wrong but it would help to know what the partition definition is, along with what resources the nodes in that partition have specifi

Re: [slurm-users] sbatch - accept jobs above limits

2022-02-08 Thread Christopher Samuel
On 2/8/22 2:26 pm, z1...@arcor.de wrote: These jobs should be accepted, if a suitable node will be active soon. For example, these jobs could be in PartitionConfig. From memory if you submit jobs with the `--hold` option then you should find they are successfully accepted - I've used that in

Re: [slurm-users] sbatch - accept jobs above limits

2022-02-08 Thread z148x
Yes, the partition does not meet the requirements now. The job should still be submitted and wait until requirements are available. On 09.02.22 00:11, Stephen Cousins wrote: > I think this message comes up when there are no nodes in that partition > have the resources capable to meet the require

Re: [slurm-users] sbatch - accept jobs above limits

2022-02-08 Thread Stephen Cousins
I think this message comes up when there are no nodes in that partition have the resources capable to meet the requirements. Can you show what the partition definition is in slurm.conf along with what the job is asking for? On Tue, Feb 8, 2022, 5:25 PM wrote: > > Dear all, > > sbatch jobs are im

[slurm-users] slurmrestd with RS256 tokens

2022-02-08 Thread John Yost
Has anyone got this to work? I have HS256 working fine, but when I try RS256 I get an error that the token is missing the kid field This is the decoded token: {'exp': 1644350831, 'iat': 1644343631, 'sub': 'slurm', 'kid': 'grm', 'alg': 'RS256'} I can see in the code where the error is being throw

[slurm-users] JobComp file not rotating

2022-02-08 Thread Stuart Barkley
I am using: JobCompType=jobcomp/filetxt JobCompLoc=/var/log/slurm/jobcomp.log This file is not being reopened on SIGUSR2 which makes rotating the log file difficult. slurmctld(8) says SIGUSR2 should be used for logrotate but only mentions the main slurmctld log file. Using SIGHUP does reope

Re: [slurm-users] Is this a known error?

2022-02-08 Thread Nicolas Greneche
Hi, I had the same issue. It was just because I had an older slurmctld somwhere with the node set to drain. Even if the node was drain in the old slurmctld, it tries to connect to slurmd. Le 08/12/2021 à 18:03, Sean McGrath a écrit : Hi Bjørn-Helge, Thanks for that. On Wed, Dec 08, 2021 at