Hi Mike,
I'm just discussing a familiar case with SchedMD right now (ticket
13309). But it seems that it is not possible with Slurm to submit jobs
that request features/configuration that are not available at the moment
of submission.
Cheers,
Alexander
Am 08.02.2022 um 23:26 schrieb z1...
I can duplicate this error word for word by submitting a job asking for
150gb of memory when the nodes in that partition have a maximum of 128GB.
Take a look at the memory values in your node specifications and in your
job script or command line. Maybe there is a typo.
On Tue, Feb 8, 2022, 8:40 P
I’m not 100% certain that this affects this situation, but there’s a slurm.conf
setting called EnforcePartLimits that you might want to change.
--
#BlackLivesMatter
|| \\UTGERS, |---*O*---
||_// the State | Ryan Novosielski - novos
What I'm saying is that the job might not be able to run in that partition.
Ever. The job might be asking for more resources than the partition can
provide. Maybe I'm wrong but it would help to know what the partition
definition is, along with what resources the nodes in that partition have
specifi
On 2/8/22 2:26 pm, z1...@arcor.de wrote:
These jobs should be accepted, if a suitable node will be active soon.
For example, these jobs could be in PartitionConfig.
From memory if you submit jobs with the `--hold` option then you should
find they are successfully accepted - I've used that in
Yes, the partition does not meet the requirements now.
The job should still be submitted and wait until requirements are available.
On 09.02.22 00:11, Stephen Cousins wrote:
> I think this message comes up when there are no nodes in that partition
> have the resources capable to meet the require
I think this message comes up when there are no nodes in that partition
have the resources capable to meet the requirements. Can you show what the
partition definition is in slurm.conf along with what the job is asking for?
On Tue, Feb 8, 2022, 5:25 PM wrote:
>
> Dear all,
>
> sbatch jobs are im
Has anyone got this to work? I have HS256 working fine, but when I try
RS256 I get an error that the token is missing the kid field
This is the decoded token:
{'exp': 1644350831, 'iat': 1644343631, 'sub': 'slurm', 'kid': 'grm', 'alg':
'RS256'}
I can see in the code where the error is being throw
I am using:
JobCompType=jobcomp/filetxt
JobCompLoc=/var/log/slurm/jobcomp.log
This file is not being reopened on SIGUSR2 which makes rotating the
log file difficult. slurmctld(8) says SIGUSR2 should be used for
logrotate but only mentions the main slurmctld log file.
Using SIGHUP does reope
Hi,
I had the same issue. It was just because I had an older slurmctld
somwhere with the node set to drain. Even if the node was drain in the
old slurmctld, it tries to connect to slurmd.
Le 08/12/2021 à 18:03, Sean McGrath a écrit :
Hi Bjørn-Helge,
Thanks for that.
On Wed, Dec 08, 2021 at
10 matches
Mail list logo