Hello,
Does anyone know if there is any documentation about the NVIDIA IMEX plugin for
Slurm 24.05?
It is not even in man page for slurm.conf, though it is in the release notes.
Best regards,
Taras
--
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to s
Hello,
In the past it was recommended to reconfigure slurm daemons in logrotate
script, sending a signal I believe was also the way to go. But recently I
retested manual logrotation and I see that a removal of log file (for
slurmctld, slurmdbd or slurmd) does not affect the logging of the daemo
2.6 and 22.05.10 are now
available (CVE-2023-41914)
Taras Shapovalov writes:
> Are the older versions affected as well?
Yes, all older versjons are affected.
--
B/H
Are the older versions affected as well?
Best regards,
Taras
From: slurm-users on behalf of Tim
Wickberg
Sent: Thursday, October 12, 2023 00:01
To: slurm-annou...@schedmd.com ;
slurm-us...@schedmd.com
Subject: [slurm-users] Slurm versions 23.02.6 and 22.05.1
Hey,
I noticed a weird behavior of Slurm 21 and 22. When the following conditions
are satisfied, then Slurm implicitly sets job memory request equal to
RealMemory of some node (perhaps first node that satisfies other job's
requests, but this is not documented, or I could not find in the documen
Hi Marcus,
This may depend on ConstrainDevices in cgroups.conf. I guess it is set to
"no" in your case.
Best regards,
Taras
On Tue, Jun 23, 2020 at 4:02 PM Marcus Wagner
wrote:
> Hi Kota,
>
> thanks for the hint.
>
> Yet, I'm still a little bit astonished, as if I remember right,
> CUDA_VISIBL
Hey Robert,
Ask Bright support, they will help you to figure out what is going on there.
Best regards,
Taras
On Tue, Feb 11, 2020 at 8:26 PM Robert Kudyba wrote:
> This is still happening. Nodes are being drained after a kill task failed.
> Could this be related to https://bugs.schedmd.com/sho
Hey guys,
Do you know if there is a way to build Slurm with datawarp plugin on a
regular RHEL7 machine without Cray environment (without DataWarp installed)?
Best regards,
Taras
from your
configuration now.
The error message suggests to "consider" this somehow. But I don't get how
we should consider this.
Best regards,
Taras
On Wed, Nov 6, 2019 at 5:30 AM Chris Samuel wrote:
> On 5/11/19 6:36 am, Taras Shapovalov wrote:
>
> > Since Slurm 19.0
Hey guys,
Since Slurm 19.05.3 we get an error message that FastSchedule is
deprecated. But I cannot find in the documentation what is an alternative
option for FastSchedule=0. Do you know how we can do that without using the
option since 19.05.3?
Best regards,
Taras
Hey guys,
Do I understand correctly that Slurm19 is not compatible with rhel8? It is
not in the list https://slurm.schedmd.com/platforms.html
Has anyone successfully built Surm19 on rhel8 (or centos8)?
Best regards,
Taras
Hi Dave,
I can confirm that CoreSpecCount can not be reset to 0 once it is set >0
(at least for FastSchedule>0). As a workaround for this bug you can try to
stop slurmctld, remove node_state file and start slurmctld again.
Best regards,
Taras
On Fri, Aug 9, 2019 at 11:54 PM Guertin, David S.
w
Hey guys,
When a job max time is exceeded, then Slurm tries to kill the job and fails:
[2019-03-15T09:44:03.589] sched: _slurm_rpc_allocate_resources JobId=1325
NodeList=rn003 usec=355
[2019-03-15T09:44:03.928] prolog_running_decr: Configuration for JobID=1325
is complete
[2019-03-15T09:45:12.739
Thank you, guys,
Lets wait for 17.11.8. Any estimation for the release date?
Best regards,
Taras
On Wed, Jul 11, 2018 at 12:11 AM Kilian Cavalotti <
kilian.cavalotti.w...@gmail.com> wrote:
> On Tue, Jul 10, 2018 at 10:34 AM, Taras Shapovalov
> wrote:
> > I noticed the
Hey guys,
When we upgraded to 17.11.7, then on some clusters all jobs are killed with
these messages:
slurmstepd: error: Job 374 exceeded memory limit (1308 > 1024), being
killed
slurmstepd: error: Exceeded job memory limit
slurmstepd: error: *** JOB 374 ON node002 CANCELLED AT
2018-06-28T0
Hey guys,
We always use the default value for SlurmUser, but now we have realized
that we don't really get why it is user slurm, but not root. Sometimes it
is useful to run SlurmctlProlog as root, but then slurmctld will also run
as root. Other workload managers are ok to run their control daemons
16 matches
Mail list logo