Hi,
I ran into this recently after upgrading from 16.05.10 to 17.11.7 and couldn’t
run any jobs on any partitions. The only way I got around this was to set this
flag on all “NodeName” definitions in slurm.conf: RealMemory=
Where foo is the total memory of the nodes in MB. I believe the documen
Hi,
Hopefully this isn't an obvious fix I'm missing. We have a large number of KNL
nodes that can get rebooted when their memory or cluster modes are changed by
users. I never heard any complaints when running Slurm v16.05.10, but I've
seen a number of issues since our upgrade a couple months
John
On 10/10/18, 4:08 PM, "Roberts, John E." wrote:
Hi,
Hopefully this isn't an obvious fix I'm missing. We have a large number of
KNL nodes that can get rebooted when their memory or cluster modes are changed
by users. I never heard any complaints when run
TmpFS in slurm.conf wasn’t being honored from my experience from at least
v16.05.10. When I initially configured Slurm, I noticed this myself. As with
the user below, we are also just setting this elsewhere.
Thanks!
John
From: slurm-users on behalf of
Shenglong Wang
Reply-To: Slurm User Comm
Hi,
I'm not sure of the best way to solve this and I don't see any obvious things I
can set in the configuration. Please let me know if I'm missing something.
I have several partitions in Slurm (16.05). I also have many accounts with
users tied to them and all of the accounts have a CPU hour li
Hi,
The documentation is a little unclear to me, so I was wondering how do a
complete backup and restore of Slurm for testing and/or disaster recovery.
I'm looking to upgrade Slurm from 16.05.10 to the latest and I'm not sure all
of what should go. I stood up some VMs to test this upgrade and m
Hi,
I'm testing the newest version of Slurm and I'm seeing an issue when using the
newer billing TRES to charge for cpu time on a partition. I've seen that
billing should be used now instead of cpu in order to properly use the
"TRESBillingWeights" option on a partition.
In my test case, I gav
want.
On Fri, Apr 27, 2018 at 11:21 AM, Roberts, John E.
mailto:jerobe...@anl.gov>> wrote:
Hi,
I'm testing the newest version of Slurm and I'm seeing an issue when using the
newer billing TRES to charge for cpu time on a partition. I've seen that
billing should be used
Hi,
Unfortunately that can't be a solution in my running production environment for
a number of reasons. I did consider it (
Thanks!
-John
On 4/30/18, 2:40 AM, "slurm-users on behalf of Bjørn-Helge Mevik"
wrote:
"Roberts, John E." writes:
> So no
Hi,
Seeing this after an upgrade today. I now can't get any jobs to run. Things
were fin before the upgrade. Any Ideas?
slurmstepd: error: Job 535721 exceeded memory limit (1160 > 1024), being
killed
slurmstepd: error: Exceeded job memory limit
ulimit shows:
$ u
Renfro, Michael"
wrote:
Anything in particular set for DefMemPerCPU in your slurm.conf?
> On Jun 11, 2018, at 3:50 PM, Roberts, John E. wrote:
>
> Hi,
>
>Seeing this after an upgrade today. I now can't get any jobs to run.
Thing
n
On 6/11/18, 4:12 PM, "Roberts, John E." wrote:
Nothing I assume isn't correct:
DefMemPerNode = UNLIMITED
MaxMemPerNode = UNLIMITED
MemLimitEnforce = Yes
PropagateResourceLimitsExcept = MEMLOCK
CPU vars aren't
12 matches
Mail list logo