On Wed, 2017-10-18 at 09:36:42 -0600, Michael Di Domenico wrote:
>
> is there anyway after a job starts to determine why the scheduler
> choose the series of nodes it did?
>
> for some reason on an empty cluster when i spin up a large job it's
> staggering the allocation across a seemingly rando
On Thursday, 19 October 2017 4:43:13 PM AEDT Nadav Toledo wrote:
> so adding manually only works if I dont restart slurmctrld...
That usually points to a communication problem for slurmdbd trying to tell the
slurmctld about these changes via an RPC.
What does this say?
sacctmgr list clusters
slurmdbd and slurmctrld are on the same machine...
sacctmgr list clusters format=cluster,controlhost,controlport
Cluster ControlHost ControlPort
-- ---
imageproc+ 127.0.0.1 6817
Its still beta stage for testing, so
Are they both running as the same user? (They need to to trust each
other's messages)
On Oct 19, 2017 01:12, "Nadav Toledo" wrote:
> slurmdbd and slurmctrld are on the same machine...
>
> sacctmgr list clusters format=cluster,controlhost,controlport
>ClusterControlHost ControlPort
>
Re: [slurm-dev] Re: Qos limits associations and AD auth yeah,
slurmuser=slurmduser=root
On 19/10/2017 11:17, Douglas Jacobsen wrote:
Are they both running as the same user? (They need to to trust each
other's messages)
On Oct 19, 2017 01:12, "Nadav Toledo" <1nadavtol...@cs.techn
Re: [slurm-dev] Re: Qos limits associations and AD auth also running
with sudo both slurmctrld and slurmdbd
perhaps i should "sudo su -" first?
On 19/10/2017 11:17, Douglas Jacobsen wrote:
Are they both running as the same user? (They need to to trust each
other's messages)
O
Re: [slurm-dev] Re: Qos limits associations and AD auth Re:
[slurm-dev] Re: Qos limits associations and AD auth I found a hint,
running sacctmgr show problem:
Cluster Account User Problem
-- -- --
domain_name
a
On Thursday, 19 October 2017 7:41:37 PM AEDT Nadav Toledo wrote:
> running : id -u domain_name\\username , does return its uid
So your system is not finding users as just "username", but instead only as
domain_name\\username which is probably not ideal.
You probably want to see if you can find
Thank you for your help Chris!
sacctmgr list config | fgrep Purge
PurgeEventAfter= 11 days
PurgeJobAfter = 61 days
PurgeResvAfter = 11 days
PurgeStepAfter = 11 days
PurgeSuspendAfter = 11 days
[2017-10-18T09:04:52.098] slurmdbd version 15.08.3 started
[2017
Hi Benjamin,
On 18 October 2017 at 15:13, Benjamin Redling
wrote:
oversubscribe fka. shared
Thanks, I'll try that, so, Shared=YES in the partition config.
> Search in the archive documentatin for 15.08:
> https://slurm.schedmd.com/archive/
Great, thank you. It seems that --share is the old
Slurm version 17.02.8 contains about 42 bug fixes developed over the
past two months.
Slurm downloads are available from https://www.schedmd.com/downloads.php
Details about the changes are listed below.
* Changes in Slurm 17.02.8
==
-- Add 'slurmdbd:' to the account
Many thanks to all the attendees, and especially to all those who
presented at the Slurm User Group 2017 meeting in Berkeley. That you to
NERSC as well for hosting, and I hope to see many of you at SLUG18 at
CIEMAT in Madrid, Spain.
PDFs of the presentations are online at
http://slurm.schedm
On 19 October 2017 at 20:37, Chris Samuel wrote:
>
> On Thursday, 19 October 2017 7:41:37 PM AEDT Nadav Toledo wrote:
>
> > running : id -u domain_name\\username , does return its uid
>
> So your system is not finding users as just "username", but instead only as
> domain_name\\username which is
13 matches
Mail list logo