It very much rang a bell!
I think there is also an scontrol command that you can use to show the actual
running config (probably “show config”), which will include the defaults if you
are seeing something that you don’t have specified in the config file.
Sent from my iPhone
On Jun 15, 2022, at
Well, you nailed it.
Honestly a little surprised it was working to begin with.
In the DBD conf
> -#DbdPort=7031
> +DbdPort=7031
And then in the slurm.conf
> -#AccountingStoragePort=3306
> +AccountingStoragePort=7031
I’m not sure how my slurm.conf showed the 3306 mysql port commented out.
I did
Apologies for not having more concrete information available when I’m replying
to you, but I figured maybe having a fast hint might be better.
Have a look at how the various daemons communicate with one another. This
sounds to me like a firewall thing between maybe the SlurmCtld and where the
S
Hoping this is an easy answer.
My mysql instance somehow corrupted itself, and I’m having to purge and start
over.
This is ok, because the data in there isn’t too valuable, and we aren’t making
use of associations or anything like that yet (no AccountingStorageEnforce).
That said, I’ve decided
I think the problem might be that you are not requesting memory, so by default,
all memory on a node is allocated to the job and "cons_res" will not allocate a
second job to any node. That comes up quite often.
Gareth
-Original Message-
From: slurm-users On Behalf Of
Guillaume De Naye
Good afternoon All,
In the spirit of scheduler ecumenicalism, I would like to invite folks
on this list to the (hybrid) European HTCondor workshop this October.
While the bulk of presentations will focus on dHTC workflows, the
HTCondor community does have a number of ongoing collaborations wit
On 06/15/2022 05:25 PM, Ward Poelmans wrote:
> Hi Guillaume,
>
> On 15/06/2022 16:59, Guillaume De Nayer wrote:
>>
>> Perhaps I missunderstand the Slurm documentation...
>>
>> As thought that the --exclusive option used in combination with sbatch
>> will reserve the whole node (40 cores) for the j
Hi Guillaume,
On 15/06/2022 16:59, Guillaume De Nayer wrote:
Perhaps I missunderstand the Slurm documentation...
As thought that the --exclusive option used in combination with sbatch
will reserve the whole node (40 cores) for the job (submitted with
sbatch). This part is working fine. I can c
On 06/15/2022 03:53 PM, Frank Lenaerts wrote:
> On Wed, Jun 15, 2022 at 02:20:56PM +0200, Guillaume De Nayer wrote:
>> One collegue has to run 20,000 jobs on this machine. Every job starts
>> his program with mpirun on 12 cores. The standard slurm behavior makes
>> that the node, which runs this jo
On Wed, Jun 15, 2022 at 02:20:56PM +0200, Guillaume De Nayer wrote:
> One collegue has to run 20,000 jobs on this machine. Every job starts
> his program with mpirun on 12 cores. The standard slurm behavior makes
> that the node, which runs this job is blocked (and 28 cores are idle).
> The small c
On Wed, Jun 15, 2022 at 02:20:56PM +0200, Guillaume De Nayer wrote:
> In order to solve this problem I'm trying to start some subtasks with
> srun inside a batch job (without mpirun for now):
>
> #!/bin/bash
> #SBATCH --job-name=test_multi_prog_srun
> #SBATCH --nodes=1
> #SBATCH --partition=short
On 06/15/2022 02:48 PM, Tina Friedrich wrote:
> Hi Guillaume,
>
Hi Tina,
> in that example you wouldn't need the 'srun' to run more than one task,
> I think.
>
You are correct. To start a program like sleep I could simply run:
sleep 20s &
sleep 30s &
wait
However, my objective is to use mpiru
Hi Guillaume,
in that example you wouldn't need the 'srun' to run more than one task,
I think.
I'm not 100% sure, but to me it sounds like you're currently assigning
whole nodes to jobs rather than cores (i.e have
'SelectType=select/linear' and no OverSubscribe) and find that to be
wastefu
Dear all,
I'm new on this list. I am responsible for several small clusters at our
chair.
I set up slurm 21.08.8-2 on a small cluster (CentOS 7) with 8 nodes:
NodeName=node0[1-8] CPUs=40 Boards=1 SocketsPerBoard=2 CoresPerSocket=20
ThreadsPerCore=1
One collegue has to run 20,000 jobs on this mac
14 matches
Mail list logo