Hello everyone,
Does anyone know of way to get amount of idle gpu per node or for
all cluster ?
sinfo -o %G gives the total amount of gres resource for each node.
Is there a way to get the idle amount same as you can get for cpu
(%C)?
Perhaps if one use
Many thanks Matthieu!
Andy
On 02/12/2018 06:42 PM, Matthieu Hautreux wrote:
Hi,
your login node may have a heavy load while starting such a large
number of independant sruns.
This may induce issues not seen under normal load, like partial
read/write on sockets, triggering bugs in slurm, f
Hi,
your login node may have a heavy load while starting such a large number of
independant sruns.
This may induce issues not seen under normal load, like partial read/write
on sockets, triggering bugs in slurm, for functions not properly protected
against such events.
Quickly looking at the sou
We have a user who wants to run multiple instances of a single process
job across a cluster, using a loop like
-
for N in $nodelist; do
srun -w $N program &
done
wait
-
This works up to a thousand nodes or so (jobs are allocated by node
here), but as the number of jobs submitted i
We recently brought a new cluster online with the desire to federate it with
our existing cluster. See the full story here:
https://bugs.schedmd.com/show_bug.cgi?id=4512
There are some fairly large limitations to federation, the biggest of which
(for us anyway) was:
> The current implementati
On 2018-02-12 11:37, Fabien ELOY wrote:
Hello,
I am trying to set priority ... but it doesn't work !
If I type sudo srun --priority=X, it's OK. But if I use my "standard"
user it's not OK (priority calculated by slurm).
I do not have a database used with SLURM.
Il my slurm.conf, "SlurmUs
Fabien ELOY writes:
>> 2018-02-12 11:51 GMT+01:00 Loris Bennett :
>>
>> Hi Fabien,
>>
>> Fabien ELOY writes:
>>
>> > Hello,
>> >
>> > I am trying to set priority ... but it doesn't work !
>> >
>> > If I type sudo srun --priority=X, it's OK. But if I use my "standard"
>> user it's not OK
Hi Loris,
Thank you for your reply.
SLURM jobs are submitted by a JAVA application and there is only one SLURM
user.
Should we use another plugin (not multifactor plugin) ? Is it a way to fix
user rights ?
Below my slurm.conf ("anonymized") :
SlurmctldPidFile=/var/run/slurm-llnl/slurmctld.pid
Sl
Hi Fabien,
Fabien ELOY writes:
> Hello,
>
> I am trying to set priority ... but it doesn't work !
>
> If I type sudo srun --priority=X, it's OK. But if I use my "standard" user
> it's not OK (priority calculated by slurm).
>
> I do not have a database used with SLURM.
>
> Il my slurm.conf, "Slu
Hello,
I am trying to set priority ... but it doesn't work !
If I type sudo srun --priority=X, it's OK. But if I use my "standard" user
it's not OK (priority calculated by slurm).
I do not have a database used with SLURM.
Il my slurm.conf, "SlurmUser=slurm" and my server has 2 users in the sam
Hi all,
I was wondering if any of you can share your insights regarding
federations. What unexpected caveats have you encountered?
We have here about about 15 "small" clusters (due to political and
technical reasons), and most users have access to more than one
cluster. Federation seems like a g
11 matches
Mail list logo