Hello Veronique,
What is the value of innodb_buffer_pool_size in my.cnf ? (assuming
you're using mariadb)
Don't hesitate to set it to some GBs, ideally a little more than the
size of your DB, if you have enough memory on the server. This improves
the overall performance of the database, spec
A workaround is to pre-configure future nodes and mark them as down - then when
you add them you can just mark them as up.
(see the DownNodes parameter)
Hope this helps!
Merlin
--
Merlin Hartley
Computer Officer
MRC Mitochondrial Biology Unit
Cambridge, CB2 0XY
United Kingdom
> On 22 Oct 2017,
Hello SLURM afficionados,
I would like to know if it's possible to restrict nodes/partitions
utilization depending on users resources asked ? And restrict resources that
users can ask ? For example, I would like each user can only ask for an
amount of RAM for their jobs and can't specify others
I have added nodes to an existing partition several times using the same
procedure which you describe, and no bad side effects have been noticed.
This is a very normal kind of operation in a cluster, where hardware
may be added or retired from time to time, while the cluster of course
contin
Ole Holm Nielsen writes:
> I have added nodes to an existing partition several times using the same
> procedure which you describe, and no bad side effects have been noticed. This
> is a very normal kind of operation in a cluster, where hardware may be added
> or retired from time to time, while
Hi Jin,
Your slurmctld.log says "Node compute004 appears to have a different
slurm.conf than the slurmctld" etc. This will happen if you didn't copy
correctly the slurm.conf to the nodes. Please correct this potential error.
Also, please specify which version of Slurm you're running.
/Ole
Hi
Thanks everyone for your response. I have also tested my setup to remove
nodes from the cluster, and the same thing happens.
*To answer some of the previous questions.*
"Node compute004 appears to have a different slurm.conf than the slurmctld"
error comes up when I replace slurm.conf in all t
Hi Jin,
I think that I always do your steps 3,4 in the opposite order: Restart
slurmctld, then slurmd on nodes:
> 3. Restart the slurmd on all nodes
> 4. Restart the slurmctld
Since you run a very old Slurm 15.08, perhaps you should upgrade 15.08
-> 16.05 -> 17.02. Soon there will be a 17.
The reason for restarting slurmctld before slurmd on nodes is Moe
Jette's advise in
http://thread.gmane.org/gmane.comp.distributed.slurm.devel/3039
I would recommend
1. Stop slurmctld
2. Update slurm.conf on all nodes
3. Restart slurmctld
4. Start slurmd on the new nodes
/Ole
On 10/23/201
There is a early thread related to this:
https://groups.google.com/forum/#!searchin/slurm-devel/gres$20gpu$20oversubscribe%7Csort:date/slurm-devel/WPmkNPedKeM/r7EDvX7jujgJ
On Sat, Oct 21, 2017 at 10:58 PM, Chaofeng Zhang
wrote:
> CUDA support it, gpu is shared mode by default, we can have mor
The deeper I dig at the select/cons_res plugin, the more of a mess it appears
to be. Inconsistencies with the documentations, etc.
The primary issue seems to be with the select/cons_res node selection lacking
"--ntasks-per-node" et al. By default, the algorithm selects "--nodes=N"
nodes, the
11 matches
Mail list logo