Re: [slurm-users] RPM build error - accounting_storage_mysql.so

2019-11-12 Thread Ole Holm Nielsen
Hi Daniel, Thanks for sharing your insights! I have updated my Wiki page https://wiki.fysik.dtu.dk/niflheim/Slurm_installation#install-mariadb-database now. /Ole On 11/12/19 8:52 AM, Daniel Letai wrote: On 11/12/19 9:34 AM, Ole Holm Nielsen wrote: On 11/11/19 10:14 PM, Daniel Letai wrote

Re: [slurm-users] RPM build error - accounting_storage_mysql.so

2019-11-12 Thread William Brown
Thanks to all for the useful suggestions. I guess the main point is to read the MariaDB updated instructions; I had looked at Release Notes but what seems to have changed is the packaging, so that MariaDB-shared is now required if using -lmariadb in the build. I can see that it is likely also p

Re: [slurm-users] oom-kill events for no good reason

2019-11-12 Thread David Baker
Hello, Thank you all for your useful replies. I double checked that the oom-killer "fires" at the end of every job on our cluster. As you mention this isn't significant and not something to be concerned about. Best regards, David From: slurm-users on behalf of

[slurm-users] Replace SGE by Slurm on running cluster

2019-11-12 Thread Nguyen Dai Quy
Hi list, We have a small HPC Linux Cluster (CentOS 7, xCAT,...) with 8 nodes running actually with SGE. We would like to replace SGE by Slurm. Do you have any experience with this kind of work? Thank you,

[slurm-users] Profiling plugin for prometheus

2019-11-12 Thread Alexandre Larouche
Hey there, I'm just letting you know that I've worked on a profiling plugin for Prometheus. The code is hosted on Github: https://github.com/Quoding/acct_gather_profile_prometheus . It is not thoroughly tested and the tests I've done were on Slurm 17. I developed it to use in another project but

[slurm-users] Combining Preemption and cons_res

2019-11-12 Thread Anthony Ruth
Hello, I recently changed our slurm.conf file to allow for job preemption. While making this change, I also chose to use select/cons_res to try and understand how preemption would interact with our future upgrade which will include GPUs. The code we will run on the GPUs can only use a single C

Re: [slurm-users] Replace SGE by Slurm on running cluster

2019-11-12 Thread William Brown
In my last role we moved from SGE to Slurm. However we did this by using VMs for all the control, login, slurmDBD and MariaDB nodes, so it was easy enough to build a Slurm cluster up to the point where it needed compute nodes. We then removed compute nodes in groups from SGE, reinstalled w

Re: [slurm-users] Replace SGE by Slurm on running cluster

2019-11-12 Thread Nguyen Dai Quy
On Tue, Nov 12, 2019 at 9:26 PM William Brown wrote: > In my last role we moved from SGE to Slurm. > > > > However we did this by using VMs for all the control, login, slurmDBD and > MariaDB nodes, so it was easy enough to build a Slurm cluster up to the > point where it needed compute nodes. W