Re: [slurm-users] SLURM heterogeneous jobs, a little help needed plz

2019-03-22 Thread Prentice Bisbal
This is the first place I've had regularly scheduled maintenance, too, and boy does it make life easier. In most of my previous jobs, it was a small enough environment that it wasn't necessary. On 3/22/19 1:57 PM, Christopher Samuel wrote: On 3/22/19 10:31 AM, Prentice Bisbal wrote: Most HP

Re: [slurm-users] SLURM heterogeneous jobs, a little help needed plz

2019-03-22 Thread Christopher Samuel
On 3/22/19 10:31 AM, Prentice Bisbal wrote: Most HPC centers have scheduled downtime on a regular basis. That's not my experience before now, where I've worked in Australia we scheduled maintenance for when we absolutely had to do them, but there could be delays to them if there were critica

Re: [slurm-users] SLURM heterogeneous jobs, a little help needed plz

2019-03-22 Thread Prentice Bisbal
Rafael, Most HPC centers have scheduled downtime on a regular basis. Typically it's one day a month, but I I know that at Argonne National Lab, which is a DOE Leadership Computing Facility that house some of the largest supercomputers in the world for use by a large number of scientists, they

Re: [slurm-users] SLURM heterogeneous jobs, a little help needed plz

2019-03-22 Thread Frava
Hi all, I think it's not that easy to keep SLURM up to date in a cluster of more than 3k nodes with a lot of users. I mean, that cluster has only a little more than 2 years old and my today's submission got the JOBID 12711473, the queue has 9769 jobs (squeue | wc -l). In two years there were only

Re: [slurm-users] SLURM heterogeneous jobs, a little help needed plz

2019-03-21 Thread Prentice Bisbal
On 3/21/19 4:40 PM, Reuti wrote: Am 21.03.2019 um 16:26 schrieb Prentice Bisbal : On 3/20/19 1:58 PM, Christopher Samuel wrote: On 3/20/19 4:20 AM, Frava wrote: Hi Chris, thank you for the reply. The team that manages that cluster is not very fond of upgrading SLURM, which I understand. A

Re: [slurm-users] SLURM heterogeneous jobs, a little help needed plz

2019-03-21 Thread Prentice Bisbal
Prentice Bisbal Lead Software Engineer Princeton Plasma Physics Laboratory http://www.pppl.gov On 3/21/19 12:21 PM, Loris Bennett wrote: Hi Ryan, Ryan Novosielski writes: On Mar 21, 2019, at 11:26 AM, Prentice Bisbal wrote: On 3/20/19 1:58 PM, Christopher Samuel wrote: On 3/20/19 4:20 AM

Re: [slurm-users] SLURM heterogeneous jobs, a little help needed plz

2019-03-21 Thread Prentice Bisbal
On 3/21/19 11:49 AM, Ryan Novosielski wrote: On Mar 21, 2019, at 11:26 AM, Prentice Bisbal wrote: On 3/20/19 1:58 PM, Christopher Samuel wrote: On 3/20/19 4:20 AM, Frava wrote: Hi Chris, thank you for the reply. The team that manages that cluster is not very fond of upgrading SLURM, which I

Re: [slurm-users] SLURM heterogeneous jobs, a little help needed plz

2019-03-21 Thread Daniel Letai
Hi Loris, On 3/21/19 6:21 PM, Loris Bennett wrote: Chris, maybe you should look at EasyBuild (https://easybuild.readthedocs.io/en/latest/). That way you can install all the dependencies (such as zlib) as modules and be pretty much independent of

Re: [slurm-users] SLURM heterogeneous jobs, a little help needed plz

2019-03-21 Thread Reuti
> Am 21.03.2019 um 16:26 schrieb Prentice Bisbal : > > > On 3/20/19 1:58 PM, Christopher Samuel wrote: >> On 3/20/19 4:20 AM, Frava wrote: >> >>> Hi Chris, thank you for the reply. >>> The team that manages that cluster is not very fond of upgrading SLURM, >>> which I understand. > > As a sy

Re: [slurm-users] SLURM heterogeneous jobs, a little help needed plz

2019-03-21 Thread Goetz, Patrick G
There are 2 kinds of system admins: can do and can't do. You're a can do; his are can't do. On 3/21/19 10:26 AM, Prentice Bisbal wrote: > > On 3/20/19 1:58 PM, Christopher Samuel wrote: >> On 3/20/19 4:20 AM, Frava wrote: >> >>> Hi Chris, thank you for the reply. >>> The team that manages that

Re: [slurm-users] SLURM heterogeneous jobs, a little help needed plz

2019-03-21 Thread Christopher Samuel
On 3/21/19 9:21 AM, Loris Bennett wrote: Chris, maybe you should look at EasyBuild (https://easybuild.readthedocs.io/en/latest/). That way you can install all the dependencies (such as zlib) as modules and be pretty much independent of the ancient packages your distro may provide (other softwar

Re: [slurm-users] SLURM heterogeneous jobs, a little help needed plz

2019-03-21 Thread Loris Bennett
Hi Ryan, Ryan Novosielski writes: >> On Mar 21, 2019, at 11:26 AM, Prentice Bisbal wrote: >> On 3/20/19 1:58 PM, Christopher Samuel wrote: >>> On 3/20/19 4:20 AM, Frava wrote: >>> Hi Chris, thank you for the reply. The team that manages that cluster is not very fond of upgrading SLUR

Re: [slurm-users] SLURM heterogeneous jobs, a little help needed plz

2019-03-21 Thread Ryan Novosielski
> On Mar 21, 2019, at 11:26 AM, Prentice Bisbal wrote: > On 3/20/19 1:58 PM, Christopher Samuel wrote: >> On 3/20/19 4:20 AM, Frava wrote: >> >>> Hi Chris, thank you for the reply. >>> The team that manages that cluster is not very fond of upgrading SLURM, >>> which I understand. > > As a syste

Re: [slurm-users] SLURM heterogeneous jobs, a little help needed plz

2019-03-21 Thread Prentice Bisbal
On 3/20/19 1:58 PM, Christopher Samuel wrote: On 3/20/19 4:20 AM, Frava wrote: Hi Chris, thank you for the reply. The team that manages that cluster is not very fond of upgrading SLURM, which I understand. As a system admin who manages clusters myself, I don't understand this. Our job is

Re: [slurm-users] SLURM heterogeneous jobs, a little help needed plz

2019-03-20 Thread Christopher Samuel
On 3/20/19 4:20 AM, Frava wrote: Hi Chris, thank you for the reply. The team that manages that cluster is not very fond of upgrading SLURM, which I understand. Do be aware that Slurm 17.11 will stop being maintained once 19.05 is released in May. So basically my heterogeneous job that only

Re: [slurm-users] SLURM heterogeneous jobs, a little help needed plz

2019-03-20 Thread Frava
Hi Chris, thank you for the reply. The team that manages that cluster is not very fond of upgrading SLURM, which I understand. So basically my heterogeneous job that only have one step is considered to have multiple steps and that's a bug in SLURM 17.11.12 ? Le mer. 20 mars 2019 à 07:02, Chris Sa

Re: [slurm-users] SLURM heterogeneous jobs, a little help needed plz

2019-03-19 Thread Chris Samuel
On Tuesday, 19 March 2019 2:03:27 PM PDT Frava wrote: > I'm struggling to get an heterogeneous job to run... > The SLURM version installed on the cluster is 17.11.12 Your Slurm is too old for this to work, you'll need to upgrade to 18.08. I believe you can enable them with "enable_hetero_steps" o

[slurm-users] SLURM heterogeneous jobs, a little help needed plz

2019-03-19 Thread Frava
Hi all, I'm struggling to get an heterogeneous job to run... The SLURM version installed on the cluster is 17.11.12 Here are the SBATCH file parameters of the job : #!/bin/bash #SBATCH --threads-per-core=1 #SBATCH --