This is the first place I've had regularly scheduled maintenance, too,
and boy does it make life easier. In most of my previous jobs, it was a
small enough environment that it wasn't necessary.
On 3/22/19 1:57 PM, Christopher Samuel wrote:
On 3/22/19 10:31 AM, Prentice Bisbal wrote:
Most HP
On 3/22/19 10:31 AM, Prentice Bisbal wrote:
Most HPC centers have scheduled downtime on a regular basis.
That's not my experience before now, where I've worked in Australia we
scheduled maintenance for when we absolutely had to do them, but there
could be delays to them if there were critica
Rafael,
Most HPC centers have scheduled downtime on a regular basis. Typically
it's one day a month, but I I know that at Argonne National Lab, which
is a DOE Leadership Computing Facility that house some of the largest
supercomputers in the world for use by a large number of scientists,
they
Hi all,
I think it's not that easy to keep SLURM up to date in a cluster of more
than 3k nodes with a lot of users. I mean, that cluster has only a little
more than 2 years old and my today's submission got the JOBID 12711473, the
queue has 9769 jobs (squeue | wc -l). In two years there were only
On 3/21/19 4:40 PM, Reuti wrote:
Am 21.03.2019 um 16:26 schrieb Prentice Bisbal :
On 3/20/19 1:58 PM, Christopher Samuel wrote:
On 3/20/19 4:20 AM, Frava wrote:
Hi Chris, thank you for the reply.
The team that manages that cluster is not very fond of upgrading SLURM, which I
understand.
A
Prentice Bisbal
Lead Software Engineer
Princeton Plasma Physics Laboratory
http://www.pppl.gov
On 3/21/19 12:21 PM, Loris Bennett wrote:
Hi Ryan,
Ryan Novosielski writes:
On Mar 21, 2019, at 11:26 AM, Prentice Bisbal wrote:
On 3/20/19 1:58 PM, Christopher Samuel wrote:
On 3/20/19 4:20 AM
On 3/21/19 11:49 AM, Ryan Novosielski wrote:
On Mar 21, 2019, at 11:26 AM, Prentice Bisbal wrote:
On 3/20/19 1:58 PM, Christopher Samuel wrote:
On 3/20/19 4:20 AM, Frava wrote:
Hi Chris, thank you for the reply.
The team that manages that cluster is not very fond of upgrading SLURM, which I
Hi Loris,
On 3/21/19 6:21 PM, Loris Bennett
wrote:
Chris, maybe
you should look at EasyBuild
(https://easybuild.readthedocs.io/en/latest/). That way you can install
all the dependencies (such as zlib) as modules and be pretty much
independent of
> Am 21.03.2019 um 16:26 schrieb Prentice Bisbal :
>
>
> On 3/20/19 1:58 PM, Christopher Samuel wrote:
>> On 3/20/19 4:20 AM, Frava wrote:
>>
>>> Hi Chris, thank you for the reply.
>>> The team that manages that cluster is not very fond of upgrading SLURM,
>>> which I understand.
>
> As a sy
There are 2 kinds of system admins: can do and can't do. You're a can
do; his are can't do.
On 3/21/19 10:26 AM, Prentice Bisbal wrote:
>
> On 3/20/19 1:58 PM, Christopher Samuel wrote:
>> On 3/20/19 4:20 AM, Frava wrote:
>>
>>> Hi Chris, thank you for the reply.
>>> The team that manages that
On 3/21/19 9:21 AM, Loris Bennett wrote:
Chris, maybe you should look at EasyBuild
(https://easybuild.readthedocs.io/en/latest/). That way you can install
all the dependencies (such as zlib) as modules and be pretty much
independent of the ancient packages your distro may provide (other
softwar
Hi Ryan,
Ryan Novosielski writes:
>> On Mar 21, 2019, at 11:26 AM, Prentice Bisbal wrote:
>> On 3/20/19 1:58 PM, Christopher Samuel wrote:
>>> On 3/20/19 4:20 AM, Frava wrote:
>>>
Hi Chris, thank you for the reply.
The team that manages that cluster is not very fond of upgrading SLUR
> On Mar 21, 2019, at 11:26 AM, Prentice Bisbal wrote:
> On 3/20/19 1:58 PM, Christopher Samuel wrote:
>> On 3/20/19 4:20 AM, Frava wrote:
>>
>>> Hi Chris, thank you for the reply.
>>> The team that manages that cluster is not very fond of upgrading SLURM,
>>> which I understand.
>
> As a syste
On 3/20/19 1:58 PM, Christopher Samuel wrote:
On 3/20/19 4:20 AM, Frava wrote:
Hi Chris, thank you for the reply.
The team that manages that cluster is not very fond of upgrading
SLURM, which I understand.
As a system admin who manages clusters myself, I don't understand this.
Our job is
On 3/20/19 4:20 AM, Frava wrote:
Hi Chris, thank you for the reply.
The team that manages that cluster is not very fond of upgrading SLURM,
which I understand.
Do be aware that Slurm 17.11 will stop being maintained once 19.05 is
released in May.
So basically my heterogeneous job that only
Hi Chris, thank you for the reply.
The team that manages that cluster is not very fond of upgrading SLURM,
which I understand.
So basically my heterogeneous job that only have one step is considered to
have multiple steps and that's a bug in SLURM 17.11.12 ?
Le mer. 20 mars 2019 à 07:02, Chris Sa
On Tuesday, 19 March 2019 2:03:27 PM PDT Frava wrote:
> I'm struggling to get an heterogeneous job to run...
> The SLURM version installed on the cluster is 17.11.12
Your Slurm is too old for this to work, you'll need to upgrade to 18.08.
I believe you can enable them with "enable_hetero_steps" o
Hi all,
I'm struggling to get an heterogeneous job to run...
The SLURM version installed on the cluster is 17.11.12
Here are the SBATCH file parameters of the job :
#!/bin/bash
#SBATCH --threads-per-core=1
#SBATCH --
18 matches
Mail list logo