Hi David,
I have had time to look into your current problem, but inline I have
some comments about the general approach.
David Baker writes:
> Hello,
>
> Could someone please give me some advice on setting up the fairshare
> in a cluster. I don't think the present setup is wildly incorrect,
> h
On 6/6/19 12:01 PM, Kilian Cavalotti wrote:
Levi did already.
Aha, race condition between searching bugzilla and writing the email. ;-)
--
Chris Samuel : http://www.csamuel.org/ : Berkeley, CA, USA
On Thu, Jun 6, 2019 at 11:16 AM Christopher Samuel wrote:
> Sounds like a good reason to file a bug.
Levi did already. Everybody can vote at
https://bugs.schedmd.com/show_bug.cgi?id=7191 :)
Cheers,
--
Kilian
On 6/6/19 10:21 AM, Levi Morrison wrote:
This means all OpenMPI programs that end up calling `srun` on Slurm
19.05 will fail.
Sounds like a good reason to file a bug. We're not on 19.05 yet so
we're not affected (yet) but this may cause us some pain when we get to
that point (though at leas
Slurm 19.05 removed support for `--cpu_bind`, which is what all released
versions of OpenMPI are using when they call into srun. This issue was
fixed 24 days ago in [OpenMPI's git repo][1].
This means all OpenMPI programs that end up calling `srun` on Slurm
19.05 will fail.
This enormous amo
Slurm 19.05 removed support for `--cpu_bind`, which is what /all/
released versions of OpenMPI are using when they call into srun. This
issue was fixed 24 days ago in [OpenMPI's git repo][1].
This means /all/ OpenMPI programs that end up calling `srun` on Slurm
19.05 will fail.
This enormous
Hello,
Could someone please give me some advice on setting up the fairshare in a
cluster. I don't think the present setup is wildly incorrect, however either my
understanding of the setup is wrong or something is reconfigured.
When we set a new user up on the cluster and they haven't used any
Anyone know what would happen to running jobs if we switch to cgroups? We
missed getting this set when we had a general cluster shutdown and want to get
it set but do have running jobs at the moment. Thanks
Deborah Crocker, PhD
Systems Engineer III
Office of Information Technology
The University
Hello,
We have tried to compile it in 2 ways, in principle we had compiled it
with pmix in the following way:
rpmbuild -ta slurm-19.05.0.tar.bz2 --define = '_ with_pmix --with-pmix =
/ opt / pmix / 3.1.2 /'
But we have also tried compiling it without pmix:
rpmbuild -ta slurm-19.05.0.tar.bz2
How did you compile SLURM? Did you add the contribs/pmi and/or contribs/pmi2
plugins to the install? Or did you use PMIx?
Sean
--
Sean Crosby
Senior DevOpsHPC Engineer and HPC Team Lead | Research Platform Services
Research Computing | CoEPP | School of Physics
University of Melbourne
On Thu,
Hello,
Yes, we have recompiled OpenMPI with integration with SLURM 19.05 but
the problem remains.
We have also tried to recompile OpenMPI without integration with SLURM.
In this case executions fail with srun, but with mpirun it continues to
work in SLURM 18.08 and fails in 19.05 in the same
Hi Andrés,
Did you recompile OpenMPI after updating to SLURM 19.05?
Sean
--
Sean Crosby
Senior DevOpsHPC Engineer and HPC Team Lead | Research Platform Services
Research Computing | CoEPP | School of Physics
University of Melbourne
On Thu, 6 Jun 2019 at 20:11, Andrés Marín Díaz
mailto:ama...@
Hi all,
I have installed udocker in our hcp infrastructure. I've been testing and I
found a very strange behaviour regarding memory comsumption. If a launch a
job with a memory reservation equal or higher to 4GB, any udocker container
can bypass this limit and use all the memory available in the n
Hi,
I'm having problems trying to remove a wckey associated with a user account.
According to the documentation it should be simply a case of 'sacctmgr del user
wckey=' but when I try it doesn't seem to like it.
An example for a user called user1 ...
# sacctmgr add user user1 wckey=test1
WCKe
Thank you very much for the help, I update some information.
- If we use Intel MPI (IMPI) mpirun it works correctly.
- If we use mpirun without using the scheduler it works correctly.
- If we use srun with software compiled with OpenMPI it works correctly.
- If we use SLURM 18.08.6 it works corre
15 matches
Mail list logo