Hi there,
We've started looking at moving to the openmpi 1.8 branch from 1.6 on our
CentOS6/Son of Grid Engine cluster and noticed an unexpected difference
when binding multiple cores to each rank.
Has openmpi's definition 'slot' changed between 1.6 and 1.8? It used to
mean ranks, but now it
Hi,
While commissioning a new cluster, I wanted to run HPL across the whole
thing using openmpi 2.0.1.
I couldn't get it to start on more than 129 hosts under Son of Gridengine
(128 remote plus the localhost running the mpirun command). openmpi would
sit there, waiting for all the orted's to
nother. That of course assumes that
qrsh is in the same location on all nodes.
I've tested that it is possible to qrsh from the head node of a job to a slave
node and then on to
another slave node by this method.
William
On Jan 17, 2017, at 9:37 AM, Mark Dixon wrote:
Hi,
While co
Hi,
Just tried upgrading from 2.0.1 to 2.0.2 and I'm getting error messages
that look like openmpi is using ssh to login to remote nodes instead of
qrsh (see below). Has anyone else noticed gridengine integration being
broken, or am I being dumb?
I built with "./configure
--prefix=/apps/dev
On Fri, 3 Feb 2017, Reuti wrote:
...
SGE on its own is not configured to use SSH? (I mean the entries in
`qconf -sconf` for rsh_command resp. daemon).
...
Nope, everything left as the default:
$ qconf -sconf | grep _command
qlogin_command builtin
rlogin_command buil
On Fri, 3 Feb 2017, r...@open-mpi.org wrote:
I do see a diff between 2.0.1 and 2.0.2 that might have a related
impact. The way we handled the MCA param that specifies the launch agent
(ssh, rsh, or whatever) was modified, and I don’t think the change is
correct. It basically says that we don’t
On Mon, 6 Feb 2017, Mark Dixon wrote:
...
Ah-ha! "-mca plm_rsh_agent foo" fixes it!
Thanks very much - presumably I can stick that in the system-wide
openmpi-mca-params.conf for now.
...
Except if I do that, it means running ompi outside of the SGE environment
no longer works :(
Hi,
When combining OpenMPI 2.0.2 with OpenMP, I'm interested in launching a
number of ranks and allocating a number of cores to each rank. Using
"-map-by socket:PE=", switching to "-map-by node:PE=" if I want
to allocate more than a single socket to a rank, seems to do what I want.
Except fo
On Wed, 15 Feb 2017, r...@open-mpi.org wrote:
Ah, yes - I know what the problem is. We weren’t expecting a PE value of
1 - the logic is looking expressly for values > 1 as we hadn’t
anticipated this use-case.
Is it a sensible use-case, or am I crazy?
I can make that change. I’m off to a work
Hi,
We have some users who would like to try out openmpi MPI_THREAD_MULTIPLE
support on our InfiniBand cluster. I am wondering if we should enable it
on our production cluster-wide version, or install it as a separate "here
be dragons" copy.
I seem to recall openmpi folk cautioning that MPI_
On Fri, 17 Feb 2017, r...@open-mpi.org wrote:
Mark - this is now available in master. Will look at what might be
required to bring it to 2.0
Thanks Ralph,
To be honest, since you've given me an alternative, there's no rush from
my point of view.
The logic's embedded in a script and it's be
On Fri, 17 Feb 2017, r...@open-mpi.org wrote:
Depends on the version, but if you are using something in the v2.x
range, you should be okay with just one installed version
Thanks Ralph.
How good is MPI_THREAD_MULTIPLE support these days and how far up the
wishlist is it, please?
We don't ge
Hi,
I'm still trying to figure out how to express the core binding I want to
openmpi 2.x via the --map-by option. Can anyone help, please?
I bet I'm being dumb, but it's proving tricky to achieve the following
aims (most important first):
1) Maximise memory bandwidth usage (e.g. load balanc
ilto:users@lists.open-mpi.org>
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
--
-------
On Fri, 3 Mar 2017, Paul Kapinos wrote:
...
Note that on 1.10.x series (even on 1.10.6), enabling of
MPI_THREAD_MULTIPLE in lead to (silent) shutdown of the InfiniBand
fabric for that application => SLOW!
2.x versions (tested: 2.0.1) handle MPI_THREAD_MULTIPLE on InfiniBand
the right way up,
ll be make it into a release, please?
Thanks,
Mark
--
-----
Mark Dixon Email: m.c.di...@leeds.ac.uk
HPC/Grid Systems Support Tel (int): 35429
Information Systems Services Tel (ext): +44(0)113 343 5429
Univ
On Mon, 29 Nov 2010, Jeff Squyres wrote:
There's work going on right now to update the ROMIO in the OMPI v1.5
series. We hope to include it in v1.5.2.
Cheers Jeff :)
Mark
--
-
Mark Dixon Email: m
disabled by default, please?
I'm a bit puzzled, as this default seems in conflict with whole "Law of
Least Astonishment" thing. Have I missed some disaster that's going to
happen?
Thanks,
Mark
--
-----
Mark Dixon
in trunk, but seem to require you to at
least ask for "--enable-opal-multi-threads".
Are we supposed to be able to use MPI_THREAD_FUNNELED by default or not?
Best wishes,
Mark
--
-----
Mark Dixon Email
ue in a failed attempt to get rid of the messages:
$ cat /etc/modprobe.d/libmlx4_local.conf
options mlx4_core log_num_mtt=24 log_mtts_per_seg=3 log_num_srq=20
Any thoughts?
Thanks,
Mark
--
-----
Mark Dixon Email
knem?
Thanks,
Mark
--
-----
Mark Dixon Email: m.c.di...@leeds.ac.uk
HPC/Grid Systems Support Tel (int): 35429
Information Systems Services Tel (ext): +44(0)113 343 5429
University of Leeds, LS2 9JT, UK
-
All the best,
Mark
--
-----
Mark Dixon Email: m.c.di...@leeds.ac.uk
HPC/Grid Systems Support Tel (int): 35429
Information Systems Services Tel (ext): +44(0)113 343 5429
University of Leeds, LS2 9JT, UK
-
--
-
Mark Dixon Email: m.c.di...@leeds.ac.uk
HPC/Grid Systems Support Tel (int): 35429
Information Systems Services Tel (ext): +44(0)113 343 5429
University of Leeds, LS2 9JT, UK
-
--
-
Mark Dixon Email: m.c.di...@leeds.ac.uk
HPC/Grid Systems Support Tel (int): 35429
Information Systems Services Tel (ext): +44(0)113 343 5429
University of Leeds, LS2 9JT, UK
-
ot see any answer to this in the FAQ or list archives.
I've attached files showing the output of configure and my environment to
this message.
Is this expected?
Thanks,
Mark
--
-----
Mark Dixon Em
-lpthread
NOTICE: Invoking /apps/compilers/sunstudio/12_200709/1/sunstudio12/bin/f90
-f77 -ftrap=%none conftestf.f conftest.o -o conftest -lnsl -lutil -lm -lpthread
conftestf.f:
MAIN fpthread:
conftest.o: In function `pthreadtest_':
conftest.c:(.text+0x41): undefined refer
ers,
Mark
--
-
Mark Dixon Email: m.c.di...@leeds.ac.uk
HPC/Grid Systems Support Tel (int): 35429
Information Systems Services Tel (ext): +44(0)113 343 5429
University of Leeds, LS2 9JT, UK
-
On Tue, 29 Jul 2008, Jeff Squyres wrote:
On Jul 29, 2008, at 6:52 AM, Mark Dixon wrote:
FWIW: I compile with PGI 7.1.4 regularly on RHEL4U4 and don't see this
problem. It would be interesting to see the config.log's from these
builds to see the actual details of what
Hi,
We're intermittently seeing messages (below) about failing to register
memory with openmpi 2.0.2 on centos7 / Mellanox FDR Connect-X 3 and the
vanilla IB stack as shipped by centos.
We're not using any mlx4_core module tweaks at the moment. On earlier
machines we used to set registered m
Thanks Ralph, will do.
Cheers,
Mark
On Wed, 18 Oct 2017, r...@open-mpi.org wrote:
Put “oob=tcp” in your default MCA param file
On Oct 18, 2017, at 9:00 AM, Mark Dixon wrote:
Hi,
We're intermittently seeing messages (below) about failing to register memory
with openmpi 2.0.2 on ce
t]
could not find endpoint with port: 1, lid: 69, msg_type: 100
On Thu, 19 Oct 2017, Mark Dixon wrote:
Thanks Ralph, will do.
Cheers,
Mark
On Wed, 18 Oct 2017, r...@open-mpi.org wrote:
Put “oob=tcp” in your default MCA param file
On Oct 18, 2017, at 9:00 AM, Mark Dixon wrote:
Hi,
We&
Hi,
I’ve built parallel HDF5 1.8.21 against OpenMPI 4.0.1 on CentOS 7 and a
Lustre 2.12 filesystem using the OS-provided GCC 4.8.5 and am trying to
run the testsuite. I’m failing the testphdf5 test: could anyone help,
please?
I’ve successfully used the same method to pass tests when building H
Hi all,
I'm confused about how openmpi supports mpi-io on Lustre these days, and
am hoping that someone can help.
Back in the openmpi 2.0.0 release notes, it said that OMPIO is the default
MPI-IO implementation on everything apart from Lustre, where ROMIO is
used. Those release notes are pre
There was a bug fix in the Open
MPI to ROMIO integration layer sometime in the 4.0 series that fixed a
datatype problem, which caused some problems in the HDF5 tests. You
might be hitting that problem.
Thanks
Edgar
-Original Message-
From: users On Behalf Of Mark Dixon via users
Sent: Monday, N
Hi Edgar,
Pity, that would have been nice! But thanks for looking.
Checking through the ompi github issues, I now realise I logged exactly
the same issue over a year ago (completely forgot - I've moved jobs since
then), including a script to reproduce the issue on a Lustre system.
Unfortunate
On Wed, 25 Nov 2020, Dave Love via users wrote:
The perf test says romio performs a bit better. Also -- from overall
time -- it's faster on IMB-IO (which I haven't looked at in detail, and
ran with suboptimal striping).
I take that back. I can't reproduce a significant difference for total
I
-Original Message-
From: users On Behalf Of Mark Dixon via users
Sent: Thursday, November 26, 2020 9:38 AM
To: Dave Love via users
Cc: Mark Dixon ; Dave Love
Subject: Re: [OMPI users] MPI-IO on Lustre - OMPIO or ROMIO?
On Wed, 25 Nov 2020, Dave Love via users wrote:
The perf test says romio
On Fri, 27 Nov 2020, Dave Love wrote:
...
It's less dramatic in the case I ran, but there's clearly something
badly wrong which needs profiling. It's probably useful to know how
many ranks that's with, and whether it's the default striping. (I
assume with default ompio fs parameters.)
Hi Da
Hi Mark,
Thanks so much for this - yes, applying that pull request against ompi
4.0.5 allows hdf5 1.10.7's parallel tests to pass on our Lustre
filesystem.
I'll certainly be applying it on our local clusters!
Best wishes,
Mark
On Tue, 1 Dec 2020, Mark Allen via users wrote:
At least for t
39 matches
Mail list logo