Kahnbein Kai via users wrote:
[External Email]
Hey Tony,
it works without the -parallel flag, all four cpu's are at 100% and
running fine.
Best regards
Kai
Am 05.01.21 um 20:36 schrieb Tony Ladd via users:
Just run the executable without mpirun and the -parallel flag.
On 1/2/21 11:39 PM,
Ke7cozjrP7uWDPH6xt6LAmYVlQPwQuK7ek&m=9YEwGLzNfCD1pAUuvNpqStsbpagtNfIzEt6wL6f3_7I&s=xXp3HlEJc7DzUAnJY0RVVKgKZ9HopKf0UUMePlaCV8w&e=
How can I solve this problem ?
Best regards
Kai
--
Tony Ladd
Chemical Engineering Department
University of Florida
Gainesville, Florid
.
Tony
On 8/25/20 10:42 AM, Jeff Squyres (jsquyres) wrote:
[External Email]
On Aug 24, 2020, at 9:44 PM, Tony Ladd wrote:
I appreciate your help (and John's as well). At this point I don't think is an
OMPI problem - my mistake. I think the communication with RDMA is somehow
f [older] hardware
(including UCX support on that hardware). But be aware that openib is
definitely going away; it is wholly being replaced by UCX. It may be that your
only option is to stick with older software stacks in these hardware
environments.
On Aug 23, 2020, at 9:46 PM, Tony Ladd vi
nfo on one node
ibdiagnet on one node
On Sun, 23 Aug 2020 at 05:02, Tony Ladd via users
mailto:users@lists.open-mpi.org>> wrote:
Hi Jeff
I installed ucx as you suggested. But I can't get even the
simplest code
(ucp_client_server) to work across the network. I can comp
Jeff Squyres (jsquyres) wrote:
[External Email]
Tony --
Have you tried compiling Open MPI with UCX support? This is Mellanox
(NVIDIA's) preferred mechanism for InfiniBand support these days -- the openib
BTL is legacy.
You can run: mpirun --mca pml ucx ...
On Aug 19, 2020, at 12:46 PM,
One other update. I compiled OpenMPI-4.0.4 The outcome was the same but
there is no mention of ibv_obj this time.
Tony
--
Tony Ladd
Chemical Engineering Department
University of Florida
Gainesville, Florida 32611-6005
USA
Email: tladd-"(AT)"-che.ufl.edu
Webhttp://ladd.che.uf
ed on f34 but it did not help.
Tony
--
Tony Ladd
Chemical Engineering Department
University of Florida
Gainesville, Florida 32611-6005
USA
Email: tladd-"(AT)"-che.ufl.edu
Webhttp://ladd.che.ufl.edu
Tel: (352)-392-6509
FAX: (352)-392-9514
foam:root(ib)> ibv_devi
x diagnostics.
I did notice that the Mellanox card has the PCI address 81:00.0 on the
server but 03:00.0 on the client. Not sure of the significance of this.
Any help anyone can offer would be much appreciated. I am stuck.
Thanks
Tony
--
Tony Ladd
Chemical Engineering Department
Univ
://ladd.che.ufl.edu/research/beoclus/beoclus.htm
Tony Ladd
Chemical Engineering
University of Florida
George
I found the info I think you were referring to. Thanks. I then experimented
essentially randomly with different algorithms for all reduce. But the issue
with really bad performance for certain message sizes persisted with v1.1.
The good news is that the upgrade to 1.2 fixed my worst problem
there is a pronouced slowdown-roughly a factor of 10,
which seems too much. Any idea whats going on?
Tony
---
Tony Ladd
Chemical Engineering
University of Florida
PO Box 116005
Gainesville, FL 32611-6005
Tel: 352-392-6509
FAX: 352-392-9513
Email: tl...@che.ufl.edu
Web
ssage size might have
some effect.
Thanks
Tony
---
Tony Ladd
Chemical Engineering
University of Florida
PO Box 116005
Gainesville, FL 32611-6005
Tel: 352-392-6509
FAX: 352-392-9513
Email: tl...@che.ufl.edu
Web: http://ladd.che.ufl.edu
1) I think OpenMPI does not use optimal algorithms for collectives. But
neither does LAM. For example the MPI_Allreduce scales as log_2 N where N is
the number of processors. MPICH uses optimized collectives and the
MPI_Allreduce is essentially independent of N. Unfortunately MPICH has never
had a
Durga
I guess we have strayed a bit from the original post. My personal opinion is
that a number of codes can run in HPC-like mode over Gigabit ethernet, not
just the trivially parallelizable. The hardware components are one key;
PCI-X, low hardware latency NIC (Intel PRO 1000 is 6.6 microsecs vs
on 48 nodes quite reliably but there are still many
issues to address. GAMMA is very much a research tool-there are a number of
features(?) which would hinder it being used in an HPC environment.
Basically Giuseppe needs help with development. Any volunteers?
Tony
---
level, and perhaps beyond.
Tony
---
Tony Ladd
Professor, Chemical Engineering
University of Florida
PO Box 116005
Gainesville, FL 32611-6005
Tel: 352-392-6509
FAX: 352-392-9513
Email: tl...@che.ufl.edu
Web: http://ladd.che.ufl.edu
buffers. NUM_SYNC is the number of sequential barrier
calls it uses to determine the mean barrier call time. You can also switch
the verious tests on and off, which can be useful for debugging
Tony
---
Tony Ladd
Professor, Chemical Engineering
University of Flo
in a third it uses MPI_ALL_REDUCE.
Finally: the tcp driver in openmpi seems not nearly as good as the one in
LAM. I got higher throughput with far fewer dropouts with LAM.
Tony
---
Tony Ladd
Professor, Chemical Engineering
University of Florida
PO Box 116005
Gainesvill
19 matches
Mail list logo