Re: [OMPI users] MPI Exit Code:1 on an OpenFoam application

2021-01-10 Thread Tony Ladd via users
Kahnbein Kai via users wrote: [External Email] Hey Tony, it works without the -parallel flag, all four cpu's are at 100% and running fine. Best regards Kai Am 05.01.21 um 20:36 schrieb Tony Ladd via users: Just run the executable without mpirun and the -parallel flag. On 1/2/21 11:39 PM,

Re: [OMPI users] MPI Exit Code:1 on an OpenFoam application

2021-01-05 Thread Tony Ladd via users
Ke7cozjrP7uWDPH6xt6LAmYVlQPwQuK7ek&m=9YEwGLzNfCD1pAUuvNpqStsbpagtNfIzEt6wL6f3_7I&s=xXp3HlEJc7DzUAnJY0RVVKgKZ9HopKf0UUMePlaCV8w&e= How can I solve this problem ? Best regards Kai -- Tony Ladd Chemical Engineering Department University of Florida Gainesville, Florid

Re: [OMPI users] Problem in starting openmpi job - no output just hangs - SOLVED

2020-09-01 Thread Tony Ladd via users
. Tony On 8/25/20 10:42 AM, Jeff Squyres (jsquyres) wrote: [External Email] On Aug 24, 2020, at 9:44 PM, Tony Ladd wrote: I appreciate your help (and John's as well). At this point I don't think is an OMPI problem - my mistake. I think the communication with RDMA is somehow

Re: [OMPI users] Problem in starting openmpi job - no output just hangs

2020-08-24 Thread Tony Ladd via users
f [older] hardware (including UCX support on that hardware). But be aware that openib is definitely going away; it is wholly being replaced by UCX. It may be that your only option is to stick with older software stacks in these hardware environments. On Aug 23, 2020, at 9:46 PM, Tony Ladd vi

Re: [OMPI users] Problem in starting openmpi job - no output just hangs

2020-08-23 Thread Tony Ladd via users
nfo on one node ibdiagnet on one node On Sun, 23 Aug 2020 at 05:02, Tony Ladd via users mailto:users@lists.open-mpi.org>> wrote: Hi Jeff I installed ucx as you suggested. But I can't get even the simplest code (ucp_client_server) to work across the network. I can comp

Re: [OMPI users] Problem in starting openmpi job - no output just hangs

2020-08-22 Thread Tony Ladd via users
Jeff Squyres (jsquyres) wrote: [External Email] Tony -- Have you tried compiling Open MPI with UCX support? This is Mellanox (NVIDIA's) preferred mechanism for InfiniBand support these days -- the openib BTL is legacy. You can run: mpirun --mca pml ucx ... On Aug 19, 2020, at 12:46 PM,

Re: [OMPI users] Problem in starting openmpi job - no output just hangs

2020-08-19 Thread Tony Ladd via users
One other update. I compiled OpenMPI-4.0.4 The outcome was the same but there is no mention of ibv_obj this time. Tony -- Tony Ladd Chemical Engineering Department University of Florida Gainesville, Florida 32611-6005 USA Email: tladd-"(AT)"-che.ufl.edu Webhttp://ladd.che.uf

Re: [OMPI users] Problem in starting openmpi job - no output just hangs

2020-08-17 Thread Tony Ladd via users
ed on f34 but it did not help. Tony -- Tony Ladd Chemical Engineering Department University of Florida Gainesville, Florida 32611-6005 USA Email: tladd-"(AT)"-che.ufl.edu Webhttp://ladd.che.ufl.edu Tel: (352)-392-6509 FAX: (352)-392-9514 foam:root(ib)> ibv_devi

[OMPI users] Problem in starting openmpi job - no output just hangs

2020-08-17 Thread Tony Ladd via users
x diagnostics. I did notice that the Mellanox card has the PCI address 81:00.0 on the server but 03:00.0 on the client. Not sure of the significance of this. Any help anyone can offer would be much appreciated. I am stuck. Thanks Tony -- Tony Ladd Chemical Engineering Department Univ

[OMPI users] Parallel application performance tests

2006-11-28 Thread Tony Ladd
://ladd.che.ufl.edu/research/beoclus/beoclus.htm Tony Ladd Chemical Engineering University of Florida

[OMPI users] OMPI collectives

2006-11-02 Thread Tony Ladd
George I found the info I think you were referring to. Thanks. I then experimented essentially randomly with different algorithms for all reduce. But the issue with really bad performance for certain message sizes persisted with v1.1. The good news is that the upgrade to 1.2 fixed my worst problem

[OMPI users] OMPI Collectives

2006-10-28 Thread Tony Ladd
there is a pronouced slowdown-roughly a factor of 10, which seems too much. Any idea whats going on? Tony --- Tony Ladd Chemical Engineering University of Florida PO Box 116005 Gainesville, FL 32611-6005 Tel: 352-392-6509 FAX: 352-392-9513 Email: tl...@che.ufl.edu Web

[OMPI users] OMPI Collectives

2006-10-27 Thread Tony Ladd
ssage size might have some effect. Thanks Tony --- Tony Ladd Chemical Engineering University of Florida PO Box 116005 Gainesville, FL 32611-6005 Tel: 352-392-6509 FAX: 352-392-9513 Email: tl...@che.ufl.edu Web: http://ladd.che.ufl.edu

[OMPI users] OMPI collectives

2006-10-26 Thread Tony Ladd
1) I think OpenMPI does not use optimal algorithms for collectives. But neither does LAM. For example the MPI_Allreduce scales as log_2 N where N is the number of processors. MPICH uses optimized collectives and the MPI_Allreduce is essentially independent of N. Unfortunately MPICH has never had a

[OMPI users] Dual Gigabit Ethernet Support

2006-10-24 Thread Tony Ladd
Durga I guess we have strayed a bit from the original post. My personal opinion is that a number of codes can run in HPC-like mode over Gigabit ethernet, not just the trivially parallelizable. The hardware components are one key; PCI-X, low hardware latency NIC (Intel PRO 1000 is 6.6 microsecs vs

[OMPI users] Dual Gigabit ethernet support

2006-10-24 Thread Tony Ladd
on 48 nodes quite reliably but there are still many issues to address. GAMMA is very much a research tool-there are a number of features(?) which would hinder it being used in an HPC environment. Basically Giuseppe needs help with development. Any volunteers? Tony ---

[OMPI users] dual Gigabit ethernet support

2006-10-23 Thread Tony Ladd
level, and perhaps beyond. Tony --- Tony Ladd Professor, Chemical Engineering University of Florida PO Box 116005 Gainesville, FL 32611-6005 Tel: 352-392-6509 FAX: 352-392-9513 Email: tl...@che.ufl.edu Web: http://ladd.che.ufl.edu

Re: [OMPI users] mca_btl_tcp_frag_send: writev failed with errno=110

2006-06-30 Thread Tony Ladd
buffers. NUM_SYNC is the number of sequential barrier calls it uses to determine the mean barrier call time. You can also switch the verious tests on and off, which can be useful for debugging Tony --- Tony Ladd Professor, Chemical Engineering University of Flo

[OMPI users] mca_btl_tcp_frag_send: writev failed with errno=110

2006-06-17 Thread Tony Ladd
in a third it uses MPI_ALL_REDUCE. Finally: the tcp driver in openmpi seems not nearly as good as the one in LAM. I got higher throughput with far fewer dropouts with LAM. Tony --- Tony Ladd Professor, Chemical Engineering University of Florida PO Box 116005 Gainesvill