Re: [OMPI users] Issue with shared memory arrays in Fortran

2020-08-24 Thread Gilles Gouaillardet via users
Patrick, Thanks for the report and the reproducer. I was able to confirm the issue with python and Fortran, but - I can only reproduce it with pml/ucx (read --mca pml ob1 --mca btl tcp,self works fine) - I can only reproduce it with bcast algorithm 8 and 9 As a workaround, you can keep using u

Re: [OMPI users] Problem in starting openmpi job - no output just hangs

2020-08-24 Thread Tony Ladd via users
Hi Jeff I appreciate your help (and John's as well). At this point I don't think is an OMPI problem - my mistake. I think the communication with RDMA is somehow disabled (perhaps its the verbs layer - I am not very knowledgeable with this). It used to work like a dream but Mellanox has appare

Re: [OMPI users] Problem in starting openmpi job - no output just hangs

2020-08-24 Thread Jeff Squyres (jsquyres) via users
I'm afraid I don't have many better answers for you. I can't quite tell from your machines, but are you running IMB-MPI1 Sendrecv *on a single node* with `--mca btl openib,self`? I don't remember offhand, but I didn't think that openib was supposed to do loopback communication. E.g., if both M

[OMPI users] Issue with shared memory arrays in Fortran

2020-08-24 Thread Patrick McNally via users
I apologize in advance for the size of the example source and probably the length of the email, but this has been a pain to track down. Our application uses System V style shared memory pretty extensively and have recently found that in certain circumstances, OpenMPI appears to provide ranks with