Patrick,
Thanks for the report and the reproducer.
I was able to confirm the issue with python and Fortran, but
- I can only reproduce it with pml/ucx (read --mca pml ob1 --mca btl
tcp,self works fine)
- I can only reproduce it with bcast algorithm 8 and 9
As a workaround, you can keep using u
Hi Jeff
I appreciate your help (and John's as well). At this point I don't think
is an OMPI problem - my mistake. I think the communication with RDMA is
somehow disabled (perhaps its the verbs layer - I am not very
knowledgeable with this). It used to work like a dream but Mellanox has
appare
I'm afraid I don't have many better answers for you.
I can't quite tell from your machines, but are you running IMB-MPI1 Sendrecv
*on a single node* with `--mca btl openib,self`?
I don't remember offhand, but I didn't think that openib was supposed to do
loopback communication. E.g., if both M
I apologize in advance for the size of the example source and probably the
length of the email, but this has been a pain to track down.
Our application uses System V style shared memory pretty extensively and
have recently found that in certain circumstances, OpenMPI appears to
provide ranks with