Thank you very much for the response. I have to admit that I'm much more
in the developer camp than the admin camp and am not terribly familiar with
installing and configuring OpenMPI myself. At least one of the systems
does not appear to use ucx but both are using mxm. I'm attaching the
output
Patrick,
Thanks for the report and the reproducer.
I was able to confirm the issue with python and Fortran, but
- I can only reproduce it with pml/ucx (read --mca pml ob1 --mca btl
tcp,self works fine)
- I can only reproduce it with bcast algorithm 8 and 9
As a workaround, you can keep using u
I apologize in advance for the size of the example source and probably the
length of the email, but this has been a pain to track down.
Our application uses System V style shared memory pretty extensively and
have recently found that in certain circumstances, OpenMPI appears to
provide ranks with