[AMD Official Use Only - General]
UCX will disqualify itself unless it finds cuda, rocm, or InfiniBand network to
use. To allow UCX to run on a regular shared memory job without GPUs or IB, you
have to set UCX_TLS environment variable explicitly allowe UCX to run for shm,
e.g :
mpirun -x UCX_T
There was work done in ompio in that direction, but the code wasn’t actually
committed into the main repository. It probably exists somewhere in a branch
somewhere. If you are interested, please ping me directly and I can put you in
contact with the person that wrote the code and to clarify the
[AMD Official Use Only - General]
I can also offer to help if there are any question regarding the ompio code,
but I do not have the bandwidth/resources to do that myself, and more
importantly, I do not have a platform to test the new component.
Edgar
From: users On Behalf Of Jeff Squyres
(js
Hi,
There are a few things that you could test to see whether they make difference.
1. Try to modify the number of aggregators used in collective I/O (assuming
that the code uses collective I/O). You could try e.g. to set it to the number
of nodes used (the algorithm determining the number