Hi all,

I am trying to write some simple OpenMPI code that communicates using RMA 
windows. I am running OpenMPI 4.1.2, UCX 1.12.1, and libfabric 1.11.0, all from 
the Ubuntu 22.04 package repos.

My issue is that calls to MPI_Win_allocate fail and I'm not sure why. I call it 
as so: `MPI_Win_allocate(64, 1, MPI_INFO_NULL, MPI_COMM_WORLD, &mem, &win)`, 
where `mem` is an uninitialized void pointer and `win` is an uninitialized 
MPI_Win object. This call happens after successful calls to `MPI_Init` and 
`MPI_Barrier`.

The call to `MPI_Win_allocate` always fails with MPI_ERR_WIN on multi-node 
runs, but succeeds if run on a single node. Examining the source 
(ompi/mpi/c/win_allocate.c:81), I can see that more specific errors returned 
from the lower layers are ambiguated into this single error code before they 
are returned to the user, so it is hard to tell where exactly the allocation is 
failing. MPI_ERR_WIN is also returned if the provided `MPI_Win` pointer is 
null, but I have verified that is not the case.

My goal is to run this on a ROCE cluster using PML/UCX, but I don't think those 
layers are the issue because the error occurs if I use either the ROCE 
configuration or the default `mpirun`configuration. Also, I am able to 
successfully run MPI code without RMA windows (just collective operations) on 
the same cluster, using both the ROCE config and the default configuration.

Anyway, I am mainly looking for advice on how to debug this issue (although if 
you see the cause of the problem that would be great). The debug output (`-v -x 
UCX_LOG_LEVEL=info`) doesn't contain anything useful. The only way I can see to 
debug this further would be to build and deploy a debug version of OpenMPI, 
which I would rather avoid if possible. Any suggestions?

Thanks,
George



Reply via email to