Ralph Castain <r...@open-mpi.org> writes: > This usually indicates that the remote process is using a different OMPI > version. You might check to ensure that the paths on the remote nodes are > correct.
That seems quite a common problem with non-obvious failure modes. Is it not possible to have a mechanism that checks the consistency of the components and aborts in a clear way? I've never thought it out, but it seems that some combination of OOB messages, library versioning (at least with ELF) and environment variables might do it.