There are a couple of things you’d need to resolve before worrying about code:
* IIRC, there is a separate ORTE daemon in each Docker container since OMPI thinks these are separate nodes. So you’ll first need to find some way those daemons can “discover” that they are on the same physical node. Is there something in the container environment that could be used for this purpose? * Once the daemons can determine they are on a shared node, then you have to be able to create a shared memory backing file that can be accessed from within any of the containers. In other words, one of the procs in one of the containers is going to have to create the backing file, and then pass the filename to the other procs on that physical node. Then those other procs need to be able to open that file from within their container. Are those doable in Docker? Note that Singularity doesn’t have these issues because it only abstracts the file system, and so every container “sees” that it is on the same node (and the ORTE daemon sits outside the container). This is why we push people in that direction for HPC with containers. Ralph > On Mar 25, 2017, at 8:07 AM, Jordi Guitart <jordi.guit...@bsc.es> wrote: > > Hi, > > I don't have previous expertise on the source code of OpenMPI, so I don't > have a clear idea of the needed changes to implement this feature. This > probably requires some preliminary brainstorming to decide the most > appropriate way to inform OpenMPI that underlying nodes can share memory even > if they have different IP addresses. > > > On 24/03/2017 20:10, Jeff Squyres (jsquyres) wrote: >> On Mar 24, 2017, at 6:41 AM, Jordi Guitart <jordi.guit...@bsc.es> wrote: >>> Docker containers have different IP addresses, indeed, so now we know why >>> it does not work. I think that this could be a nice feature for OpenMPI, so >>> I'll probably issue a request for it ;-) >> Cool. >> >> I don't think any of the current developers in the Open MPI community are >> actively working with Docker (several are working with Singularity). Would >> this be a feature you'd be willing to submit a patch for? >> > > > http://bsc.es/disclaimer > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://rfd.newmexicoconsortium.org/mailman/listinfo/users _______________________________________________ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users