[OMPI users] How it the rank determined (Open MPI and Podman)
I did a quick test to see if I can use Podman in combination with Open MPI: [test@test1 ~]$ mpirun --hostfile ~/hosts podman run quay.io/adrianreber/mpi-test /home/mpi/hello Hello, world (1 procs total) --> Process # 0 of 1 is alive. ->789b8fb622ef Hello, world (1 procs total) --> Process # 0 of 1 is alive. ->749eb4e1c01a The test program (hello) is taken from https://raw.githubusercontent.com/openhpc/ohpc/obs/OpenHPC_1.3.8_Factory/tests/mpi/hello.c The problem with this is that each process thinks it is process 0 of 1 instead of Hello, world (2 procs total) --> Process # 1 of 2 is alive. ->test1 --> Process # 0 of 2 is alive. ->test2 My questions is how is the rank determined? What resources do I need to have in my container to correctly determine the rank. This is Podman 1.4.2 and Open MPI 4.0.1. Adrian ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users
Re: [OMPI users] How it the rank determined (Open MPI and Podman)
Adrian, the MPI application relies on some environment variables (they typically start with OMPI_ and PMIX_). The MPI application internally uses a PMIx client that must be able to contact a PMIx server (that is included in mpirun and the orted daemon(s) spawned on the remote hosts). located on the same host. If podman provides some isolation between the app inside the container (e.g. /home/mpi/hello) and the outside world (e.g. mpirun/orted), that won't be an easy ride. Cheers, Gilles On 7/11/2019 4:35 PM, Adrian Reber via users wrote: I did a quick test to see if I can use Podman in combination with Open MPI: [test@test1 ~]$ mpirun --hostfile ~/hosts podman run quay.io/adrianreber/mpi-test /home/mpi/hello Hello, world (1 procs total) --> Process # 0 of 1 is alive. ->789b8fb622ef Hello, world (1 procs total) --> Process # 0 of 1 is alive. ->749eb4e1c01a The test program (hello) is taken from https://raw.githubusercontent.com/openhpc/ohpc/obs/OpenHPC_1.3.8_Factory/tests/mpi/hello.c The problem with this is that each process thinks it is process 0 of 1 instead of Hello, world (2 procs total) --> Process # 1 of 2 is alive. ->test1 --> Process # 0 of 2 is alive. ->test2 My questions is how is the rank determined? What resources do I need to have in my container to correctly determine the rank. This is Podman 1.4.2 and Open MPI 4.0.1. Adrian ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users
Re: [OMPI users] How it the rank determined (Open MPI and Podman)
Gilles, thanks for pointing out the environment variables. I quickly created a wrapper which tells Podman to re-export all OMPI_ and PMIX_ variables (grep "\(PMIX\|OMPI\)"). Now it works: $ mpirun --hostfile ~/hosts ./wrapper -v /tmp:/tmp --userns=keep-id --net=host mpi-test /home/mpi/hello Hello, world (2 procs total) --> Process # 0 of 2 is alive. ->test1 --> Process # 1 of 2 is alive. ->test2 I need to tell Podman to mount /tmp from the host into the container, as I am running rootless I also need to tell Podman to use the same user ID in the container as outside (so that the Open MPI files in /tmp) can be shared and I am also running without a network namespace. So this is now with the full Podman provided isolation except the network namespace. Thanks for you help! Adrian On Thu, Jul 11, 2019 at 04:47:21PM +0900, Gilles Gouaillardet via users wrote: > Adrian, > > > the MPI application relies on some environment variables (they typically > start with OMPI_ and PMIX_). > > The MPI application internally uses a PMIx client that must be able to > contact a PMIx server > > (that is included in mpirun and the orted daemon(s) spawned on the remote > hosts). > > located on the same host. > > > If podman provides some isolation between the app inside the container (e.g. > /home/mpi/hello) > > and the outside world (e.g. mpirun/orted), that won't be an easy ride. > > > Cheers, > > > Gilles > > > On 7/11/2019 4:35 PM, Adrian Reber via users wrote: > > I did a quick test to see if I can use Podman in combination with Open > > MPI: > > > > [test@test1 ~]$ mpirun --hostfile ~/hosts podman run > > quay.io/adrianreber/mpi-test /home/mpi/hello > > > > Hello, world (1 procs total) > > --> Process # 0 of 1 is alive. ->789b8fb622ef > > > > Hello, world (1 procs total) > > --> Process # 0 of 1 is alive. ->749eb4e1c01a > > > > The test program (hello) is taken from > > https://raw.githubusercontent.com/openhpc/ohpc/obs/OpenHPC_1.3.8_Factory/tests/mpi/hello.c > > > > > > The problem with this is that each process thinks it is process 0 of 1 > > instead of > > > > Hello, world (2 procs total) > > --> Process # 1 of 2 is alive. ->test1 > > --> Process # 0 of 2 is alive. ->test2 > > > > My questions is how is the rank determined? What resources do I need to have > > in my container to correctly determine the rank. > > > > This is Podman 1.4.2 and Open MPI 4.0.1. > > > > Adrian > > ___ > > users mailing list > > users@lists.open-mpi.org > > https://lists.open-mpi.org/mailman/listinfo/users > > > ___ > users mailing list > users@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/users ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users
Re: [OMPI users] How it the rank determined (Open MPI and Podman)
Not really a relevant reply, however Nomad has task drivers for Docker and Singularity https://www.hashicorp.com/blog/singularity-and-hashicorp-nomad-a-perfect-fit I'm not sure if it woul dbe easier to set up an MPI enviroment with Nomad though On Thu, 11 Jul 2019 at 11:08, Adrian Reber via users < users@lists.open-mpi.org> wrote: > Gilles, > > thanks for pointing out the environment variables. I quickly created a > wrapper which tells Podman to re-export all OMPI_ and PMIX_ variables > (grep "\(PMIX\|OMPI\)"). Now it works: > > $ mpirun --hostfile ~/hosts ./wrapper -v /tmp:/tmp --userns=keep-id > --net=host mpi-test /home/mpi/hello > > Hello, world (2 procs total) > --> Process # 0 of 2 is alive. ->test1 > --> Process # 1 of 2 is alive. ->test2 > > I need to tell Podman to mount /tmp from the host into the container, as > I am running rootless I also need to tell Podman to use the same user ID > in the container as outside (so that the Open MPI files in /tmp) can be > shared and I am also running without a network namespace. > > So this is now with the full Podman provided isolation except the > network namespace. Thanks for you help! > > Adrian > > On Thu, Jul 11, 2019 at 04:47:21PM +0900, Gilles Gouaillardet via users > wrote: > > Adrian, > > > > > > the MPI application relies on some environment variables (they typically > > start with OMPI_ and PMIX_). > > > > The MPI application internally uses a PMIx client that must be able to > > contact a PMIx server > > > > (that is included in mpirun and the orted daemon(s) spawned on the remote > > hosts). > > > > located on the same host. > > > > > > If podman provides some isolation between the app inside the container > (e.g. > > /home/mpi/hello) > > > > and the outside world (e.g. mpirun/orted), that won't be an easy ride. > > > > > > Cheers, > > > > > > Gilles > > > > > > On 7/11/2019 4:35 PM, Adrian Reber via users wrote: > > > I did a quick test to see if I can use Podman in combination with Open > > > MPI: > > > > > > [test@test1 ~]$ mpirun --hostfile ~/hosts podman run > quay.io/adrianreber/mpi-test /home/mpi/hello > > > > > > Hello, world (1 procs total) > > > --> Process # 0 of 1 is alive. ->789b8fb622ef > > > > > > Hello, world (1 procs total) > > > --> Process # 0 of 1 is alive. ->749eb4e1c01a > > > > > > The test program (hello) is taken from > https://raw.githubusercontent.com/openhpc/ohpc/obs/OpenHPC_1.3.8_Factory/tests/mpi/hello.c > > > > > > > > > The problem with this is that each process thinks it is process 0 of 1 > > > instead of > > > > > > Hello, world (2 procs total) > > > --> Process # 1 of 2 is alive. ->test1 > > > --> Process # 0 of 2 is alive. ->test2 > > > > > > My questions is how is the rank determined? What resources do I need > to have > > > in my container to correctly determine the rank. > > > > > > This is Podman 1.4.2 and Open MPI 4.0.1. > > > > > > Adrian > > > ___ > > > users mailing list > > > users@lists.open-mpi.org > > > https://lists.open-mpi.org/mailman/listinfo/users > > > > > ___ > > users mailing list > > users@lists.open-mpi.org > > https://lists.open-mpi.org/mailman/listinfo/users > ___ > users mailing list > users@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/users > ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users