Hello,

I am currently trying to get MPI working using docker containers across
networked virtual machines. The problem I have is that I don't seem to get
any output from a dockerized MPI program (a tutorial matrix multiply
program), but inspecting the nodes (docker containers) each process sits at
98% cpu usage whilst mpirun does not seem to get any output, and nothing
complains or finishes. Running the same application on my laptop (with a
local mpirun call) the running time is less than a second.

Running a “hello world” program which just printed its “rank" seems to work
fine with this docker setup, with mpirun getting the stdout printf calls.
However, that example never had any inter-node communications.

Much of the system setup we have is based on the idea that environment
variables are what passes information from mpirun to the node processes
(under ssh), but I have not had any exposure to MPI prior to this, so
perhaps this is wrong.

A (relatively) quick description of the system;

 - We have 3 virtual machines that are interconnected on an infiniband
network (however, in the hope of it being simpler, we currently use the
tcp/ip layer not infiniband)

 - There is no resource manager (e.g. Slurm) running, everything is over
ssh.

 - Each vm is running Centos 7, has openmpi 1.6.4 installed, and loads the
mpi environment module up via .bashrc.

 - We have docker installed on each vm.

 - We have a container that is based on ubuntu 14:04, has openmpi version
1.6.5 installed, and runs an mpi-based matrix_multiply program on startup.
Ompi_info output is:

   > ompi_info -v ompi full --parsable

     package:Open MPI buildd@allspice Distribution

     ompi:version:full:1.6.5

     ompi:version:svn:r28673

     ompi:version:release_date:Jun 26, 2013

     orte:version:full:1.6.5

     orte:version:svn:r28673

     orte:version:release_date:Jun 26, 2013

     opal:version:full:1.6.5

     opal:version:svn:r28673

     opal:version:release_date:Jun 26, 2013

     mpi-api:version:full:2.1

     ident:1.6.5



 - To start the mpi program(s), on one of the vms we call mpirun with:

     mpirun --mca btl_tcp_port_min_v4 7950 --mca btl_tcp_port_range_v4 50
--mca btl tcp,self --mca mpi_base_verbose 30 -H
10.3.2.41,10.3.2.42,10.3.2.43 /root/docker-machine-script.sh

 - This then runs our shell script on the vm, which collects environment
variables, filters out the vm mpi variables, and starts the docker
container with those env variables. Our shell script is:

#!/bin/bash

env | grep MPI | grep -v
'MPI_INCLUDE\|MPI_PYTHON\|MPI_LIB\|MPI_BIN\|MPI_COMPILER\|MPI_SYSCONFIG\|MPI_SUFFIX\|MPI_MAN\|MPI_HOME\|MPI_FORTRAN_MOD_DIR'
> /root/env.txt

docker run --privileged --env-file /root/env.txt -p 7950-8000:7950-8000
mrmagooey/mpi-mm


 - Finally, the container runs the mpi application (the matrix multiply)
using the environment variables.

 - Each container is capable of ssh’ing into the other containers without
password.

It is a rather complicated setup, but the real application that we will
eventually run is a pain to compile from scratch and so docker is a very
appealing solution.

Attached is the "ompi_info —all” and “ifconfig” called on one of the host
vm’s (as per the forum support instructions). Also attached is the result
of /usr/bin/printenv on one of the containers.

Thank you,

Peter
docker0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 2044
        inet 172.17.42.1  netmask 255.255.0.0  broadcast 0.0.0.0
        inet6 fe80::5484:7aff:fefe:9799  prefixlen 64  scopeid 0x20<link>
        ether 56:84:7a:fe:97:99  txqueuelen 0  (Ethernet)
        RX packets 289381  bytes 17144659 (16.3 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 287659  bytes 18990126 (18.1 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

ib0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 2044
        inet 10.3.2.41  netmask 255.255.0.0  broadcast 10.3.255.255
        inet6 fe80::fa16:3e00:bd:d05  prefixlen 64  scopeid 0x20<link>
Infiniband hardware address can be incorrect! Please read BUGS section in 
ifconfig(8).
        infiniband 80:00:0A:90:FD:2D:39:5A:00:00:00:00:00:00:00:00:00:00:00:00  
txqueuelen 256  (InfiniBand)
        RX packets 2278868  bytes 2930153450 (2.7 GiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 2112100  bytes 2680850199 (2.4 GiB)
        TX errors 0  dropped 4 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 0  (Local Loopback)
        RX packets 12  bytes 3872 (3.7 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 12  bytes 3872 (3.7 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

veth3fc6c93: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 2044
        inet6 fe80::ecb1:dcff:fefc:7d9f  prefixlen 64  scopeid 0x20<link>
        ether ee:b1:dc:fc:7d:9f  txqueuelen 0  (Ethernet)
        RX packets 23  bytes 2145 (2.0 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 18  bytes 5658 (5.5 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

Attachment: ompi_info.tar.gz
Description: GNU Zip compressed data

OMPI_COMM_WORLD_RANK=0
HOSTNAME=64e742bab4de
OMPI_MCA_btl_tcp_port_min_v4=7950
OMPI_MCA_orte_ess_num_procs=3
OMPI_MCA_orte_ess_jobid=4179623937
OMPI_MCA_orte_num_nodes=3
OMPI_MCA_orte_hnp_uri=4179623936.0;tcp://10.3.2.43:39999;tcp://172.17.42.1:39999
OMPI_COMM_WORLD_LOCAL_RANK=0
OMPI_MCA_mpi_base_verbose=30
OMPI_MCA_orte_cpu_type=GenuineIntel
LD_LIBRARY_PATH=/usr/lib/openmpi/:/usr/include/openmpi/
LS_COLORS=
OMPI_UNIVERSE_SIZE=3
OMPI_MCA_mpi_yield_when_idle=0
OMPI_MCA_orte_num_restarts=0
OMPI_MCA_orte_precondition_transports=bac424fcf9d03a7f-7580429f9e9823ab
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
OMPI_COMM_WORLD_LOCAL_SIZE=1
PWD=/
OMPI_COMM_WORLD_SIZE=3
OMPI_MCA_orte_cpu_model=Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz
OMPI_MCA_orte_daemonize=1
OMPI_MCA_orte_app_num=0
SHLVL=1
HOME=/root
OMPI_MCA_btl_tcp_port_range_v4=50
OMPI_MCA_orte_ess_node_rank=0
OMPI_MCA_orte_ess_vpid=0
OMPI_MCA_ess=env
OMPI_MCA_shmem_RUNTIME_QUERY_hint=mmap
LESSOPEN=| /usr/bin/lesspipe %s
OMPI_COMM_WORLD_NODE_RANK=0
LESSCLOSE=/usr/bin/lesspipe %s %s
OMPI_MCA_plm=rsh
OMPI_MCA_btl=tcp,self
OMPI_MCA_orte_local_daemon_uri=4179623936.1;tcp://10.3.2.41:59170;tcp://172.17.42.1:59170
_=/usr/bin/env

Reply via email to