Thanks Gilles - I appreciate all the detail. Ahh, that's great that Open MPI now supports specifying an ssh port simply through the hostfile. That'll make things a little simpler when I have that use case in the future.
Oh of course - that makes sense that Open MPI requires TCP ports too rather than just port 22 for the ssh'ing. Thanks for all the details on what's going on under the hood as well as how to force a static port range. Reading more about Docker, yes, my issues all boiled down to network connectivity. In case this helps other Docker newbies out there, one easy way to get the containers to all be able to see each other is to create an overlay network: docker network create --driver overlay --subnet 10.0.9.0/24 mpi-net Then you want containers running in this network. You could do this using Consul, but using Docker Swarm you can simply deploy services into this network. So, what I did is create a swarm on a manager node: docker swarm init --advertise-addr <MANAGER-IP> This prints out a command that the workers should execute that looks something like: docker swarm join --token <TOKEN> <MANAGER-IP>:<Port> Run this on all the worker nodes so that they join the swarm. Then, deploy the Docker image as a service on all of the nodes (the manager plus all workers), publishing the container's port 22 to host port 32777, by running this command on the manager node: docker service create -p "32777:22" --name mpi-test --mode global --network mpi-net mpi-image-name Note that I needed to do a 'docker pull mpi-image-name' on all the instances first to get the image on there (there's probably a built-in way to do this with Docker Swarm that I'm missing). As mentioned in my original post, my Docker container starts up an OpenSSH server so at this point this container is running on all instances. You can ssh in: ssh mpirun@172.17.0.1 -p 32777 -i ~/id_rsa You can then use a special nslookup to find the IP addresses of each container: nslookup tasks.mpi-test This will list the address of each container: Server: 127.0.0.11 Address: 127.0.0.11#53 Non-authoritative answer: Name: tasks.mpi-test Address: 10.0.9.3 Name: tasks.mpi-test Address: 10.0.9.4 Name: tasks.mpi-test Address: 10.0.9.5 These addresses are directly accessible without any SSH tunneling or other funny business - you can create a hosts file with 10.0.9.3, 10.0.9.4, and 10.0.9.5 and then simply perform a normal mpirun command: mpirun --hostfile hosts.txt -np 3 mpi_hello_world This fits the conops I'm looking for of performing the mpirun inside of the container. Of course the Docker image could be made smarter to automatically do the nslookup'ing and mpirun'ing rather than requiring you to ssh in to do these steps. I have seen from https://www.percona.com/blog/2016/08/03/testing-docker-multi-host-network-performance/ that overlay networks don't offer great performance so I want to investigate using Calico or some other network plugin, but that's a Docker rather than MPI issue. Similar to the ssh-less MPI post you link to, it would be cool if Open MPI just supported this natively, but it's just two lines in a Dockerfile for me to put in an OpenSSH server and it's easy enough to automate the nslookup stuff so it's not that big of a deal. I don't think that post would cover the use case I'm looking for as you'd still need to be able to ssh between hosts somehow. I am interested in Singularity, so thanks for the pointer to that, but I am tied to Docker specifically for some of the work I'm doing currently. Many thanks. -Adam On Sun, Dec 25, 2016 at 7:49 PM, Gilles Gouaillardet <gil...@rist.or.jp> wrote: > Adam, > > > there are several things here > > > with an up-to-date master, you can specify an alternate ssh port via a > hostfile > > see https://github.com/open-mpi/ompi/issues/2224 > > > Open MPI requires more than just ssh. > > - remote nodes (orted) need to call back mpirun (oob/tcp) > > - nodes (MPI tasks) need to be able to connect to each other (btl/tcp) > > > regarding oob/tcp, your mpirun command line will basically do under the > hood > ssh docker2 orted <docker1 ip> <docker1 oob/tcp port> > > then each task will use a port for btl/tcp, and tasks might directly > connect to each other with the docker IP and this port. > > by default, these two ports are dynamic, but you can use static port > (range) via MCA parameter > mpirun --mca oob_tcp_static_ipv4_ports xxx --mca oob_btl_tcp_port_min_v4 > yyy --mca btl_tcp_port_range_v4 zzz > > > that does not change the fact that ssh tunneling works with host > addresses, and Open MPI will (internally) use docker addresses. > > > i'd rather suggest you try to > - enable IP connectivity between your containers (eventually running on > different hosts) > - assuming you need (some) network isolation, then use static ports, and > update your firewall to allow full TCP/IP connectivity on these ports > and port 22 (ssh). > > you can also refer to https://github.com/open-mpi/ompi/issues/1511 > yet an other way to use docker was discussed here. > > last but not least, if you want to use containers but you are not tied to > docker, you can consider http://singularity.lbl.gov/ > (as far as Open MPI is concerned,native support is expected for Open MPI > 2.1) > > > Cheers, > > Gilles > > > On 12/26/2016 6:11 AM, Adam Sylvester wrote: > > I'm trying to use OpenMPI 1.10.4 to communicate between two Docker > containers running on two different physical machines. Docker doesn't have > much to do with my question (unless someone has a suggestion for a better > way to do what I'm trying to :o) )... each Docker container is running an > OpenSSH server which shows up as 172.17.0.1 on the physical hosts: > > $ ifconfig docker0 > docker0 Link encap:Ethernet HWaddr 02:42:8E:07:05:A0 > inet addr:172.17.0.1 Bcast:0.0.0.0 Mask:255.255.0.0 > inet6 addr: fe80::42:8eff:fe07:5a0/64 Scope:Link > > The Docker container's ssh port is published on the physical host as port > 32768. > > The Docker container has a user 'mpirun' which I have public/private ssh > keys set up for. > > Let's call the physical hosts host1 and host2; each host is running a > Docker container I'll refer to as docker1 and docker2 respectively. So, > this means I can... > 1. ssh From host1 into docker1: > ssh mpirun@172.17.0.1 -i ssh/id_rsa -p 32768 > > 2. Set up an ssh tunnel from inside docker1, through host2, into docker2, > on local port 4334 (ec2-user is the login to host2) > ssh -f -N -q -o "TCPKeepAlive yes" -o "ServerAliveInterval 60" -L 4334: > 172.17.0.1:32768 -l ec2-user host2 > > 3. Update my ~/.ssh/config file to name this host 'docker2': > StrictHostKeyChecking no > Host docker2 > HostName 127.0.0.1 > Port 4334 > User mpirun > > 4. I can now do 'ssh docker2' and ssh into it without issues. > > Here's where I get stuck. I'd read that OpenMPI's mpirun didn't support > ssh'ing on a non-standard port, so I thought I could just do step 3 above > and then list the hosts when I run mpirun from docker1: > > mpirun --prefix /usr/local -n 2 -H localhost,docker2 > /home/mpirun/mpi_hello_world > > However, I get: > [3524ae84a26b:00197] [[55635,0],1] tcp_peer_send_blocking: send() to > socket 9 failed: Broken pipe (32) > -------------------------------------------------------------------------- > ORTE was unable to reliably start one or more daemons. > This usually is caused by: > > * not finding the required libraries and/or binaries on > one or more nodes. Please check your PATH and LD_LIBRARY_PATH > settings, or configure OMPI with --enable-orterun-prefix-by-default > > * lack of authority to execute on one or more specified nodes. > Please verify your allocation and authorities. > > * the inability to write startup files into /tmp > (--tmpdir/orte_tmpdir_base). > Please check with your sys admin to determine the correct location to > use. > > * compilation of the orted with dynamic libraries when static are required > (e.g., on Cray). Please check your configure cmd line and consider using > one of the contrib/platform definitions for your system type. > > * an inability to create a connection back to mpirun due to a > lack of common network interfaces and/or no route found between > them. Please check network connectivity (including firewalls > and network routing requirements). > -------------------------------------------------------------------------- > > I'm guessing that something's going wrong when docker2 tries to > communicate back to docker1. However, I'm not sure what additional > tunneling to set up to support this. My understanding of ssh tunnels is > relatively basic... I can of course create a tunnel on docker2 back to > docker1 but I don't know how ssh/mpi will "find" it. I've read a bit about > reverse ssh tunneling but it's not clear enough to me what this is doing to > apply it here. > > Any help is much appreciated! > -Adam > > > _______________________________________________ > users mailing > listus...@lists.open-mpi.orghttps://rfd.newmexicoconsortium.org/mailman/listinfo/users > > > > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://rfd.newmexicoconsortium.org/mailman/listinfo/users >
_______________________________________________ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users