Hi! Thank you all for your reply Jeff, Gilles and rhc.
Thank you Jeff and rhc for clarifying to me some of the openmpi's internals. >> FWIW: we never send interface names to other hosts - just dot addresses > Should have clarified - when you specify an interface name for the MCA param, then it is the interface name that is transferred as that is the value of the MCA param. However, once we determine our address, we only transfer dot addresses between ourselves If only dot addresses are sent to the hosts then why doesn't openmpi use the default route like `ip route get <other host IP>` instead of choosing a random one? Is it an expected behaviour? Can it be changed? Sorry. As Gilles pointed out I forgot to mention which openmpi version I was using. I'm using openmpi 3.0.0 gcc 7.3.0 from openhpc. Centos 7.5. > mpirun—mca oob_tcp_if_exclude 192.168.100.0/24 ... I cannot just exclude that interface cause after that I want to add another computer that's on a different network. And this is where things get messy :( I cannot just include and exclude networks cause I have different machines on different networks. This is what I want to achieve: compute01 compute02 compute03 ens3 192.168.100.104/24 10.0.0.227/24 192.168.100.105/24 ens8 10.0.0.228/24 172.21.1.128/24 --- ens9 172.21.1.155/24 --- --- So I'm in compute01 MPI_spawning another process on compute02 and compute03. With both MPI_Spawn and `mpirun -n 3 -host compute01,compute02,compute03 hostname` Then when I include the mca parameters I get this: `mpirun --oversubscribe --allow-run-as-root -n 3 --mca oob_tcp_if_include 10.0.0.0/24,192.168.100.0/24 -host compute01,compute02,compute03 hostname` WARNING: An invalid value was given for oob_tcp_if_include. This value will be ignored. ... Message: Did not find interface matching this subnet This would all work if it were to use the system's internals like `ip route`. Best regards, Carlos. On Sat, Jun 23, 2018 at 12:27 AM, r...@open-mpi.org <r...@open-mpi.org> wrote: > > > On Jun 22, 2018, at 8:25 PM, r...@open-mpi.org wrote: > > > > On Jun 22, 2018, at 7:31 PM, Gilles Gouaillardet < > gilles.gouaillar...@gmail.com> wrote: > > Carlos, > > By any chance, could > > mpirun—mca oob_tcp_if_exclude 192.168.100.0/24 ... > > work for you ? > > Which Open MPI version are you running ? > > > IIRC, subnets are internally translated to interfaces, so that might be an > issue if > the translation if made on the first host, and then the interface name is > sent to the other hosts. > > > FWIW: we never send interface names to other hosts - just dot addresses > > > Should have clarified - when you specify an interface name for the MCA > param, then it is the interface name that is transferred as that is the > value of the MCA param. However, once we determine our address, we only > transfer dot addresses between ourselves > > > > > Cheers, > > Gilles > > On Saturday, June 23, 2018, carlos aguni <aguni...@gmail.com> wrote: > >> Hi all, >> >> I'm trying to run a code on 2 machines that has at least 2 network >> interfaces in it. >> So I have them as described below: >> >> compute01 >> compute02 >> ens3 >> 192.168.100.104/24 >> 10.0.0.227/24 >> ens8 >> 10.0.0.228/24 >> 172.21.1.128/24 >> ens9 >> 172.21.1.155/24 >> --- >> >> Issue is. When I execute `mpirun -n 2 -host compute01,compute02 hostname` >> on them what I get is the correct output after a very long delay.. >> >> What I've read so far is that OpenMPI performs a greedy algorithm on each >> interface that timeouts if it doesn't find the desired IP. >> Then I saw here (https://www.open-mpi.org/faq/?category=tcp#tcp-selection) >> that I can run commands like: >> `$ mpirun -n 2 --mca oob_tcp_if_include 10.0.0.0/24 -n 2 -host >> compute01,compute02 hosname` >> But this configuration doesn't reach the other host(s). >> In the end I sometimes I get the same timeout. >> >> So is there a way to let it to use the system's default route? >> >> Regards, >> Carlos. >> > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/users > > > > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/users >
_______________________________________________ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users