Hello,

I want to force OpenMPI to use TCP and in particular use a particular subnet. 
Unfortunately, I can't manage to do that.

Here is what I try:

$BIN/mpirun --mca pml ob1 --mca btl tcp,self --mca ptl_tcp_remote_connections 1 
--mca btl_tcp_if_include '10.233.0.0/19' -np 4  --oversubscribe -H ib1n,ib2n 
bash -c 'echo $PMIX_SERVER_URI2'

The expected result would be a list of IP addresses in 10.233.0.0 subnet, but 
instead I get this:

2659516416.2;tcp4://127.0.0.1:46777
2659516416.2;tcp4://127.0.0.1:46777
2659516416.1;tcp4://127.0.0.1:45055
2659516416.1;tcp4://127.0.0.1:45055

Could you help me to debug this problem somehow?

The IP addresses are completely available in the desired subnet

$BIN/mpirun --mca pml ob1 --mca btl tcp,self  --mca ptl_tcp_remote_connections 
1 --mca btl_tcp_if_include '10.233.0.0/19' -np 4  --oversubscribe -H ib1n,ib2n 
ip addr show dev br0

Returns a set of bridges looking like:

9: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group 
default qlen 1000
    link/ether 94:de:80:ba:37:e4 brd ff:ff:ff:ff:ff:ff
    inet 141.76.49.17/26 brd 141.76.49.63 scope global br0
       valid_lft forever preferred_lft forever
    inet 10.233.0.82/19 scope global br0
       valid_lft forever preferred_lft forever
    inet6 2002:8d4c:3001:48:40de:80ff:feba:37e4/64 scope global deprecated 
mngtmpaddr dynamic 
       valid_lft 59528sec preferred_lft 0sec
    inet6 fe80::96de:80ff:feba:37e4/64 scope link tentative dadfailed 
       valid_lft forever preferred_lft forever
<three overs are similar>

What is more boggling is that if I attache with a debugger at 
opal/mca/pmix/pmix3x/pmix/src/mca/ptl/tcp/ptl_tcp_components.c around line 500 
I see that mca_ptl_tcp_component.remote_connections is false. This means that 
the way I set up component parameters is ignored.

-- 
Regards,
Maksym Planeta

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Reply via email to