An other test is to swap the hostnames. If the single barrier test fails, this can hint to a firewall.
Cheers, Gilles Gilles Gouaillardet <gil...@rist.or.jp> wrote: >sudo make uninstall >will not remove modules that are no more built >sudo rm -rf /usr/local/lib/openmpi >is safe thought > >i confirm i did not see any issue on a system with two networks > >Cheers, > >Gilles > >On 4/18/2016 2:53 PM, dpchoudh . wrote: > >Hello Gilles > >I did a > >sudo make uninstall > >followed by a > >sudo make install > >on both nodes. But that did not make a difference. I will try your tarball >build suggestion a bit later. > >What I find a bit strange is that only I seem to be getting into this issue. >What could I be doing wrong? Or am I discovering an obscure bug? > >Thanks > >Durga > > >1% of the executables have 99% of CPU privilege! > >Userspace code! Unite!! Occupy the kernel!!! > > >On Mon, Apr 18, 2016 at 1:21 AM, Gilles Gouaillardet <gil...@rist.or.jp> wrote: > >so you might want to >rm -rf /usr/local/lib/openmpi >and run >make install >again, just to make sure old stuff does not get in the way > >Cheers, > >Gilles > > > >On 4/18/2016 2:12 PM, dpchoudh . wrote: > >Hello Gilles > >Thank you very much for your feedback. You are right that my original stack >trace was on code that was several weeks behind, but updating it just now did >not seem to make a difference: I am copying the stack from the latest code >below: > > >On the master node: > >(gdb) bt >#0 0x00007fc0524cbb7d in poll () from /lib64/libc.so.6 >#1 0x00007fc051e53116 in poll_dispatch (base=0x1aabbe0, tv=0x7fff29fcb240) at >poll.c:165 >#2 0x00007fc051e4adb0 in opal_libevent2022_event_base_loop (base=0x1aabbe0, >flags=2) at event.c:1630 >#3 0x00007fc051de9a00 in opal_progress () at runtime/opal_progress.c:171 >#4 0x00007fc04ce46b0b in opal_condition_wait (c=0x7fc052d3cde0 ><ompi_request_cond>, > m=0x7fc052d3cd60 <ompi_request_lock>) at >../../../../opal/threads/condition.h:76 >#5 0x00007fc04ce46cec in ompi_request_wait_completion (req=0x1b7b580) > at ../../../../ompi/request/request.h:383 >#6 0x00007fc04ce48d4f in mca_pml_ob1_send (buf=0x7fff29fcb480, count=4, > datatype=0x601080 <ompi_mpi_char>, dst=1, tag=1, >sendmode=MCA_PML_BASE_SEND_STANDARD, > comm=0x601280 <ompi_mpi_comm_world>) at pml_ob1_isend.c:259 >#7 0x00007fc052a62d73 in PMPI_Send (buf=0x7fff29fcb480, count=4, >type=0x601080 <ompi_mpi_char>, dest=1, > tag=1, comm=0x601280 <ompi_mpi_comm_world>) at psend.c:78 >#8 0x0000000000400afa in main (argc=1, argv=0x7fff29fcb5e8) at mpitest.c:19 >(gdb) > >And on the non-master node > >(gdb) bt >#0 0x00007fad2c32148d in nanosleep () from /lib64/libc.so.6 >#1 0x00007fad2c352014 in usleep () from /lib64/libc.so.6 >#2 0x00007fad296412de in OPAL_PMIX_PMIX120_PMIx_Fence (procs=0x0, nprocs=0, >info=0x0, ninfo=0) > at src/client/pmix_client_fence.c:100 >#3 0x00007fad2960e1a6 in pmix120_fence (procs=0x0, collect_data=0) at >pmix120_client.c:258 >#4 0x00007fad2c89b2da in ompi_mpi_finalize () at >runtime/ompi_mpi_finalize.c:242 >#5 0x00007fad2c8c5849 in PMPI_Finalize () at pfinalize.c:47 >#6 0x0000000000400958 in main (argc=1, argv=0x7fff163879c8) at mpitest.c:30 >(gdb) > >And my configuration was done as follows: > > $ ./configure --enable-debug --enable-debug-symbols > >I double checked to ensure that there is not an older installation of OpenMPI >that is getting mixed up with the master branch. > >sudo yum list installed | grep -i mpi > >shows nothing on both nodes, and pmap -p <pid> shows that all the libraries >are coming from /usr/local/lib, which seems to be correct. I am also quite >sure about the firewall issue (that there is none). I will try out your >suggestion on installing from a tarball and see how it goes. > > >Thanks > >Durga > > >1% of the executables have 99% of CPU privilege! > >Userspace code! Unite!! Occupy the kernel!!! > > >On Mon, Apr 18, 2016 at 12:47 AM, Gilles Gouaillardet <gil...@rist.or.jp> >wrote: > >here is your stack trace > >#6 0x00007f72a0d09cd5 in mca_pml_ob1_send (buf=0x7fff81057db0, count=4, > datatype=0x601080 <ompi_mpi_char>, dst=1, tag=1, > sendmode=MCA_PML_BASE_SEND_STANDARD, comm=0x601280 <ompi_mpi_comm_world>) > >at line 251 > > >that would be line 259 in current master, and this file was updated 21 days ago >and that suggests your master is not quite up to date. > >even if the message is sent eagerly, the ob1 pml does use an internal request >it will wait for. > >btw, did you configure with --enable-mpi-thread-multiple ? >did you configure with --enable-mpirun-prefix-by-default ? >did you configure with --disable-dlopen ? > >at first, i d recommend you download a tarball from >https://www.open-mpi.org/nightly/master, >configure && make && make install >using a new install dir, and check if the issue is still here or not. > >there could be some side effects if some old modules were not removed and/or >if you are >not using the modules you expect. >/* when it hangs, you can pmap <pid> and check the path of the openmpi >libraries are the one you expect */ > >what if you do not send/recv but invoke MPI_Barrier multiple times ? >what if you send/recv a one byte message instead ? >did you double check there is no firewall running on your nodes ? > >Cheers, > >Gilles > > > > > > > >On 4/18/2016 1:06 PM, dpchoudh . wrote: > >Thank you for your suggestion, Ralph. But it did not make any difference. > >Let me say that my code is about a week stale. I just did a git pull and am >building it right now. The build takes quite a bit of time, so I avoid doing >that unless there is a reason. But what I am trying out is the most basic >functionality, so I'd think a week or so of lag would not make a difference. > >Does the stack trace suggest something to you? It seems that the send hangs; >but a 4 byte send should be sent eagerly. > >Best regards > >'Durga > > >1% of the executables have 99% of CPU privilege! > >Userspace code! Unite!! Occupy the kernel!!! > > >On Sun, Apr 17, 2016 at 11:55 PM, Ralph Castain <r...@open-mpi.org> wrote: > >Try adding -mca oob_tcp_if_include eno1 to your cmd line and see if that makes >a difference > > >On Apr 17, 2016, at 8:43 PM, dpchoudh . <dpcho...@gmail.com> wrote: > > >Hello Gilles and all > >I am sorry to be bugging the developers, but this issue seems to be nagging >me, and I am surprised it does not seem to affect anybody else. But then >again, I am using the master branch, and most users are probably using a >released version. > >This time I am using a totally different cluster. This has NO verbs capable >interface; just 2 Ethernet (1 of which has no IP address and hence is >unusable) plus 1 proprietary interface that currently supports only IP >traffic. The two IP interfaces (Ethernet and proprietary) are on different IP >subnets. > >My test program is as follows: > >#include <stdio.h> >#include <string.h> >#include "mpi.h" >int main(int argc, char *argv[]) >{ >char host[128]; >int n; >MPI_Init(&argc, &argv); >MPI_Get_processor_name(host, &n); >printf("Hello from %s\n", host); >MPI_Comm_size(MPI_COMM_WORLD, &n); >printf("The world has %d nodes\n", n); >MPI_Comm_rank(MPI_COMM_WORLD, &n); >printf("My rank is %d\n",n); >//#if 0 >if (n == 0) >{ >strcpy(host, "ha!"); >MPI_Send(host, strlen(host) + 1, MPI_CHAR, 1, 1, MPI_COMM_WORLD); >printf("sent %s\n", host); >} >else >{ >//int len = strlen(host) + 1; >bzero(host, 128); >MPI_Recv(host, 4, MPI_CHAR, 0, 1, MPI_COMM_WORLD, MPI_STATUS_IGNORE); >printf("Received %s from rank 0\n", host); >} >//#endif >MPI_Finalize(); >return 0; >} > >This program, when run between two nodes, hangs. The command was: >[durga@b-1 ~]$ mpirun -np 2 -hostfile ~/hostfile -mca btl self,tcp -mca pml >ob1 -mca btl_tcp_if_include eno1 ./mpitest > >And the hang is with the following output: (eno1 is one of the gigEth >interfaces, that takes OOB traffic as well) > >Hello from b-1 >The world has 2 nodes >My rank is 0 >Hello from b-2 >The world has 2 nodes >My rank is 1 > >Note that if I uncomment the #if 0 - #endif (i.e. comment out the >MPI_Send()/MPI_Recv() part, the program runs to completion. Also note that the >printfs following MPI_Send()/MPI_Recv() do not show up on console. > >Upon attaching gdb, the stack trace from the master node is as follows: > >Missing separate debuginfos, use: debuginfo-install glibc-2.17-78.el7.x86_64 >libpciaccess-0.13.4-2.el7.x86_64 >(gdb) bt >#0 0x00007f72a533eb7d in poll () from /lib64/libc.so.6 >#1 0x00007f72a4cb7146 in poll_dispatch (base=0xee33d0, tv=0x7fff81057b70) > at poll.c:165 >#2 0x00007f72a4caede0 in opal_libevent2022_event_base_loop (base=0xee33d0, > flags=2) at event.c:1630 >#3 0x00007f72a4c4e692 in opal_progress () at runtime/opal_progress.c:171 >#4 0x00007f72a0d07ac1 in opal_condition_wait ( > c=0x7f72a5bb1e00 <ompi_request_cond>, m=0x7f72a5bb1d80 <ompi_request_lock>) > at ../../../../opal/threads/condition.h:76 >#5 0x00007f72a0d07ca2 in ompi_request_wait_completion (req=0x113eb80) > at ../../../../ompi/request/request.h:383 >#6 0x00007f72a0d09cd5 in mca_pml_ob1_send (buf=0x7fff81057db0, count=4, > datatype=0x601080 <ompi_mpi_char>, dst=1, tag=1, > sendmode=MCA_PML_BASE_SEND_STANDARD, comm=0x601280 <ompi_mpi_comm_world>) > at pml_ob1_isend.c:251 >#7 0x00007f72a58d6be3 in PMPI_Send (buf=0x7fff81057db0, count=4, > type=0x601080 <ompi_mpi_char>, dest=1, tag=1, > comm=0x601280 <ompi_mpi_comm_world>) at psend.c:78 >#8 0x0000000000400afa in main (argc=1, argv=0x7fff81057f18) at mpitest.c:19 >(gdb) > >And the backtrace on the non-master node is: > >(gdb) bt >#0 0x00007ff3b377e48d in nanosleep () from /lib64/libc.so.6 >#1 0x00007ff3b37af014 in usleep () from /lib64/libc.so.6 >#2 0x00007ff3b0c922de in OPAL_PMIX_PMIX120_PMIx_Fence (procs=0x0, nprocs=0, > info=0x0, ninfo=0) at src/client/pmix_client_fence.c:100 >#3 0x00007ff3b0c5f1a6 in pmix120_fence (procs=0x0, collect_data=0) > at pmix120_client.c:258 >#4 0x00007ff3b3cf8f4b in ompi_mpi_finalize () > at runtime/ompi_mpi_finalize.c:242 >#5 0x00007ff3b3d23295 in PMPI_Finalize () at pfinalize.c:47 >#6 0x0000000000400958 in main (argc=1, argv=0x7fff785e8788) at mpitest.c:30 >(gdb) > >The hostfile is as follows: > >[durga@b-1 ~]$ cat hostfile >10.4.70.10 slots=1 >10.4.70.11 slots=1 >#10.4.70.12 slots=1 > >And the ifconfig output from the master node is as follows (the other node is >similar; all the IP interfaces are in their respective subnets) : > >[durga@b-1 ~]$ ifconfig >eno1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 > inet 10.4.70.10 netmask 255.255.255.0 broadcast 10.4.70.255 > inet6 fe80::21e:c9ff:fefe:13df prefixlen 64 scopeid 0x20<link> > ether 00:1e:c9:fe:13:df txqueuelen 1000 (Ethernet) > RX packets 48215 bytes 27842846 (26.5 MiB) > RX errors 0 dropped 0 overruns 0 frame 0 > TX packets 52746 bytes 7817568 (7.4 MiB) > TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 > device interrupt 16 > >eno2: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500 > ether 00:1e:c9:fe:13:e0 txqueuelen 1000 (Ethernet) > RX packets 0 bytes 0 (0.0 B) > RX errors 0 dropped 0 overruns 0 frame 0 > TX packets 0 bytes 0 (0.0 B) > TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 > device interrupt 17 > >lf0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 2016 > inet 192.168.1.2 netmask 255.255.255.0 broadcast 192.168.1.255 > inet6 fe80::3002:ff:fe33:3333 prefixlen 64 scopeid 0x20<link> > ether 32:02:00:33:33:33 txqueuelen 1000 (Ethernet) > RX packets 10 bytes 512 (512.0 B) > RX errors 0 dropped 0 overruns 0 frame 0 > TX packets 22 bytes 1536 (1.5 KiB) > TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 > >lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536 > inet 127.0.0.1 netmask 255.0.0.0 > inet6 ::1 prefixlen 128 scopeid 0x10<host> > loop txqueuelen 0 (Local Loopback) > RX packets 26 bytes 1378 (1.3 KiB) > RX errors 0 dropped 0 overruns 0 frame 0 > TX packets 26 bytes 1378 (1.3 KiB) > TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 > >Please help me with this. I am stuck with the TCP transport, which is the most >basic of all transports. > >Thanks in advance > >Durga > > > >1% of the executables have 99% of CPU privilege! > >Userspace code! Unite!! Occupy the kernel!!! > > >On Tue, Apr 12, 2016 at 9:32 PM, Gilles Gouaillardet <gil...@rist.or.jp> wrote: > >This is quite unlikely, and fwiw, your test program works for me. > >i suggest you check your 3 TCP networks are usable, for example > >$ mpirun -np 2 -hostfile ~/hostfile -mca btl self,tcp -mca pml ob1 --mca >btl_tcp_if_include xxx ./mpitest > >in which xxx is a [list of] interface name : >eth0 >eth1 >ib0 >eth0,eth1 >eth0,ib0 >... >eth0,eth1,ib0 > >and see where problem start occuring. > >btw, are your 3 interfaces in 3 different subnet ? is routing required between >two interfaces of the same type ? > >Cheers, > >Gilles > > >On 4/13/2016 7:15 AM, dpchoudh . wrote: > >Hi all > >I have reported this issue before, but then had brushed it off as something >that was caused by my modifications to the source tree. It looks like that is >not the case. > >Just now, I did the following: > >1. Cloned a fresh copy from master. > >2. Configured with the following flags, built and installed it in my two-node >"cluster". > >--enable-debug --enable-debug-symbols --disable-dlopen > >3. Compiled the following program, mpitest.c with these flags: -g3 -Wall >-Wextra > >4. Ran it like this: >[durga@smallMPI ~]$ mpirun -np 2 -hostfile ~/hostfile -mca btl self,tcp -mca >pml ob1 ./mpitest > >With this, the code hangs at MPI_Barrier() on both nodes, after generating the >following output: > >Hello world from processor smallMPI, rank 0 out of 2 processors >Hello world from processor bigMPI, rank 1 out of 2 processors >smallMPI sent haha! >bigMPI received haha! > ><Hangs until killed by ^C> > >Attaching to the hung process at one node gives the following backtrace: > >(gdb) bt >#0 0x00007f55b0f41c3d in poll () from /lib64/libc.so.6 >#1 0x00007f55b03ccde6 in poll_dispatch (base=0x70e7b0, tv=0x7ffd1bb551c0) at >poll.c:165 >#2 0x00007f55b03c4a90 in opal_libevent2022_event_base_loop (base=0x70e7b0, >flags=2) at event.c:1630 >#3 0x00007f55b02f0144 in opal_progress () at runtime/opal_progress.c:171 >#4 0x00007f55b14b4d8b in opal_condition_wait (c=0x7f55b19fec40 ><ompi_request_cond>, m=0x7f55b19febc0 <ompi_request_lock>) at >../opal/threads/condition.h:76 >#5 0x00007f55b14b531b in ompi_request_default_wait_all (count=2, >requests=0x7ffd1bb55370, statuses=0x7ffd1bb55340) at request/req_wait.c:287 >#6 0x00007f55b157a225 in ompi_coll_base_sendrecv_zero (dest=1, stag=-16, >source=1, rtag=-16, comm=0x601280 <ompi_mpi_comm_world>) > at base/coll_base_barrier.c:63 >#7 0x00007f55b157a92a in ompi_coll_base_barrier_intra_two_procs >(comm=0x601280 <ompi_mpi_comm_world>, module=0x7c2630) at >base/coll_base_barrier.c:308 >#8 0x00007f55b15aafec in ompi_coll_tuned_barrier_intra_dec_fixed >(comm=0x601280 <ompi_mpi_comm_world>, module=0x7c2630) at >coll_tuned_decision_fixed.c:196 >#9 0x00007f55b14d36fd in PMPI_Barrier (comm=0x601280 <ompi_mpi_comm_world>) >at pbarrier.c:63 >#10 0x0000000000400b0b in main (argc=1, argv=0x7ffd1bb55658) at mpitest.c:26 >(gdb) > >Thinking that this might be a bug in tuned collectives, since that is what the >stack shows, I ran the program like this (basically adding the ^tuned part) > >[durga@smallMPI ~]$ mpirun -np 2 -hostfile ~/hostfile -mca btl self,tcp -mca >pml ob1 -mca coll ^tuned ./mpitest > >It still hangs, but now with a different stack trace: >(gdb) bt >#0 0x00007f910d38ac3d in poll () from /lib64/libc.so.6 >#1 0x00007f910c815de6 in poll_dispatch (base=0x1a317b0, tv=0x7fff43ee3610) at >poll.c:165 >#2 0x00007f910c80da90 in opal_libevent2022_event_base_loop (base=0x1a317b0, >flags=2) at event.c:1630 >#3 0x00007f910c739144 in opal_progress () at runtime/opal_progress.c:171 >#4 0x00007f910db130f7 in opal_condition_wait (c=0x7f910de47c40 ><ompi_request_cond>, m=0x7f910de47bc0 <ompi_request_lock>) > at ../../../../opal/threads/condition.h:76 >#5 0x00007f910db132d8 in ompi_request_wait_completion (req=0x1b07680) at >../../../../ompi/request/request.h:383 >#6 0x00007f910db1533b in mca_pml_ob1_send (buf=0x0, count=0, >datatype=0x7f910de1e340 <ompi_mpi_byte>, dst=1, tag=-16, >sendmode=MCA_PML_BASE_SEND_STANDARD, > comm=0x601280 <ompi_mpi_comm_world>) at pml_ob1_isend.c:259 >#7 0x00007f910d9c3b38 in ompi_coll_base_barrier_intra_basic_linear >(comm=0x601280 <ompi_mpi_comm_world>, module=0x1b092c0) at >base/coll_base_barrier.c:368 >#8 0x00007f910d91c6fd in PMPI_Barrier (comm=0x601280 <ompi_mpi_comm_world>) >at pbarrier.c:63 >#9 0x0000000000400b0b in main (argc=1, argv=0x7fff43ee3a58) at mpitest.c:26 >(gdb) > >The mpitest.c program is as follows: >#include <mpi.h> >#include <stdio.h> >#include <string.h> > >int main(int argc, char** argv) >{ > int world_size, world_rank, name_len; > char hostname[MPI_MAX_PROCESSOR_NAME], buf[8]; > > MPI_Init(&argc, &argv); > MPI_Comm_size(MPI_COMM_WORLD, &world_size); > MPI_Comm_rank(MPI_COMM_WORLD, &world_rank); > MPI_Get_processor_name(hostname, &name_len); > printf("Hello world from processor %s, rank %d out of %d processors\n", >hostname, world_rank, world_size); > if (world_rank == 1) > { > MPI_Recv(buf, 6, MPI_CHAR, 0, 99, MPI_COMM_WORLD, MPI_STATUS_IGNORE); > printf("%s received %s\n", hostname, buf); > } > else > { > strcpy(buf, "haha!"); > MPI_Send(buf, 6, MPI_CHAR, 1, 99, MPI_COMM_WORLD); > printf("%s sent %s\n", hostname, buf); > } > MPI_Barrier(MPI_COMM_WORLD); > MPI_Finalize(); > return 0; >} > >The hostfile is as follows: >10.10.10.10 slots=1 >10.10.10.11 slots=1 > >The two nodes are connected by three physical and 3 logical networks: > >Physical: Gigabit Ethernet, 10G iWARP, 20G Infiniband > >Logical: IP (all 3), PSM (Qlogic Infiniband), Verbs (iWARP and Infiniband) > >Please note again that this is a fresh, brand new clone. > >Is this a bug (perhaps a side effect of --disable-dlopen) or something I am >doing wrong? > >Thanks > >Durga > > >We learn from history that we never learn from history. > > > >_______________________________________________ users mailing list >us...@open-mpi.org Subscription: >http://www.open-mpi.org/mailman/listinfo.cgi/users Link to this post: >http://www.open-mpi.org/community/lists/users/2016/04/28930.php > > > >_______________________________________________ >users mailing list >us...@open-mpi.org >Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >Link to this post: >http://www.open-mpi.org/community/lists/users/2016/04/28932.php > > >_______________________________________________ >users mailing list >us...@open-mpi.org >Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > >Link to this post: >http://www.open-mpi.org/community/lists/users/2016/04/28942.php > > > >_______________________________________________ >users mailing list >us...@open-mpi.org >Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >Link to this post: >http://www.open-mpi.org/community/lists/users/2016/04/28943.php > > > > >_______________________________________________ users mailing list >us...@open-mpi.org Subscription: >http://www.open-mpi.org/mailman/listinfo.cgi/users > >Link to this post: >http://www.open-mpi.org/community/lists/users/2016/04/28944.php > > > >_______________________________________________ >users mailing list >us...@open-mpi.org >Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >Link to this post: >http://www.open-mpi.org/community/lists/users/2016/04/28946.php > > > > >_______________________________________________ users mailing list >us...@open-mpi.org Subscription: >http://www.open-mpi.org/mailman/listinfo.cgi/users > >Link to this post: >http://www.open-mpi.org/community/lists/users/2016/04/28947.php > > > >_______________________________________________ >users mailing list >us...@open-mpi.org >Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >Link to this post: >http://www.open-mpi.org/community/lists/users/2016/04/28948.php > > > > >_______________________________________________ users mailing list >us...@open-mpi.org Subscription: >http://www.open-mpi.org/mailman/listinfo.cgi/users Link to this post: >http://www.open-mpi.org/community/lists/users/2016/04/28949.php > >