As far as i am concerned, i would consider that as a bug :

since the link is down, the psm component should simply disqualify itself,

it will follow-up this on the devel ML


Cheers,


Gilles


On 4/25/2016 10:36 AM, dpchoudh . wrote:
Hello Gilles

Thank you for finding the bug; it was not there in the original code; I added it while trying to 'simplify' the code.

With the bug fixed, the code now runs in the last scenario. But it still hangs with the following command line (even after updating to latest git tree, rebuilding and reinstalling):

mpirun -np 2 -mca btl self,sm ./mpi_hello_master_slave

and the stack is still as before:

(gdb) bt
#0  0x00007f4e4bd60117 in sched_yield () from /lib64/libc.so.6
#1 0x00007f4e4ba3d875 in amsh_ep_connreq_wrap () from /lib64/libpsm_infinipath.so.1 #2 0x00007f4e4ba3e254 in amsh_ep_connect () from /lib64/libpsm_infinipath.so.1 #3 0x00007f4e4ba470df in psm_ep_connect () from /lib64/libpsm_infinipath.so.1 #4 0x00007f4e4c4c8975 in ompi_mtl_psm_add_procs (mtl=0x7f4e4c846500 <ompi_mtl_psm>, nprocs=2, procs=0x23bb420)
    at mtl_psm.c:312
#5 0x00007f4e4c52ef6b in mca_pml_cm_add_procs (procs=0x23bb420, nprocs=2) at pml_cm.c:134 #6 0x00007f4e4c2e7d0f in ompi_mpi_init (argc=1, argv=0x7fffe930f9b8, requested=0, provided=0x7fffe930f78c)
    at runtime/ompi_mpi_init.c:770
#7 0x00007f4e4c324aff in PMPI_Init (argc=0x7fffe930f7bc, argv=0x7fffe930f7b0) at pinit.c:66 #8 0x000000000040101f in main (argc=1, argv=0x7fffe930f9b8) at mpi_hello_master_slave.c:94

As you can see, OMPI is trying the PSM link to communicate, even though the link is down and it is not mentioned in the arguments to mpirun. (There are not even multiple nodes mentioned in the arguments.)

Is this the expected behaviour or is it a bug?

Thanks
Durga

1% of the executables have 99% of CPU privilege!
Userspace code! Unite!! Occupy the kernel!!!

On Sun, Apr 24, 2016 at 8:12 PM, Gilles Gouaillardet <gil...@rist.or.jp <mailto:gil...@rist.or.jp>> wrote:

    two comments :

    - the program is incorrect : slave() should MPI_Recv(...,
    MPI_ANY_TAG, ...)

    - current master uses pmix114, and your traces mention pmix120
      so your master is out of sync, or pmix120 is an old module that
    was not manually removed.
      fwiw, once in a while, i
      rm -rf /.../ompi_install_dir/lib/openmpi
      to get rid of the removed modules

    Cheers,

    Gilles


    On 4/25/2016 7:34 AM, dpchoudh . wrote:
    Hello all

    Attached is a simple MPI program (a modified version of a similar
    program that was posted by another user). This program, when run
    on a single node machine, hangs most of the time, as follows: (in
    all cases, OS was CentOS 7)

    Scenario 1: OMPI v 1.10, single socket quad core machine, with
    Chelsio T3 card, link down, and GigE, link up

    mpirun -np 2 <progname>
    Backtrace of the two spawned processes as follows:

    (gdb) bt
    #0  0x00007f6471647aba in mca_btl_vader_component_progress () at
    btl_vader_component.c:708
    #1  0x00007f6475c6722a in opal_progress () at
    runtime/opal_progress.c:187
    #2  0x00007f64767b7685 in opal_condition_wait (c=<optimized out>,
    m=<optimized out>)
        at ../opal/threads/condition.h:78
    #3  ompi_request_default_wait_all (count=2,
    requests=0x7ffd1d921530, statuses=0x7ffd1d921540)
        at request/req_wait.c:281
    #4  0x00007f64709dd591 in ompi_coll_tuned_sendrecv_zero
    (stag=-16, rtag=-16,
        comm=<optimized out>, source=1, dest=1) at
    coll_tuned_barrier.c:78
    #5  ompi_coll_tuned_barrier_intra_two_procs (comm=0x6022c0
    <ompi_mpi_comm_world>,
        module=<optimized out>) at coll_tuned_barrier.c:324
    #6  0x00007f64767c92e6 in PMPI_Barrier (comm=0x6022c0
    <ompi_mpi_comm_world>) at pbarrier.c:70
    #7  0x00000000004010bd in main (argc=1, argv=0x7ffd1d9217d8) at
    mpi_hello_master_slave.c:115
    (gdb)


    (gdb) bt
    #0  mca_pml_ob1_progress () at pml_ob1_progress.c:45
    #1  0x00007feeae7dc22a in opal_progress () at
    runtime/opal_progress.c:187
    #2  0x00007feea9e125c5 in opal_condition_wait (c=<optimized out>,
    m=<optimized out>)
        at ../../../../opal/threads/condition.h:78
    #3  ompi_request_wait_completion (req=0xe55200) at
    ../../../../ompi/request/request.h:381
    #4  mca_pml_ob1_recv (addr=<optimized out>, count=255,
    datatype=<optimized out>,
        src=<optimized out>, tag=<optimized out>, comm=<optimized
    out>, status=0x7fff4a618000)
        at pml_ob1_irecv.c:118
    #5  0x00007feeaf35068f in PMPI_Recv (buf=0x7fff4a618020, count=255,
        type=0x6020c0 <ompi_mpi_char>, source=<optimized out>,
    tag=<optimized out>,
        comm=0x6022c0 <ompi_mpi_comm_world>, status=0x7fff4a618000)
    at precv.c:78
    #6  0x0000000000400d49 in slave () at mpi_hello_master_slave.c:67
    #7  0x00000000004010b3 in main (argc=1, argv=0x7fff4a6184d8) at
    mpi_hello_master_slave.c:113
    (gdb)


    Scenario 2:
    Dual socket Hexcore machine with Qlogic IB, Chelsio iWARP and
    Fibre Channel, all link down, GigE, link up, OpenMPI compiled
    from master branch, crashes as follows:

    [durga@smallMPI Desktop]$ mpirun -np 2 ./mpi_hello_master_slave

    mpi_hello_master_slave:39570 terminated with signal 11 at PC=20
    SP=7ffd438c00b8. Backtrace:

    mpi_hello_master_slave:39571 terminated with signal 11 at PC=20
    SP=7ffee5903e08. Backtrace:
    -------------------------------------------------------
    Primary job  terminated normally, but 1 process returned
    a non-zero exit code. Per user-direction, the job has been aborted.
    -------------------------------------------------------
    --------------------------------------------------------------------------
    mpirun noticed that process rank 0 with PID 0 on node smallMPI
    exited on signal 11 (Segmentation fault).
    --------------------------------------------------------------------------

    Scenario 3:
    Exactly same as scenario 2, but with command line more explicit
    as follows:

    [durga@smallMPI Desktop]$ mpirun -np 2 -mca btl self,sm
    ./mpi_hello_master_slave
    This hangs with the following backtrace:

    (gdb) bt
    #0  0x00007ff6639f049d in nanosleep () from /lib64/libc.so.6
    #1  0x00007ff663a210d4 in usleep () from /lib64/libc.so.6
    #2  0x00007ff662f72796 in OPAL_PMIX_PMIX120_PMIx_Fence
    (procs=0x0, nprocs=0, info=0x0, ninfo=0)
        at src/client/pmix_client_fence.c:100
    #3  0x00007ff662f4f0bc in pmix120_fence (procs=0x0,
    collect_data=0) at pmix120_client.c:255
    #4  0x00007ff663f941af in ompi_mpi_init (argc=1,
    argv=0x7ffc18c9afd8, requested=0, provided=0x7ffc18c9adac)
        at runtime/ompi_mpi_init.c:813
    #5  0x00007ff663fc9c33 in PMPI_Init (argc=0x7ffc18c9addc,
    argv=0x7ffc18c9add0) at pinit.c:66
    #6  0x000000000040101f in main (argc=1, argv=0x7ffc18c9afd8) at
    mpi_hello_master_slave.c:94
    (gdb) q

    (gdb) bt
    #0  0x00007f5af7646117 in sched_yield () from /lib64/libc.so.6
    #1  0x00007f5af7323875 in amsh_ep_connreq_wrap () from
    /lib64/libpsm_infinipath.so.1
    #2  0x00007f5af7324254 in amsh_ep_connect () from
    /lib64/libpsm_infinipath.so.1
    #3  0x00007f5af732d0df in psm_ep_connect () from
    /lib64/libpsm_infinipath.so.1
    #4  0x00007f5af7d94a69 in ompi_mtl_psm_add_procs
    (mtl=0x7f5af80f8500 <ompi_mtl_psm>, nprocs=2, procs=0xf53e60)
        at mtl_psm.c:312
    #5  0x00007f5af7df3630 in mca_pml_cm_add_procs (procs=0xf53e60,
    nprocs=2) at pml_cm.c:134
    #6  0x00007f5af7bcc0d1 in ompi_mpi_init (argc=1,
    argv=0x7ffc485a2f98, requested=0, provided=0x7ffc485a2d6c)
        at runtime/ompi_mpi_init.c:777
    #7  0x00007f5af7c01c33 in PMPI_Init (argc=0x7ffc485a2d9c,
    argv=0x7ffc485a2d90) at pinit.c:66
    #8  0x000000000040101f in main (argc=1, argv=0x7ffc485a2f98) at
    mpi_hello_master_slave.c:94

    This seems to suggest that it is trying PSM to connect even when
    the link was down and it was not mentioned in the command line.
    Is this behavior expected?


    Scenario 4:
    Exactly same as scenario 3, but with even more explicit command line:

    [durga@smallMPI Desktop]$ mpirun -np 2 -mca btl self,sm -mca pml
    ob1 ./mpi_hello_master_slave

    This hangs towards the end, after printing the output (as opposed
    to scenario 3 where it hangs at the connection setup stage,
    without printing anything.)

    Process 0 of 2 running on host smallMPI


    Now 1 slave tasks are sending greetings.

    Process 1 of 2 running on host smallMPI
    Greetings from task 1:
      message type:        3
      msg length:          141 characters
      message:
        hostname:          smallMPI
        operating system:  Linux
        release: 3.10.0-327.13.1.el7.x86_64
        processor:         x86_64


    Backtraces of the two processes are as follows:

    (gdb) bt
    #0  opal_timer_base_get_usec_clock_gettime () at
    timer_linux_component.c:180
    #1  0x00007f10f46e50e4 in opal_progress () at
    runtime/opal_progress.c:161
    #2  0x00007f10f58a9d8b in opal_condition_wait (c=0x7f10f5df3c40
    <ompi_request_cond>,
        m=0x7f10f5df3bc0 <ompi_request_lock>) at
    ../opal/threads/condition.h:76
    #3  0x00007f10f58aa31b in ompi_request_default_wait_all (count=2,
    requests=0x7ffe7edd5a80,
        statuses=0x7ffe7edd5a50) at request/req_wait.c:287
    #4  0x00007f10f596f225 in ompi_coll_base_sendrecv_zero (dest=1,
    stag=-16, source=1, rtag=-16,
        comm=0x6022c0 <ompi_mpi_comm_world>) at
    base/coll_base_barrier.c:63
    #5  0x00007f10f596f92a in ompi_coll_base_barrier_intra_two_procs
    (comm=0x6022c0 <ompi_mpi_comm_world>,
        module=0xd5a7f0) at base/coll_base_barrier.c:308
    #6  0x00007f10f599ffec in ompi_coll_tuned_barrier_intra_dec_fixed
    (comm=0x6022c0 <ompi_mpi_comm_world>,
        module=0xd5a7f0) at coll_tuned_decision_fixed.c:196
    #7  0x00007f10f58c86fd in PMPI_Barrier (comm=0x6022c0
    <ompi_mpi_comm_world>) at pbarrier.c:63
    #8  0x00000000004010bd in main (argc=1, argv=0x7ffe7edd5d48) at
    mpi_hello_master_slave.c:115


    (gdb) bt
    #0  0x00007fffe9d6a988 in clock_gettime ()
    #1  0x00007f704bf64edd in clock_gettime () from /lib64/libc.so.6
    #2  0x00007f704b4deea5 in opal_timer_base_get_usec_clock_gettime
    () at timer_linux_component.c:183
    #3  0x00007f704b2f50e4 in opal_progress () at
    runtime/opal_progress.c:161
    #4  0x00007f704c6cc39c in opal_condition_wait (c=0x7f704ca03c40
    <ompi_request_cond>,
        m=0x7f704ca03bc0 <ompi_request_lock>) at
    ../../../../opal/threads/condition.h:76
    #5  0x00007f704c6cc560 in ompi_request_wait_completion
    (req=0x165e580) at ../../../../ompi/request/request.h:383
    #6  0x00007f704c6cd724 in mca_pml_ob1_recv (addr=0x7fffe9cafa10,
    count=255, datatype=0x6020c0 <ompi_mpi_char>,
        src=0, tag=1, comm=0x6022c0 <ompi_mpi_comm_world>,
    status=0x7fffe9caf9f0) at pml_ob1_irecv.c:123
    #7  0x00007f704c4ff434 in PMPI_Recv (buf=0x7fffe9cafa10,
    count=255, type=0x6020c0 <ompi_mpi_char>, source=0,
        tag=1, comm=0x6022c0 <ompi_mpi_comm_world>,
    status=0x7fffe9caf9f0) at precv.c:79
    #8  0x0000000000400d49 in slave () at mpi_hello_master_slave.c:67
    #9  0x00000000004010b3 in main (argc=1, argv=0x7fffe9cafec8) at
    mpi_hello_master_slave.c:113
    (gdb) q

    I am going to try the tarball shortly, but hopefully someone can
    get some insight out of this much information. BTW, the code was
    compiled with the following flags:

    -Wall -Wextra -g3 -O0

    Let me rehash that NO network communication was involved in any
    of these experiments; they were all single node shared memory (sm
    btl) jobs.

    Thanks
    Durga



    1% of the executables have 99% of CPU privilege!
    Userspace code! Unite!! Occupy the kernel!!!


    _______________________________________________
    users mailing list
    us...@open-mpi.org <mailto:us...@open-mpi.org>
    Subscription:http://www.open-mpi.org/mailman/listinfo.cgi/users
    Link to this 
post:http://www.open-mpi.org/community/lists/users/2016/04/29018.php


    _______________________________________________
    users mailing list
    us...@open-mpi.org <mailto:us...@open-mpi.org>
    Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
    Link to this post:
    http://www.open-mpi.org/community/lists/users/2016/04/29019.php




_______________________________________________
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2016/04/29020.php

Reply via email to