Thanks for the tests !

What was fixed in openmpi is handling disconnected infinipath port.

Restoring signal handlers when libinfinipath.so is unloaded (when
mca_mtl_psm.so is unloaded from our point of view) can only be fixed within
libinfinipath.so. It might have already been fixed in the latest OFED
versions, but i am not sure about that ...

Cheers,

Gilles

On Thursday, May 12, 2016, dpchoudh . <dpcho...@gmail.com> wrote:

> <quote>
>
> If you configure with --disable-dlopen, then libinfinipath.so is slurped
> and hence the infinipath signal handler is always set, even if you disable
> the psm mtl or choose to only use the ob1 pml.
> if you do not configure with --disable-dlopen, then the infinipath signal
> handler is set when mca_mtl_psm.so is loaded. and it is not loaded if it is
> disabled or if only ob1 is used.
> </quote>
>
> Aah, I see. But you said that this was recently fixed, right? (I mean, the
> signal handlers are now uninstalled if PSM is unloaded). I do have the
> latest from master.
>
> I ran your patches, and *both* of them fix the crash. In case it is
> useful, I am attaching the console output after applying the patch (the
> output from the app proper is omitted.)
>
> <From patch 1>
> [durga@smallMPI ~]$ mpirun -np 2  ./mpitest
> --------------------------------------------------------------------------
> WARNING: There is at least non-excluded one OpenFabrics device found,
> but there are no active ports detected (or Open MPI was unable to use
> them).  This is most certainly not what you wanted.  Check your
> cables, subnet manager configuration, etc.  The openib BTL will be
> ignored for this job.
>
>   Local host: smallMPI
> --------------------------------------------------------------------------
> smallMPI.26487PSM found 0 available contexts on InfiniPath device(s).
> (err=21)
> smallMPI.26488PSM found 0 available contexts on InfiniPath device(s).
> (err=21)
>
>
> <From patch 2>
>
> [durga@smallMPI ~]$ mpirun -np 2  ./mpitest
> --------------------------------------------------------------------------
> WARNING: There is at least non-excluded one OpenFabrics device found,
> but there are no active ports detected (or Open MPI was unable to use
> them).  This is most certainly not what you wanted.  Check your
> cables, subnet manager configuration, etc.  The openib BTL will be
> ignored for this job.
>
>   Local host: smallMPI
> --------------------------------------------------------------------------
> smallMPI.7486PSM found 0 available contexts on InfiniPath device(s).
> (err=21)
> smallMPI.7487PSM found 0 available contexts on InfiniPath device(s).
> (err=21)
>
>
> The surgeon general advises you to eat right, exercise regularly and quit
> ageing.
>
> On Thu, May 12, 2016 at 4:29 AM, Gilles Gouaillardet <gil...@rist.or.jp
> <javascript:_e(%7B%7D,'cvml','gil...@rist.or.jp');>> wrote:
>
>> If you configure with --disable-dlopen, then libinfinipath.so is slurped
>> and hence the infinipath signal handler is always set, even if you disable
>> the psm mtl or choose to only use the ob1 pml.
>>
>> if you do not configure with --disable-dlopen, then the infinipath signal
>> handler is set when mca_mtl_psm.so is loaded. and it is not loaded if it is
>> disabled or if only ob1 is used.
>>
>> it seems some verbs destructors are called twice here.
>>
>> can you please give the attached patches a try ?
>>
>> /* they are exclusive, e.g. you should only apply one at a time */
>>
>>
>> Cheers,
>>
>>
>> Gilles
>> On 5/12/2016 4:54 PM, dpchoudh . wrote:
>>
>> Hello Gilles
>>
>> I am not sure if I understand you correctly, but let me answer based on
>> what I think you mean:
>>
>> <quote>
>> the infinipath signal handler only dump the stack (into a .btr file, yeah
>> !)
>> so if your application crashes without it, you should examine the core
>> file and see what is going wrong.
>> </quote>
>>
>> If this is true, then there is a bug in OMPI proper, since it is crashing
>> inside MPI_Init(). Here is the stack:
>>
>> (gdb) bt
>> #0  0x00007ff3104ac7d8 in main_arena () from /lib64/libc.so.6
>> #1  0x00007ff30f6869ac in device_destruct (device=0x1284b30) at
>> btl_openib_component.c:985
>> #2  0x00007ff30f6820ae in opal_obj_run_destructors (object=0x1284b30) at
>> ../../../../opal/class/opal_object.h:460
>> #3  0x00007ff30f689d3c in init_one_device (btl_list=0x7fff96c3a200,
>> ib_dev=0x12843f0) at btl_openib_component.c:2255
>> #4  0x00007ff30f68b800 in btl_openib_component_init
>> (num_btl_modules=0x7fff96c3a330, enable_progress_threads=true,
>>     enable_mpi_threads=false) at btl_openib_component.c:2752
>> #5  0x00007ff30f648971 in mca_btl_base_select
>> (enable_progress_threads=true, enable_mpi_threads=false) at
>> base/btl_base_select.c:110
>> #6  0x00007ff3108100a0 in mca_bml_r2_component_init
>> (priority=0x7fff96c3a3fc, enable_progress_threads=true,
>> enable_mpi_threads=false)
>>     at bml_r2_component.c:86
>> #7  0x00007ff31080d033 in mca_bml_base_init
>> (enable_progress_threads=true, enable_mpi_threads=false) at
>> base/bml_base_init.c:74
>> #8  0x00007ff310754675 in ompi_mpi_init (argc=1, argv=0x7fff96c3a7b8,
>> requested=0, provided=0x7fff96c3a56c)
>>     at runtime/ompi_mpi_init.c:590
>> #9  0x00007ff3107918b7 in PMPI_Init (argc=0x7fff96c3a5ac,
>> argv=0x7fff96c3a5a0) at pinit.c:66
>> #10 0x0000000000400aa0 in main (argc=1, argv=0x7fff96c3a7b8) at
>> mpitest.c:17
>>
>> As you can see, the crash happens inside the verbs library and the
>> following gets printed to the console:
>>
>> [durga@smallMPI ~]$ mpirun -np 2 ./mpitest
>> [smallMPI:05754] *** Process received signal ***
>> [smallMPI:05754] Signal: Segmentation fault (11)
>> [smallMPI:05754] Signal code: Invalid permissions (2)
>> [smallMPI:05754] Failing at address: 0x7ff3104ac7d8
>>
>> That sort of tells me the perhaps the signal handler does more than
>> simply prints the stack; it might be manipulating page permissions (since I
>> see a different behaviour when PSM signal handlers are enabled.
>>
>> The MPI app that I am running is a simple program and it runs fine with
>> the work around you mention.
>>
>> <quote>
>> note the infinipath signal handler is set in the constructor of
>> libinfinipath.so,
>> and used *not* to be removed in the destructor.
>> that means that if the signal handler is invoked *after* the pml MTL
>> is unloaded, a crash will likely occur because the psm signal handler
>> is likely pointing to unmapped memory.
>> </quote>
>>
>> But during normal operation, this should not be an issue, right? The
>> signal handler, even if it points to unmapped memory, is being invoked in
>> response to an exception that will kill the process anyway. The only side
>> effect of this I see is that the stack will be misleading. In any case, I
>> am compiling with --disable-dlopen set, so my understanding is that since
>> all the components are slurped onto one giant .so file, the memory will not
>> be unmapped.
>>
>> <quote>
>> on top of that, there used to be a bug if some PSM device is detected
>> but with no link (e.g. crash)
>> with the latest ompi master, this bug should be fixed (e.g. no crash)
>> this means the PSM mtl should disqualify itself if there is no link on
>> any of the PSM ports, so, unless your infinipath library is fixed or
>> you configure'd with --disable-dlopen, you will run into trouble if
>> the ipath signal handler is invoked.
>>
>> can you confirm you have the latest master and there is no link on
>> your ipath device ?
>>
>> what does
>> grep ACTIVE /sys/class/infiniband/qib*/ports/*/state
>> returns ?
>> </quote>
>>
>> I confirm that I have the latest from master (by running 'git pull').
>> Also, I have a single Qlogic card with a single port and here is the output:
>> [durga@smallMPI ~]$ cat /sys/class/infiniband/qib0/ports/1/state
>> 1: DOWN
>>
>> <quote>
>> if you did not configure with --disable-dlopen *and* you do not need
>> the psm mtl, you can
>> mpirun --mca mtl ^psm ...
>> or if you do not need any mtl at all
>> mpirun --mca pml ob1 ...
>> should be enough
>> </quote>
>>
>> I did configure with --disable-dlopen, but why does that make a
>> difference? This is the part that I don't understand.
>> And yes, I do have a reasonable work around now, but I am passing on my
>> observations so that if there is a bug, the developers can fix it, or if I
>> am doing something wrong, then they can correct me.
>>
>> The surgeon general advises you to eat right, exercise regularly and quit
>> ageing.
>>
>> On Thu, May 12, 2016 at 12:38 AM, Gilles Gouaillardet <
>> <javascript:_e(%7B%7D,'cvml','gilles.gouaillar...@gmail.com');>
>> gilles.gouaillar...@gmail.com
>> <javascript:_e(%7B%7D,'cvml','gilles.gouaillar...@gmail.com');>> wrote:
>>
>>> Durga,
>>>
>>> the infinipath signal handler only dump the stack (into a .btr file,
>>> yeah !)
>>> so if your application crashes without it, you should examine the core
>>> file and see what is going wrong.
>>>
>>> note the infinipath signal handler is set in the constructor of
>>> libinfinipath.so,
>>> and used *not* to be removed in the destructor.
>>> that means that if the signal handler is invoked *after* the pml MTL
>>> is unloaded, a crash will likely occur because the psm signal handler
>>> is likely pointing to unmapped memory.
>>>
>>> on top of that, there used to be a bug if some PSM device is detected
>>> but with no link (e.g. crash)
>>> with the latest ompi master, this bug should be fixed (e.g. no crash)
>>> this means the PSM mtl should disqualify itself if there is no link on
>>> any of the PSM ports, so, unless your infinipath library is fixed or
>>> you configure'd with --disable-dlopen, you will run into trouble if
>>> the ipath signal handler is invoked.
>>>
>>> can you confirm you have the latest master and there is no link on
>>> your ipath device ?
>>>
>>> what does
>>> grep ACTIVE /sys/class/infiniband/qib*/ports/*/state
>>> returns ?
>>>
>>> if you did not configure with --disable-dlopen *and* you do not need
>>> the psm mtl, you can
>>> mpirun --mca mtl ^psm ...
>>> or if you do not need any mtl at all
>>> mpirun --mca pml ob1 ...
>>> should be enough
>>>
>>> Cheers,
>>>
>>> Gilles
>>>
>>> commit 4d026e223ce717345712e669d26f78ed49082df6
>>> Merge: f8facb1 4071719
>>> Author: rhc54 <r...@open-mpi.org
>>> <javascript:_e(%7B%7D,'cvml','r...@open-mpi.org');>>
>>> Date:   Wed May 11 17:43:17 2016 -0700
>>>
>>>     Merge pull request #1661 from matcabral/master
>>>
>>>     PSM and PSM2 MTLs to detect drivers and link
>>>
>>>
>>> On Thu, May 12, 2016 at 12:42 PM, dpchoudh . <
>>> <javascript:_e(%7B%7D,'cvml','dpcho...@gmail.com');>dpcho...@gmail.com
>>> <javascript:_e(%7B%7D,'cvml','dpcho...@gmail.com');>> wrote:
>>> > Sorry for belabouring on this, but this (hopefully final!) piece of
>>> > information might be of interest to the developers:
>>> >
>>> > There must be a reason why PSM is installing its signal handlers;
>>> often this
>>> > is done to modify the permission of a page in response to a SEGV and
>>> attempt
>>> > access again. By disabling the handlers, I am preventing the library
>>> from
>>> > doing that, and here is what it tells me:
>>> >
>>> > [durga@smallMPI ~]$ mpirun -np 2  ./mpitest
>>> > [smallMPI:20496] *** Process received signal ***
>>> > [smallMPI:20496] Signal: Segmentation fault (11)
>>> > [smallMPI:20496] Signal code: Invalid permissions (2)
>>> > [smallMPI:20496] Failing at address: 0x7f0b2fdb57d8
>>> > [smallMPI:20496] [ 0] /lib64/libpthread.so.0(+0xf100)[0x7f0b2fdcb100]
>>> > [smallMPI:20496] [ 1] /lib64/libc.so.6(+0x3ba7d8)[0x7f0b2fdb57d8]
>>> > [smallMPI:20496] *** End of error message ***
>>> > [smallMPI:20497] *** Process received signal ***
>>> > [smallMPI:20497] Signal: Segmentation fault (11)
>>> > [smallMPI:20497] Signal code: Invalid permissions (2)
>>> > [smallMPI:20497] Failing at address: 0x7fbfe2b387d8
>>> > [smallMPI:20497] [ 0] /lib64/libpthread.so.0(+0xf100)[0x7fbfe2b4e100]
>>> > [smallMPI:20497] [ 1] /lib64/libc.so.6(+0x3ba7d8)[0x7fbfe2b387d8]
>>> > [smallMPI:20497] *** End of error message ***
>>> > -------------------------------------------------------
>>> > Primary job  terminated normally, but 1 process returned
>>> > a non-zero exit code. Per user-direction, the job has been aborted.
>>> > -------------------------------------------------------
>>> >
>>> > However, even without disabling it, it crashes anyway, as follows:
>>> >
>>> > unset IPATH_NO_BACKTRACE
>>> >
>>> > [durga@smallMPI ~]$ mpirun -np 2  ./mpitest
>>> >
>>> > mpitest:22009 terminated with signal 11 at PC=7f908bb2a7d8
>>> SP=7ffebb4ee5b8.
>>> > Backtrace:
>>> > /lib64/libc.so.6(+0x3ba7d8)[0x7f908bb2a7d8]
>>> >
>>> > mpitest:22010 terminated with signal 11 at PC=7f7a2caa67d8
>>> SP=7ffd73dec3e8.
>>> > Backtrace:
>>> > /lib64/libc.so.6(+0x3ba7d8)[0x7f7a2caa67d8]
>>> >
>>> > The PC is at a different location but I do not have any more
>>> information
>>> > without a core file.
>>> >
>>> > It seems like some interaction between OMPI and PSM library is
>>> incorrect.
>>> > I'll let the developers figure it out :-)
>>> >
>>> >
>>> > Thanks
>>> > Durga
>>> >
>>> >
>>> >
>>> >
>>> > The surgeon general advises you to eat right, exercise regularly and
>>> quit
>>> > ageing.
>>> >
>>> > On Wed, May 11, 2016 at 11:23 PM, dpchoudh . <dpcho...@gmail.com
>>> <javascript:_e(%7B%7D,'cvml','dpcho...@gmail.com');>> wrote:
>>> >>
>>> >> Hello Gilles
>>> >>
>>> >> Mystery solved! In fact, this one line is exactly what was needed!! It
>>> >> turns out the OMPI signal handlers are irrelevant. (i.e. don't make
>>> any
>>> >> difference when this env variable is set)
>>> >>
>>> >> This explains:
>>> >>
>>> >> 1. The difference in the behaviour in the two clusters (one has PSM,
>>> the
>>> >> other does not)
>>> >> 2. Why you couldn't find where in OMPI code the .btr files are being
>>> >> generated (looks like they are being generated in PSM library)
>>> >>
>>> >> And, now that I can get a core file (finally!), I can present the back
>>> >> trace where the crash in MPI_Init() is happening. It is as follows:
>>> >>
>>> >> #0  0x00007f1c114977d8 in main_arena () from /lib64/libc.so.6
>>> >> #1  0x00007f1c106719ac in device_destruct (device=0x1c85b70) at
>>> >> btl_openib_component.c:985
>>> >> #2  0x00007f1c1066d0ae in opal_obj_run_destructors (object=0x1c85b70)
>>> at
>>> >> ../../../../opal/class/opal_object.h:460
>>> >> #3  0x00007f1c10674d3c in init_one_device (btl_list=0x7ffd00dada50,
>>> >> ib_dev=0x1c85430) at btl_openib_component.c:2255
>>> >> #4  0x00007f1c10676800 in btl_openib_component_init
>>> >> (num_btl_modules=0x7ffd00dadb80, enable_progress_threads=true,
>>> >> enable_mpi_threads=false)
>>> >>     at btl_openib_component.c:2752
>>> >> #5  0x00007f1c10633971 in mca_btl_base_select
>>> >> (enable_progress_threads=true, enable_mpi_threads=false) at
>>> >> base/btl_base_select.c:110
>>> >> #6  0x00007f1c117fb0a0 in mca_bml_r2_component_init
>>> >> (priority=0x7ffd00dadc4c, enable_progress_threads=true,
>>> >> enable_mpi_threads=false)
>>> >>     at bml_r2_component.c:86
>>> >> #7  0x00007f1c117f8033 in mca_bml_base_init
>>> (enable_progress_threads=true,
>>> >> enable_mpi_threads=false) at base/bml_base_init.c:74
>>> >> #8  0x00007f1c1173f675 in ompi_mpi_init (argc=1, argv=0x7ffd00dae008,
>>> >> requested=0, provided=0x7ffd00daddbc) at runtime/ompi_mpi_init.c:590
>>> >> #9  0x00007f1c1177c8b7 in PMPI_Init (argc=0x7ffd00daddfc,
>>> >> argv=0x7ffd00daddf0) at pinit.c:66
>>> >> #10 0x0000000000400aa0 in main (argc=1, argv=0x7ffd00dae008) at
>>> >> mpitest.c:17
>>> >>
>>> >> This is with the absolute latest code from master.
>>> >>
>>> >> Thanks everyone for their help.
>>> >>
>>> >> Durga
>>> >>
>>> >> The surgeon general advises you to eat right, exercise regularly and
>>> quit
>>> >> ageing.
>>> >>
>>> >> On Wed, May 11, 2016 at 10:55 PM, Gilles Gouaillardet <
>>> gil...@rist.or.jp <javascript:_e(%7B%7D,'cvml','gil...@rist.or.jp');>>
>>>
>>> >> wrote:
>>> >>>
>>> >>> Note the psm library sets its own signal handler, possibly after the
>>> >>> OpenMPI one.
>>> >>>
>>> >>> that can be disabled by
>>> >>>
>>> >>> export IPATH_NO_BACKTRACE=1
>>> >>>
>>> >>> Cheers,
>>> >>>
>>> >>> Gilles
>>> >>>
>>> >>>
>>> >>> On 5/12/2016 11:34 AM, dpchoudh . wrote:
>>> >>>
>>> >>> Hello Gilles
>>> >>>
>>> >>> Thank you for your continued support. With your help, I have a better
>>> >>> understanding of what is happening. Here are the details.
>>> >>>
>>> >>> 1. Yes, I am sure that ulimit -c is 'unlimited' (and for the test in
>>> >>> question, I am running it on a single node, so there are no other
>>> nodes)
>>> >>>
>>> >>> 2. The command I am running is possibly the simplest MPI command:
>>> >>> mpirun -np 2 <program>
>>> >>>
>>> >>> It looks to me, after running your test code, that what is crashing
>>> is
>>> >>> MPI_Init() itself. The output from your code (I called it 'btrtest')
>>> is as
>>> >>> follows:
>>> >>>
>>> >>> [durga@smallMPI ~]$ mpirun -np 2 ./btrtest
>>> >>> before MPI_Init : -1 -1
>>> >>> before MPI_Init : -1 -1
>>> >>>
>>> >>> btrtest:7275 terminated with signal 11 at PC=7f401f49e7d8
>>> >>> SP=7ffec47e7578.  Backtrace:
>>> >>> /lib64/libc.so.6(+0x3ba7d8)[0x7f401f49e7d8]
>>> >>>
>>> >>> btrtest:7274 terminated with signal 11 at PC=7f1ba21897d8
>>> >>> SP=7ffc516ac8d8.  Backtrace:
>>> >>> /lib64/libc.so.6(+0x3ba7d8)[0x7f1ba21897d8]
>>> >>> -------------------------------------------------------
>>> >>> Primary job  terminated normally, but 1 process returned
>>> >>> a non-zero exit code. Per user-direction, the job has been aborted.
>>> >>> -------------------------------------------------------
>>> >>>
>>> >>>
>>> --------------------------------------------------------------------------
>>> >>> mpirun detected that one or more processes exited with non-zero
>>> status,
>>> >>> thus causing
>>> >>> the job to be terminated. The first process to do so was:
>>> >>>
>>> >>>   Process name: [[7936,1],1]
>>> >>>   Exit code:    1
>>> >>>
>>> >>>
>>> --------------------------------------------------------------------------
>>> >>>
>>> >>> So obviously the code does not make it past MPI_Init()
>>> >>>
>>> >>> This is an issue that I have been observing for quite a while in
>>> >>> different forms and have reported on the forum a few times also. Let
>>> me
>>> >>> elaborate:
>>> >>>
>>> >>> Both my 'well-behaving' and crashing clusters run CentOS 7 (the
>>> crashing
>>> >>> one has the latest updates, the well-behaving one does not as I am
>>> not
>>> >>> allowed to apply updates on that). They both have OMPI, from the
>>> master
>>> >>> branch, compiled from the source. Both consist of 64 bit Dell
>>> servers,
>>> >>> although not identical models (I doubt if that matters)
>>> >>>
>>> >>> The only significant difference between the two is this:
>>> >>>
>>> >>> The well behaved one (if it does core dump, that is because there is
>>> a
>>> >>> bug in the MPI app) has very simple network hardware: two different
>>> NICs
>>> >>> (one Broadcom GbE, one proprietary NIC that is currently being
>>> exposed as an
>>> >>> IP interface). There is no RDMA capability there at all.
>>> >>>
>>> >>> The crashing one have 4 different NICs:
>>> >>> 1. Broadcom GbE
>>> >>> 2. Chelsio T3 based 10Gb iWARP NIC
>>> >>> 3. QLogic 20Gb Infiniband (PSM capable)
>>> >>> 4. LSI logic Fibre channel
>>> >>>
>>> >>> In this situation, WITH ALL BUT THE GbE LINK DOWN (the GbE connects
>>> the
>>> >>> machine to the WAN link), it seems just the presence of these NICs
>>> matter.
>>> >>>
>>> >>> Here are the various commands and outputs:
>>> >>>
>>> >>> [durga@smallMPI ~]$ mpirun -np 2 ./btrtest
>>> >>> before MPI_Init : -1 -1
>>> >>> before MPI_Init : -1 -1
>>> >>>
>>> >>> btrtest:10032 terminated with signal 11 at PC=7f6897c197d8
>>> >>> SP=7ffcae2b2ef8.  Backtrace:
>>> >>> /lib64/libc.so.6(+0x3ba7d8)[0x7f6897c197d8]
>>> >>>
>>> >>> btrtest:10033 terminated with signal 11 at PC=7fb035c3e7d8
>>> >>> SP=7ffe61a92088.  Backtrace:
>>> >>> /lib64/libc.so.6(+0x3ba7d8)[0x7fb035c3e7d8]
>>> >>> -------------------------------------------------------
>>> >>> Primary job  terminated normally, but 1 process returned
>>> >>> a non-zero exit code. Per user-direction, the job has been aborted.
>>> >>> -------------------------------------------------------
>>> >>>
>>> >>>
>>> --------------------------------------------------------------------------
>>> >>> mpirun detected that one or more processes exited with non-zero
>>> status,
>>> >>> thus causing
>>> >>> the job to be terminated. The first process to do so was:
>>> >>>
>>> >>>   Process name: [[9294,1],0]
>>> >>>   Exit code:    1
>>> >>>
>>> >>>
>>> --------------------------------------------------------------------------
>>> >>>
>>> >>> [durga@smallMPI ~]$ mpirun -np 2 -mca pml ob1 ./btrtest
>>> >>> before MPI_Init : -1 -1
>>> >>> before MPI_Init : -1 -1
>>> >>>
>>> >>> btrtest:10076 terminated with signal 11 at PC=7fa92d20b7d8
>>> >>> SP=7ffebb106028.  Backtrace:
>>> >>> /lib64/libc.so.6(+0x3ba7d8)[0x7fa92d20b7d8]
>>> >>>
>>> >>> btrtest:10077 terminated with signal 11 at PC=7f5012fa57d8
>>> >>> SP=7ffea4f4fdf8.  Backtrace:
>>> >>> /lib64/libc.so.6(+0x3ba7d8)[0x7f5012fa57d8]
>>> >>> -------------------------------------------------------
>>> >>> Primary job  terminated normally, but 1 process returned
>>> >>> a non-zero exit code. Per user-direction, the job has been aborted.
>>> >>> -------------------------------------------------------
>>> >>>
>>> >>>
>>> --------------------------------------------------------------------------
>>> >>> mpirun detected that one or more processes exited with non-zero
>>> status,
>>> >>> thus causing
>>> >>> the job to be terminated. The first process to do so was:
>>> >>>
>>> >>>   Process name: [[9266,1],0]
>>> >>>   Exit code:    1
>>> >>>
>>> >>>
>>> --------------------------------------------------------------------------
>>> >>>
>>> >>> [durga@smallMPI ~]$ mpirun -np 2 -mca pml ob1 -mca btl self,sm
>>> ./btrtest
>>> >>> before MPI_Init : -1 -1
>>> >>> before MPI_Init : -1 -1
>>> >>>
>>> >>> btrtest:10198 terminated with signal 11 at PC=400829 SP=7ffe6e148870.
>>> >>> Backtrace:
>>> >>>
>>> >>> btrtest:10197 terminated with signal 11 at PC=400829 SP=7ffe87be6cd0.
>>> >>> Backtrace:
>>> >>> ./btrtest[0x400829]
>>> >>> /lib64/libc.so.6(__libc_start_main+0xf5)[0x7f9473bbeb15]
>>> >>> ./btrtest[0x4006d9]
>>> >>> ./btrtest[0x400829]
>>> >>> /lib64/libc.so.6(__libc_start_main+0xf5)[0x7fdfe2d8eb15]
>>> >>> ./btrtest[0x4006d9]
>>> >>> after MPI_Init : -1 -1
>>> >>> after MPI_Init : -1 -1
>>> >>> -------------------------------------------------------
>>> >>> Primary job  terminated normally, but 1 process returned
>>> >>> a non-zero exit code. Per user-direction, the job has been aborted.
>>> >>> -------------------------------------------------------
>>> >>>
>>> >>>
>>> --------------------------------------------------------------------------
>>> >>> mpirun detected that one or more processes exited with non-zero
>>> status,
>>> >>> thus causing
>>> >>> the job to be terminated. The first process to do so was:
>>> >>>
>>> >>>   Process name: [[9384,1],1]
>>> >>>   Exit code:    1
>>> >>>
>>> >>>
>>> --------------------------------------------------------------------------
>>> >>>
>>> >>>
>>> >>> [durga@smallMPI ~]$ ulimit -a
>>> >>> core file size          (blocks, -c) unlimited
>>> >>> data seg size           (kbytes, -d) unlimited
>>> >>> scheduling priority             (-e) 0
>>> >>> file size               (blocks, -f) unlimited
>>> >>> pending signals                 (-i) 216524
>>> >>> max locked memory       (kbytes, -l) unlimited
>>> >>> max memory size         (kbytes, -m) unlimited
>>> >>> open files                      (-n) 1024
>>> >>> pipe size            (512 bytes, -p) 8
>>> >>> POSIX message queues     (bytes, -q) 819200
>>> >>> real-time priority              (-r) 0
>>> >>> stack size              (kbytes, -s) 8192
>>> >>> cpu time               (seconds, -t) unlimited
>>> >>> max user processes              (-u) 4096
>>> >>> virtual memory          (kbytes, -v) unlimited
>>> >>> file locks                      (-x) unlimited
>>> >>> [durga@smallMPI ~]$
>>> >>>
>>> >>>
>>> >>> I do realize that my setup is very unusual (I am a quasi-developer
>>> of MPI
>>> >>> whereas most other folks in this list are likely end-users), but
>>> somehow
>>> >>> just disabling this 'execinfo' MCA would allow me to make progress
>>> (and also
>>> >>> find out why/where MPI_Init() is crashing!). Is there any way I can
>>> do that?
>>> >>>
>>> >>> Thank you
>>> >>> Durga
>>> >>>
>>> >>> The surgeon general advises you to eat right, exercise regularly and
>>> quit
>>> >>> ageing.
>>> >>>
>>> >>> On Wed, May 11, 2016 at 8:58 PM, Gilles Gouaillardet <
>>> gil...@rist.or.jp <javascript:_e(%7B%7D,'cvml','gil...@rist.or.jp');>>
>>> >>> wrote:
>>> >>>>
>>> >>>> Are you sure ulimit -c unlimited is *really* applied on all hosts
>>> >>>>
>>> >>>>
>>> >>>> can you please run the simple program below and confirm that ?
>>> >>>>
>>> >>>>
>>> >>>> Cheers,
>>> >>>>
>>> >>>>
>>> >>>> Gilles
>>> >>>>
>>> >>>>
>>> >>>> #include <sys/time.h>
>>> >>>> #include <sys/resource.h>
>>> >>>> #include <poll.h>
>>> >>>> #include <stdio.h>
>>> >>>>
>>> >>>> int main(int argc, char *argv[]) {
>>> >>>>     struct rlimit rlim;
>>> >>>>     char * c = (char *)0;
>>> >>>>     getrlimit(RLIMIT_CORE, &rlim);
>>> >>>>     printf ("before MPI_Init : %d %d\n", rlim.rlim_cur,
>>> rlim.rlim_max);
>>> >>>>     MPI_Init(&argc, &argv);
>>> >>>>     getrlimit(RLIMIT_CORE, &rlim);
>>> >>>>     printf ("after MPI_Init : %d %d\n", rlim.rlim_cur,
>>> rlim.rlim_max);
>>> >>>>     *c = 0;
>>> >>>>     MPI_Finalize();
>>> >>>>     return 0;
>>> >>>> }
>>> >>>>
>>> >>>>
>>> >>>> On 5/12/2016 4:22 AM, dpchoudh . wrote:
>>> >>>>
>>> >>>> Hello Gilles
>>> >>>>
>>> >>>> Thank you for the advice. However, that did not seem to make any
>>> >>>> difference. Here is what I did (on the cluster that generates .btr
>>> files for
>>> >>>> core dumps):
>>> >>>>
>>> >>>> [durga@smallMPI git]$ ompi_info --all | grep opal_signal
>>> >>>>            MCA opal base: parameter "opal_signal" (current value:
>>> >>>> "6,7,8,11", data source: default, level: 3 user/all, type: string)
>>> >>>> [durga@smallMPI git]$
>>> >>>>
>>> >>>>
>>> >>>> According to <bits/signum.h>, signals 6.7,8,11 are this:
>>> >>>>
>>> >>>> #define    SIGABRT        6    /* Abort (ANSI).  */
>>> >>>> #define    SIGBUS        7    /* BUS error (4.2 BSD).  */
>>> >>>> #define    SIGFPE        8    /* Floating-point exception (ANSI).
>>> */
>>> >>>> #define    SIGSEGV        11    /* Segmentation violation (ANSI).
>>> */
>>> >>>>
>>> >>>> And thus I added the following just after MPI_Init()
>>> >>>>
>>> >>>>     MPI_Init(&argc, &argv);
>>> >>>>     signal(SIGABRT, SIG_DFL);
>>> >>>>     signal(SIGBUS, SIG_DFL);
>>> >>>>     signal(SIGFPE, SIG_DFL);
>>> >>>>     signal(SIGSEGV, SIG_DFL);
>>> >>>>     signal(SIGTERM, SIG_DFL);
>>> >>>>
>>> >>>> (I added the 'SIGTERM' part later, just in case it would make a
>>> >>>> difference; i didn't)
>>> >>>>
>>> >>>> The resulting code still generates .btr files instead of core files.
>>> >>>>
>>> >>>> It looks like the 'execinfo' MCA component is being used as the
>>> >>>> backtrace mechanism:
>>> >>>>
>>> >>>> [durga@smallMPI git]$ ompi_info | grep backtrace
>>> >>>>            MCA backtrace: execinfo (MCA v2.1.0, API v2.0.0,
>>> Component
>>> >>>> v3.0.0)
>>> >>>>
>>> >>>> However, I could not find any way to choose 'none' instead of
>>> 'execinfo'
>>> >>>>
>>> >>>> And the strange thing is, on the cluster where regular core dump is
>>> >>>> happening, the output of
>>> >>>> $ ompi_info | grep backtrace
>>> >>>> is identical to the above. (Which kind of makes sense because they
>>> were
>>> >>>> created from the same source with the same configure options.)
>>> >>>>
>>> >>>> Sorry to harp on this, but without a core file it is hard to debug
>>> the
>>> >>>> application (e.g. examine stack variables).
>>> >>>>
>>> >>>> Thank you
>>> >>>> Durga
>>> >>>>
>>> >>>>
>>> >>>> The surgeon general advises you to eat right, exercise regularly and
>>> >>>> quit ageing.
>>> >>>>
>>> >>>> On Wed, May 11, 2016 at 3:37 AM, Gilles Gouaillardet
>>> >>>> <gilles.gouaillar...@gmail.com
>>> <javascript:_e(%7B%7D,'cvml','gilles.gouaillar...@gmail.com');>> wrote:
>>> >>>>>
>>> >>>>> Durga,
>>> >>>>>
>>> >>>>> you might wanna try to restore the signal handler for other
>>> signals as
>>> >>>>> well
>>> >>>>> (SIGSEGV, SIGBUS, ...)
>>> >>>>> ompi_info --all | grep opal_signal
>>> >>>>> does list the signal you should restore the handler
>>> >>>>>
>>> >>>>>
>>> >>>>> only one backtrace component is built (out of several candidates :
>>> >>>>> execinfo, none, printstack)
>>> >>>>> nm -l libopen-pal.so | grep backtrace
>>> >>>>> will hint you which component was built
>>> >>>>>
>>> >>>>> your two similar distros might have different backtrace component
>>> >>>>>
>>> >>>>>
>>> >>>>>
>>> >>>>> Gus,
>>> >>>>>
>>> >>>>> btr is a plain text file with a back trace "ala" gdb
>>> >>>>>
>>> >>>>>
>>> >>>>>
>>> >>>>> Nathan,
>>> >>>>>
>>> >>>>> i did a 'grep btr' and could not find anything :-(
>>> >>>>> opal_backtrace_buffer and opal_backtrace_print are only used with
>>> >>>>> stderr.
>>> >>>>> so i am puzzled who creates the tracefile name and where ...
>>> >>>>> also, no stack is printed by default unless opal_abort_print_stack
>>> is
>>> >>>>> true
>>> >>>>>
>>> >>>>> Cheers,
>>> >>>>>
>>> >>>>> Gilles
>>> >>>>>
>>> >>>>>
>>> >>>>> On Wed, May 11, 2016 at 3:43 PM, dpchoudh . <dpcho...@gmail.com
>>> <javascript:_e(%7B%7D,'cvml','dpcho...@gmail.com');>> wrote:
>>> >>>>> > Hello Nathan
>>> >>>>> >
>>> >>>>> > Thank you for your response. Could you please be more specific?
>>> >>>>> > Adding the
>>> >>>>> > following after MPI_Init() does not seem to make a difference.
>>> >>>>> >
>>> >>>>> >     MPI_Init(&argc, &argv);
>>> >>>>> >   signal(SIGABRT, SIG_DFL);
>>> >>>>> >   signal(SIGTERM, SIG_DFL);
>>> >>>>> >
>>> >>>>> > I also find it puzzling that nearly identical OMPI distro
>>> running on
>>> >>>>> > a
>>> >>>>> > different machine shows different behaviour.
>>> >>>>> >
>>> >>>>> > Best regards
>>> >>>>> > Durga
>>> >>>>> >
>>> >>>>> > The surgeon general advises you to eat right, exercise regularly
>>> and
>>> >>>>> > quit
>>> >>>>> > ageing.
>>> >>>>> >
>>> >>>>> > On Tue, May 10, 2016 at 10:02 AM, Hjelm, Nathan Thomas
>>> >>>>> > <hje...@lanl.gov
>>> <javascript:_e(%7B%7D,'cvml','hje...@lanl.gov');>>
>>> >>>>> > wrote:
>>> >>>>> >>
>>> >>>>> >> btr files are indeed created by open mpi's backtrace mechanism.
>>> I
>>> >>>>> >> think we
>>> >>>>> >> should revisit it at some point but for now the only effective
>>> way i
>>> >>>>> >> have
>>> >>>>> >> found to prevent it is to restore the default signal handlers
>>> after
>>> >>>>> >> MPI_Init.
>>> >>>>> >>
>>> >>>>> >> Excuse the quoting style. Good sucks.
>>> >>>>> >>
>>> >>>>> >>
>>> >>>>> >> ________________________________________
>>> >>>>> >> From: users on behalf of dpchoudh .
>>> >>>>> >> Sent: Monday, May 09, 2016 2:59:37 PM
>>> >>>>> >> To: Open MPI Users
>>> >>>>> >> Subject: Re: [OMPI users] No core dump in some cases
>>> >>>>> >>
>>> >>>>> >> Hi Gus
>>> >>>>> >>
>>> >>>>> >> Thanks for your suggestion. But I am not using any resource
>>> manager
>>> >>>>> >> (i.e.
>>> >>>>> >> I am launching mpirun from the bash shell.). In fact, both of
>>> the
>>> >>>>> >> two
>>> >>>>> >> clusters I talked about run CentOS 7 and I launch the job the
>>> same
>>> >>>>> >> way on
>>> >>>>> >> both of these, yet one of them creates standard core files and
>>> the
>>> >>>>> >> other
>>> >>>>> >> creates the 'btr; files. Strange thing is, I could not find
>>> anything
>>> >>>>> >> on the
>>> >>>>> >> .btr (= Backtrace?) files on Google, which is any I asked on
>>> this
>>> >>>>> >> forum.
>>> >>>>> >>
>>> >>>>> >> Best regards
>>> >>>>> >> Durga
>>> >>>>> >>
>>> >>>>> >> The surgeon general advises you to eat right, exercise
>>> regularly and
>>> >>>>> >> quit
>>> >>>>> >> ageing.
>>> >>>>> >>
>>> >>>>> >> On Mon, May 9, 2016 at 12:04 PM, Gus Correa
>>> >>>>> >> < <javascript:_e(%7B%7D,'cvml','g...@ldeo.columbia.edu');>
>>> g...@ldeo.columbia.edu
>>> <javascript:_e(%7B%7D,'cvml','g...@ldeo.columbia.edu');><mailto:
>>> <javascript:_e(%7B%7D,'cvml','g...@ldeo.columbia.edu');>
>>> g...@ldeo.columbia.edu
>>> <javascript:_e(%7B%7D,'cvml','g...@ldeo.columbia.edu');>>> wrote:
>>> >>>>> >> Hi Durga
>>> >>>>> >>
>>> >>>>> >> Just in case ...
>>> >>>>> >> If you're using a resource manager to start the jobs (Torque,
>>> etc),
>>> >>>>> >> you need to have them set the limits (for coredump size,
>>> stacksize,
>>> >>>>> >> locked
>>> >>>>> >> memory size, etc).
>>> >>>>> >> This way the jobs will inherit the limits from the
>>> >>>>> >> resource manager daemon.
>>> >>>>> >> On Torque (which I use) I do this on the pbs_mom daemon
>>> >>>>> >> init script (I am still before the systemd era, that lovely
>>> POS).
>>> >>>>> >> And set the hard/soft limits on /etc/security/limits.conf as
>>> well.
>>> >>>>> >>
>>> >>>>> >> I hope this helps,
>>> >>>>> >> Gus Correa
>>> >>>>> >>
>>> >>>>> >> On 05/07/2016 12:27 PM, Jeff Squyres (jsquyres) wrote:
>>> >>>>> >> I'm afraid I don't know what a .btr file is -- that is not
>>> something
>>> >>>>> >> that
>>> >>>>> >> is controlled by Open MPI.
>>> >>>>> >>
>>> >>>>> >> You might want to look into your OS settings to see if it has
>>> some
>>> >>>>> >> kind of
>>> >>>>> >> alternate corefile mechanism...?
>>> >>>>> >>
>>> >>>>> >>
>>> >>>>> >> On May 6, 2016, at 8:58 PM, dpchoudh .
>>> >>>>> >> < <javascript:_e(%7B%7D,'cvml','dpcho...@gmail.com');>
>>> dpcho...@gmail.com <javascript:_e(%7B%7D,'cvml','dpcho...@gmail.com');>
>>> <mailto: <javascript:_e(%7B%7D,'cvml','dpcho...@gmail.com');>
>>> dpcho...@gmail.com <javascript:_e(%7B%7D,'cvml','dpcho...@gmail.com');>>>
>>> wrote:
>>> >>>>> >>
>>> >>>>> >> Hello all
>>> >>>>> >>
>>> >>>>> >> I run MPI jobs (for test purpose only) on two different
>>> 'clusters'.
>>> >>>>> >> Both
>>> >>>>> >> 'clusters' have two nodes only, connected back-to-back. The two
>>> are
>>> >>>>> >> very
>>> >>>>> >> similar, but not identical, both software and hardware wise.
>>> >>>>> >>
>>> >>>>> >> Both have ulimit -c set to unlimited. However, only one of the
>>> two
>>> >>>>> >> creates
>>> >>>>> >> core files when an MPI job crashes. The other creates a text
>>> file
>>> >>>>> >> named
>>> >>>>> >> something like
>>> >>>>> >>
>>> >>>>> >>
>>> >>>>> >>
>>> <program_name_that_crashed>.80s-<a-number-that-looks-like-a-PID>,<hostname-where-the-crash-happened>.btr
>>> >>>>> >>
>>> >>>>> >> I'd much prefer a core file because that allows me to debug
>>> with a
>>> >>>>> >> lot
>>> >>>>> >> more options than a static text file with addresses. How do I
>>> get a
>>> >>>>> >> core
>>> >>>>> >> file in all situations? I am using MPI source from the master
>>> >>>>> >> branch.
>>> >>>>> >>
>>> >>>>> >> Thanks in advance
>>> >>>>> >> Durga
>>> >>>>> >>
>>> >>>>> >> The surgeon general advises you to eat right, exercise
>>> regularly and
>>> >>>>> >> quit
>>> >>>>> >> ageing.
>>> >>>>> >> _______________________________________________
>>> >>>>> >> users mailing list
>>> >>>>> >> us...@open-mpi.org
>>> <javascript:_e(%7B%7D,'cvml','us...@open-mpi.org');><mailto:
>>> <javascript:_e(%7B%7D,'cvml','us...@open-mpi.org');>us...@open-mpi.org
>>> <javascript:_e(%7B%7D,'cvml','us...@open-mpi.org');>>
>>> >>>>> >> Subscription:
>>> <https://www.open-mpi.org/mailman/listinfo.cgi/users>
>>> https://www.open-mpi.org/mailman/listinfo.cgi/users
>>> >>>>> >> Link to this post:
>>> >>>>> >> http://www.open-mpi.org/community/lists/users/2016/05/29124.php
>>> >>>>> >>
>>> >>>>> >>
>>> >>>>> >>
>>> >>>>> >> _______________________________________________
>>> >>>>> >> users mailing list
>>> >>>>> >> us...@open-mpi.org
>>> <javascript:_e(%7B%7D,'cvml','us...@open-mpi.org');><mailto:
>>> <javascript:_e(%7B%7D,'cvml','us...@open-mpi.org');>us...@open-mpi.org
>>> <javascript:_e(%7B%7D,'cvml','us...@open-mpi.org');>>
>>> >>>>> >> Subscription:
>>> <https://www.open-mpi.org/mailman/listinfo.cgi/users>
>>> https://www.open-mpi.org/mailman/listinfo.cgi/users
>>> >>>>> >> Link to this post:
>>> >>>>> >> http://www.open-mpi.org/community/lists/users/2016/05/29141.php
>>> >>>>> >>
>>> >>>>> >> _______________________________________________
>>> >>>>> >> users mailing list
>>> >>>>> >> us...@open-mpi.org
>>> <javascript:_e(%7B%7D,'cvml','us...@open-mpi.org');>
>>> >>>>> >> Subscription:
>>> <https://www.open-mpi.org/mailman/listinfo.cgi/users>
>>> https://www.open-mpi.org/mailman/listinfo.cgi/users
>>> >>>>> >> Link to this post:
>>> >>>>> >> http://www.open-mpi.org/community/lists/users/2016/05/29154.php
>>> >>>>> >
>>> >>>>> >
>>> >>>>> >
>>> >>>>> > _______________________________________________
>>> >>>>> > users mailing list
>>> >>>>> > us...@open-mpi.org
>>> <javascript:_e(%7B%7D,'cvml','us...@open-mpi.org');>
>>> >>>>> > Subscription:
>>> <https://www.open-mpi.org/mailman/listinfo.cgi/users>
>>> https://www.open-mpi.org/mailman/listinfo.cgi/users
>>> >>>>> > Link to this post:
>>> >>>>> > http://www.open-mpi.org/community/lists/users/2016/05/29169.php
>>> >>>>> _______________________________________________
>>> >>>>> users mailing list
>>> >>>>> us...@open-mpi.org
>>> <javascript:_e(%7B%7D,'cvml','us...@open-mpi.org');>
>>> >>>>> Subscription:
>>> <https://www.open-mpi.org/mailman/listinfo.cgi/users>
>>> https://www.open-mpi.org/mailman/listinfo.cgi/users
>>> >>>>> Link to this post:
>>> >>>>> http://www.open-mpi.org/community/lists/users/2016/05/29170.php
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>> _______________________________________________
>>> >>>> users mailing list
>>> >>>> us...@open-mpi.org
>>> <javascript:_e(%7B%7D,'cvml','us...@open-mpi.org');>
>>> >>>> Subscription: <https://www.open-mpi.org/mailman/listinfo.cgi/users>
>>> https://www.open-mpi.org/mailman/listinfo.cgi/users
>>> >>>> Link to this post:
>>> >>>> http://www.open-mpi.org/community/lists/users/2016/05/29176.php
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>> _______________________________________________
>>> >>>> users mailing list
>>> >>>> us...@open-mpi.org
>>> <javascript:_e(%7B%7D,'cvml','us...@open-mpi.org');>
>>> >>>> Subscription: <https://www.open-mpi.org/mailman/listinfo.cgi/users>
>>> https://www.open-mpi.org/mailman/listinfo.cgi/users
>>> >>>> Link to this post:
>>> >>>> http://www.open-mpi.org/community/lists/users/2016/05/29177.php
>>> >>>
>>> >>>
>>> >>>
>>> >>>
>>> >>> _______________________________________________
>>> >>> users mailing list
>>> >>> us...@open-mpi.org
>>> <javascript:_e(%7B%7D,'cvml','us...@open-mpi.org');>
>>> >>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
>>> >>> Link to this post:
>>> >>> http://www.open-mpi.org/community/lists/users/2016/05/29178.php
>>> >>>
>>> >>>
>>> >>>
>>> >>> _______________________________________________
>>> >>> users mailing list
>>> >>> us...@open-mpi.org
>>> <javascript:_e(%7B%7D,'cvml','us...@open-mpi.org');>
>>> >>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
>>> >>> Link to this post:
>>> >>> http://www.open-mpi.org/community/lists/users/2016/05/29181.php
>>> >>
>>> >>
>>> >
>>> >
>>> > _______________________________________________
>>> > users mailing list
>>> > us...@open-mpi.org
>>> <javascript:_e(%7B%7D,'cvml','us...@open-mpi.org');>
>>> > Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
>>> > Link to this post:
>>> > http://www.open-mpi.org/community/lists/users/2016/05/29184.php
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org <javascript:_e(%7B%7D,'cvml','us...@open-mpi.org');>
>>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
>>> Link to this post:
>>> http://www.open-mpi.org/community/lists/users/2016/05/29185.php
>>>
>>
>>
>>
>> _______________________________________________
>> users mailing listus...@open-mpi.org 
>> <javascript:_e(%7B%7D,'cvml','us...@open-mpi.org');>
>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/users/2016/05/29186.php
>>
>>
>>
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org <javascript:_e(%7B%7D,'cvml','us...@open-mpi.org');>
>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post:
>> http://www.open-mpi.org/community/lists/users/2016/05/29187.php
>>
>
>

Reply via email to