Thanks for the tests !

What was fixed in openmpi is handling disconnected infinipath port.

Restoring signal handlers when libinfinipath.so is unloaded (when 
mca_mtl_psm.so is unloaded from our point of view) can only be fixed within 
libinfinipath.so.
It might have already been fixed in the latest OFED versions, but i am not sure 
about that ...

Cheers,

Gilles

"dpchoudh ." <dpcho...@gmail.com> wrote:
><quote>
>
>If you configure with --disable-dlopen, then libinfinipath.so is slurped and 
>hence the infinipath signal handler is always set, even if you disable the psm 
>mtl or choose to only use the ob1 pml.
>
>if you do not configure with --disable-dlopen, then the infinipath signal 
>handler is set when mca_mtl_psm.so is loaded. and it is not loaded if it is 
>disabled or if only ob1 is used.
>
></quote>
>
>Aah, I see. But you said that this was recently fixed, right? (I mean, the 
>signal handlers are now uninstalled if PSM is unloaded). I do have the latest 
>from master.
>
>I ran your patches, and *both* of them fix the crash. In case it is useful, I 
>am attaching the console output after applying the patch (the output from the 
>app proper is omitted.)
>
><From patch 1>
>[durga@smallMPI ~]$ mpirun -np 2  ./mpitest
>--------------------------------------------------------------------------
>WARNING: There is at least non-excluded one OpenFabrics device found,
>but there are no active ports detected (or Open MPI was unable to use
>them).  This is most certainly not what you wanted.  Check your
>cables, subnet manager configuration, etc.  The openib BTL will be
>ignored for this job.
>
>  Local host: smallMPI
>--------------------------------------------------------------------------
>smallMPI.26487PSM found 0 available contexts on InfiniPath device(s). (err=21)
>smallMPI.26488PSM found 0 available contexts on InfiniPath device(s). (err=21)
>
>
><From patch 2>
>
>
>[durga@smallMPI ~]$ mpirun -np 2  ./mpitest
>--------------------------------------------------------------------------
>WARNING: There is at least non-excluded one OpenFabrics device found,
>but there are no active ports detected (or Open MPI was unable to use
>them).  This is most certainly not what you wanted.  Check your
>cables, subnet manager configuration, etc.  The openib BTL will be
>ignored for this job.
>
>  Local host: smallMPI
>--------------------------------------------------------------------------
>smallMPI.7486PSM found 0 available contexts on InfiniPath device(s). (err=21)
>smallMPI.7487PSM found 0 available contexts on InfiniPath device(s). (err=21)
>
>
>The surgeon general advises you to eat right, exercise regularly and quit 
>ageing.
>
>
>On Thu, May 12, 2016 at 4:29 AM, Gilles Gouaillardet <gil...@rist.or.jp> wrote:
>
>If you configure with --disable-dlopen, then libinfinipath.so is slurped and 
>hence the infinipath signal handler is always set, even if you disable the psm 
>mtl or choose to only use the ob1 pml.
>
>if you do not configure with --disable-dlopen, then the infinipath signal 
>handler is set when mca_mtl_psm.so is loaded. and it is not loaded if it is 
>disabled or if only ob1 is used.
>
>
>it seems some verbs destructors are called twice here.
>
>can you please give the attached patches a try ?
>
>/* they are exclusive, e.g. you should only apply one at a time */
>
>
>Cheers,
>
>
>Gilles
>
>On 5/12/2016 4:54 PM, dpchoudh . wrote:
>
>Hello Gilles
>
>I am not sure if I understand you correctly, but let me answer based on what I 
>think you mean:
>
><quote>
>the infinipath signal handler only dump the stack (into a .btr file, yeah !)
>so if your application crashes without it, you should examine the core
>file and see what is going wrong.
>
></quote>
>
>If this is true, then there is a bug in OMPI proper, since it is crashing 
>inside MPI_Init(). Here is the stack:
>
>(gdb) bt
>#0  0x00007ff3104ac7d8 in main_arena () from /lib64/libc.so.6
>#1  0x00007ff30f6869ac in device_destruct (device=0x1284b30) at 
>btl_openib_component.c:985
>#2  0x00007ff30f6820ae in opal_obj_run_destructors (object=0x1284b30) at 
>../../../../opal/class/opal_object.h:460
>#3  0x00007ff30f689d3c in init_one_device (btl_list=0x7fff96c3a200, 
>ib_dev=0x12843f0) at btl_openib_component.c:2255
>#4  0x00007ff30f68b800 in btl_openib_component_init 
>(num_btl_modules=0x7fff96c3a330, enable_progress_threads=true, 
>    enable_mpi_threads=false) at btl_openib_component.c:2752
>#5  0x00007ff30f648971 in mca_btl_base_select (enable_progress_threads=true, 
>enable_mpi_threads=false) at base/btl_base_select.c:110
>#6  0x00007ff3108100a0 in mca_bml_r2_component_init (priority=0x7fff96c3a3fc, 
>enable_progress_threads=true, enable_mpi_threads=false)
>    at bml_r2_component.c:86
>#7  0x00007ff31080d033 in mca_bml_base_init (enable_progress_threads=true, 
>enable_mpi_threads=false) at base/bml_base_init.c:74
>#8  0x00007ff310754675 in ompi_mpi_init (argc=1, argv=0x7fff96c3a7b8, 
>requested=0, provided=0x7fff96c3a56c)
>    at runtime/ompi_mpi_init.c:590
>#9  0x00007ff3107918b7 in PMPI_Init (argc=0x7fff96c3a5ac, argv=0x7fff96c3a5a0) 
>at pinit.c:66
>#10 0x0000000000400aa0 in main (argc=1, argv=0x7fff96c3a7b8) at mpitest.c:17
>
>As you can see, the crash happens inside the verbs library and the following 
>gets printed to the console:
>
>[durga@smallMPI ~]$ mpirun -np 2 ./mpitest
>[smallMPI:05754] *** Process received signal ***
>[smallMPI:05754] Signal: Segmentation fault (11)
>[smallMPI:05754] Signal code: Invalid permissions (2)
>[smallMPI:05754] Failing at address: 0x7ff3104ac7d8
>
>That sort of tells me the perhaps the signal handler does more than simply 
>prints the stack; it might be manipulating page permissions (since I see a 
>different behaviour when PSM signal handlers are enabled.
>
>The MPI app that I am running is a simple program and it runs fine with the 
>work around you mention.
>
>
><quote>
>note the infinipath signal handler is set in the constructor of
>libinfinipath.so,
>and used *not* to be removed in the destructor.
>that means that if the signal handler is invoked *after* the pml MTL
>is unloaded, a crash will likely occur because the psm signal handler
>is likely pointing to unmapped memory.
>
></quote>
>
>But during normal operation, this should not be an issue, right? The signal 
>handler, even if it points to unmapped memory, is being invoked in response to 
>an exception that will kill the process anyway. The only side effect of this I 
>see is that the stack will be misleading. In any case, I am compiling with 
>--disable-dlopen set, so my understanding is that since all the components are 
>slurped onto one giant .so file, the memory will not be unmapped.
>
><quote>
>on top of that, there used to be a bug if some PSM device is detected
>but with no link (e.g. crash)
>with the latest ompi master, this bug should be fixed (e.g. no crash)
>this means the PSM mtl should disqualify itself if there is no link on
>any of the PSM ports, so, unless your infinipath library is fixed or
>you configure'd with --disable-dlopen, you will run into trouble if
>the ipath signal handler is invoked.
>
>can you confirm you have the latest master and there is no link on
>your ipath device ?
>
>what does
>grep ACTIVE /sys/class/infiniband/qib*/ports/*/state
>returns ?
>
></quote>
>
>I confirm that I have the latest from master (by running 'git pull'). Also, I 
>have a single Qlogic card with a single port and here is the output:
>[durga@smallMPI ~]$ cat /sys/class/infiniband/qib0/ports/1/state 
>1: DOWN
>
><quote>
>if you did not configure with --disable-dlopen *and* you do not need
>the psm mtl, you can
>mpirun --mca mtl ^psm ...
>or if you do not need any mtl at all
>mpirun --mca pml ob1 ...
>should be enough
>
></quote>
>
>I did configure with --disable-dlopen, but why does that make a difference? 
>This is the part that I don't understand.
>
>And yes, I do have a reasonable work around now, but I am passing on my 
>observations so that if there is a bug, the developers can fix it, or if I am 
>doing something wrong, then they can correct me.
>
>
>The surgeon general advises you to eat right, exercise regularly and quit 
>ageing.
>
>
>On Thu, May 12, 2016 at 12:38 AM, Gilles Gouaillardet 
><gilles.gouaillar...@gmail.com> wrote:
>
>Durga,
>
>the infinipath signal handler only dump the stack (into a .btr file, yeah !)
>so if your application crashes without it, you should examine the core
>file and see what is going wrong.
>
>note the infinipath signal handler is set in the constructor of
>libinfinipath.so,
>and used *not* to be removed in the destructor.
>that means that if the signal handler is invoked *after* the pml MTL
>is unloaded, a crash will likely occur because the psm signal handler
>is likely pointing to unmapped memory.
>
>on top of that, there used to be a bug if some PSM device is detected
>but with no link (e.g. crash)
>with the latest ompi master, this bug should be fixed (e.g. no crash)
>this means the PSM mtl should disqualify itself if there is no link on
>any of the PSM ports, so, unless your infinipath library is fixed or
>you configure'd with --disable-dlopen, you will run into trouble if
>the ipath signal handler is invoked.
>
>can you confirm you have the latest master and there is no link on
>your ipath device ?
>
>what does
>grep ACTIVE /sys/class/infiniband/qib*/ports/*/state
>returns ?
>
>if you did not configure with --disable-dlopen *and* you do not need
>the psm mtl, you can
>mpirun --mca mtl ^psm ...
>or if you do not need any mtl at all
>mpirun --mca pml ob1 ...
>should be enough
>
>Cheers,
>
>Gilles
>
>commit 4d026e223ce717345712e669d26f78ed49082df6
>Merge: f8facb1 4071719
>Author: rhc54 <r...@open-mpi.org>
>Date:   Wed May 11 17:43:17 2016 -0700
>
>    Merge pull request #1661 from matcabral/master
>
>    PSM and PSM2 MTLs to detect drivers and link
>
>
>On Thu, May 12, 2016 at 12:42 PM, dpchoudh . <dpcho...@gmail.com> wrote:
>> Sorry for belabouring on this, but this (hopefully final!) piece of
>> information might be of interest to the developers:
>>
>> There must be a reason why PSM is installing its signal handlers; often this
>> is done to modify the permission of a page in response to a SEGV and attempt
>> access again. By disabling the handlers, I am preventing the library from
>> doing that, and here is what it tells me:
>>
>> [durga@smallMPI ~]$ mpirun -np 2  ./mpitest
>> [smallMPI:20496] *** Process received signal ***
>> [smallMPI:20496] Signal: Segmentation fault (11)
>> [smallMPI:20496] Signal code: Invalid permissions (2)
>> [smallMPI:20496] Failing at address: 0x7f0b2fdb57d8
>> [smallMPI:20496] [ 0] /lib64/libpthread.so.0(+0xf100)[0x7f0b2fdcb100]
>> [smallMPI:20496] [ 1] /lib64/libc.so.6(+0x3ba7d8)[0x7f0b2fdb57d8]
>> [smallMPI:20496] *** End of error message ***
>> [smallMPI:20497] *** Process received signal ***
>> [smallMPI:20497] Signal: Segmentation fault (11)
>> [smallMPI:20497] Signal code: Invalid permissions (2)
>> [smallMPI:20497] Failing at address: 0x7fbfe2b387d8
>> [smallMPI:20497] [ 0] /lib64/libpthread.so.0(+0xf100)[0x7fbfe2b4e100]
>> [smallMPI:20497] [ 1] /lib64/libc.so.6(+0x3ba7d8)[0x7fbfe2b387d8]
>> [smallMPI:20497] *** End of error message ***
>> -------------------------------------------------------
>> Primary job  terminated normally, but 1 process returned
>> a non-zero exit code. Per user-direction, the job has been aborted.
>> -------------------------------------------------------
>>
>> However, even without disabling it, it crashes anyway, as follows:
>>
>> unset IPATH_NO_BACKTRACE
>>
>> [durga@smallMPI ~]$ mpirun -np 2  ./mpitest
>>
>> mpitest:22009 terminated with signal 11 at PC=7f908bb2a7d8 SP=7ffebb4ee5b8.
>> Backtrace:
>> /lib64/libc.so.6(+0x3ba7d8)[0x7f908bb2a7d8]
>>
>> mpitest:22010 terminated with signal 11 at PC=7f7a2caa67d8 SP=7ffd73dec3e8.
>> Backtrace:
>> /lib64/libc.so.6(+0x3ba7d8)[0x7f7a2caa67d8]
>>
>> The PC is at a different location but I do not have any more information
>> without a core file.
>>
>> It seems like some interaction between OMPI and PSM library is incorrect.
>> I'll let the developers figure it out :-)
>>
>>
>> Thanks
>> Durga
>>
>>
>>
>>
>> The surgeon general advises you to eat right, exercise regularly and quit
>> ageing.
>>
>> On Wed, May 11, 2016 at 11:23 PM, dpchoudh . <dpcho...@gmail.com> wrote:
>>>
>>> Hello Gilles
>>>
>>> Mystery solved! In fact, this one line is exactly what was needed!! It
>>> turns out the OMPI signal handlers are irrelevant. (i.e. don't make any
>>> difference when this env variable is set)
>>>
>>> This explains:
>>>
>>> 1. The difference in the behaviour in the two clusters (one has PSM, the
>>> other does not)
>>> 2. Why you couldn't find where in OMPI code the .btr files are being
>>> generated (looks like they are being generated in PSM library)
>>>
>>> And, now that I can get a core file (finally!), I can present the back
>>> trace where the crash in MPI_Init() is happening. It is as follows:
>>>
>>> #0  0x00007f1c114977d8 in main_arena () from /lib64/libc.so.6
>>> #1  0x00007f1c106719ac in device_destruct (device=0x1c85b70) at
>>> btl_openib_component.c:985
>>> #2  0x00007f1c1066d0ae in opal_obj_run_destructors (object=0x1c85b70) at
>>> ../../../../opal/class/opal_object.h:460
>>> #3  0x00007f1c10674d3c in init_one_device (btl_list=0x7ffd00dada50,
>>> ib_dev=0x1c85430) at btl_openib_component.c:2255
>>> #4  0x00007f1c10676800 in btl_openib_component_init
>>> (num_btl_modules=0x7ffd00dadb80, enable_progress_threads=true,
>>> enable_mpi_threads=false)
>>>     at btl_openib_component.c:2752
>>> #5  0x00007f1c10633971 in mca_btl_base_select
>>> (enable_progress_threads=true, enable_mpi_threads=false) at
>>> base/btl_base_select.c:110
>>> #6  0x00007f1c117fb0a0 in mca_bml_r2_component_init
>>> (priority=0x7ffd00dadc4c, enable_progress_threads=true,
>>> enable_mpi_threads=false)
>>>     at bml_r2_component.c:86
>>> #7  0x00007f1c117f8033 in mca_bml_base_init (enable_progress_threads=true,
>>> enable_mpi_threads=false) at base/bml_base_init.c:74
>>> #8  0x00007f1c1173f675 in ompi_mpi_init (argc=1, argv=0x7ffd00dae008,
>>> requested=0, provided=0x7ffd00daddbc) at runtime/ompi_mpi_init.c:590
>>> #9  0x00007f1c1177c8b7 in PMPI_Init (argc=0x7ffd00daddfc,
>>> argv=0x7ffd00daddf0) at pinit.c:66
>>> #10 0x0000000000400aa0 in main (argc=1, argv=0x7ffd00dae008) at
>>> mpitest.c:17
>>>
>>> This is with the absolute latest code from master.
>>>
>>> Thanks everyone for their help.
>>>
>>> Durga
>>>
>>> The surgeon general advises you to eat right, exercise regularly and quit
>>> ageing.
>>>
>>> On Wed, May 11, 2016 at 10:55 PM, Gilles Gouaillardet <gil...@rist.or.jp>
>
>
>>> wrote:
>>>>
>>>> Note the psm library sets its own signal handler, possibly after the
>>>> OpenMPI one.
>>>>
>>>> that can be disabled by
>>>>
>>>> export IPATH_NO_BACKTRACE=1
>>>>
>>>> Cheers,
>>>>
>>>> Gilles
>>>>
>>>>
>>>> On 5/12/2016 11:34 AM, dpchoudh . wrote:
>>>>
>>>> Hello Gilles
>>>>
>>>> Thank you for your continued support. With your help, I have a better
>>>> understanding of what is happening. Here are the details.
>>>>
>>>> 1. Yes, I am sure that ulimit -c is 'unlimited' (and for the test in
>>>> question, I am running it on a single node, so there are no other nodes)
>>>>
>>>> 2. The command I am running is possibly the simplest MPI command:
>>>> mpirun -np 2 <program>
>>>>
>>>> It looks to me, after running your test code, that what is crashing is
>>>> MPI_Init() itself. The output from your code (I called it 'btrtest') is as
>>>> follows:
>>>>
>>>> [durga@smallMPI ~]$ mpirun -np 2 ./btrtest
>>>> before MPI_Init : -1 -1
>>>> before MPI_Init : -1 -1
>>>>
>>>> btrtest:7275 terminated with signal 11 at PC=7f401f49e7d8
>>>> SP=7ffec47e7578.  Backtrace:
>>>> /lib64/libc.so.6(+0x3ba7d8)[0x7f401f49e7d8]
>>>>
>>>> btrtest:7274 terminated with signal 11 at PC=7f1ba21897d8
>>>> SP=7ffc516ac8d8.  Backtrace:
>>>> /lib64/libc.so.6(+0x3ba7d8)[0x7f1ba21897d8]
>>>> -------------------------------------------------------
>>>> Primary job  terminated normally, but 1 process returned
>>>> a non-zero exit code. Per user-direction, the job has been aborted.
>>>> -------------------------------------------------------
>>>>
>>>> --------------------------------------------------------------------------
>>>> mpirun detected that one or more processes exited with non-zero status,
>>>> thus causing
>>>> the job to be terminated. The first process to do so was:
>>>>
>>>>   Process name: [[7936,1],1]
>>>>   Exit code:    1
>>>>
>>>> --------------------------------------------------------------------------
>>>>
>>>> So obviously the code does not make it past MPI_Init()
>>>>
>>>> This is an issue that I have been observing for quite a while in
>>>> different forms and have reported on the forum a few times also. Let me
>>>> elaborate:
>>>>
>>>> Both my 'well-behaving' and crashing clusters run CentOS 7 (the crashing
>>>> one has the latest updates, the well-behaving one does not as I am not
>>>> allowed to apply updates on that). They both have OMPI, from the master
>>>> branch, compiled from the source. Both consist of 64 bit Dell servers,
>>>> although not identical models (I doubt if that matters)
>>>>
>>>> The only significant difference between the two is this:
>>>>
>>>> The well behaved one (if it does core dump, that is because there is a
>>>> bug in the MPI app) has very simple network hardware: two different NICs
>>>> (one Broadcom GbE, one proprietary NIC that is currently being exposed as 
>>>> an
>>>> IP interface). There is no RDMA capability there at all.
>>>>
>>>> The crashing one have 4 different NICs:
>>>> 1. Broadcom GbE
>>>> 2. Chelsio T3 based 10Gb iWARP NIC
>>>> 3. QLogic 20Gb Infiniband (PSM capable)
>>>> 4. LSI logic Fibre channel
>>>>
>>>> In this situation, WITH ALL BUT THE GbE LINK DOWN (the GbE connects the
>>>> machine to the WAN link), it seems just the presence of these NICs matter.
>>>>
>>>> Here are the various commands and outputs:
>>>>
>>>> [durga@smallMPI ~]$ mpirun -np 2 ./btrtest
>>>> before MPI_Init : -1 -1
>>>> before MPI_Init : -1 -1
>>>>
>>>> btrtest:10032 terminated with signal 11 at PC=7f6897c197d8
>>>> SP=7ffcae2b2ef8.  Backtrace:
>>>> /lib64/libc.so.6(+0x3ba7d8)[0x7f6897c197d8]
>>>>
>>>> btrtest:10033 terminated with signal 11 at PC=7fb035c3e7d8
>>>> SP=7ffe61a92088.  Backtrace:
>>>> /lib64/libc.so.6(+0x3ba7d8)[0x7fb035c3e7d8]
>>>> -------------------------------------------------------
>>>> Primary job  terminated normally, but 1 process returned
>>>> a non-zero exit code. Per user-direction, the job has been aborted.
>>>> -------------------------------------------------------
>>>>
>>>> --------------------------------------------------------------------------
>>>> mpirun detected that one or more processes exited with non-zero status,
>>>> thus causing
>>>> the job to be terminated. The first process to do so was:
>>>>
>>>>   Process name: [[9294,1],0]
>>>>   Exit code:    1
>>>>
>>>> --------------------------------------------------------------------------
>>>>
>>>> [durga@smallMPI ~]$ mpirun -np 2 -mca pml ob1 ./btrtest
>>>> before MPI_Init : -1 -1
>>>> before MPI_Init : -1 -1
>>>>
>>>> btrtest:10076 terminated with signal 11 at PC=7fa92d20b7d8
>>>> SP=7ffebb106028.  Backtrace:
>>>> /lib64/libc.so.6(+0x3ba7d8)[0x7fa92d20b7d8]
>>>>
>>>> btrtest:10077 terminated with signal 11 at PC=7f5012fa57d8
>>>> SP=7ffea4f4fdf8.  Backtrace:
>>>> /lib64/libc.so.6(+0x3ba7d8)[0x7f5012fa57d8]
>>>> -------------------------------------------------------
>>>> Primary job  terminated normally, but 1 process returned
>>>> a non-zero exit code. Per user-direction, the job has been aborted.
>>>> -------------------------------------------------------
>>>>
>>>> --------------------------------------------------------------------------
>>>> mpirun detected that one or more processes exited with non-zero status,
>>>> thus causing
>>>> the job to be terminated. The first process to do so was:
>>>>
>>>>   Process name: [[9266,1],0]
>>>>   Exit code:    1
>>>>
>>>> --------------------------------------------------------------------------
>>>>
>>>> [durga@smallMPI ~]$ mpirun -np 2 -mca pml ob1 -mca btl self,sm ./btrtest
>>>> before MPI_Init : -1 -1
>>>> before MPI_Init : -1 -1
>>>>
>>>> btrtest:10198 terminated with signal 11 at PC=400829 SP=7ffe6e148870.
>>>> Backtrace:
>>>>
>>>> btrtest:10197 terminated with signal 11 at PC=400829 SP=7ffe87be6cd0.
>>>> Backtrace:
>>>> ./btrtest[0x400829]
>>>> /lib64/libc.so.6(__libc_start_main+0xf5)[0x7f9473bbeb15]
>>>> ./btrtest[0x4006d9]
>>>> ./btrtest[0x400829]
>>>> /lib64/libc.so.6(__libc_start_main+0xf5)[0x7fdfe2d8eb15]
>>>> ./btrtest[0x4006d9]
>>>> after MPI_Init : -1 -1
>>>> after MPI_Init : -1 -1
>>>> -------------------------------------------------------
>>>> Primary job  terminated normally, but 1 process returned
>>>> a non-zero exit code. Per user-direction, the job has been aborted.
>>>> -------------------------------------------------------
>>>>
>>>> --------------------------------------------------------------------------
>>>> mpirun detected that one or more processes exited with non-zero status,
>>>> thus causing
>>>> the job to be terminated. The first process to do so was:
>>>>
>>>>   Process name: [[9384,1],1]
>>>>   Exit code:    1
>>>>
>>>> --------------------------------------------------------------------------
>>>>
>>>>
>>>> [durga@smallMPI ~]$ ulimit -a
>>>> core file size          (blocks, -c) unlimited
>>>> data seg size           (kbytes, -d) unlimited
>>>> scheduling priority             (-e) 0
>>>> file size               (blocks, -f) unlimited
>>>> pending signals                 (-i) 216524
>>>> max locked memory       (kbytes, -l) unlimited
>>>> max memory size         (kbytes, -m) unlimited
>>>> open files                      (-n) 1024
>>>> pipe size            (512 bytes, -p) 8
>>>> POSIX message queues     (bytes, -q) 819200
>>>> real-time priority              (-r) 0
>>>> stack size              (kbytes, -s) 8192
>>>> cpu time               (seconds, -t) unlimited
>>>> max user processes              (-u) 4096
>>>> virtual memory          (kbytes, -v) unlimited
>>>> file locks                      (-x) unlimited
>>>> [durga@smallMPI ~]$
>>>>
>>>>
>>>> I do realize that my setup is very unusual (I am a quasi-developer of MPI
>>>> whereas most other folks in this list are likely end-users), but somehow
>>>> just disabling this 'execinfo' MCA would allow me to make progress (and 
>>>> also
>>>> find out why/where MPI_Init() is crashing!). Is there any way I can do 
>>>> that?
>>>>
>>>> Thank you
>>>> Durga
>>>>
>>>> The surgeon general advises you to eat right, exercise regularly and quit
>>>> ageing.
>>>>
>>>> On Wed, May 11, 2016 at 8:58 PM, Gilles Gouaillardet <gil...@rist.or.jp>
>>>> wrote:
>>>>>
>>>>> Are you sure ulimit -c unlimited is *really* applied on all hosts
>>>>>
>>>>>
>>>>> can you please run the simple program below and confirm that ?
>>>>>
>>>>>
>>>>> Cheers,
>>>>>
>>>>>
>>>>> Gilles
>>>>>
>>>>>
>>>>> #include <sys/time.h>
>>>>> #include <sys/resource.h>
>>>>> #include <poll.h>
>>>>> #include <stdio.h>
>>>>>
>>>>> int main(int argc, char *argv[]) {
>>>>>     struct rlimit rlim;
>>>>>     char * c = (char *)0;
>>>>>     getrlimit(RLIMIT_CORE, &rlim);
>>>>>     printf ("before MPI_Init : %d %d\n", rlim.rlim_cur, rlim.rlim_max);
>>>>>     MPI_Init(&argc, &argv);
>>>>>     getrlimit(RLIMIT_CORE, &rlim);
>>>>>     printf ("after MPI_Init : %d %d\n", rlim.rlim_cur, rlim.rlim_max);
>>>>>     *c = 0;
>>>>>     MPI_Finalize();
>>>>>     return 0;
>>>>> }
>>>>>
>>>>>
>>>>> On 5/12/2016 4:22 AM, dpchoudh . wrote:
>>>>>
>>>>> Hello Gilles
>>>>>
>>>>> Thank you for the advice. However, that did not seem to make any
>>>>> difference. Here is what I did (on the cluster that generates .btr files 
>>>>> for
>>>>> core dumps):
>>>>>
>>>>> [durga@smallMPI git]$ ompi_info --all | grep opal_signal
>>>>>            MCA opal base: parameter "opal_signal" (current value:
>>>>> "6,7,8,11", data source: default, level: 3 user/all, type: string)
>>>>> [durga@smallMPI git]$
>>>>>
>>>>>
>>>>> According to <bits/signum.h>, signals 6.7,8,11 are this:
>>>>>
>>>>> #define    SIGABRT        6    /* Abort (ANSI).  */
>>>>> #define    SIGBUS        7    /* BUS error (4.2 BSD).  */
>>>>> #define    SIGFPE        8    /* Floating-point exception (ANSI).  */
>>>>> #define    SIGSEGV        11    /* Segmentation violation (ANSI).  */
>>>>>
>>>>> And thus I added the following just after MPI_Init()
>>>>>
>>>>>     MPI_Init(&argc, &argv);
>>>>>     signal(SIGABRT, SIG_DFL);
>>>>>     signal(SIGBUS, SIG_DFL);
>>>>>     signal(SIGFPE, SIG_DFL);
>>>>>     signal(SIGSEGV, SIG_DFL);
>>>>>     signal(SIGTERM, SIG_DFL);
>>>>>
>>>>> (I added the 'SIGTERM' part later, just in case it would make a
>>>>> difference; i didn't)
>>>>>
>>>>> The resulting code still generates .btr files instead of core files.
>>>>>
>>>>> It looks like the 'execinfo' MCA component is being used as the
>>>>> backtrace mechanism:
>>>>>
>>>>> [durga@smallMPI git]$ ompi_info | grep backtrace
>>>>>            MCA backtrace: execinfo (MCA v2.1.0, API v2.0.0, Component
>>>>> v3.0.0)
>>>>>
>>>>> However, I could not find any way to choose 'none' instead of 'execinfo'
>>>>>
>>>>> And the strange thing is, on the cluster where regular core dump is
>>>>> happening, the output of
>>>>> $ ompi_info | grep backtrace
>>>>> is identical to the above. (Which kind of makes sense because they were
>>>>> created from the same source with the same configure options.)
>>>>>
>>>>> Sorry to harp on this, but without a core file it is hard to debug the
>>>>> application (e.g. examine stack variables).
>>>>>
>>>>> Thank you
>>>>> Durga
>>>>>
>>>>>
>>>>> The surgeon general advises you to eat right, exercise regularly and
>>>>> quit ageing.
>>>>>
>>>>> On Wed, May 11, 2016 at 3:37 AM, Gilles Gouaillardet
>>>>> <gilles.gouaillar...@gmail.com> wrote:
>>>>>>
>>>>>> Durga,
>>>>>>
>>>>>> you might wanna try to restore the signal handler for other signals as
>>>>>> well
>>>>>> (SIGSEGV, SIGBUS, ...)
>>>>>> ompi_info --all | grep opal_signal
>>>>>> does list the signal you should restore the handler
>>>>>>
>>>>>>
>>>>>> only one backtrace component is built (out of several candidates :
>>>>>> execinfo, none, printstack)
>>>>>> nm -l libopen-pal.so | grep backtrace
>>>>>> will hint you which component was built
>>>>>>
>>>>>> your two similar distros might have different backtrace component
>>>>>>
>>>>>>
>>>>>>
>>>>>> Gus,
>>>>>>
>>>>>> btr is a plain text file with a back trace "ala" gdb
>>>>>>
>>>>>>
>>>>>>
>>>>>> Nathan,
>>>>>>
>>>>>> i did a 'grep btr' and could not find anything :-(
>>>>>> opal_backtrace_buffer and opal_backtrace_print are only used with
>>>>>> stderr.
>>>>>> so i am puzzled who creates the tracefile name and where ...
>>>>>> also, no stack is printed by default unless opal_abort_print_stack is
>>>>>> true
>>>>>>
>>>>>> Cheers,
>>>>>>
>>>>>> Gilles
>>>>>>
>>>>>>
>>>>>> On Wed, May 11, 2016 at 3:43 PM, dpchoudh . <dpcho...@gmail.com> wrote:
>>>>>> > Hello Nathan
>>>>>> >
>>>>>> > Thank you for your response. Could you please be more specific?
>>>>>> > Adding the
>>>>>> > following after MPI_Init() does not seem to make a difference.
>>>>>> >
>>>>>> >     MPI_Init(&argc, &argv);
>>>>>> >   signal(SIGABRT, SIG_DFL);
>>>>>> >   signal(SIGTERM, SIG_DFL);
>>>>>> >
>>>>>> > I also find it puzzling that nearly identical OMPI distro running on
>>>>>> > a
>>>>>> > different machine shows different behaviour.
>>>>>> >
>>>>>> > Best regards
>>>>>> > Durga
>>>>>> >
>>>>>> > The surgeon general advises you to eat right, exercise regularly and
>>>>>> > quit
>>>>>> > ageing.
>>>>>> >
>>>>>> > On Tue, May 10, 2016 at 10:02 AM, Hjelm, Nathan Thomas
>>>>>> > <hje...@lanl.gov>
>>>>>> > wrote:
>>>>>> >>
>>>>>> >> btr files are indeed created by open mpi's backtrace mechanism. I
>>>>>> >> think we
>>>>>> >> should revisit it at some point but for now the only effective way i
>>>>>> >> have
>>>>>> >> found to prevent it is to restore the default signal handlers after
>>>>>> >> MPI_Init.
>>>>>> >>
>>>>>> >> Excuse the quoting style. Good sucks.
>>>>>> >>
>>>>>> >>
>>>>>> >> ________________________________________
>>>>>> >> From: users on behalf of dpchoudh .
>>>>>> >> Sent: Monday, May 09, 2016 2:59:37 PM
>>>>>> >> To: Open MPI Users
>>>>>> >> Subject: Re: [OMPI users] No core dump in some cases
>>>>>> >>
>>>>>> >> Hi Gus
>>>>>> >>
>>>>>> >> Thanks for your suggestion. But I am not using any resource manager
>>>>>> >> (i.e.
>>>>>> >> I am launching mpirun from the bash shell.). In fact, both of the
>>>>>> >> two
>>>>>> >> clusters I talked about run CentOS 7 and I launch the job the same
>>>>>> >> way on
>>>>>> >> both of these, yet one of them creates standard core files and the
>>>>>> >> other
>>>>>> >> creates the 'btr; files. Strange thing is, I could not find anything
>>>>>> >> on the
>>>>>> >> .btr (= Backtrace?) files on Google, which is any I asked on this
>>>>>> >> forum.
>>>>>> >>
>>>>>> >> Best regards
>>>>>> >> Durga
>>>>>> >>
>>>>>> >> The surgeon general advises you to eat right, exercise regularly and
>>>>>> >> quit
>>>>>> >> ageing.
>>>>>> >>
>>>>>> >> On Mon, May 9, 2016 at 12:04 PM, Gus Correa
>>>>>> >> <g...@ldeo.columbia.edu<mailto:g...@ldeo.columbia.edu>> wrote:
>>>>>> >> Hi Durga
>>>>>> >>
>>>>>> >> Just in case ...
>>>>>> >> If you're using a resource manager to start the jobs (Torque, etc),
>>>>>> >> you need to have them set the limits (for coredump size, stacksize,
>>>>>> >> locked
>>>>>> >> memory size, etc).
>>>>>> >> This way the jobs will inherit the limits from the
>>>>>> >> resource manager daemon.
>>>>>> >> On Torque (which I use) I do this on the pbs_mom daemon
>>>>>> >> init script (I am still before the systemd era, that lovely POS).
>>>>>> >> And set the hard/soft limits on /etc/security/limits.conf as well.
>>>>>> >>
>>>>>> >> I hope this helps,
>>>>>> >> Gus Correa
>>>>>> >>
>>>>>> >> On 05/07/2016 12:27 PM, Jeff Squyres (jsquyres) wrote:
>>>>>> >> I'm afraid I don't know what a .btr file is -- that is not something
>>>>>> >> that
>>>>>> >> is controlled by Open MPI.
>>>>>> >>
>>>>>> >> You might want to look into your OS settings to see if it has some
>>>>>> >> kind of
>>>>>> >> alternate corefile mechanism...?
>>>>>> >>
>>>>>> >>
>>>>>> >> On May 6, 2016, at 8:58 PM, dpchoudh .
>>>>>> >> <dpcho...@gmail.com<mailto:dpcho...@gmail.com>> wrote:
>>>>>> >>
>>>>>> >> Hello all
>>>>>> >>
>>>>>> >> I run MPI jobs (for test purpose only) on two different 'clusters'.
>>>>>> >> Both
>>>>>> >> 'clusters' have two nodes only, connected back-to-back. The two are
>>>>>> >> very
>>>>>> >> similar, but not identical, both software and hardware wise.
>>>>>> >>
>>>>>> >> Both have ulimit -c set to unlimited. However, only one of the two
>>>>>> >> creates
>>>>>> >> core files when an MPI job crashes. The other creates a text file
>>>>>> >> named
>>>>>> >> something like
>>>>>> >>
>>>>>> >>
>>>>>> >> <program_name_that_crashed>.80s-<a-number-that-looks-like-a-PID>,<hostname-where-the-crash-happened>.btr
>>>>>> >>
>>>>>> >> I'd much prefer a core file because that allows me to debug with a
>>>>>> >> lot
>>>>>> >> more options than a static text file with addresses. How do I get a
>>>>>> >> core
>>>>>> >> file in all situations? I am using MPI source from the master
>>>>>> >> branch.
>>>>>> >>
>>>>>> >> Thanks in advance
>>>>>> >> Durga
>>>>>> >>
>>>>>> >> The surgeon general advises you to eat right, exercise regularly and
>>>>>> >> quit
>>>>>> >> ageing.
>>>>>> >> _______________________________________________
>>>>>> >> users mailing list
>>>>>> >> us...@open-mpi.org<mailto:us...@open-mpi.org>
>>>>>> >> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>> >> Link to this post:
>>>>>> >> http://www.open-mpi.org/community/lists/users/2016/05/29124.php
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >> _______________________________________________
>>>>>> >> users mailing list
>>>>>> >> us...@open-mpi.org<mailto:us...@open-mpi.org>
>>>>>> >> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>> >> Link to this post:
>>>>>> >> http://www.open-mpi.org/community/lists/users/2016/05/29141.php
>>>>>> >>
>>>>>> >> _______________________________________________
>>>>>> >> users mailing list
>>>>>> >> us...@open-mpi.org
>>>>>> >> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>> >> Link to this post:
>>>>>> >> http://www.open-mpi.org/community/lists/users/2016/05/29154.php
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > _______________________________________________
>>>>>> > users mailing list
>>>>>> > us...@open-mpi.org
>>>>>> > Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>> > Link to this post:
>>>>>> > http://www.open-mpi.org/community/lists/users/2016/05/29169.php
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> us...@open-mpi.org
>>>>>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>> Link to this post:
>>>>>> http://www.open-mpi.org/community/lists/users/2016/05/29170.php
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>> Link to this post:
>>>>> http://www.open-mpi.org/community/lists/users/2016/05/29176.php
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>> Link to this post:
>>>>> http://www.open-mpi.org/community/lists/users/2016/05/29177.php
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> Link to this post:
>>>> http://www.open-mpi.org/community/lists/users/2016/05/29178.php
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> Link to this post:
>
>>>> http://www.open-mpi.org/community/lists/users/2016/05/29181.php
>>>
>>>
>>
>>
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post:
>> http://www.open-mpi.org/community/lists/users/2016/05/29184.php
>_______________________________________________
>users mailing list
>us...@open-mpi.org
>Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
>Link to this post: 
>http://www.open-mpi.org/community/lists/users/2016/05/29185.php
>
>
>
>
>_______________________________________________ users mailing list 
>us...@open-mpi.org Subscription: 
>https://www.open-mpi.org/mailman/listinfo.cgi/users Link to this post: 
>http://www.open-mpi.org/community/lists/users/2016/05/29186.php 
>
>
>
>_______________________________________________
>users mailing list
>us...@open-mpi.org
>Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
>Link to this post: 
>http://www.open-mpi.org/community/lists/users/2016/05/29187.php
>
>

Reply via email to