Thanks for the tests ! What was fixed in openmpi is handling disconnected infinipath port.
Restoring signal handlers when libinfinipath.so is unloaded (when mca_mtl_psm.so is unloaded from our point of view) can only be fixed within libinfinipath.so. It might have already been fixed in the latest OFED versions, but i am not sure about that ... Cheers, Gilles On Thursday, May 12, 2016, dpchoudh . <dpcho...@gmail.com> wrote: > <quote> > > If you configure with --disable-dlopen, then libinfinipath.so is slurped > and hence the infinipath signal handler is always set, even if you disable > the psm mtl or choose to only use the ob1 pml. > if you do not configure with --disable-dlopen, then the infinipath signal > handler is set when mca_mtl_psm.so is loaded. and it is not loaded if it is > disabled or if only ob1 is used. > </quote> > > Aah, I see. But you said that this was recently fixed, right? (I mean, the > signal handlers are now uninstalled if PSM is unloaded). I do have the > latest from master. > > I ran your patches, and *both* of them fix the crash. In case it is > useful, I am attaching the console output after applying the patch (the > output from the app proper is omitted.) > > <From patch 1> > [durga@smallMPI ~]$ mpirun -np 2 ./mpitest > -------------------------------------------------------------------------- > WARNING: There is at least non-excluded one OpenFabrics device found, > but there are no active ports detected (or Open MPI was unable to use > them). This is most certainly not what you wanted. Check your > cables, subnet manager configuration, etc. The openib BTL will be > ignored for this job. > > Local host: smallMPI > -------------------------------------------------------------------------- > smallMPI.26487PSM found 0 available contexts on InfiniPath device(s). > (err=21) > smallMPI.26488PSM found 0 available contexts on InfiniPath device(s). > (err=21) > > > <From patch 2> > > [durga@smallMPI ~]$ mpirun -np 2 ./mpitest > -------------------------------------------------------------------------- > WARNING: There is at least non-excluded one OpenFabrics device found, > but there are no active ports detected (or Open MPI was unable to use > them). This is most certainly not what you wanted. Check your > cables, subnet manager configuration, etc. The openib BTL will be > ignored for this job. > > Local host: smallMPI > -------------------------------------------------------------------------- > smallMPI.7486PSM found 0 available contexts on InfiniPath device(s). > (err=21) > smallMPI.7487PSM found 0 available contexts on InfiniPath device(s). > (err=21) > > > The surgeon general advises you to eat right, exercise regularly and quit > ageing. > > On Thu, May 12, 2016 at 4:29 AM, Gilles Gouaillardet <gil...@rist.or.jp > <javascript:_e(%7B%7D,'cvml','gil...@rist.or.jp');>> wrote: > >> If you configure with --disable-dlopen, then libinfinipath.so is slurped >> and hence the infinipath signal handler is always set, even if you disable >> the psm mtl or choose to only use the ob1 pml. >> >> if you do not configure with --disable-dlopen, then the infinipath signal >> handler is set when mca_mtl_psm.so is loaded. and it is not loaded if it is >> disabled or if only ob1 is used. >> >> it seems some verbs destructors are called twice here. >> >> can you please give the attached patches a try ? >> >> /* they are exclusive, e.g. you should only apply one at a time */ >> >> >> Cheers, >> >> >> Gilles >> On 5/12/2016 4:54 PM, dpchoudh . wrote: >> >> Hello Gilles >> >> I am not sure if I understand you correctly, but let me answer based on >> what I think you mean: >> >> <quote> >> the infinipath signal handler only dump the stack (into a .btr file, yeah >> !) >> so if your application crashes without it, you should examine the core >> file and see what is going wrong. >> </quote> >> >> If this is true, then there is a bug in OMPI proper, since it is crashing >> inside MPI_Init(). Here is the stack: >> >> (gdb) bt >> #0 0x00007ff3104ac7d8 in main_arena () from /lib64/libc.so.6 >> #1 0x00007ff30f6869ac in device_destruct (device=0x1284b30) at >> btl_openib_component.c:985 >> #2 0x00007ff30f6820ae in opal_obj_run_destructors (object=0x1284b30) at >> ../../../../opal/class/opal_object.h:460 >> #3 0x00007ff30f689d3c in init_one_device (btl_list=0x7fff96c3a200, >> ib_dev=0x12843f0) at btl_openib_component.c:2255 >> #4 0x00007ff30f68b800 in btl_openib_component_init >> (num_btl_modules=0x7fff96c3a330, enable_progress_threads=true, >> enable_mpi_threads=false) at btl_openib_component.c:2752 >> #5 0x00007ff30f648971 in mca_btl_base_select >> (enable_progress_threads=true, enable_mpi_threads=false) at >> base/btl_base_select.c:110 >> #6 0x00007ff3108100a0 in mca_bml_r2_component_init >> (priority=0x7fff96c3a3fc, enable_progress_threads=true, >> enable_mpi_threads=false) >> at bml_r2_component.c:86 >> #7 0x00007ff31080d033 in mca_bml_base_init >> (enable_progress_threads=true, enable_mpi_threads=false) at >> base/bml_base_init.c:74 >> #8 0x00007ff310754675 in ompi_mpi_init (argc=1, argv=0x7fff96c3a7b8, >> requested=0, provided=0x7fff96c3a56c) >> at runtime/ompi_mpi_init.c:590 >> #9 0x00007ff3107918b7 in PMPI_Init (argc=0x7fff96c3a5ac, >> argv=0x7fff96c3a5a0) at pinit.c:66 >> #10 0x0000000000400aa0 in main (argc=1, argv=0x7fff96c3a7b8) at >> mpitest.c:17 >> >> As you can see, the crash happens inside the verbs library and the >> following gets printed to the console: >> >> [durga@smallMPI ~]$ mpirun -np 2 ./mpitest >> [smallMPI:05754] *** Process received signal *** >> [smallMPI:05754] Signal: Segmentation fault (11) >> [smallMPI:05754] Signal code: Invalid permissions (2) >> [smallMPI:05754] Failing at address: 0x7ff3104ac7d8 >> >> That sort of tells me the perhaps the signal handler does more than >> simply prints the stack; it might be manipulating page permissions (since I >> see a different behaviour when PSM signal handlers are enabled. >> >> The MPI app that I am running is a simple program and it runs fine with >> the work around you mention. >> >> <quote> >> note the infinipath signal handler is set in the constructor of >> libinfinipath.so, >> and used *not* to be removed in the destructor. >> that means that if the signal handler is invoked *after* the pml MTL >> is unloaded, a crash will likely occur because the psm signal handler >> is likely pointing to unmapped memory. >> </quote> >> >> But during normal operation, this should not be an issue, right? The >> signal handler, even if it points to unmapped memory, is being invoked in >> response to an exception that will kill the process anyway. The only side >> effect of this I see is that the stack will be misleading. In any case, I >> am compiling with --disable-dlopen set, so my understanding is that since >> all the components are slurped onto one giant .so file, the memory will not >> be unmapped. >> >> <quote> >> on top of that, there used to be a bug if some PSM device is detected >> but with no link (e.g. crash) >> with the latest ompi master, this bug should be fixed (e.g. no crash) >> this means the PSM mtl should disqualify itself if there is no link on >> any of the PSM ports, so, unless your infinipath library is fixed or >> you configure'd with --disable-dlopen, you will run into trouble if >> the ipath signal handler is invoked. >> >> can you confirm you have the latest master and there is no link on >> your ipath device ? >> >> what does >> grep ACTIVE /sys/class/infiniband/qib*/ports/*/state >> returns ? >> </quote> >> >> I confirm that I have the latest from master (by running 'git pull'). >> Also, I have a single Qlogic card with a single port and here is the output: >> [durga@smallMPI ~]$ cat /sys/class/infiniband/qib0/ports/1/state >> 1: DOWN >> >> <quote> >> if you did not configure with --disable-dlopen *and* you do not need >> the psm mtl, you can >> mpirun --mca mtl ^psm ... >> or if you do not need any mtl at all >> mpirun --mca pml ob1 ... >> should be enough >> </quote> >> >> I did configure with --disable-dlopen, but why does that make a >> difference? This is the part that I don't understand. >> And yes, I do have a reasonable work around now, but I am passing on my >> observations so that if there is a bug, the developers can fix it, or if I >> am doing something wrong, then they can correct me. >> >> The surgeon general advises you to eat right, exercise regularly and quit >> ageing. >> >> On Thu, May 12, 2016 at 12:38 AM, Gilles Gouaillardet < >> <javascript:_e(%7B%7D,'cvml','gilles.gouaillar...@gmail.com');> >> gilles.gouaillar...@gmail.com >> <javascript:_e(%7B%7D,'cvml','gilles.gouaillar...@gmail.com');>> wrote: >> >>> Durga, >>> >>> the infinipath signal handler only dump the stack (into a .btr file, >>> yeah !) >>> so if your application crashes without it, you should examine the core >>> file and see what is going wrong. >>> >>> note the infinipath signal handler is set in the constructor of >>> libinfinipath.so, >>> and used *not* to be removed in the destructor. >>> that means that if the signal handler is invoked *after* the pml MTL >>> is unloaded, a crash will likely occur because the psm signal handler >>> is likely pointing to unmapped memory. >>> >>> on top of that, there used to be a bug if some PSM device is detected >>> but with no link (e.g. crash) >>> with the latest ompi master, this bug should be fixed (e.g. no crash) >>> this means the PSM mtl should disqualify itself if there is no link on >>> any of the PSM ports, so, unless your infinipath library is fixed or >>> you configure'd with --disable-dlopen, you will run into trouble if >>> the ipath signal handler is invoked. >>> >>> can you confirm you have the latest master and there is no link on >>> your ipath device ? >>> >>> what does >>> grep ACTIVE /sys/class/infiniband/qib*/ports/*/state >>> returns ? >>> >>> if you did not configure with --disable-dlopen *and* you do not need >>> the psm mtl, you can >>> mpirun --mca mtl ^psm ... >>> or if you do not need any mtl at all >>> mpirun --mca pml ob1 ... >>> should be enough >>> >>> Cheers, >>> >>> Gilles >>> >>> commit 4d026e223ce717345712e669d26f78ed49082df6 >>> Merge: f8facb1 4071719 >>> Author: rhc54 <r...@open-mpi.org >>> <javascript:_e(%7B%7D,'cvml','r...@open-mpi.org');>> >>> Date: Wed May 11 17:43:17 2016 -0700 >>> >>> Merge pull request #1661 from matcabral/master >>> >>> PSM and PSM2 MTLs to detect drivers and link >>> >>> >>> On Thu, May 12, 2016 at 12:42 PM, dpchoudh . < >>> <javascript:_e(%7B%7D,'cvml','dpcho...@gmail.com');>dpcho...@gmail.com >>> <javascript:_e(%7B%7D,'cvml','dpcho...@gmail.com');>> wrote: >>> > Sorry for belabouring on this, but this (hopefully final!) piece of >>> > information might be of interest to the developers: >>> > >>> > There must be a reason why PSM is installing its signal handlers; >>> often this >>> > is done to modify the permission of a page in response to a SEGV and >>> attempt >>> > access again. By disabling the handlers, I am preventing the library >>> from >>> > doing that, and here is what it tells me: >>> > >>> > [durga@smallMPI ~]$ mpirun -np 2 ./mpitest >>> > [smallMPI:20496] *** Process received signal *** >>> > [smallMPI:20496] Signal: Segmentation fault (11) >>> > [smallMPI:20496] Signal code: Invalid permissions (2) >>> > [smallMPI:20496] Failing at address: 0x7f0b2fdb57d8 >>> > [smallMPI:20496] [ 0] /lib64/libpthread.so.0(+0xf100)[0x7f0b2fdcb100] >>> > [smallMPI:20496] [ 1] /lib64/libc.so.6(+0x3ba7d8)[0x7f0b2fdb57d8] >>> > [smallMPI:20496] *** End of error message *** >>> > [smallMPI:20497] *** Process received signal *** >>> > [smallMPI:20497] Signal: Segmentation fault (11) >>> > [smallMPI:20497] Signal code: Invalid permissions (2) >>> > [smallMPI:20497] Failing at address: 0x7fbfe2b387d8 >>> > [smallMPI:20497] [ 0] /lib64/libpthread.so.0(+0xf100)[0x7fbfe2b4e100] >>> > [smallMPI:20497] [ 1] /lib64/libc.so.6(+0x3ba7d8)[0x7fbfe2b387d8] >>> > [smallMPI:20497] *** End of error message *** >>> > ------------------------------------------------------- >>> > Primary job terminated normally, but 1 process returned >>> > a non-zero exit code. Per user-direction, the job has been aborted. >>> > ------------------------------------------------------- >>> > >>> > However, even without disabling it, it crashes anyway, as follows: >>> > >>> > unset IPATH_NO_BACKTRACE >>> > >>> > [durga@smallMPI ~]$ mpirun -np 2 ./mpitest >>> > >>> > mpitest:22009 terminated with signal 11 at PC=7f908bb2a7d8 >>> SP=7ffebb4ee5b8. >>> > Backtrace: >>> > /lib64/libc.so.6(+0x3ba7d8)[0x7f908bb2a7d8] >>> > >>> > mpitest:22010 terminated with signal 11 at PC=7f7a2caa67d8 >>> SP=7ffd73dec3e8. >>> > Backtrace: >>> > /lib64/libc.so.6(+0x3ba7d8)[0x7f7a2caa67d8] >>> > >>> > The PC is at a different location but I do not have any more >>> information >>> > without a core file. >>> > >>> > It seems like some interaction between OMPI and PSM library is >>> incorrect. >>> > I'll let the developers figure it out :-) >>> > >>> > >>> > Thanks >>> > Durga >>> > >>> > >>> > >>> > >>> > The surgeon general advises you to eat right, exercise regularly and >>> quit >>> > ageing. >>> > >>> > On Wed, May 11, 2016 at 11:23 PM, dpchoudh . <dpcho...@gmail.com >>> <javascript:_e(%7B%7D,'cvml','dpcho...@gmail.com');>> wrote: >>> >> >>> >> Hello Gilles >>> >> >>> >> Mystery solved! In fact, this one line is exactly what was needed!! It >>> >> turns out the OMPI signal handlers are irrelevant. (i.e. don't make >>> any >>> >> difference when this env variable is set) >>> >> >>> >> This explains: >>> >> >>> >> 1. The difference in the behaviour in the two clusters (one has PSM, >>> the >>> >> other does not) >>> >> 2. Why you couldn't find where in OMPI code the .btr files are being >>> >> generated (looks like they are being generated in PSM library) >>> >> >>> >> And, now that I can get a core file (finally!), I can present the back >>> >> trace where the crash in MPI_Init() is happening. It is as follows: >>> >> >>> >> #0 0x00007f1c114977d8 in main_arena () from /lib64/libc.so.6 >>> >> #1 0x00007f1c106719ac in device_destruct (device=0x1c85b70) at >>> >> btl_openib_component.c:985 >>> >> #2 0x00007f1c1066d0ae in opal_obj_run_destructors (object=0x1c85b70) >>> at >>> >> ../../../../opal/class/opal_object.h:460 >>> >> #3 0x00007f1c10674d3c in init_one_device (btl_list=0x7ffd00dada50, >>> >> ib_dev=0x1c85430) at btl_openib_component.c:2255 >>> >> #4 0x00007f1c10676800 in btl_openib_component_init >>> >> (num_btl_modules=0x7ffd00dadb80, enable_progress_threads=true, >>> >> enable_mpi_threads=false) >>> >> at btl_openib_component.c:2752 >>> >> #5 0x00007f1c10633971 in mca_btl_base_select >>> >> (enable_progress_threads=true, enable_mpi_threads=false) at >>> >> base/btl_base_select.c:110 >>> >> #6 0x00007f1c117fb0a0 in mca_bml_r2_component_init >>> >> (priority=0x7ffd00dadc4c, enable_progress_threads=true, >>> >> enable_mpi_threads=false) >>> >> at bml_r2_component.c:86 >>> >> #7 0x00007f1c117f8033 in mca_bml_base_init >>> (enable_progress_threads=true, >>> >> enable_mpi_threads=false) at base/bml_base_init.c:74 >>> >> #8 0x00007f1c1173f675 in ompi_mpi_init (argc=1, argv=0x7ffd00dae008, >>> >> requested=0, provided=0x7ffd00daddbc) at runtime/ompi_mpi_init.c:590 >>> >> #9 0x00007f1c1177c8b7 in PMPI_Init (argc=0x7ffd00daddfc, >>> >> argv=0x7ffd00daddf0) at pinit.c:66 >>> >> #10 0x0000000000400aa0 in main (argc=1, argv=0x7ffd00dae008) at >>> >> mpitest.c:17 >>> >> >>> >> This is with the absolute latest code from master. >>> >> >>> >> Thanks everyone for their help. >>> >> >>> >> Durga >>> >> >>> >> The surgeon general advises you to eat right, exercise regularly and >>> quit >>> >> ageing. >>> >> >>> >> On Wed, May 11, 2016 at 10:55 PM, Gilles Gouaillardet < >>> gil...@rist.or.jp <javascript:_e(%7B%7D,'cvml','gil...@rist.or.jp');>> >>> >>> >> wrote: >>> >>> >>> >>> Note the psm library sets its own signal handler, possibly after the >>> >>> OpenMPI one. >>> >>> >>> >>> that can be disabled by >>> >>> >>> >>> export IPATH_NO_BACKTRACE=1 >>> >>> >>> >>> Cheers, >>> >>> >>> >>> Gilles >>> >>> >>> >>> >>> >>> On 5/12/2016 11:34 AM, dpchoudh . wrote: >>> >>> >>> >>> Hello Gilles >>> >>> >>> >>> Thank you for your continued support. With your help, I have a better >>> >>> understanding of what is happening. Here are the details. >>> >>> >>> >>> 1. Yes, I am sure that ulimit -c is 'unlimited' (and for the test in >>> >>> question, I am running it on a single node, so there are no other >>> nodes) >>> >>> >>> >>> 2. The command I am running is possibly the simplest MPI command: >>> >>> mpirun -np 2 <program> >>> >>> >>> >>> It looks to me, after running your test code, that what is crashing >>> is >>> >>> MPI_Init() itself. The output from your code (I called it 'btrtest') >>> is as >>> >>> follows: >>> >>> >>> >>> [durga@smallMPI ~]$ mpirun -np 2 ./btrtest >>> >>> before MPI_Init : -1 -1 >>> >>> before MPI_Init : -1 -1 >>> >>> >>> >>> btrtest:7275 terminated with signal 11 at PC=7f401f49e7d8 >>> >>> SP=7ffec47e7578. Backtrace: >>> >>> /lib64/libc.so.6(+0x3ba7d8)[0x7f401f49e7d8] >>> >>> >>> >>> btrtest:7274 terminated with signal 11 at PC=7f1ba21897d8 >>> >>> SP=7ffc516ac8d8. Backtrace: >>> >>> /lib64/libc.so.6(+0x3ba7d8)[0x7f1ba21897d8] >>> >>> ------------------------------------------------------- >>> >>> Primary job terminated normally, but 1 process returned >>> >>> a non-zero exit code. Per user-direction, the job has been aborted. >>> >>> ------------------------------------------------------- >>> >>> >>> >>> >>> -------------------------------------------------------------------------- >>> >>> mpirun detected that one or more processes exited with non-zero >>> status, >>> >>> thus causing >>> >>> the job to be terminated. The first process to do so was: >>> >>> >>> >>> Process name: [[7936,1],1] >>> >>> Exit code: 1 >>> >>> >>> >>> >>> -------------------------------------------------------------------------- >>> >>> >>> >>> So obviously the code does not make it past MPI_Init() >>> >>> >>> >>> This is an issue that I have been observing for quite a while in >>> >>> different forms and have reported on the forum a few times also. Let >>> me >>> >>> elaborate: >>> >>> >>> >>> Both my 'well-behaving' and crashing clusters run CentOS 7 (the >>> crashing >>> >>> one has the latest updates, the well-behaving one does not as I am >>> not >>> >>> allowed to apply updates on that). They both have OMPI, from the >>> master >>> >>> branch, compiled from the source. Both consist of 64 bit Dell >>> servers, >>> >>> although not identical models (I doubt if that matters) >>> >>> >>> >>> The only significant difference between the two is this: >>> >>> >>> >>> The well behaved one (if it does core dump, that is because there is >>> a >>> >>> bug in the MPI app) has very simple network hardware: two different >>> NICs >>> >>> (one Broadcom GbE, one proprietary NIC that is currently being >>> exposed as an >>> >>> IP interface). There is no RDMA capability there at all. >>> >>> >>> >>> The crashing one have 4 different NICs: >>> >>> 1. Broadcom GbE >>> >>> 2. Chelsio T3 based 10Gb iWARP NIC >>> >>> 3. QLogic 20Gb Infiniband (PSM capable) >>> >>> 4. LSI logic Fibre channel >>> >>> >>> >>> In this situation, WITH ALL BUT THE GbE LINK DOWN (the GbE connects >>> the >>> >>> machine to the WAN link), it seems just the presence of these NICs >>> matter. >>> >>> >>> >>> Here are the various commands and outputs: >>> >>> >>> >>> [durga@smallMPI ~]$ mpirun -np 2 ./btrtest >>> >>> before MPI_Init : -1 -1 >>> >>> before MPI_Init : -1 -1 >>> >>> >>> >>> btrtest:10032 terminated with signal 11 at PC=7f6897c197d8 >>> >>> SP=7ffcae2b2ef8. Backtrace: >>> >>> /lib64/libc.so.6(+0x3ba7d8)[0x7f6897c197d8] >>> >>> >>> >>> btrtest:10033 terminated with signal 11 at PC=7fb035c3e7d8 >>> >>> SP=7ffe61a92088. Backtrace: >>> >>> /lib64/libc.so.6(+0x3ba7d8)[0x7fb035c3e7d8] >>> >>> ------------------------------------------------------- >>> >>> Primary job terminated normally, but 1 process returned >>> >>> a non-zero exit code. Per user-direction, the job has been aborted. >>> >>> ------------------------------------------------------- >>> >>> >>> >>> >>> -------------------------------------------------------------------------- >>> >>> mpirun detected that one or more processes exited with non-zero >>> status, >>> >>> thus causing >>> >>> the job to be terminated. The first process to do so was: >>> >>> >>> >>> Process name: [[9294,1],0] >>> >>> Exit code: 1 >>> >>> >>> >>> >>> -------------------------------------------------------------------------- >>> >>> >>> >>> [durga@smallMPI ~]$ mpirun -np 2 -mca pml ob1 ./btrtest >>> >>> before MPI_Init : -1 -1 >>> >>> before MPI_Init : -1 -1 >>> >>> >>> >>> btrtest:10076 terminated with signal 11 at PC=7fa92d20b7d8 >>> >>> SP=7ffebb106028. Backtrace: >>> >>> /lib64/libc.so.6(+0x3ba7d8)[0x7fa92d20b7d8] >>> >>> >>> >>> btrtest:10077 terminated with signal 11 at PC=7f5012fa57d8 >>> >>> SP=7ffea4f4fdf8. Backtrace: >>> >>> /lib64/libc.so.6(+0x3ba7d8)[0x7f5012fa57d8] >>> >>> ------------------------------------------------------- >>> >>> Primary job terminated normally, but 1 process returned >>> >>> a non-zero exit code. Per user-direction, the job has been aborted. >>> >>> ------------------------------------------------------- >>> >>> >>> >>> >>> -------------------------------------------------------------------------- >>> >>> mpirun detected that one or more processes exited with non-zero >>> status, >>> >>> thus causing >>> >>> the job to be terminated. The first process to do so was: >>> >>> >>> >>> Process name: [[9266,1],0] >>> >>> Exit code: 1 >>> >>> >>> >>> >>> -------------------------------------------------------------------------- >>> >>> >>> >>> [durga@smallMPI ~]$ mpirun -np 2 -mca pml ob1 -mca btl self,sm >>> ./btrtest >>> >>> before MPI_Init : -1 -1 >>> >>> before MPI_Init : -1 -1 >>> >>> >>> >>> btrtest:10198 terminated with signal 11 at PC=400829 SP=7ffe6e148870. >>> >>> Backtrace: >>> >>> >>> >>> btrtest:10197 terminated with signal 11 at PC=400829 SP=7ffe87be6cd0. >>> >>> Backtrace: >>> >>> ./btrtest[0x400829] >>> >>> /lib64/libc.so.6(__libc_start_main+0xf5)[0x7f9473bbeb15] >>> >>> ./btrtest[0x4006d9] >>> >>> ./btrtest[0x400829] >>> >>> /lib64/libc.so.6(__libc_start_main+0xf5)[0x7fdfe2d8eb15] >>> >>> ./btrtest[0x4006d9] >>> >>> after MPI_Init : -1 -1 >>> >>> after MPI_Init : -1 -1 >>> >>> ------------------------------------------------------- >>> >>> Primary job terminated normally, but 1 process returned >>> >>> a non-zero exit code. Per user-direction, the job has been aborted. >>> >>> ------------------------------------------------------- >>> >>> >>> >>> >>> -------------------------------------------------------------------------- >>> >>> mpirun detected that one or more processes exited with non-zero >>> status, >>> >>> thus causing >>> >>> the job to be terminated. The first process to do so was: >>> >>> >>> >>> Process name: [[9384,1],1] >>> >>> Exit code: 1 >>> >>> >>> >>> >>> -------------------------------------------------------------------------- >>> >>> >>> >>> >>> >>> [durga@smallMPI ~]$ ulimit -a >>> >>> core file size (blocks, -c) unlimited >>> >>> data seg size (kbytes, -d) unlimited >>> >>> scheduling priority (-e) 0 >>> >>> file size (blocks, -f) unlimited >>> >>> pending signals (-i) 216524 >>> >>> max locked memory (kbytes, -l) unlimited >>> >>> max memory size (kbytes, -m) unlimited >>> >>> open files (-n) 1024 >>> >>> pipe size (512 bytes, -p) 8 >>> >>> POSIX message queues (bytes, -q) 819200 >>> >>> real-time priority (-r) 0 >>> >>> stack size (kbytes, -s) 8192 >>> >>> cpu time (seconds, -t) unlimited >>> >>> max user processes (-u) 4096 >>> >>> virtual memory (kbytes, -v) unlimited >>> >>> file locks (-x) unlimited >>> >>> [durga@smallMPI ~]$ >>> >>> >>> >>> >>> >>> I do realize that my setup is very unusual (I am a quasi-developer >>> of MPI >>> >>> whereas most other folks in this list are likely end-users), but >>> somehow >>> >>> just disabling this 'execinfo' MCA would allow me to make progress >>> (and also >>> >>> find out why/where MPI_Init() is crashing!). Is there any way I can >>> do that? >>> >>> >>> >>> Thank you >>> >>> Durga >>> >>> >>> >>> The surgeon general advises you to eat right, exercise regularly and >>> quit >>> >>> ageing. >>> >>> >>> >>> On Wed, May 11, 2016 at 8:58 PM, Gilles Gouaillardet < >>> gil...@rist.or.jp <javascript:_e(%7B%7D,'cvml','gil...@rist.or.jp');>> >>> >>> wrote: >>> >>>> >>> >>>> Are you sure ulimit -c unlimited is *really* applied on all hosts >>> >>>> >>> >>>> >>> >>>> can you please run the simple program below and confirm that ? >>> >>>> >>> >>>> >>> >>>> Cheers, >>> >>>> >>> >>>> >>> >>>> Gilles >>> >>>> >>> >>>> >>> >>>> #include <sys/time.h> >>> >>>> #include <sys/resource.h> >>> >>>> #include <poll.h> >>> >>>> #include <stdio.h> >>> >>>> >>> >>>> int main(int argc, char *argv[]) { >>> >>>> struct rlimit rlim; >>> >>>> char * c = (char *)0; >>> >>>> getrlimit(RLIMIT_CORE, &rlim); >>> >>>> printf ("before MPI_Init : %d %d\n", rlim.rlim_cur, >>> rlim.rlim_max); >>> >>>> MPI_Init(&argc, &argv); >>> >>>> getrlimit(RLIMIT_CORE, &rlim); >>> >>>> printf ("after MPI_Init : %d %d\n", rlim.rlim_cur, >>> rlim.rlim_max); >>> >>>> *c = 0; >>> >>>> MPI_Finalize(); >>> >>>> return 0; >>> >>>> } >>> >>>> >>> >>>> >>> >>>> On 5/12/2016 4:22 AM, dpchoudh . wrote: >>> >>>> >>> >>>> Hello Gilles >>> >>>> >>> >>>> Thank you for the advice. However, that did not seem to make any >>> >>>> difference. Here is what I did (on the cluster that generates .btr >>> files for >>> >>>> core dumps): >>> >>>> >>> >>>> [durga@smallMPI git]$ ompi_info --all | grep opal_signal >>> >>>> MCA opal base: parameter "opal_signal" (current value: >>> >>>> "6,7,8,11", data source: default, level: 3 user/all, type: string) >>> >>>> [durga@smallMPI git]$ >>> >>>> >>> >>>> >>> >>>> According to <bits/signum.h>, signals 6.7,8,11 are this: >>> >>>> >>> >>>> #define SIGABRT 6 /* Abort (ANSI). */ >>> >>>> #define SIGBUS 7 /* BUS error (4.2 BSD). */ >>> >>>> #define SIGFPE 8 /* Floating-point exception (ANSI). >>> */ >>> >>>> #define SIGSEGV 11 /* Segmentation violation (ANSI). >>> */ >>> >>>> >>> >>>> And thus I added the following just after MPI_Init() >>> >>>> >>> >>>> MPI_Init(&argc, &argv); >>> >>>> signal(SIGABRT, SIG_DFL); >>> >>>> signal(SIGBUS, SIG_DFL); >>> >>>> signal(SIGFPE, SIG_DFL); >>> >>>> signal(SIGSEGV, SIG_DFL); >>> >>>> signal(SIGTERM, SIG_DFL); >>> >>>> >>> >>>> (I added the 'SIGTERM' part later, just in case it would make a >>> >>>> difference; i didn't) >>> >>>> >>> >>>> The resulting code still generates .btr files instead of core files. >>> >>>> >>> >>>> It looks like the 'execinfo' MCA component is being used as the >>> >>>> backtrace mechanism: >>> >>>> >>> >>>> [durga@smallMPI git]$ ompi_info | grep backtrace >>> >>>> MCA backtrace: execinfo (MCA v2.1.0, API v2.0.0, >>> Component >>> >>>> v3.0.0) >>> >>>> >>> >>>> However, I could not find any way to choose 'none' instead of >>> 'execinfo' >>> >>>> >>> >>>> And the strange thing is, on the cluster where regular core dump is >>> >>>> happening, the output of >>> >>>> $ ompi_info | grep backtrace >>> >>>> is identical to the above. (Which kind of makes sense because they >>> were >>> >>>> created from the same source with the same configure options.) >>> >>>> >>> >>>> Sorry to harp on this, but without a core file it is hard to debug >>> the >>> >>>> application (e.g. examine stack variables). >>> >>>> >>> >>>> Thank you >>> >>>> Durga >>> >>>> >>> >>>> >>> >>>> The surgeon general advises you to eat right, exercise regularly and >>> >>>> quit ageing. >>> >>>> >>> >>>> On Wed, May 11, 2016 at 3:37 AM, Gilles Gouaillardet >>> >>>> <gilles.gouaillar...@gmail.com >>> <javascript:_e(%7B%7D,'cvml','gilles.gouaillar...@gmail.com');>> wrote: >>> >>>>> >>> >>>>> Durga, >>> >>>>> >>> >>>>> you might wanna try to restore the signal handler for other >>> signals as >>> >>>>> well >>> >>>>> (SIGSEGV, SIGBUS, ...) >>> >>>>> ompi_info --all | grep opal_signal >>> >>>>> does list the signal you should restore the handler >>> >>>>> >>> >>>>> >>> >>>>> only one backtrace component is built (out of several candidates : >>> >>>>> execinfo, none, printstack) >>> >>>>> nm -l libopen-pal.so | grep backtrace >>> >>>>> will hint you which component was built >>> >>>>> >>> >>>>> your two similar distros might have different backtrace component >>> >>>>> >>> >>>>> >>> >>>>> >>> >>>>> Gus, >>> >>>>> >>> >>>>> btr is a plain text file with a back trace "ala" gdb >>> >>>>> >>> >>>>> >>> >>>>> >>> >>>>> Nathan, >>> >>>>> >>> >>>>> i did a 'grep btr' and could not find anything :-( >>> >>>>> opal_backtrace_buffer and opal_backtrace_print are only used with >>> >>>>> stderr. >>> >>>>> so i am puzzled who creates the tracefile name and where ... >>> >>>>> also, no stack is printed by default unless opal_abort_print_stack >>> is >>> >>>>> true >>> >>>>> >>> >>>>> Cheers, >>> >>>>> >>> >>>>> Gilles >>> >>>>> >>> >>>>> >>> >>>>> On Wed, May 11, 2016 at 3:43 PM, dpchoudh . <dpcho...@gmail.com >>> <javascript:_e(%7B%7D,'cvml','dpcho...@gmail.com');>> wrote: >>> >>>>> > Hello Nathan >>> >>>>> > >>> >>>>> > Thank you for your response. Could you please be more specific? >>> >>>>> > Adding the >>> >>>>> > following after MPI_Init() does not seem to make a difference. >>> >>>>> > >>> >>>>> > MPI_Init(&argc, &argv); >>> >>>>> > signal(SIGABRT, SIG_DFL); >>> >>>>> > signal(SIGTERM, SIG_DFL); >>> >>>>> > >>> >>>>> > I also find it puzzling that nearly identical OMPI distro >>> running on >>> >>>>> > a >>> >>>>> > different machine shows different behaviour. >>> >>>>> > >>> >>>>> > Best regards >>> >>>>> > Durga >>> >>>>> > >>> >>>>> > The surgeon general advises you to eat right, exercise regularly >>> and >>> >>>>> > quit >>> >>>>> > ageing. >>> >>>>> > >>> >>>>> > On Tue, May 10, 2016 at 10:02 AM, Hjelm, Nathan Thomas >>> >>>>> > <hje...@lanl.gov >>> <javascript:_e(%7B%7D,'cvml','hje...@lanl.gov');>> >>> >>>>> > wrote: >>> >>>>> >> >>> >>>>> >> btr files are indeed created by open mpi's backtrace mechanism. >>> I >>> >>>>> >> think we >>> >>>>> >> should revisit it at some point but for now the only effective >>> way i >>> >>>>> >> have >>> >>>>> >> found to prevent it is to restore the default signal handlers >>> after >>> >>>>> >> MPI_Init. >>> >>>>> >> >>> >>>>> >> Excuse the quoting style. Good sucks. >>> >>>>> >> >>> >>>>> >> >>> >>>>> >> ________________________________________ >>> >>>>> >> From: users on behalf of dpchoudh . >>> >>>>> >> Sent: Monday, May 09, 2016 2:59:37 PM >>> >>>>> >> To: Open MPI Users >>> >>>>> >> Subject: Re: [OMPI users] No core dump in some cases >>> >>>>> >> >>> >>>>> >> Hi Gus >>> >>>>> >> >>> >>>>> >> Thanks for your suggestion. But I am not using any resource >>> manager >>> >>>>> >> (i.e. >>> >>>>> >> I am launching mpirun from the bash shell.). In fact, both of >>> the >>> >>>>> >> two >>> >>>>> >> clusters I talked about run CentOS 7 and I launch the job the >>> same >>> >>>>> >> way on >>> >>>>> >> both of these, yet one of them creates standard core files and >>> the >>> >>>>> >> other >>> >>>>> >> creates the 'btr; files. Strange thing is, I could not find >>> anything >>> >>>>> >> on the >>> >>>>> >> .btr (= Backtrace?) files on Google, which is any I asked on >>> this >>> >>>>> >> forum. >>> >>>>> >> >>> >>>>> >> Best regards >>> >>>>> >> Durga >>> >>>>> >> >>> >>>>> >> The surgeon general advises you to eat right, exercise >>> regularly and >>> >>>>> >> quit >>> >>>>> >> ageing. >>> >>>>> >> >>> >>>>> >> On Mon, May 9, 2016 at 12:04 PM, Gus Correa >>> >>>>> >> < <javascript:_e(%7B%7D,'cvml','g...@ldeo.columbia.edu');> >>> g...@ldeo.columbia.edu >>> <javascript:_e(%7B%7D,'cvml','g...@ldeo.columbia.edu');><mailto: >>> <javascript:_e(%7B%7D,'cvml','g...@ldeo.columbia.edu');> >>> g...@ldeo.columbia.edu >>> <javascript:_e(%7B%7D,'cvml','g...@ldeo.columbia.edu');>>> wrote: >>> >>>>> >> Hi Durga >>> >>>>> >> >>> >>>>> >> Just in case ... >>> >>>>> >> If you're using a resource manager to start the jobs (Torque, >>> etc), >>> >>>>> >> you need to have them set the limits (for coredump size, >>> stacksize, >>> >>>>> >> locked >>> >>>>> >> memory size, etc). >>> >>>>> >> This way the jobs will inherit the limits from the >>> >>>>> >> resource manager daemon. >>> >>>>> >> On Torque (which I use) I do this on the pbs_mom daemon >>> >>>>> >> init script (I am still before the systemd era, that lovely >>> POS). >>> >>>>> >> And set the hard/soft limits on /etc/security/limits.conf as >>> well. >>> >>>>> >> >>> >>>>> >> I hope this helps, >>> >>>>> >> Gus Correa >>> >>>>> >> >>> >>>>> >> On 05/07/2016 12:27 PM, Jeff Squyres (jsquyres) wrote: >>> >>>>> >> I'm afraid I don't know what a .btr file is -- that is not >>> something >>> >>>>> >> that >>> >>>>> >> is controlled by Open MPI. >>> >>>>> >> >>> >>>>> >> You might want to look into your OS settings to see if it has >>> some >>> >>>>> >> kind of >>> >>>>> >> alternate corefile mechanism...? >>> >>>>> >> >>> >>>>> >> >>> >>>>> >> On May 6, 2016, at 8:58 PM, dpchoudh . >>> >>>>> >> < <javascript:_e(%7B%7D,'cvml','dpcho...@gmail.com');> >>> dpcho...@gmail.com <javascript:_e(%7B%7D,'cvml','dpcho...@gmail.com');> >>> <mailto: <javascript:_e(%7B%7D,'cvml','dpcho...@gmail.com');> >>> dpcho...@gmail.com <javascript:_e(%7B%7D,'cvml','dpcho...@gmail.com');>>> >>> wrote: >>> >>>>> >> >>> >>>>> >> Hello all >>> >>>>> >> >>> >>>>> >> I run MPI jobs (for test purpose only) on two different >>> 'clusters'. >>> >>>>> >> Both >>> >>>>> >> 'clusters' have two nodes only, connected back-to-back. The two >>> are >>> >>>>> >> very >>> >>>>> >> similar, but not identical, both software and hardware wise. >>> >>>>> >> >>> >>>>> >> Both have ulimit -c set to unlimited. However, only one of the >>> two >>> >>>>> >> creates >>> >>>>> >> core files when an MPI job crashes. The other creates a text >>> file >>> >>>>> >> named >>> >>>>> >> something like >>> >>>>> >> >>> >>>>> >> >>> >>>>> >> >>> <program_name_that_crashed>.80s-<a-number-that-looks-like-a-PID>,<hostname-where-the-crash-happened>.btr >>> >>>>> >> >>> >>>>> >> I'd much prefer a core file because that allows me to debug >>> with a >>> >>>>> >> lot >>> >>>>> >> more options than a static text file with addresses. How do I >>> get a >>> >>>>> >> core >>> >>>>> >> file in all situations? I am using MPI source from the master >>> >>>>> >> branch. >>> >>>>> >> >>> >>>>> >> Thanks in advance >>> >>>>> >> Durga >>> >>>>> >> >>> >>>>> >> The surgeon general advises you to eat right, exercise >>> regularly and >>> >>>>> >> quit >>> >>>>> >> ageing. >>> >>>>> >> _______________________________________________ >>> >>>>> >> users mailing list >>> >>>>> >> us...@open-mpi.org >>> <javascript:_e(%7B%7D,'cvml','us...@open-mpi.org');><mailto: >>> <javascript:_e(%7B%7D,'cvml','us...@open-mpi.org');>us...@open-mpi.org >>> <javascript:_e(%7B%7D,'cvml','us...@open-mpi.org');>> >>> >>>>> >> Subscription: >>> <https://www.open-mpi.org/mailman/listinfo.cgi/users> >>> https://www.open-mpi.org/mailman/listinfo.cgi/users >>> >>>>> >> Link to this post: >>> >>>>> >> http://www.open-mpi.org/community/lists/users/2016/05/29124.php >>> >>>>> >> >>> >>>>> >> >>> >>>>> >> >>> >>>>> >> _______________________________________________ >>> >>>>> >> users mailing list >>> >>>>> >> us...@open-mpi.org >>> <javascript:_e(%7B%7D,'cvml','us...@open-mpi.org');><mailto: >>> <javascript:_e(%7B%7D,'cvml','us...@open-mpi.org');>us...@open-mpi.org >>> <javascript:_e(%7B%7D,'cvml','us...@open-mpi.org');>> >>> >>>>> >> Subscription: >>> <https://www.open-mpi.org/mailman/listinfo.cgi/users> >>> https://www.open-mpi.org/mailman/listinfo.cgi/users >>> >>>>> >> Link to this post: >>> >>>>> >> http://www.open-mpi.org/community/lists/users/2016/05/29141.php >>> >>>>> >> >>> >>>>> >> _______________________________________________ >>> >>>>> >> users mailing list >>> >>>>> >> us...@open-mpi.org >>> <javascript:_e(%7B%7D,'cvml','us...@open-mpi.org');> >>> >>>>> >> Subscription: >>> <https://www.open-mpi.org/mailman/listinfo.cgi/users> >>> https://www.open-mpi.org/mailman/listinfo.cgi/users >>> >>>>> >> Link to this post: >>> >>>>> >> http://www.open-mpi.org/community/lists/users/2016/05/29154.php >>> >>>>> > >>> >>>>> > >>> >>>>> > >>> >>>>> > _______________________________________________ >>> >>>>> > users mailing list >>> >>>>> > us...@open-mpi.org >>> <javascript:_e(%7B%7D,'cvml','us...@open-mpi.org');> >>> >>>>> > Subscription: >>> <https://www.open-mpi.org/mailman/listinfo.cgi/users> >>> https://www.open-mpi.org/mailman/listinfo.cgi/users >>> >>>>> > Link to this post: >>> >>>>> > http://www.open-mpi.org/community/lists/users/2016/05/29169.php >>> >>>>> _______________________________________________ >>> >>>>> users mailing list >>> >>>>> us...@open-mpi.org >>> <javascript:_e(%7B%7D,'cvml','us...@open-mpi.org');> >>> >>>>> Subscription: >>> <https://www.open-mpi.org/mailman/listinfo.cgi/users> >>> https://www.open-mpi.org/mailman/listinfo.cgi/users >>> >>>>> Link to this post: >>> >>>>> http://www.open-mpi.org/community/lists/users/2016/05/29170.php >>> >>>> >>> >>>> >>> >>>> >>> >>>> >>> >>>> _______________________________________________ >>> >>>> users mailing list >>> >>>> us...@open-mpi.org >>> <javascript:_e(%7B%7D,'cvml','us...@open-mpi.org');> >>> >>>> Subscription: <https://www.open-mpi.org/mailman/listinfo.cgi/users> >>> https://www.open-mpi.org/mailman/listinfo.cgi/users >>> >>>> Link to this post: >>> >>>> http://www.open-mpi.org/community/lists/users/2016/05/29176.php >>> >>>> >>> >>>> >>> >>>> >>> >>>> _______________________________________________ >>> >>>> users mailing list >>> >>>> us...@open-mpi.org >>> <javascript:_e(%7B%7D,'cvml','us...@open-mpi.org');> >>> >>>> Subscription: <https://www.open-mpi.org/mailman/listinfo.cgi/users> >>> https://www.open-mpi.org/mailman/listinfo.cgi/users >>> >>>> Link to this post: >>> >>>> http://www.open-mpi.org/community/lists/users/2016/05/29177.php >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> _______________________________________________ >>> >>> users mailing list >>> >>> us...@open-mpi.org >>> <javascript:_e(%7B%7D,'cvml','us...@open-mpi.org');> >>> >>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users >>> >>> Link to this post: >>> >>> http://www.open-mpi.org/community/lists/users/2016/05/29178.php >>> >>> >>> >>> >>> >>> >>> >>> _______________________________________________ >>> >>> users mailing list >>> >>> us...@open-mpi.org >>> <javascript:_e(%7B%7D,'cvml','us...@open-mpi.org');> >>> >>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users >>> >>> Link to this post: >>> >>> http://www.open-mpi.org/community/lists/users/2016/05/29181.php >>> >> >>> >> >>> > >>> > >>> > _______________________________________________ >>> > users mailing list >>> > us...@open-mpi.org >>> <javascript:_e(%7B%7D,'cvml','us...@open-mpi.org');> >>> > Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users >>> > Link to this post: >>> > http://www.open-mpi.org/community/lists/users/2016/05/29184.php >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org <javascript:_e(%7B%7D,'cvml','us...@open-mpi.org');> >>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users >>> Link to this post: >>> http://www.open-mpi.org/community/lists/users/2016/05/29185.php >>> >> >> >> >> _______________________________________________ >> users mailing listus...@open-mpi.org >> <javascript:_e(%7B%7D,'cvml','us...@open-mpi.org');> >> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users >> Link to this post: >> http://www.open-mpi.org/community/lists/users/2016/05/29186.php >> >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org <javascript:_e(%7B%7D,'cvml','us...@open-mpi.org');> >> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users >> Link to this post: >> http://www.open-mpi.org/community/lists/users/2016/05/29187.php >> > >