This is a known problem - I committed the fix for PSM with a link down just today.
> On May 11, 2016, at 7:34 PM, dpchoudh . <dpcho...@gmail.com> wrote: > > Hello Gilles > > Thank you for your continued support. With your help, I have a better > understanding of what is happening. Here are the details. > > 1. Yes, I am sure that ulimit -c is 'unlimited' (and for the test in > question, I am running it on a single node, so there are no other nodes) > > 2. The command I am running is possibly the simplest MPI command: > mpirun -np 2 <program> > > It looks to me, after running your test code, that what is crashing is > MPI_Init() itself. The output from your code (I called it 'btrtest') is as > follows: > > [durga@smallMPI ~]$ mpirun -np 2 ./btrtest > before MPI_Init : -1 -1 > before MPI_Init : -1 -1 > > btrtest:7275 terminated with signal 11 at PC=7f401f49e7d8 SP=7ffec47e7578. > Backtrace: > /lib64/libc.so.6(+0x3ba7d8)[0x7f401f49e7d8] > > btrtest:7274 terminated with signal 11 at PC=7f1ba21897d8 SP=7ffc516ac8d8. > Backtrace: > /lib64/libc.so.6(+0x3ba7d8)[0x7f1ba21897d8] > ------------------------------------------------------- > Primary job terminated normally, but 1 process returned > a non-zero exit code. Per user-direction, the job has been aborted. > ------------------------------------------------------- > -------------------------------------------------------------------------- > mpirun detected that one or more processes exited with non-zero status, thus > causing > the job to be terminated. The first process to do so was: > > Process name: [[7936,1],1] > Exit code: 1 > -------------------------------------------------------------------------- > > So obviously the code does not make it past MPI_Init() > > This is an issue that I have been observing for quite a while in different > forms and have reported on the forum a few times also. Let me elaborate: > > Both my 'well-behaving' and crashing clusters run CentOS 7 (the crashing one > has the latest updates, the well-behaving one does not as I am not allowed to > apply updates on that). They both have OMPI, from the master branch, compiled > from the source. Both consist of 64 bit Dell servers, although not identical > models (I doubt if that matters) > > The only significant difference between the two is this: > > The well behaved one (if it does core dump, that is because there is a bug in > the MPI app) has very simple network hardware: two different NICs (one > Broadcom GbE, one proprietary NIC that is currently being exposed as an IP > interface). There is no RDMA capability there at all. > > The crashing one have 4 different NICs: > 1. Broadcom GbE > 2. Chelsio T3 based 10Gb iWARP NIC > 3. QLogic 20Gb Infiniband (PSM capable) > 4. LSI logic Fibre channel > > In this situation, WITH ALL BUT THE GbE LINK DOWN (the GbE connects the > machine to the WAN link), it seems just the presence of these NICs matter. > > Here are the various commands and outputs: > > [durga@smallMPI ~]$ mpirun -np 2 ./btrtest > before MPI_Init : -1 -1 > before MPI_Init : -1 -1 > > btrtest:10032 terminated with signal 11 at PC=7f6897c197d8 SP=7ffcae2b2ef8. > Backtrace: > /lib64/libc.so.6(+0x3ba7d8)[0x7f6897c197d8] > > btrtest:10033 terminated with signal 11 at PC=7fb035c3e7d8 SP=7ffe61a92088. > Backtrace: > /lib64/libc.so.6(+0x3ba7d8)[0x7fb035c3e7d8] > ------------------------------------------------------- > Primary job terminated normally, but 1 process returned > a non-zero exit code. Per user-direction, the job has been aborted. > ------------------------------------------------------- > -------------------------------------------------------------------------- > mpirun detected that one or more processes exited with non-zero status, thus > causing > the job to be terminated. The first process to do so was: > > Process name: [[9294,1],0] > Exit code: 1 > -------------------------------------------------------------------------- > > [durga@smallMPI ~]$ mpirun -np 2 -mca pml ob1 ./btrtest > before MPI_Init : -1 -1 > before MPI_Init : -1 -1 > > btrtest:10076 terminated with signal 11 at PC=7fa92d20b7d8 SP=7ffebb106028. > Backtrace: > /lib64/libc.so.6(+0x3ba7d8)[0x7fa92d20b7d8] > > btrtest:10077 terminated with signal 11 at PC=7f5012fa57d8 SP=7ffea4f4fdf8. > Backtrace: > /lib64/libc.so.6(+0x3ba7d8)[0x7f5012fa57d8] > ------------------------------------------------------- > Primary job terminated normally, but 1 process returned > a non-zero exit code. Per user-direction, the job has been aborted. > ------------------------------------------------------- > -------------------------------------------------------------------------- > mpirun detected that one or more processes exited with non-zero status, thus > causing > the job to be terminated. The first process to do so was: > > Process name: [[9266,1],0] > Exit code: 1 > -------------------------------------------------------------------------- > > [durga@smallMPI ~]$ mpirun -np 2 -mca pml ob1 -mca btl self,sm ./btrtest > before MPI_Init : -1 -1 > before MPI_Init : -1 -1 > > btrtest:10198 terminated with signal 11 at PC=400829 SP=7ffe6e148870. > Backtrace: > > btrtest:10197 terminated with signal 11 at PC=400829 SP=7ffe87be6cd0. > Backtrace: > ./btrtest[0x400829] > /lib64/libc.so.6(__libc_start_main+0xf5)[0x7f9473bbeb15] > ./btrtest[0x4006d9] > ./btrtest[0x400829] > /lib64/libc.so.6(__libc_start_main+0xf5)[0x7fdfe2d8eb15] > ./btrtest[0x4006d9] > after MPI_Init : -1 -1 > after MPI_Init : -1 -1 > ------------------------------------------------------- > Primary job terminated normally, but 1 process returned > a non-zero exit code. Per user-direction, the job has been aborted. > ------------------------------------------------------- > -------------------------------------------------------------------------- > mpirun detected that one or more processes exited with non-zero status, thus > causing > the job to be terminated. The first process to do so was: > > Process name: [[9384,1],1] > Exit code: 1 > -------------------------------------------------------------------------- > > > [durga@smallMPI ~]$ ulimit -a > core file size (blocks, -c) unlimited > data seg size (kbytes, -d) unlimited > scheduling priority (-e) 0 > file size (blocks, -f) unlimited > pending signals (-i) 216524 > max locked memory (kbytes, -l) unlimited > max memory size (kbytes, -m) unlimited > open files (-n) 1024 > pipe size (512 bytes, -p) 8 > POSIX message queues (bytes, -q) 819200 > real-time priority (-r) 0 > stack size (kbytes, -s) 8192 > cpu time (seconds, -t) unlimited > max user processes (-u) 4096 > virtual memory (kbytes, -v) unlimited > file locks (-x) unlimited > [durga@smallMPI ~]$ > > > I do realize that my setup is very unusual (I am a quasi-developer of MPI > whereas most other folks in this list are likely end-users), but somehow just > disabling this 'execinfo' MCA would allow me to make progress (and also find > out why/where MPI_Init() is crashing!). Is there any way I can do that? > > Thank you > Durga > > The surgeon general advises you to eat right, exercise regularly and quit > ageing. > > On Wed, May 11, 2016 at 8:58 PM, Gilles Gouaillardet <gil...@rist.or.jp > <mailto:gil...@rist.or.jp>> wrote: > Are you sure ulimit -c unlimited is *really* applied on all hosts > > > can you please run the simple program below and confirm that ? > > > Cheers, > > > Gilles > > > #include <sys/time.h> > #include <sys/resource.h> > #include <poll.h> > #include <stdio.h> > > int main(int argc, char *argv[]) { > struct rlimit rlim; > char * c = (char *)0; > getrlimit(RLIMIT_CORE, &rlim); > printf ("before MPI_Init : %d %d\n", rlim.rlim_cur, rlim.rlim_max); > MPI_Init(&argc, &argv); > getrlimit(RLIMIT_CORE, &rlim); > printf ("after MPI_Init : %d %d\n", rlim.rlim_cur, rlim.rlim_max); > *c = 0; > MPI_Finalize(); > return 0; > } > > > On 5/12/2016 4:22 AM, dpchoudh . wrote: >> Hello Gilles >> >> Thank you for the advice. However, that did not seem to make any difference. >> Here is what I did (on the cluster that generates .btr files for core dumps): >> >> [durga@smallMPI git]$ ompi_info --all | grep opal_signal >> MCA opal base: parameter "opal_signal" (current value: >> "6,7,8,11", data source: default, level: 3 user/all, type: string) >> [durga@smallMPI git]$ >> >> >> According to <bits/signum.h>, signals 6.7,8,11 are this: >> >> #define SIGABRT 6 /* Abort (ANSI). */ >> #define SIGBUS 7 /* BUS error (4.2 BSD). */ >> #define SIGFPE 8 /* Floating-point exception (ANSI). */ >> #define SIGSEGV 11 /* Segmentation violation (ANSI). */ >> >> And thus I added the following just after MPI_Init() >> >> MPI_Init(&argc, &argv); >> signal(SIGABRT, SIG_DFL); >> signal(SIGBUS, SIG_DFL); >> signal(SIGFPE, SIG_DFL); >> signal(SIGSEGV, SIG_DFL); >> signal(SIGTERM, SIG_DFL); >> >> (I added the 'SIGTERM' part later, just in case it would make a difference; >> i didn't) >> >> The resulting code still generates .btr files instead of core files. >> >> It looks like the 'execinfo' MCA component is being used as the backtrace >> mechanism: >> >> [durga@smallMPI git]$ ompi_info | grep backtrace >> MCA backtrace: execinfo (MCA v2.1.0, API v2.0.0, Component v3.0.0) >> >> However, I could not find any way to choose 'none' instead of 'execinfo' >> >> And the strange thing is, on the cluster where regular core dump is >> happening, the output of >> $ ompi_info | grep backtrace >> is identical to the above. (Which kind of makes sense because they were >> created from the same source with the same configure options.) >> >> Sorry to harp on this, but without a core file it is hard to debug the >> application (e.g. examine stack variables). >> >> Thank you >> Durga >> >> >> The surgeon general advises you to eat right, exercise regularly and quit >> ageing. >> >> On Wed, May 11, 2016 at 3:37 AM, Gilles Gouaillardet < >> <mailto:gilles.gouaillar...@gmail.com>gilles.gouaillar...@gmail.com >> <mailto:gilles.gouaillar...@gmail.com>> wrote: >> Durga, >> >> you might wanna try to restore the signal handler for other signals as well >> (SIGSEGV, SIGBUS, ...) >> ompi_info --all | grep opal_signal >> does list the signal you should restore the handler >> >> >> only one backtrace component is built (out of several candidates : >> execinfo, none, printstack) >> nm -l libopen-pal.so | grep backtrace >> will hint you which component was built >> >> your two similar distros might have different backtrace component >> >> >> >> Gus, >> >> btr is a plain text file with a back trace "ala" gdb >> >> >> >> Nathan, >> >> i did a 'grep btr' and could not find anything :-( >> opal_backtrace_buffer and opal_backtrace_print are only used with stderr. >> so i am puzzled who creates the tracefile name and where ... >> also, no stack is printed by default unless opal_abort_print_stack is true >> >> Cheers, >> >> Gilles >> >> >> On Wed, May 11, 2016 at 3:43 PM, dpchoudh . < >> <mailto:dpcho...@gmail.com>dpcho...@gmail.com <mailto:dpcho...@gmail.com>> >> wrote: >> > Hello Nathan >> > >> > Thank you for your response. Could you please be more specific? Adding the >> > following after MPI_Init() does not seem to make a difference. >> > >> > MPI_Init(&argc, &argv); >> > signal(SIGABRT, SIG_DFL); >> > signal(SIGTERM, SIG_DFL); >> > >> > I also find it puzzling that nearly identical OMPI distro running on a >> > different machine shows different behaviour. >> > >> > Best regards >> > Durga >> > >> > The surgeon general advises you to eat right, exercise regularly and quit >> > ageing. >> > >> > On Tue, May 10, 2016 at 10:02 AM, Hjelm, Nathan Thomas <hje...@lanl.gov >> > <mailto:hje...@lanl.gov>> >> > wrote: >> >> >> >> btr files are indeed created by open mpi's backtrace mechanism. I think we >> >> should revisit it at some point but for now the only effective way i have >> >> found to prevent it is to restore the default signal handlers after >> >> MPI_Init. >> >> >> >> Excuse the quoting style. Good sucks. >> >> >> >> >> >> ________________________________________ >> >> From: users on behalf of dpchoudh . >> >> Sent: Monday, May 09, 2016 2:59:37 PM >> >> To: Open MPI Users >> >> Subject: Re: [OMPI users] No core dump in some cases >> >> >> >> Hi Gus >> >> >> >> Thanks for your suggestion. But I am not using any resource manager (i.e. >> >> I am launching mpirun from the bash shell.). In fact, both of the two >> >> clusters I talked about run CentOS 7 and I launch the job the same way on >> >> both of these, yet one of them creates standard core files and the other >> >> creates the 'btr; files. Strange thing is, I could not find anything on >> >> the >> >> .btr (= Backtrace?) files on Google, which is any I asked on this forum. >> >> >> >> Best regards >> >> Durga >> >> >> >> The surgeon general advises you to eat right, exercise regularly and quit >> >> ageing. >> >> >> >> On Mon, May 9, 2016 at 12:04 PM, Gus Correa >> >> <g...@ldeo.columbia.edu <mailto:g...@ldeo.columbia.edu><mailto: >> >> <mailto:g...@ldeo.columbia.edu>g...@ldeo.columbia.edu >> >> <mailto:g...@ldeo.columbia.edu>>> wrote: >> >> Hi Durga >> >> >> >> Just in case ... >> >> If you're using a resource manager to start the jobs (Torque, etc), >> >> you need to have them set the limits (for coredump size, stacksize, locked >> >> memory size, etc). >> >> This way the jobs will inherit the limits from the >> >> resource manager daemon. >> >> On Torque (which I use) I do this on the pbs_mom daemon >> >> init script (I am still before the systemd era, that lovely POS). >> >> And set the hard/soft limits on /etc/security/limits.conf as well. >> >> >> >> I hope this helps, >> >> Gus Correa >> >> >> >> On 05/07/2016 12:27 PM, Jeff Squyres (jsquyres) wrote: >> >> I'm afraid I don't know what a .btr file is -- that is not something that >> >> is controlled by Open MPI. >> >> >> >> You might want to look into your OS settings to see if it has some kind of >> >> alternate corefile mechanism...? >> >> >> >> >> >> On May 6, 2016, at 8:58 PM, dpchoudh . >> >> <dpcho...@gmail.com <mailto:dpcho...@gmail.com><mailto: >> >> <mailto:dpcho...@gmail.com>dpcho...@gmail.com >> >> <mailto:dpcho...@gmail.com>>> wrote: >> >> >> >> Hello all >> >> >> >> I run MPI jobs (for test purpose only) on two different 'clusters'. Both >> >> 'clusters' have two nodes only, connected back-to-back. The two are very >> >> similar, but not identical, both software and hardware wise. >> >> >> >> Both have ulimit -c set to unlimited. However, only one of the two creates >> >> core files when an MPI job crashes. The other creates a text file named >> >> something like >> >> >> >> <program_name_that_crashed>.80s-<a-number-that-looks-like-a-PID>,<hostname-where-the-crash-happened>.btr >> >> >> >> I'd much prefer a core file because that allows me to debug with a lot >> >> more options than a static text file with addresses. How do I get a core >> >> file in all situations? I am using MPI source from the master branch. >> >> >> >> Thanks in advance >> >> Durga >> >> >> >> The surgeon general advises you to eat right, exercise regularly and quit >> >> ageing. >> >> _______________________________________________ >> >> users mailing list >> >> us...@open-mpi.org <mailto:us...@open-mpi.org><mailto: >> >> <mailto:us...@open-mpi.org>us...@open-mpi.org <mailto:us...@open-mpi.org>> >> >> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users >> >> <https://www.open-mpi.org/mailman/listinfo.cgi/users> >> >> Link to this post: >> >> http://www.open-mpi.org/community/lists/users/2016/05/29124.php >> >> <http://www.open-mpi.org/community/lists/users/2016/05/29124.php> >> >> >> >> >> >> >> >> _______________________________________________ >> >> users mailing list >> >> us...@open-mpi.org <mailto:us...@open-mpi.org><mailto: >> >> <mailto:us...@open-mpi.org>us...@open-mpi.org <mailto:us...@open-mpi.org>> >> >> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users >> >> <https://www.open-mpi.org/mailman/listinfo.cgi/users> >> >> Link to this post: >> >> http://www.open-mpi.org/community/lists/users/2016/05/29141.php >> >> <http://www.open-mpi.org/community/lists/users/2016/05/29141.php> >> >> >> >> _______________________________________________ >> >> users mailing list >> >> us...@open-mpi.org <mailto:us...@open-mpi.org> >> >> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users >> >> <https://www.open-mpi.org/mailman/listinfo.cgi/users> >> >> Link to this post: >> >> http://www.open-mpi.org/community/lists/users/2016/05/29154.php >> >> <http://www.open-mpi.org/community/lists/users/2016/05/29154.php> >> > >> > >> > >> > _______________________________________________ >> > users mailing list >> > us...@open-mpi.org <mailto:us...@open-mpi.org> >> > Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users >> > <https://www.open-mpi.org/mailman/listinfo.cgi/users> >> > Link to this post: >> > http://www.open-mpi.org/community/lists/users/2016/05/29169.php >> > <http://www.open-mpi.org/community/lists/users/2016/05/29169.php> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org <mailto:us...@open-mpi.org> >> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users >> <https://www.open-mpi.org/mailman/listinfo.cgi/users> >> Link to this post: >> http://www.open-mpi.org/community/lists/users/2016/05/29170.php >> <http://www.open-mpi.org/community/lists/users/2016/05/29170.php> >> >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org <mailto:us...@open-mpi.org> >> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users >> <https://www.open-mpi.org/mailman/listinfo.cgi/users> >> Link to this post: >> http://www.open-mpi.org/community/lists/users/2016/05/29176.php >> <http://www.open-mpi.org/community/lists/users/2016/05/29176.php> > > _______________________________________________ > users mailing list > us...@open-mpi.org <mailto:us...@open-mpi.org> > Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users > <https://www.open-mpi.org/mailman/listinfo.cgi/users> > Link to this post: > http://www.open-mpi.org/community/lists/users/2016/05/29177.php > <http://www.open-mpi.org/community/lists/users/2016/05/29177.php> > > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2016/05/29178.php