Hello Nathan

Thank you for your response. Could you please be more specific? Adding the
following after MPI_Init() does not seem to make a difference.

    MPI_Init(&argc, &argv);

* signal(SIGABRT, SIG_DFL);  signal(SIGTERM, SIG_DFL);*

I also find it puzzling that nearly identical OMPI distro running on a
different machine shows different behaviour.

Best regards
Durga

The surgeon general advises you to eat right, exercise regularly and quit
ageing.

On Tue, May 10, 2016 at 10:02 AM, Hjelm, Nathan Thomas <hje...@lanl.gov>
wrote:

> btr files are indeed created by open mpi's backtrace mechanism. I think we
> should revisit it at some point but for now the only effective way i have
> found to prevent it is to restore the default signal handlers after
> MPI_Init.
>
> Excuse the quoting style. Good sucks.
>
>
> ________________________________________
> From: users on behalf of dpchoudh .
> Sent: Monday, May 09, 2016 2:59:37 PM
> To: Open MPI Users
> Subject: Re: [OMPI users] No core dump in some cases
>
> Hi Gus
>
> Thanks for your suggestion. But I am not using any resource manager (i.e.
> I am launching mpirun from the bash shell.). In fact, both of the two
> clusters I talked about run CentOS 7 and I launch the job the same way on
> both of these, yet one of them creates standard core files and the other
> creates the 'btr; files. Strange thing is, I could not find anything on the
> .btr (= Backtrace?) files on Google, which is any I asked on this forum.
>
> Best regards
> Durga
>
> The surgeon general advises you to eat right, exercise regularly and quit
> ageing.
>
> On Mon, May 9, 2016 at 12:04 PM, Gus Correa <g...@ldeo.columbia.edu<mailto:
> g...@ldeo.columbia.edu>> wrote:
> Hi Durga
>
> Just in case ...
> If you're using a resource manager to start the jobs (Torque, etc),
> you need to have them set the limits (for coredump size, stacksize, locked
> memory size, etc).
> This way the jobs will inherit the limits from the
> resource manager daemon.
> On Torque (which I use) I do this on the pbs_mom daemon
> init script (I am still before the systemd era, that lovely POS).
> And set the hard/soft limits on /etc/security/limits.conf as well.
>
> I hope this helps,
> Gus Correa
>
> On 05/07/2016 12:27 PM, Jeff Squyres (jsquyres) wrote:
> I'm afraid I don't know what a .btr file is -- that is not something that
> is controlled by Open MPI.
>
> You might want to look into your OS settings to see if it has some kind of
> alternate corefile mechanism...?
>
>
> On May 6, 2016, at 8:58 PM, dpchoudh . <dpcho...@gmail.com<mailto:
> dpcho...@gmail.com>> wrote:
>
> Hello all
>
> I run MPI jobs (for test purpose only) on two different 'clusters'. Both
> 'clusters' have two nodes only, connected back-to-back. The two are very
> similar, but not identical, both software and hardware wise.
>
> Both have ulimit -c set to unlimited. However, only one of the two creates
> core files when an MPI job crashes. The other creates a text file named
> something like
>
> <program_name_that_crashed>.80s-<a-number-that-looks-like-a-PID>,<hostname-where-the-crash-happened>.btr
>
> I'd much prefer a core file because that allows me to debug with a lot
> more options than a static text file with addresses. How do I get a core
> file in all situations? I am using MPI source from the master branch.
>
> Thanks in advance
> Durga
>
> The surgeon general advises you to eat right, exercise regularly and quit
> ageing.
> _______________________________________________
> users mailing list
> us...@open-mpi.org<mailto:us...@open-mpi.org>
> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2016/05/29124.php
>
>
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org<mailto:us...@open-mpi.org>
> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2016/05/29141.php
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2016/05/29154.php
>

Reply via email to