On 05/09/2016 04:59 PM, dpchoudh . wrote:
Hi Gus

Thanks for your suggestion. But I am not using any resource manager
(i.e. I am launching mpirun from the bash shell.). In fact, both of the
two clusters I talked about run CentOS 7 and I launch the job the same
way on both of these, yet one of them creates standard core files and
the other creates the 'btr; files. Strange thing is, I could not find
anything on the .btr (= Backtrace?) files on Google, which is any I
asked on this forum.

Best regards
Durga

The surgeon general advises you to eat right, exercise regularly and
quit ageing.

Hi Durga

My search showed something, but quite weirdly related to databases.
Maybe the same file extension used for two different things?
Does "file *.btr" tell anything?

Databases:

http://cs.pervasive.com/forums/p/14533/50237.aspx

... more databases ...

http://www.openthefile.net/extension/btr

... binary tree indexes ...

http://www.velocityreviews.com/threads/index-btr-file-in-windows-xp-help-please.307459/

... and a catalog of buterflies!  :)

http://filext.com/file-extension/BTR
http://review-tech.appspot.com/btr-file.html

Oh well ...

... and finally a previous incarnation of an OpenMPI 1.6.5 question similar to yours (where .btr stands for backtrace):

http://stackoverflow.com/questions/25275450/cause-all-processes-running-under-openmpi-to-dump-core

Could this be due to a (unlikely) mix of OpenMPI 1.10 with 1.6.5?

Gus Correa


On Mon, May 9, 2016 at 12:04 PM, Gus Correa <g...@ldeo.columbia.edu
<mailto:g...@ldeo.columbia.edu>> wrote:

    Hi Durga

    Just in case ...
    If you're using a resource manager to start the jobs (Torque, etc),
    you need to have them set the limits (for coredump size, stacksize,
    locked memory size, etc).
    This way the jobs will inherit the limits from the
    resource manager daemon.
    On Torque (which I use) I do this on the pbs_mom daemon
    init script (I am still before the systemd era, that lovely POS).
    And set the hard/soft limits on /etc/security/limits.conf as well.

    I hope this helps,
    Gus Correa

    On 05/07/2016 12:27 PM, Jeff Squyres (jsquyres) wrote:

        I'm afraid I don't know what a .btr file is -- that is not
        something that is controlled by Open MPI.

        You might want to look into your OS settings to see if it has
        some kind of alternate corefile mechanism...?


            On May 6, 2016, at 8:58 PM, dpchoudh . <dpcho...@gmail.com
            <mailto:dpcho...@gmail.com>> wrote:

            Hello all

            I run MPI jobs (for test purpose only) on two different
            'clusters'. Both 'clusters' have two nodes only, connected
            back-to-back. The two are very similar, but not identical,
            both software and hardware wise.

            Both have ulimit -c set to unlimited. However, only one of
            the two creates core files when an MPI job crashes. The
            other creates a text file named something like
            
<program_name_that_crashed>.80s-<a-number-that-looks-like-a-PID>,<hostname-where-the-crash-happened>.btr

            I'd much prefer a core file because that allows me to debug
            with a lot more options than a static text file with
            addresses. How do I get a core file in all situations? I am
            using MPI source from the master branch.

            Thanks in advance
            Durga

            The surgeon general advises you to eat right, exercise
            regularly and quit ageing.
            _______________________________________________
            users mailing list
            us...@open-mpi.org <mailto:us...@open-mpi.org>
            Subscription:
            https://www.open-mpi.org/mailman/listinfo.cgi/users
            Link to this post:
            http://www.open-mpi.org/community/lists/users/2016/05/29124.php




    _______________________________________________
    users mailing list
    us...@open-mpi.org <mailto:us...@open-mpi.org>
    Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
    Link to this post:
    http://www.open-mpi.org/community/lists/users/2016/05/29141.php




_______________________________________________
users mailing list
us...@open-mpi.org
Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2016/05/29145.php


Reply via email to