Nate,

i could get rid of the problem by not using the psm mtl.
the infinipath library (used by the psm mtl) sets some signal handlers that conflict with the JVM
that can be seen by running
mpirun -np 1 java -Xcheck:jni MPITestBroke data/

so instead of running
mpirun -np 1 java MPITestBroke data/
please run
mpirun --mca mtl ^psm -np 1 java MPITestBroke data/

that solved the issue for me

Cheers,

Gilles

On 8/13/2015 9:19 AM, Nate Chambers wrote:
*I appreciate you trying to help! I put the Java and its compiled .class file on Dropbox. The directory contains the .java and .class files, as well as a data/ directory:*

http://www.dropbox.com/sh/pds5c5wecfpb2wk/AAAcz17UTDQErmrUqp2SPjpqa?dl=0

*You can run it with and without MPI:*

>  java MPITestBroke data/
>  mpirun -np 1 java MPITestBroke data/

*Attached is a text file of what I see when I run it with mpirun and your debug flag. Lots of debug lines.*
*
*
*
*
Nate





On Wed, Aug 12, 2015 at 11:09 AM, Howard Pritchard <hpprit...@gmail.com <mailto:hpprit...@gmail.com>> wrote:

    Hi Nate,

    Sorry for the delay in getting back to you.

    We're somewhat stuck on how to help you, but here are two suggestions.

    Could you add the following to your launch command line

    --mca odls_base_verbose 100

    so we can see exactly what arguments are being feed to java when
    launching
    your app.

    Also, if you could put your MPITestBroke.class file somewhere
    (like google drive)
    where we could get it and try to run locally or at NERSC, that
    might help us
    narrow down the problem.    Better yet, if you have the class or
    jar file for
    the entire app plus some data sets, we could try that out as well.

    All the config outputs, etc. you've sent so far indicate a correct
    installation
    of open mpi.

    Howard


    On Aug 6, 2015 1:54 PM, "Nate Chambers" <ncham...@usna.edu
    <mailto:ncham...@usna.edu>> wrote:

        Howard,

        I tried the nightly build openmpi-dev-2223-g731cfe3 and it
        still segfaults as before. I must admit I am new to MPI, so is
        it possible I'm just configuring or running incorrectly? Let
        me list my steps for you, and maybe something will jump out?
        Also attached is my config.log.


        CONFIGURE
        ./configure --prefix=<install-dir> --enable-mpi-java CC=gcc

        MAKE
        make all install

        RUN
        <install-dir>/mpirun -np 1 java MPITestBroke twitter/


        DEFAULT JAVA AND GCC

        $ java -version
        java version "1.7.0_21"
        Java(TM) SE Runtime Environment (build 1.7.0_21-b11)
        Java HotSpot(TM) 64-Bit Server VM (build 23.21-b01, mixed mode)

        $ gcc --v
        Using built-in specs.
        Target: x86_64-redhat-linux
        Configured with: ../configure --prefix=/usr
        --mandir=/usr/share/man --infodir=/usr/share/info
        --with-bugurl=http://bugzilla.redhat.com/bugzilla
        --enable-bootstrap --enable-shared --enable-threads=posix
        --enable-checking=release --with-system-zlib
        --enable-__cxa_atexit --disable-libunwind-exceptions
        --enable-gnu-unique-object
        --enable-languages=c,c++,objc,obj-c++,java,fortran,ada
        --enable-java-awt=gtk --disable-dssi
        --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre
        --enable-libgcj-multifile --enable-java-maintainer-mode
        --with-ecj-jar=/usr/share/java/eclipse-ecj.jar
        --disable-libjava-multilib --with-ppl --with-cloog
        --with-tune=generic --with-arch_32=i686
        --build=x86_64-redhat-linux
        Thread model: posix
        gcc version 4.4.7 20120313 (Red Hat 4.4.7-4) (GCC)





        On Thu, Aug 6, 2015 at 7:58 AM, Howard Pritchard
        <hpprit...@gmail.com <mailto:hpprit...@gmail.com>> wrote:

            HI Nate,

            We're trying this out on a mac running mavericks and a
            cray xc system.   the mac has java 8
            while the cray xc has java 7.

            We could not get the code to run just using the java
            launch command, although we noticed if you add

            catch(NoClassDefFoundError e) {

                  System.out.println("Not using MPI its out to lunch
            for now");

                }

            as one of the catches after the try for firing up MPI, you
            can get further.

            Instead we tried on the two systems using

            mpirun -np 1 java MPITestBroke tweets repeat.txt

            and, you guessed it, we can't reproduce the error, at
            least using master.

            Would you mind trying to get a copy of nightly master
            build off of

            http://www.open-mpi.org/nightly/master/

            and install that version and give it a try.

            If that works, then I'd suggest using master (or v2.0) for
            now.

            Howard




            2015-08-05 14:41 GMT-06:00 Nate Chambers
            <ncham...@usna.edu <mailto:ncham...@usna.edu>>:

                Howard,

                Thanks for looking at all this. Adding System.gc() did
                not cause it to segfault. The segfault still comes
                much later in the processing.

                I was able to reduce my code to a single test file
                without other dependencies. It is attached. This code
                simply opens a text file and reads its lines, one by
                one. Once finished, it closes and opens the same file
                and reads the lines again. On my system, it does this
                about 4 times until the segfault fires. Obviously this
                code makes no sense, but it's based on our actual code
                that reads millions of lines of data and does various
                processing to it.

                Attached is a tweets.tgz file that you can uncompress
                to have an input directory. The text file is just the
                same line over and over again. Run it as:

                *java MPITestBroke tweets/*


                Nate





                On Wed, Aug 5, 2015 at 8:29 AM, Howard Pritchard
                <hpprit...@gmail.com <mailto:hpprit...@gmail.com>> wrote:

                    Hi Nate,

                    Sorry for the delay in getting back. Thanks for
                    the sanity check.  You may have a point about the
                    args string to MPI.init -
                    there's nothing the Open MPI is needing from this
                    but that is a difference with your use case - your
                    app has an argument.

                    Would you mind adding a

                    System.gc()

                    call immediately after MPI.init call and see if
                    the gc blows up with a segfault?

                    Also, may be interesting to add the -verbose:jni
                    to your command line.

                    We'll do some experiments here with the init
                    string arg.

                    Is your app open source where we could download it
                    and try to reproduce the problem locally?

                    thanks,

                    Howard


                    2015-08-04 18:52 GMT-06:00 Nate Chambers
                    <ncham...@usna.edu <mailto:ncham...@usna.edu>>:

                        Sanity checks pass. Both Hello and Ring.java
                        run correctly with the expected program's output.

                        Does MPI.init(args) expect anything from those
                        command-line args?


                        Nate


                        On Tue, Aug 4, 2015 at 12:26 PM, Howard
                        Pritchard <hpprit...@gmail.com
                        <mailto:hpprit...@gmail.com>> wrote:

                            Hello Nate,

                            As a sanity check of your installation,
                            could you try to compile the
                            examples/*.java codes using the mpijavac
                            you've installed and see that those run
                            correctly?
                            I'd be just interested in the Hello.java
                            and Ring.java?

                            Howard







                            2015-08-04 14:34 GMT-06:00 Nate Chambers
                            <ncham...@usna.edu
                            <mailto:ncham...@usna.edu>>:

                                Sure, I reran the configure with
                                CC=gcc and then make install. I think
                                that's the proper way to do it.
                                Attached is my config log. The
                                behavior when running our code appears
                                to be the same. The output is the same
                                error I pasted in my email above. It
                                occurs when calling MPI.init().

                                I'm not great at debugging this sort
                                of stuff, but happy to try things out
                                if you need me to.

                                Nate


                                On Tue, Aug 4, 2015 at 5:09 AM, Howard
                                Pritchard <hpprit...@gmail.com
                                <mailto:hpprit...@gmail.com>> wrote:

                                    Hello Nate,

                                    As a first step to addressing
                                    this, could you please try using
                                    gcc rather than the Intel
                                    compilers to build Open MPI?

                                    We've been doing a lot of work
                                    recently on the java bindings,
                                    etc. but have never tried using
                                    any compilers other
                                    than gcc when working with the
                                    java bindings.

                                    Thanks,

                                    Howard


                                    2015-08-03 17:36 GMT-06:00 Nate
                                    Chambers <ncham...@usna.edu
                                    <mailto:ncham...@usna.edu>>:

                                        We've been struggling with
                                        this error for a while, so
                                        hoping someone more
                                        knowledgeable can help!

                                        Our java MPI code exits with a
                                        segfault during its normal
                                        operation, *but the segfault
                                        occurs before our code ever
                                        uses MPI functionality like
                                        sending/receiving. *We've
                                        removed all message calls and
                                        any use of MPI.COMM_WORLD from
                                        the code. The segfault occurs
                                        if we call MPI.init(args) in
                                        our code, and does not if we
                                        comment that line out. Further
                                        vexing us, the crash doesn't
                                        happen at the point of the
                                        MPI.init call, but later on in
                                        the program. I don't have an
                                        easy-to-run example here
                                        because our non-MPI code is so
                                        large and complicated. We have
                                        run simpler test programs with
                                        MPI and the segfault does not
                                        occur.

                                        We have isolated the line
                                        where the segfault occurs.
                                        However, if we comment that
                                        out, the program will run
                                        longer, but then randomly (but
                                        deterministically) segfault
                                        later on in the code. Does
                                        anyone have tips on how to
                                        debug this? We have tried
                                        several flags with mpirun, but
                                        no good clues.

                                        We have also tried several MPI
                                        versions, including stable
                                        1.8.7 and the most recent 1.8.8rc1


                                        ATTACHED
                                        - config.log from installation
                                        - output from `ompi_info -all`


                                        OUTPUT FROM RUNNING

                                        > mpirun -np 2 java -mx4g
                                        FeaturizeDay datadir/ days.txt
                                        ...
                                        some normal output from our code
                                        ...
                                        
--------------------------------------------------------------------------
                                        mpirun noticed that process
                                        rank 0 with PID 29646 on node
                                        r9n69 exited on signal 11
                                        (Segmentation fault).
                                        
--------------------------------------------------------------------------




                                        
_______________________________________________
                                        users mailing list
                                        us...@open-mpi.org
                                        <mailto:us...@open-mpi.org>
                                        Subscription:
                                        
http://www.open-mpi.org/mailman/listinfo.cgi/users
                                        Link to this post:
                                        
http://www.open-mpi.org/community/lists/users/2015/08/27386.php



                                    
_______________________________________________
                                    users mailing list
                                    us...@open-mpi.org
                                    <mailto:us...@open-mpi.org>
                                    Subscription:
                                    
http://www.open-mpi.org/mailman/listinfo.cgi/users
                                    Link to this post:
                                    
http://www.open-mpi.org/community/lists/users/2015/08/27389.php



                                _______________________________________________
                                users mailing list
                                us...@open-mpi.org
                                <mailto:us...@open-mpi.org>
                                Subscription:
                                
http://www.open-mpi.org/mailman/listinfo.cgi/users
                                Link to this post:
                                
http://www.open-mpi.org/community/lists/users/2015/08/27391.php



                            _______________________________________________
                            users mailing list
                            us...@open-mpi.org <mailto:us...@open-mpi.org>
                            Subscription:
                            http://www.open-mpi.org/mailman/listinfo.cgi/users
                            Link to this post:
                            
http://www.open-mpi.org/community/lists/users/2015/08/27392.php



                        _______________________________________________
                        users mailing list
                        us...@open-mpi.org <mailto:us...@open-mpi.org>
                        Subscription:
                        http://www.open-mpi.org/mailman/listinfo.cgi/users
                        Link to this post:
                        
http://www.open-mpi.org/community/lists/users/2015/08/27393.php



                    _______________________________________________
                    users mailing list
                    us...@open-mpi.org <mailto:us...@open-mpi.org>
                    Subscription:
                    http://www.open-mpi.org/mailman/listinfo.cgi/users
                    Link to this post:
                    
http://www.open-mpi.org/community/lists/users/2015/08/27396.php



                _______________________________________________
                users mailing list
                us...@open-mpi.org <mailto:us...@open-mpi.org>
                Subscription:
                http://www.open-mpi.org/mailman/listinfo.cgi/users
                Link to this post:
                http://www.open-mpi.org/community/lists/users/2015/08/27399.php



            _______________________________________________
            users mailing list
            us...@open-mpi.org <mailto:us...@open-mpi.org>
            Subscription:
            http://www.open-mpi.org/mailman/listinfo.cgi/users
            Link to this post:
            http://www.open-mpi.org/community/lists/users/2015/08/27405.php



        _______________________________________________
        users mailing list
        us...@open-mpi.org <mailto:us...@open-mpi.org>
        Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
        Link to this post:
        http://www.open-mpi.org/community/lists/users/2015/08/27406.php


    _______________________________________________
    users mailing list
    us...@open-mpi.org <mailto:us...@open-mpi.org>
    Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
    Link to this post:
    http://www.open-mpi.org/community/lists/users/2015/08/27446.php




_______________________________________________
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2015/08/27450.php

Reply via email to