You made the best of it... thanks a lot!

Whithout MPI it runs.
Just adding MPI.init() causes the crash!

maybe I installed something wrong...

install newest automake, autoconf, m4, libtoolize in right order and same prefix
check out ompi,
autogen
configure with same prefix, pointing to the same jdk, I later use
make
make install

I will test some different configurations of ./configure...


On 07/08/2016 01:40 PM, Gilles Gouaillardet wrote:
I am running out of ideas ...

what if you do not run within slurm ?
what if you do not use '-cp executor.jar'
or what if you configure without --disable-dlopen --disable-mca-dso ?

if you
mpirun -np 1 ...
then MPI_Bcast and MPI_Barrier are basically no-op, so it is really weird your program is still crashing. an other test is to comment out MPI_Bcast and MPI_Barrier and try again with -np 1

Cheers,

Gilles

On Friday, July 8, 2016, Gundram Leifert <gundram.leif...@uni-rostock.de <mailto:gundram.leif...@uni-rostock.de>> wrote:

    In any cases the same error.
    this is my code:

    salloc -n 3
    export IPATH_NO_BACKTRACE
    ulimit -s 10240
    mpirun -np 3 java -cp executor.jar
    de.uros.citlab.executor.test.TestSendBigFiles2


    also for 1 or two cores, the process crashes.


    On 07/08/2016 12:32 PM, Gilles Gouaillardet wrote:
    you can try
    export IPATH_NO_BACKTRACE
    before invoking mpirun (that should not be needed though)

    an other test is to
    ulimit -s 10240
    before invoking mpirun.

    btw, do you use mpirun or srun ?

    can you reproduce the crash with 1 or 2 tasks ?

    Cheers,

    Gilles

    On Friday, July 8, 2016, Gundram Leifert
    <gundram.leif...@uni-rostock.de
    <javascript:_e(%7B%7D,'cvml','gundram.leif...@uni-rostock.de');>>
    wrote:

        Hello,

        configure:
        ./configure --enable-mpi-java
        --with-jdk-dir=/home/gl069/bin/jdk1.7.0_25 --disable-dlopen
        --disable-mca-dso


        1 node with 3 cores. I use SLURM to allocate one node. I
        changed --mem, but it has no effect.
        salloc -n 3


        core file size          (blocks, -c) 0
        data seg size           (kbytes, -d) unlimited
        scheduling priority             (-e) 0
        file size               (blocks, -f) unlimited
        pending signals                 (-i) 256564
        max locked memory       (kbytes, -l) unlimited
        max memory size         (kbytes, -m) unlimited
        open files                      (-n) 100000
        pipe size            (512 bytes, -p) 8
        POSIX message queues     (bytes, -q) 819200
        real-time priority              (-r) 0
        stack size              (kbytes, -s) unlimited
        cpu time               (seconds, -t) unlimited
        max user processes              (-u) 4096
        virtual memory          (kbytes, -v) unlimited
        file locks                      (-x) unlimited

        uname -a
        Linux titan01.service 3.10.0-327.13.1.el7.x86_64 #1 SMP Thu
        Mar 31 16:04:38 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

        cat /etc/system-release
        CentOS Linux release 7.2.1511 (Core)

        what else do you need?

        Cheers, Gundram

        On 07/07/2016 10:05 AM, Gilles Gouaillardet wrote:

        Gundram,


        can you please provide more information on your environment :

        - configure command line

        - OS

        - memory available

        - ulimit -a

        - number of nodes

        - number of tasks used

        - interconnect used (if any)

        - batch manager (if any)


        Cheers,


        Gilles

        On 7/7/2016 4:17 PM, Gundram Leifert wrote:
        Hello Gilles,

        I tried you code and it crashes after 3-15 iterations (see
        (1)). It is always the same error (only the "94" varies).

        Meanwhile I think Java and MPI use the same memory because
        when I delete the hash-call, the program runs sometimes
        more than 9k iterations.
        When it crashes, there are different lines (see (2) and
        (3)). The crashes also occurs on rank 0.

        ##### (1)#####
        # Problematic frame:
        # J 94 C2
        de.uros.citlab.executor.test.TestSendBigFiles2.hashcode([BI)I
        (42 bytes) @ 0x00002b03242dc9c4 [0x00002b03242dc860+0x164]

        #####(2)#####
        # Problematic frame:
        # V  [libjvm.so+0x68d0f6]
        JavaCallWrapper::JavaCallWrapper(methodHandle, Handle,
        JavaValue*, Thread*)+0xb6

        #####(3)#####
        # Problematic frame:
        # V  [libjvm.so+0x4183bf]
        ThreadInVMfromNative::ThreadInVMfromNative(JavaThread*)+0x4f

        Any more idea?

        On 07/07/2016 03:00 AM, Gilles Gouaillardet wrote:

        Gundram,


        fwiw, i cannot reproduce the issue on my box

        - centos 7

        - java version "1.8.0_71"
          Java(TM) SE Runtime Environment (build 1.8.0_71-b15)
          Java HotSpot(TM) 64-Bit Server VM (build 25.71-b15,
        mixed mode)


        i noticed on non zero rank saveMem is allocated at each
        iteration.
        ideally, the garbage collector can take care of that and
        this should not be an issue.

        would you mind giving the attached file a try ?

        Cheers,

        Gilles

        On 7/7/2016 7:41 AM, Gilles Gouaillardet wrote:
        I will have a look at it today

        how did you configure OpenMPI ?

        Cheers,

        Gilles

        On Thursday, July 7, 2016, Gundram Leifert
        <gundram.leif...@uni-rostock.de
        <javascript:_e(%7B%7D,'cvml','gundram.leif...@uni-rostock.de');>>
        wrote:

            Hello Giles,

            thank you for your hints! I did 3 changes,
            unfortunately the same error occures:

            update ompi:
            commit ae8444682f0a7aa158caea08800542ce9874455e
            Author: Ralph Castain <r...@open-mpi.org>
            <javascript:_e(%7B%7D,'cvml','r...@open-mpi.org');>
            Date:   Tue Jul 5 20:07:16 2016 -0700

            update java:
            java version "1.8.0_92"
            Java(TM) SE Runtime Environment (build 1.8.0_92-b14)
            Java HotSpot(TM) Server VM (build 25.92-b14, mixed mode)

            delete hashcode-lines.

            Now I get this error message - to 100%, after
            different number of iterations (15-300):

             0/ 3:length = 100000000
             0/ 3:bcast length done (length = 100000000)
             1/ 3:bcast length done (length = 100000000)
             2/ 3:bcast length done (length = 100000000)
            #
            # A fatal error has been detected by the Java Runtime
            Environment:
            #
            #  SIGSEGV (0xb) at pc=0x00002b3d022fcd24, pid=16578,
            tid=0x00002b3d29716700
            #
            # JRE version: Java(TM) SE Runtime Environment
            (8.0_92-b14) (build 1.8.0_92-b14)
            # Java VM: Java HotSpot(TM) 64-Bit Server VM
            (25.92-b14 mixed mode linux-amd64 compressed oops)
            # Problematic frame:
            # V  [libjvm.so+0x414d24]
            ciEnv::get_field_by_index(ciInstanceKlass*, int)+0x94
            #
            # Failed to write core dump. Core dumps have been
            disabled. To enable core dumping, try "ulimit -c
            unlimited" before starting Java again
            #
            # An error report file with more information is saved as:
            # /home/gl069/ompi/bin/executor/hs_err_pid16578.log
            #
            # Compiler replay data is saved as:
            # /home/gl069/ompi/bin/executor/replay_pid16578.log
            #
            # If you would like to submit a bug report, please visit:
            # http://bugreport.java.com/bugreport/crash.jsp
            #
            [titan01:16578] *** Process received signal ***
            [titan01:16578] Signal: Aborted (6)
            [titan01:16578] Signal code:  (-6)
            [titan01:16578] [ 0]
            /usr/lib64/libpthread.so.0(+0xf100)[0x2b3d01500100]
            [titan01:16578] [ 1]
            /usr/lib64/libc.so.6(gsignal+0x37)[0x2b3d01b5c5f7]
            [titan01:16578] [ 2]
            /usr/lib64/libc.so.6(abort+0x148)[0x2b3d01b5dce8]
            [titan01:16578] [ 3]
            
/home/gl069/bin/jdk1.8.0_92/jre/lib/amd64/server/libjvm.so(+0x91e605)[0x2b3d02806605]
            [titan01:16578] [ 4]
            
/home/gl069/bin/jdk1.8.0_92/jre/lib/amd64/server/libjvm.so(+0xabda63)[0x2b3d029a5a63]
            [titan01:16578] [ 5]
            
/home/gl069/bin/jdk1.8.0_92/jre/lib/amd64/server/libjvm.so(JVM_handle_linux_signal+0x14f)[0x2b3d0280be2f]
            [titan01:16578] [ 6]
            
/home/gl069/bin/jdk1.8.0_92/jre/lib/amd64/server/libjvm.so(+0x91a5c3)[0x2b3d028025c3]
            [titan01:16578] [ 7]
            /usr/lib64/libc.so.6(+0x35670)[0x2b3d01b5c670]
            [titan01:16578] [ 8]
            
/home/gl069/bin/jdk1.8.0_92/jre/lib/amd64/server/libjvm.so(+0x414d24)[0x2b3d022fcd24]
            [titan01:16578] [ 9]
            
/home/gl069/bin/jdk1.8.0_92/jre/lib/amd64/server/libjvm.so(+0x43c5ae)[0x2b3d023245ae]
            [titan01:16578] [10]
            
/home/gl069/bin/jdk1.8.0_92/jre/lib/amd64/server/libjvm.so(+0x369ade)[0x2b3d02251ade]
            [titan01:16578] [11]
            
/home/gl069/bin/jdk1.8.0_92/jre/lib/amd64/server/libjvm.so(+0x36eda0)[0x2b3d02256da0]
            [titan01:16578] [12]
            
/home/gl069/bin/jdk1.8.0_92/jre/lib/amd64/server/libjvm.so(+0x37091b)[0x2b3d0225891b]
            [titan01:16578] [13]
            
/home/gl069/bin/jdk1.8.0_92/jre/lib/amd64/server/libjvm.so(+0x3712b6)[0x2b3d022592b6]
            [titan01:16578] [14]
            
/home/gl069/bin/jdk1.8.0_92/jre/lib/amd64/server/libjvm.so(+0x36d2cf)[0x2b3d022552cf]
            [titan01:16578] [15]
            
/home/gl069/bin/jdk1.8.0_92/jre/lib/amd64/server/libjvm.so(+0x36e412)[0x2b3d02256412]
            [titan01:16578] [16]
            
/home/gl069/bin/jdk1.8.0_92/jre/lib/amd64/server/libjvm.so(+0x36ed8d)[0x2b3d02256d8d]
            [titan01:16578] [17]
            
/home/gl069/bin/jdk1.8.0_92/jre/lib/amd64/server/libjvm.so(+0x37091b)[0x2b3d0225891b]
            [titan01:16578] [18]
            
/home/gl069/bin/jdk1.8.0_92/jre/lib/amd64/server/libjvm.so(+0x3712b6)[0x2b3d022592b6]
            [titan01:16578] [19]
            
/home/gl069/bin/jdk1.8.0_92/jre/lib/amd64/server/libjvm.so(+0x36d2cf)[0x2b3d022552cf]
            [titan01:16578] [20]
            
/home/gl069/bin/jdk1.8.0_92/jre/lib/amd64/server/libjvm.so(+0x36e412)[0x2b3d02256412]
            [titan01:16578] [21]
            
/home/gl069/bin/jdk1.8.0_92/jre/lib/amd64/server/libjvm.so(+0x36ed8d)[0x2b3d02256d8d]
            [titan01:16578] [22]
            
/home/gl069/bin/jdk1.8.0_92/jre/lib/amd64/server/libjvm.so(+0x3708c2)[0x2b3d022588c2]
            [titan01:16578] [23]
            
/home/gl069/bin/jdk1.8.0_92/jre/lib/amd64/server/libjvm.so(+0x3724e7)[0x2b3d0225a4e7]
            [titan01:16578] [24]
            
/home/gl069/bin/jdk1.8.0_92/jre/lib/amd64/server/libjvm.so(+0x37a817)[0x2b3d02262817]
            [titan01:16578] [25]
            
/home/gl069/bin/jdk1.8.0_92/jre/lib/amd64/server/libjvm.so(+0x37a92f)[0x2b3d0226292f]
            [titan01:16578] [26]
            
/home/gl069/bin/jdk1.8.0_92/jre/lib/amd64/server/libjvm.so(+0x358edb)[0x2b3d02240edb]
            [titan01:16578] [27]
            
/home/gl069/bin/jdk1.8.0_92/jre/lib/amd64/server/libjvm.so(+0x35929e)[0x2b3d0224129e]
            [titan01:16578] [28]
            
/home/gl069/bin/jdk1.8.0_92/jre/lib/amd64/server/libjvm.so(+0x3593ce)[0x2b3d022413ce]
            [titan01:16578] [29]
            
/home/gl069/bin/jdk1.8.0_92/jre/lib/amd64/server/libjvm.so(+0x35973e)[0x2b3d0224173e]
            [titan01:16578] *** End of error message ***
            -------------------------------------------------------
            Primary job  terminated normally, but 1 process returned
            a non-zero exit code. Per user-direction, the job has
            been aborted.
            -------------------------------------------------------
            
--------------------------------------------------------------------------
            mpirun noticed that process rank 2 with PID 0 on node
            titan01 exited on signal 6 (Aborted).
            
--------------------------------------------------------------------------

            I don't know if it is a  problem of java or ompi -
            but the last years, java worked with no problems on
            my machine...

            Thank you for your tips in advance!
            Gundram

            On 07/06/2016 03:10 PM, Gilles Gouaillardet wrote:
            Note a race condition in MPI_Init has been fixed
            yesterday in the master.
            can you please update your OpenMPI and try again ?

            hopefully the hang will disappear.

            Can you reproduce the crash with a simpler (and
            ideally deterministic) version of your program.
            the crash occurs in hashcode, and this makes little
            sense to me. can you also update your jdk ?

            Cheers,

            Gilles

            On Wednesday, July 6, 2016, Gundram Leifert
            <gundram.leif...@uni-rostock.de> wrote:

                Hello Jason,

                thanks for your response! I thing it is another
                problem. I try to send 100MB bytes. So there are
                not many tries (between 10 and 30). I realized
                that the execution of this code can result 3
                different errors:

                1. most often the posted error message occures.

                2. in <10% the cases i have a live lock. I can
                see 3 java-processes, one with 200% and two with
                100% processor utilization. After ~15 minutes
                without new system outputs this error occurs.


                [thread 47499823949568 also had an error]
                # A fatal error has been detected by the Java
                Runtime Environment:
                #
                #  Internal Error (safepoint.cpp:317),
                pid=24256, tid=47500347131648
                #  guarantee(PageArmed == 0) failed: invariant
                #
                # JRE version: 7.0_25-b15
                # Java VM: Java HotSpot(TM) 64-Bit Server VM
                (23.25-b01 mixed mode linux-amd64 compressed oops)
                # Failed to write core dump. Core dumps have
                been disabled. To enable core dumping, try
                "ulimit -c unlimited" before starting Java again
                #
                # An error report file with more information is
                saved as:
                # /home/gl069/ompi/bin/executor/hs_err_pid24256.log
                #
                # If you would like to submit a bug report,
                please visit:
                # http://bugreport.sun.com/bugreport/crash.jsp
                #
                [titan01:24256] *** Process received signal ***
                [titan01:24256] Signal: Aborted (6)
                [titan01:24256] Signal code:  (-6)
                [titan01:24256] [ 0]
                /usr/lib64/libpthread.so.0(+0xf100)[0x2b336a324100]
                [titan01:24256] [ 1]
                /usr/lib64/libc.so.6(gsignal+0x37)[0x2b336a9815f7]
                [titan01:24256] [ 2]
                /usr/lib64/libc.so.6(abort+0x148)[0x2b336a982ce8]
                [titan01:24256] [ 3]
                
/home/gl069/bin/jdk1.7.0_25/jre/lib/amd64/server/libjvm.so(+0x742ac5)[0x2b336b44fac5]
                [titan01:24256] [ 4]
                
/home/gl069/bin/jdk1.7.0_25/jre/lib/amd64/server/libjvm.so(+0x8a2137)[0x2b336b5af137]
                [titan01:24256] [ 5]
                
/home/gl069/bin/jdk1.7.0_25/jre/lib/amd64/server/libjvm.so(+0x407262)[0x2b336b114262]
                [titan01:24256] [ 6]
                
/home/gl069/bin/jdk1.7.0_25/jre/lib/amd64/server/libjvm.so(+0x7c6c34)[0x2b336b4d3c34]
                [titan01:24256] [ 7]
                
/home/gl069/bin/jdk1.7.0_25/jre/lib/amd64/server/libjvm.so(+0x8a9c17)[0x2b336b5b6c17]
                [titan01:24256] [ 8]
                
/home/gl069/bin/jdk1.7.0_25/jre/lib/amd64/server/libjvm.so(+0x8aa2c0)[0x2b336b5b72c0]
                [titan01:24256] [ 9]
                
/home/gl069/bin/jdk1.7.0_25/jre/lib/amd64/server/libjvm.so(+0x744270)[0x2b336b451270]
                [titan01:24256] [10]
                /usr/lib64/libpthread.so.0(+0x7dc5)[0x2b336a31cdc5]
                [titan01:24256] [11]
                /usr/lib64/libc.so.6(clone+0x6d)[0x2b336aa4228d]
                [titan01:24256] *** End of error message ***
                -------------------------------------------------------
                Primary job  terminated normally, but 1 process
                returned
                a non-zero exit code. Per user-direction, the
                job has been aborted.
                -------------------------------------------------------
                
--------------------------------------------------------------------------
                mpirun noticed that process rank 0 with PID 0 on
                node titan01 exited on signal 6 (Aborted).
                
--------------------------------------------------------------------------


                3. in <10% the cases i have a dead lock while
                MPI.init. This stays for more than 15 minutes
                without returning with an error message...

                Can I enable some debug-flags to see what
                happens on C / OpenMPI side?

                Thanks in advance for your help!
                Gundram Leifert


                On 07/05/2016 06:05 PM, Jason Maldonis wrote:
                After reading your thread looks like it may be
                related to an issue I had a few weeks ago (I'm
                a novice though). Maybe my thread will be of
                help:
                https://www.open-mpi.org/community/lists/users/2016/06/29425.php


                When you say "After a specific number of
                repetitions the process either hangs up or
                returns with a SIGSEGV."  does you mean that a
                single call hangs, or that at some point during
                the for loop a call hangs? If you mean the
                latter, then it might relate to my issue.
                Otherwise my thread probably won't be helpful.

                Jason Maldonis
                Research Assistant of Professor Paul Voyles
                Materials Science Grad Student
                University of Wisconsin, Madison
                1509 University Ave, Rm M142
                Madison, WI 53706
                maldo...@wisc.edu
                <javascript:_e(%7B%7D,'cvml','maldo...@wisc.edu');>
                608-295-5532

                On Tue, Jul 5, 2016 at 9:58 AM, Gundram Leifert
                <gundram.leif...@uni-rostock.de
                
<javascript:_e(%7B%7D,'cvml','gundram.leif...@uni-rostock.de');>>
                wrote:

                    Hello,

                    I try to send many byte-arrays via
                    broadcast. After a specific number of
                    repetitions the process either hangs up or
                    returns with a SIGSEGV. Does any one can
                    help me solving the problem:

                    ########## The code:

                    import java.util.Random;
                    import mpi.*;

                    public class TestSendBigFiles {

                        public static void log(String msg) {
                            try {
                    System.err.println(String.format("%2d/%2d:%s",
                    MPI.COMM_WORLD.getRank(),
                    MPI.COMM_WORLD.getSize(), msg));
                            } catch (MPIException ex) {
                    System.err.println(String.format("%2s/%2s:%s",
                    "?", "?", msg));
                            }
                        }

                        private static int hashcode(byte[]
                    bytearray) {
                            if (bytearray == null) {
                                return 0;
                            }
                            int hash = 39;
                            for (int i = 0; i <
                    bytearray.length; i++) {
                                byte b = bytearray[i];
                                hash = hash * 7 + (int) b;
                            }
                            return hash;
                        }

                        public static void main(String args[])
                    throws MPIException {
                            log("start main");
                    MPI.Init(args);
                            try {
                    log("initialized done");
                                byte[] saveMem = new
                    byte[100000000];
                    MPI.COMM_WORLD.barrier();
                                Random r = new Random();
                    r.nextBytes(saveMem);
                                if (MPI.COMM_WORLD.getRank() ==
                    0) {
                    for (int i = 0; i < 1000; i++) {
                    saveMem[r.nextInt(saveMem.length)]++;
                      log("i = " + i);
                      int[] lengthData = new int[]{saveMem.length};
                      log("object hash = " + hashcode(saveMem));
                      log("length = " + lengthData[0]);
                    MPI.COMM_WORLD.bcast(lengthData, 1, MPI.INT
                    <http://MPI.INT>, 0);
                      log("bcast length done (length = " +
                    lengthData[0] + ")");
                    MPI.COMM_WORLD.barrier();
                    MPI.COMM_WORLD.bcast(saveMem,
                    lengthData[0], MPI.BYTE, 0);
                      log("bcast data done");
                    MPI.COMM_WORLD.barrier();
                                    }
                    MPI.COMM_WORLD.bcast(new int[]{0}, 1,
                    MPI.INT <http://MPI.INT>, 0);
                                } else {
                    while (true) {
                      int[] lengthData = new int[1];
                    MPI.COMM_WORLD.bcast(lengthData, 1, MPI.INT
                    <http://MPI.INT>, 0);
                      log("bcast length done (length = " +
                    lengthData[0] + ")");
                      if (lengthData[0] == 0) {
                          break;
                      }
                    MPI.COMM_WORLD.barrier();
                      saveMem = new byte[lengthData[0]];
                    MPI.COMM_WORLD.bcast(saveMem,
                    saveMem.length, MPI.BYTE, 0);
                      log("bcast data done");
                    MPI.COMM_WORLD.barrier();
                      log("object hash = " + hashcode(saveMem));
                                    }
                                }
                    MPI.COMM_WORLD.barrier();
                            } catch (MPIException ex) {
                    System.out.println("caugth error." + ex);
                    log(ex.getMessage());
                            } catch (RuntimeException ex) {
                    System.out.println("caugth error." + ex);
                    log(ex.getMessage());
                            } finally {
                    MPI.Finalize();
                            }

                        }

                    }


                    ############ The Error (if it does not just
                    hang up):

                    #
                    # A fatal error has been detected by the
                    Java Runtime Environment:
                    #
                    #  SIGSEGV (0xb) at pc=0x00002b7e9c86e3a1,
                    pid=1172, tid=47822674495232
                    #
                    #
                    # A fatal error has been detected by the
                    Java Runtime Environment:
                    # JRE version: 7.0_25-b15
                    # Java VM: Java HotSpot(TM) 64-Bit Server
                    VM (23.25-b01 mixed mode linux-amd64
                    compressed oops)
                    # Problematic frame:
                    # #
                    #  SIGSEGV (0xb) at pc=0x00002af69c0693a1,
                    pid=1173, tid=47238546896640
                    #
                    # JRE version: 7.0_25-b15
                    J
                    de.uros.citlab.executor.test.TestSendBigFiles.hashcode([B)I
                    #
                    # Failed to write core dump. Core dumps
                    have been disabled. To enable core dumping,
                    try "ulimit -c unlimited" before starting
                    Java again
                    #
                    # Java VM: Java HotSpot(TM) 64-Bit Server
                    VM (23.25-b01 mixed mode linux-amd64
                    compressed oops)
                    # Problematic frame:
                    # J
                    de.uros.citlab.executor.test.TestSendBigFiles.hashcode([B)I
                    #
                    # Failed to write core dump. Core dumps
                    have been disabled. To enable core dumping,
                    try "ulimit -c unlimited" before starting
                    Java again
                    #
                    # An error report file with more
                    information is saved as:
                    #
                    /home/gl069/ompi/bin/executor/hs_err_pid1172.log
                    # An error report file with more
                    information is saved as:
                    #
                    /home/gl069/ompi/bin/executor/hs_err_pid1173.log
                    #
                    # If you would like to submit a bug report,
                    please visit:
                    # http://bugreport.sun.com/bugreport/crash.jsp
                    #
                    #
                    # If you would like to submit a bug report,
                    please visit:
                    # http://bugreport.sun.com/bugreport/crash.jsp
                    #
                    [titan01:01172] *** Process received signal ***
                    [titan01:01172] Signal: Aborted (6)
                    [titan01:01172] Signal code:  (-6)
                    [titan01:01173] *** Process received signal ***
                    [titan01:01173] Signal: Aborted (6)
                    [titan01:01173] Signal code:  (-6)
                    [titan01:01172] [ 0]
                    /usr/lib64/libpthread.so.0(+0xf100)[0x2b7e9596a100]
                    [titan01:01172] [ 1]
                    /usr/lib64/libc.so.6(gsignal+0x37)[0x2b7e95fc75f7]
                    [titan01:01172] [ 2]
                    /usr/lib64/libc.so.6(abort+0x148)[0x2b7e95fc8ce8]
                    [titan01:01172] [ 3]
                    
/home/gl069/bin/jdk1.7.0_25/jre/lib/amd64/server/libjvm.so(+0x742ac5)[0x2b7e96a95ac5]
                    [titan01:01172] [ 4]
                    
/home/gl069/bin/jdk1.7.0_25/jre/lib/amd64/server/libjvm.so(+0x8a2137)[0x2b7e96bf5137]
                    [titan01:01172] [ 5]
                    
/home/gl069/bin/jdk1.7.0_25/jre/lib/amd64/server/libjvm.so(JVM_handle_linux_signal+0x140)[0x2b7e96a995e0]
                    [titan01:01172] [ 6] [titan01:01173] [ 0]
                    /usr/lib64/libpthread.so.0(+0xf100)[0x2af694ded100]
                    [titan01:01173] [ 1]
                    /usr/lib64/libc.so.6(+0x35670)[0x2b7e95fc7670]
                    [titan01:01172] [ 7] [0x2b7e9c86e3a1]
                    [titan01:01172] *** End of error message ***
                    /usr/lib64/libc.so.6(gsignal+0x37)[0x2af69544a5f7]
                    [titan01:01173] [ 2]
                    /usr/lib64/libc.so.6(abort+0x148)[0x2af69544bce8]
                    [titan01:01173] [ 3]
                    
/home/gl069/bin/jdk1.7.0_25/jre/lib/amd64/server/libjvm.so(+0x742ac5)[0x2af695f18ac5]
                    [titan01:01173] [ 4]
                    
/home/gl069/bin/jdk1.7.0_25/jre/lib/amd64/server/libjvm.so(+0x8a2137)[0x2af696078137]
                    [titan01:01173] [ 5]
                    
/home/gl069/bin/jdk1.7.0_25/jre/lib/amd64/server/libjvm.so(JVM_handle_linux_signal+0x140)[0x2af695f1c5e0]
                    [titan01:01173] [ 6]
                    /usr/lib64/libc.so.6(+0x35670)[0x2af69544a670]
                    [titan01:01173] [ 7] [0x2af69c0693a1]
                    [titan01:01173] *** End of error message ***
                    -------------------------------------------------------
                    Primary job terminated normally, but 1
                    process returned
                    a non-zero exit code. Per user-direction,
                    the job has been aborted.
                    -------------------------------------------------------
                    
--------------------------------------------------------------------------
                    mpirun noticed that process rank 1 with PID
                    0 on node titan01 exited on signal 6 (Aborted).


                    ########CONFIGURATION:
                    I used the ompi master sources from github:
                    commit 267821f0dd405b5f4370017a287d9a49f92e734a
                    Author: Gilles Gouaillardet
                    <gil...@rist.or.jp
                    <javascript:_e(%7B%7D,'cvml','gil...@rist.or.jp');>>
                    Date:   Tue Jul 5 13:47:50 2016 +0900

                    ./configure --enable-mpi-java
                    --with-jdk-dir=/home/gl069/bin/jdk1.7.0_25
                    --disable-dlopen --disable-mca-dso

                    Thanks a lot for your help!
                    Gundram

                    _______________________________________________
                    users mailing list
                    us...@open-mpi.org
                    Subscription:
                    https://www.open-mpi.org/mailman/listinfo.cgi/users
                    Link to this post:
                    
http://www.open-mpi.org/community/lists/users/2016/07/29584.php




                _______________________________________________
                users mailing list
                us...@open-mpi.org
                Subscription:https://www.open-mpi.org/mailman/listinfo.cgi/users
                Link to this 
post:http://www.open-mpi.org/community/lists/users/2016/07/29585.php



            _______________________________________________
            users mailing list
            us...@open-mpi.org
            Subscription:https://www.open-mpi.org/mailman/listinfo.cgi/users
            Link to this 
post:http://www.open-mpi.org/community/lists/users/2016/07/29587.php



        _______________________________________________
        users mailing list
        us...@open-mpi.org
        Subscription:https://www.open-mpi.org/mailman/listinfo.cgi/users
        Link to this 
post:http://www.open-mpi.org/community/lists/users/2016/07/29589.php



        _______________________________________________
        users mailing list
        us...@open-mpi.org
        Subscription:https://www.open-mpi.org/mailman/listinfo.cgi/users
        Link to this 
post:http://www.open-mpi.org/community/lists/users/2016/07/29590.php



        _______________________________________________
        users mailing list
        us...@open-mpi.org
        Subscription:https://www.open-mpi.org/mailman/listinfo.cgi/users
        Link to this 
post:http://www.open-mpi.org/community/lists/users/2016/07/29592.php



        _______________________________________________
        users mailing list
        us...@open-mpi.org
        Subscription:https://www.open-mpi.org/mailman/listinfo.cgi/users
        Link to this 
post:http://www.open-mpi.org/community/lists/users/2016/07/29593.php



    _______________________________________________
    users mailing list
    us...@open-mpi.org
    <javascript:_e(%7B%7D,'cvml','us...@open-mpi.org');>
    Subscription:https://www.open-mpi.org/mailman/listinfo.cgi/users
    Link to this 
post:http://www.open-mpi.org/community/lists/users/2016/07/29601.php



_______________________________________________
users mailing list
us...@open-mpi.org
Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2016/07/29603.php

Reply via email to