Hello Gilley, Howard,

I configured without disable dlopen - same error.

I test these classes on another cluster and: IT WORKS!

So it is a problem of the cluster configuration. Thank you all very much for all your help! When the admin can solve the problem, i will let you know, what he had changed.

Cheers Gundram

On 07/08/2016 04:19 PM, Howard Pritchard wrote:
Hi Gundram

Could you configure without the disable dlopen option and retry?

Howard

Am Freitag, 8. Juli 2016 schrieb Gilles Gouaillardet :

    the JVM sets its own signal handlers, and it is important openmpi
    dones not override them.
    this is what previously happened with PSM (infinipath) but this
    has been solved since.
    you might be linking with a third party library that hijacks
    signal handlers and cause the crash
    (which would explain why I cannot reproduce the issue)

    the master branch has a revamped memory patcher (compared to v2.x
    or v1.10), and that could have some bad interactions with the JVM,
    so you might also give v2.x a try

    Cheers,

    Gilles

    On Friday, July 8, 2016, Gundram Leifert
    <gundram.leif...@uni-rostock.de
    <javascript:_e(%7B%7D,'cvml','gundram.leif...@uni-rostock.de');>>
    wrote:

        You made the best of it... thanks a lot!

        Whithout MPI it runs.
        Just adding MPI.init() causes the crash!

        maybe I installed something wrong...

        install newest automake, autoconf, m4, libtoolize in right
        order and same prefix
        check out ompi,
        autogen
        configure with same prefix, pointing to the same jdk, I later use
        make
        make install

        I will test some different configurations of ./configure...


        On 07/08/2016 01:40 PM, Gilles Gouaillardet wrote:
        I am running out of ideas ...

        what if you do not run within slurm ?
        what if you do not use '-cp executor.jar'
        or what if you configure without --disable-dlopen
        --disable-mca-dso ?

        if you
        mpirun -np 1 ...
        then MPI_Bcast and MPI_Barrier are basically no-op, so it is
        really weird your program is still crashing. an other test is
        to comment out MPI_Bcast and MPI_Barrier and try again with -np 1

        Cheers,

        Gilles

        On Friday, July 8, 2016, Gundram Leifert
        <gundram.leif...@uni-rostock.de> wrote:

            In any cases the same error.
            this is my code:

            salloc -n 3
            export IPATH_NO_BACKTRACE
            ulimit -s 10240
            mpirun -np 3 java -cp executor.jar
            de.uros.citlab.executor.test.TestSendBigFiles2


            also for 1 or two cores, the process crashes.


            On 07/08/2016 12:32 PM, Gilles Gouaillardet wrote:
            you can try
            export IPATH_NO_BACKTRACE
            before invoking mpirun (that should not be needed though)

            an other test is to
            ulimit -s 10240
            before invoking mpirun.

            btw, do you use mpirun or srun ?

            can you reproduce the crash with 1 or 2 tasks ?

            Cheers,

            Gilles

            On Friday, July 8, 2016, Gundram Leifert
            <gundram.leif...@uni-rostock.de> wrote:

                Hello,

                configure:
                ./configure --enable-mpi-java
                --with-jdk-dir=/home/gl069/bin/jdk1.7.0_25
                --disable-dlopen --disable-mca-dso


                1 node with 3 cores. I use SLURM to allocate one
                node. I changed --mem, but it has no effect.
                salloc -n 3


                core file size          (blocks, -c) 0
                data seg size           (kbytes, -d) unlimited
                scheduling priority             (-e) 0
                file size               (blocks, -f) unlimited
                pending signals                 (-i) 256564
                max locked memory       (kbytes, -l) unlimited
                max memory size         (kbytes, -m) unlimited
                open files                      (-n) 100000
                pipe size            (512 bytes, -p) 8
                POSIX message queues     (bytes, -q) 819200
                real-time priority              (-r) 0
                stack size              (kbytes, -s) unlimited
                cpu time               (seconds, -t) unlimited
                max user processes              (-u) 4096
                virtual memory          (kbytes, -v) unlimited
                file locks                      (-x) unlimited

                uname -a
                Linux titan01.service 3.10.0-327.13.1.el7.x86_64 #1
                SMP Thu Mar 31 16:04:38 UTC 2016 x86_64 x86_64
                x86_64 GNU/Linux

                cat /etc/system-release
                CentOS Linux release 7.2.1511 (Core)

                what else do you need?

                Cheers, Gundram

                On 07/07/2016 10:05 AM, Gilles Gouaillardet wrote:

                Gundram,


                can you please provide more information on your
                environment :

                - configure command line

                - OS

                - memory available

                - ulimit -a

                - number of nodes

                - number of tasks used

                - interconnect used (if any)

                - batch manager (if any)


                Cheers,


                Gilles

                On 7/7/2016 4:17 PM, Gundram Leifert wrote:
                Hello Gilles,

                I tried you code and it crashes after 3-15
                iterations (see (1)). It is always the same error
                (only the "94" varies).

                Meanwhile I think Java and MPI use the same memory
                because when I delete the hash-call, the program
                runs sometimes more than 9k iterations.
                When it crashes, there are different lines (see
                (2) and (3)). The crashes also occurs on rank 0.

                ##### (1)#####
                # Problematic frame:
                # J 94 C2
                de.uros.citlab.executor.test.TestSendBigFiles2.hashcode([BI)I
                (42 bytes) @ 0x00002b03242dc9c4
                [0x00002b03242dc860+0x164]

                #####(2)#####
                # Problematic frame:
                # V  [libjvm.so+0x68d0f6]
                JavaCallWrapper::JavaCallWrapper(methodHandle,
                Handle, JavaValue*, Thread*)+0xb6

                #####(3)#####
                # Problematic frame:
                # V  [libjvm.so+0x4183bf]
                ThreadInVMfromNative::ThreadInVMfromNative(JavaThread*)+0x4f

                Any more idea?

                On 07/07/2016 03:00 AM, Gilles Gouaillardet wrote:

                Gundram,


                fwiw, i cannot reproduce the issue on my box

                - centos 7

                - java version "1.8.0_71"
                  Java(TM) SE Runtime Environment (build
                1.8.0_71-b15)
                  Java HotSpot(TM) 64-Bit Server VM (build
                25.71-b15, mixed mode)


                i noticed on non zero rank saveMem is allocated
                at each iteration.
                ideally, the garbage collector can take care of
                that and this should not be an issue.

                would you mind giving the attached file a try ?

                Cheers,

                Gilles

                On 7/7/2016 7:41 AM, Gilles Gouaillardet wrote:
                I will have a look at it today

                how did you configure OpenMPI ?

                Cheers,

                Gilles

                On Thursday, July 7, 2016, Gundram Leifert
                <gundram.leif...@uni-rostock.de> wrote:

                    Hello Giles,

                    thank you for your hints! I did 3 changes,
                    unfortunately the same error occures:

                    update ompi:
                    commit ae8444682f0a7aa158caea08800542ce9874455e
                    Author: Ralph Castain <r...@open-mpi.org>
                    Date:   Tue Jul 5 20:07:16 2016 -0700

                    update java:
                    java version "1.8.0_92"
                    Java(TM) SE Runtime Environment (build
                    1.8.0_92-b14)
                    Java HotSpot(TM) Server VM (build 25.92-b14,
                    mixed mode)

                    delete hashcode-lines.

                    Now I get this error message - to 100%,
                    after different number of iterations (15-300):

                     0/ 3:length = 100000000
                     0/ 3:bcast length done (length = 100000000)
                     1/ 3:bcast length done (length = 100000000)
                     2/ 3:bcast length done (length = 100000000)
                    #
                    # A fatal error has been detected by the
                    Java Runtime Environment:
                    #
                    #  SIGSEGV (0xb) at pc=0x00002b3d022fcd24,
                    pid=16578, tid=0x00002b3d29716700
                    #
                    # JRE version: Java(TM) SE Runtime
                    Environment (8.0_92-b14) (build 1.8.0_92-b14)
                    # Java VM: Java HotSpot(TM) 64-Bit Server VM
                    (25.92-b14 mixed mode linux-amd64 compressed
                    oops)
                    # Problematic frame:
                    # V [libjvm.so+0x414d24]
                    ciEnv::get_field_by_index(ciInstanceKlass*,
                    int)+0x94
                    #
                    # Failed to write core dump. Core dumps have
                    been disabled. To enable core dumping, try
                    "ulimit -c unlimited" before starting Java again
                    #
                    # An error report file with more information
                    is saved as:
                    #
                    /home/gl069/ompi/bin/executor/hs_err_pid16578.log
                    #
                    # Compiler replay data is saved as:
                    #
                    /home/gl069/ompi/bin/executor/replay_pid16578.log
                    #
                    # If you would like to submit a bug report,
                    please visit:
                    # http://bugreport.java.com/bugreport/crash.jsp
                    #
                    [titan01:16578] *** Process received signal ***
                    [titan01:16578] Signal: Aborted (6)
                    [titan01:16578] Signal code:  (-6)
                    [titan01:16578] [ 0]
                    /usr/lib64/libpthread.so.0(+0xf100)[0x2b3d01500100]
                    [titan01:16578] [ 1]
                    /usr/lib64/libc.so.6(gsignal+0x37)[0x2b3d01b5c5f7]
                    [titan01:16578] [ 2]
                    /usr/lib64/libc.so.6(abort+0x148)[0x2b3d01b5dce8]
                    [titan01:16578] [ 3]
                    
/home/gl069/bin/jdk1.8.0_92/jre/lib/amd64/server/libjvm.so(+0x91e605)[0x2b3d02806605]
                    [titan01:16578] [ 4]
                    
/home/gl069/bin/jdk1.8.0_92/jre/lib/amd64/server/libjvm.so(+0xabda63)[0x2b3d029a5a63]
                    [titan01:16578] [ 5]
                    
/home/gl069/bin/jdk1.8.0_92/jre/lib/amd64/server/libjvm.so(JVM_handle_linux_signal+0x14f)[0x2b3d0280be2f]
                    [titan01:16578] [ 6]
                    
/home/gl069/bin/jdk1.8.0_92/jre/lib/amd64/server/libjvm.so(+0x91a5c3)[0x2b3d028025c3]
                    [titan01:16578] [ 7]
                    /usr/lib64/libc.so.6(+0x35670)[0x2b3d01b5c670]
                    [titan01:16578] [ 8]
                    
/home/gl069/bin/jdk1.8.0_92/jre/lib/amd64/server/libjvm.so(+0x414d24)[0x2b3d022fcd24]
                    [titan01:16578] [ 9]
                    
/home/gl069/bin/jdk1.8.0_92/jre/lib/amd64/server/libjvm.so(+0x43c5ae)[0x2b3d023245ae]
                    [titan01:16578] [10]
                    
/home/gl069/bin/jdk1.8.0_92/jre/lib/amd64/server/libjvm.so(+0x369ade)[0x2b3d02251ade]
                    [titan01:16578] [11]
                    
/home/gl069/bin/jdk1.8.0_92/jre/lib/amd64/server/libjvm.so(+0x36eda0)[0x2b3d02256da0]
                    [titan01:16578] [12]
                    
/home/gl069/bin/jdk1.8.0_92/jre/lib/amd64/server/libjvm.so(+0x37091b)[0x2b3d0225891b]
                    [titan01:16578] [13]
                    
/home/gl069/bin/jdk1.8.0_92/jre/lib/amd64/server/libjvm.so(+0x3712b6)[0x2b3d022592b6]
                    [titan01:16578] [14]
                    
/home/gl069/bin/jdk1.8.0_92/jre/lib/amd64/server/libjvm.so(+0x36d2cf)[0x2b3d022552cf]
                    [titan01:16578] [15]
                    
/home/gl069/bin/jdk1.8.0_92/jre/lib/amd64/server/libjvm.so(+0x36e412)[0x2b3d02256412]
                    [titan01:16578] [16]
                    
/home/gl069/bin/jdk1.8.0_92/jre/lib/amd64/server/libjvm.so(+0x36ed8d)[0x2b3d02256d8d]
                    [titan01:16578] [17]
                    
/home/gl069/bin/jdk1.8.0_92/jre/lib/amd64/server/libjvm.so(+0x37091b)[0x2b3d0225891b]
                    [titan01:16578] [18]
                    
/home/gl069/bin/jdk1.8.0_92/jre/lib/amd64/server/libjvm.so(+0x3712b6)[0x2b3d022592b6]
                    [titan01:16578] [19]
                    
/home/gl069/bin/jdk1.8.0_92/jre/lib/amd64/server/libjvm.so(+0x36d2cf)[0x2b3d022552cf]
                    [titan01:16578] [20]
                    
/home/gl069/bin/jdk1.8.0_92/jre/lib/amd64/server/libjvm.so(+0x36e412)[0x2b3d02256412]
                    [titan01:16578] [21]
                    
/home/gl069/bin/jdk1.8.0_92/jre/lib/amd64/server/libjvm.so(+0x36ed8d)[0x2b3d02256d8d]
                    [titan01:16578] [22]
                    
/home/gl069/bin/jdk1.8.0_92/jre/lib/amd64/server/libjvm.so(+0x3708c2)[0x2b3d022588c2]
                    [titan01:16578] [23]
                    
/home/gl069/bin/jdk1.8.0_92/jre/lib/amd64/server/libjvm.so(+0x3724e7)[0x2b3d0225a4e7]
                    [titan01:16578] [24]
                    
/home/gl069/bin/jdk1.8.0_92/jre/lib/amd64/server/libjvm.so(+0x37a817)[0x2b3d02262817]
                    [titan01:16578] [25]
                    
/home/gl069/bin/jdk1.8.0_92/jre/lib/amd64/server/libjvm.so(+0x37a92f)[0x2b3d0226292f]
                    [titan01:16578] [26]
                    
/home/gl069/bin/jdk1.8.0_92/jre/lib/amd64/server/libjvm.so(+0x358edb)[0x2b3d02240edb]
                    [titan01:16578] [27]
                    
/home/gl069/bin/jdk1.8.0_92/jre/lib/amd64/server/libjvm.so(+0x35929e)[0x2b3d0224129e]
                    [titan01:16578] [28]
                    
/home/gl069/bin/jdk1.8.0_92/jre/lib/amd64/server/libjvm.so(+0x3593ce)[0x2b3d022413ce]
                    [titan01:16578] [29]
                    
/home/gl069/bin/jdk1.8.0_92/jre/lib/amd64/server/libjvm.so(+0x35973e)[0x2b3d0224173e]
                    [titan01:16578] *** End of error message ***
                    -------------------------------------------------------
                    Primary job  terminated normally, but 1
                    process returned
                    a non-zero exit code. Per user-direction,
                    the job has been aborted.
                    -------------------------------------------------------
                    
--------------------------------------------------------------------------
                    mpirun noticed that process rank 2 with PID
                    0 on node titan01 exited on signal 6 (Aborted).
                    
--------------------------------------------------------------------------

                    I don't know if it is a problem of java or
                    ompi - but the last years, java worked with
                    no problems on my machine...

                    Thank you for your tips in advance!
                    Gundram

                    On 07/06/2016 03:10 PM, Gilles Gouaillardet
                    wrote:
                    Note a race condition in MPI_Init has been
                    fixed yesterday in the master.
                    can you please update your OpenMPI and try
                    again ?

                    hopefully the hang will disappear.

                    Can you reproduce the crash with a simpler
                    (and ideally deterministic) version of your
                    program.
                    the crash occurs in hashcode, and this
                    makes little sense to me. can you also
                    update your jdk ?

                    Cheers,

                    Gilles

                    On Wednesday, July 6, 2016, Gundram Leifert
                    <gundram.leif...@uni-rostock.de> wrote:

                        Hello Jason,

                        thanks for your response! I thing it is
                        another problem. I try to send 100MB
                        bytes. So there are not many tries
                        (between 10 and 30). I realized that
                        the execution of this code can result 3
                        different errors:

                        1. most often the posted error message
                        occures.

                        2. in <10% the cases i have a live
                        lock. I can see 3 java-processes, one
                        with 200% and two with 100% processor
                        utilization. After ~15 minutes without
                        new system outputs this error occurs.


                        [thread 47499823949568 also had an error]
                        # A fatal error has been detected by
                        the Java Runtime Environment:
                        #
                        #  Internal Error (safepoint.cpp:317),
                        pid=24256, tid=47500347131648
                        # guarantee(PageArmed == 0) failed:
                        invariant
                        #
                        # JRE version: 7.0_25-b15
                        # Java VM: Java HotSpot(TM) 64-Bit
                        Server VM (23.25-b01 mixed mode
                        linux-amd64 compressed oops)
                        # Failed to write core dump. Core dumps
                        have been disabled. To enable core
                        dumping, try "ulimit -c unlimited"
                        before starting Java again
                        #
                        # An error report file with more
                        information is saved as:
                        #
                        /home/gl069/ompi/bin/executor/hs_err_pid24256.log
                        #
                        # If you would like to submit a bug
                        report, please visit:
                        #
                        http://bugreport.sun.com/bugreport/crash.jsp
                        #
                        [titan01:24256] *** Process received
                        signal ***
                        [titan01:24256] Signal: Aborted (6)
                        [titan01:24256] Signal code: (-6)
                        [titan01:24256] [ 0]
                        /usr/lib64/libpthread.so.0(+0xf100)[0x2b336a324100]
                        [titan01:24256] [ 1]
                        /usr/lib64/libc.so.6(gsignal+0x37)[0x2b336a9815f7]
                        [titan01:24256] [ 2]
                        /usr/lib64/libc.so.6(abort+0x148)[0x2b336a982ce8]
                        [titan01:24256] [ 3]
                        
/home/gl069/bin/jdk1.7.0_25/jre/lib/amd64/server/libjvm.so(+0x742ac5)[0x2b336b44fac5]
                        [titan01:24256] [ 4]
                        
/home/gl069/bin/jdk1.7.0_25/jre/lib/amd64/server/libjvm.so(+0x8a2137)[0x2b336b5af137]
                        [titan01:24256] [ 5]
                        
/home/gl069/bin/jdk1.7.0_25/jre/lib/amd64/server/libjvm.so(+0x407262)[0x2b336b114262]
                        [titan01:24256] [ 6]
                        
/home/gl069/bin/jdk1.7.0_25/jre/lib/amd64/server/libjvm.so(+0x7c6c34)[0x2b336b4d3c34]
                        [titan01:24256] [ 7]
                        
/home/gl069/bin/jdk1.7.0_25/jre/lib/amd64/server/libjvm.so(+0x8a9c17)[0x2b336b5b6c17]
                        [titan01:24256] [ 8]
                        
/home/gl069/bin/jdk1.7.0_25/jre/lib/amd64/server/libjvm.so(+0x8aa2c0)[0x2b336b5b72c0]
                        [titan01:24256] [ 9]
                        
/home/gl069/bin/jdk1.7.0_25/jre/lib/amd64/server/libjvm.so(+0x744270)[0x2b336b451270]
                        [titan01:24256] [10]
                        /usr/lib64/libpthread.so.0(+0x7dc5)[0x2b336a31cdc5]
                        [titan01:24256] [11]
                        /usr/lib64/libc.so.6(clone+0x6d)[0x2b336aa4228d]
                        [titan01:24256] *** End of error
                        message ***
                        -------------------------------------------------------
                        Primary job terminated normally, but 1
                        process returned
                        a non-zero exit code. Per
                        user-direction, the job has been aborted.
                        -------------------------------------------------------
                        
--------------------------------------------------------------------------
                        mpirun noticed that process rank 0 with
                        PID 0 on node titan01 exited on signal
                        6 (Aborted).
                        
--------------------------------------------------------------------------


                        3. in <10% the cases i have a dead lock
                        while MPI.init. This stays for more
                        than 15 minutes without returning with
                        an error message...

                        Can I enable some debug-flags to see
                        what happens on C / OpenMPI side?

                        Thanks in advance for your help!
                        Gundram Leifert


                        On 07/05/2016 06:05 PM, Jason Maldonis
                        wrote:
                        After reading your thread looks like
                        it may be related to an issue I had a
                        few weeks ago (I'm a novice though).
                        Maybe my thread will be of help:
                        
https://www.open-mpi.org/community/lists/users/2016/06/29425.php


                        When you say "After a specific number
                        of repetitions the process either
                        hangs up or returns with a SIGSEGV."
                         does you mean that a single call
                        hangs, or that at some point during
                        the for loop a call hangs? If you mean
                        the latter, then it might relate to my
                        issue. Otherwise my thread probably
                        won't be helpful.

                        Jason Maldonis
                        Research Assistant of Professor Paul
                        Voyles
                        Materials Science Grad Student
                        University of Wisconsin, Madison
                        1509 University Ave, Rm M142
                        Madison, WI 53706
                        maldo...@wisc.edu
                        608-295-5532

                        On Tue, Jul 5, 2016 at 9:58 AM,
                        Gundram Leifert
                        <gundram.leif...@uni-rostock.de> wrote:

                            Hello,

                            I try to send many byte-arrays via
                            broadcast. After a specific number
                            of repetitions the process either
                            hangs up or returns with a
                            SIGSEGV. Does any one can help me
                            solving the problem:

                            ########## The code:

                            import java.util.Random;
                            import mpi.*;

                            public class TestSendBigFiles {

                                public static void log(String
                            msg) {
                                    try {
                            System.err.println(String.format("%2d/%2d:%s",
                            MPI.COMM_WORLD.getRank(),
                            MPI.COMM_WORLD.getSize(), msg));
                                    } catch (MPIException ex) {
                            System.err.println(String.format("%2s/%2s:%s",
                            "?", "?", msg));
                                    }
                                }

                                private static int
                            hashcode(byte[] bytearray) {
                                    if (bytearray == null) {
                            return 0;
                                    }
                                    int hash = 39;
                                    for (int i = 0; i <
                            bytearray.length; i++) {
                            byte b = bytearray[i];
                            hash = hash * 7 + (int) b;
                                    }
                                    return hash;
                                }

                                public static void main(String
                            args[]) throws MPIException {
                            log("start main");
                            MPI.Init(args);
                                    try {
                            log("initialized done");
                            byte[] saveMem = new byte[100000000];
                            MPI.COMM_WORLD.barrier();
                            Random r = new Random();
                            r.nextBytes(saveMem);
                                        if
                            (MPI.COMM_WORLD.getRank() == 0) {
                              for (int i = 0; i < 1000; i++) {
                            saveMem[r.nextInt(saveMem.length)]++;
                                  log("i = " + i);
                                  int[] lengthData = new
                            int[]{saveMem.length};
                            log("object hash = " +
                            hashcode(saveMem));
                            log("length = " + lengthData[0]);
                            MPI.COMM_WORLD.bcast(lengthData,
                            1, MPI.INT <http://MPI.INT>, 0);
                            log("bcast length done (length = "
                            + lengthData[0] + ")");
                            MPI.COMM_WORLD.barrier();
                            MPI.COMM_WORLD.bcast(saveMem,
                            lengthData[0], MPI.BYTE, 0);
                            log("bcast data done");
                            MPI.COMM_WORLD.barrier();
                              }
                            MPI.COMM_WORLD.bcast(new int[]{0},
                            1, MPI.INT <http://MPI.INT>, 0);
                                        } else {
                              while (true) {
                                  int[] lengthData = new int[1];
                            MPI.COMM_WORLD.bcast(lengthData,
                            1, MPI.INT <http://MPI.INT>, 0);
                            log("bcast length done (length = "
                            + lengthData[0] + ")");
                                  if (lengthData[0] == 0) {
                            break;
                                  }
                            MPI.COMM_WORLD.barrier();
                                  saveMem = new
                            byte[lengthData[0]];
                            MPI.COMM_WORLD.bcast(saveMem,
                            saveMem.length, MPI.BYTE, 0);
                            log("bcast data done");
                            MPI.COMM_WORLD.barrier();
                            log("object hash = " +
                            hashcode(saveMem));
                              }
                                        }
                            MPI.COMM_WORLD.barrier();
                                    } catch (MPIException ex) {
                            System.out.println("caugth error."
                            + ex);
                            log(ex.getMessage());
                                    } catch (RuntimeException
                            ex) {
                            System.out.println("caugth error."
                            + ex);
                            log(ex.getMessage());
                                    } finally {
                            MPI.Finalize();
                                    }

                                }

                            }


                            ############ The Error (if it does
                            not just hang up):

                            #
                            # A fatal error has been detected
                            by the Java Runtime Environment:
                            #
                            #  SIGSEGV (0xb) at
                            pc=0x00002b7e9c86e3a1, pid=1172,
                            tid=47822674495232
                            #
                            #
                            # A fatal error has been detected
                            by the Java Runtime Environment:
                            # JRE version: 7.0_25-b15
                            # Java VM: Java HotSpot(TM) 64-Bit
                            Server VM (23.25-b01 mixed mode
                            linux-amd64 compressed oops)
                            # Problematic frame:
                            # #
                            #  SIGSEGV (0xb) at
                            pc=0x00002af69c0693a1, pid=1173,
                            tid=47238546896640
                            #
                            # JRE version: 7.0_25-b15
                            J
                            
de.uros.citlab.executor.test.TestSendBigFiles.hashcode([B)I
                            #
                            # Failed to write core dump. Core
                            dumps have been disabled. To
                            enable core dumping, try "ulimit
                            -c unlimited" before starting Java
                            again
                            #
                            # Java VM: Java HotSpot(TM) 64-Bit
                            Server VM (23.25-b01 mixed mode
                            linux-amd64 compressed oops)
                            # Problematic frame:
                            # J
                            
de.uros.citlab.executor.test.TestSendBigFiles.hashcode([B)I
                            #
                            # Failed to write core dump. Core
                            dumps have been disabled. To
                            enable core dumping, try "ulimit
                            -c unlimited" before starting Java
                            again
                            #
                            # An error report file with more
                            information is saved as:
                            #
                            /home/gl069/ompi/bin/executor/hs_err_pid1172.log
                            # An error report file with more
                            information is saved as:
                            #
                            /home/gl069/ompi/bin/executor/hs_err_pid1173.log
                            #
                            # If you would like to submit a
                            bug report, please visit:
                            #
                            http://bugreport.sun.com/bugreport/crash.jsp
                            #
                            #
                            # If you would like to submit a
                            bug report, please visit:
                            #
                            http://bugreport.sun.com/bugreport/crash.jsp
                            #
                            [titan01:01172] *** Process
                            received signal ***
                            [titan01:01172] Signal: Aborted (6)
                            [titan01:01172] Signal code: (-6)
                            [titan01:01173] *** Process
                            received signal ***
                            [titan01:01173] Signal: Aborted (6)
                            [titan01:01173] Signal code: (-6)
                            [titan01:01172] [ 0]
                            /usr/lib64/libpthread.so.0(+0xf100)[0x2b7e9596a100]
                            [titan01:01172] [ 1]
                            /usr/lib64/libc.so.6(gsignal+0x37)[0x2b7e95fc75f7]
                            [titan01:01172] [ 2]
                            /usr/lib64/libc.so.6(abort+0x148)[0x2b7e95fc8ce8]
                            [titan01:01172] [ 3]
                            
/home/gl069/bin/jdk1.7.0_25/jre/lib/amd64/server/libjvm.so(+0x742ac5)[0x2b7e96a95ac5]
                            [titan01:01172] [ 4]
                            
/home/gl069/bin/jdk1.7.0_25/jre/lib/amd64/server/libjvm.so(+0x8a2137)[0x2b7e96bf5137]
                            [titan01:01172] [ 5]
                            
/home/gl069/bin/jdk1.7.0_25/jre/lib/amd64/server/libjvm.so(JVM_handle_linux_signal+0x140)[0x2b7e96a995e0]
                            [titan01:01172] [ 6]
                            [titan01:01173] [ 0]
                            /usr/lib64/libpthread.so.0(+0xf100)[0x2af694ded100]
                            [titan01:01173] [ 1]
                            /usr/lib64/libc.so.6(+0x35670)[0x2b7e95fc7670]
                            [titan01:01172] [ 7] [0x2b7e9c86e3a1]
                            [titan01:01172] *** End of error
                            message ***
                            /usr/lib64/libc.so.6(gsignal+0x37)[0x2af69544a5f7]
                            [titan01:01173] [ 2]
                            /usr/lib64/libc.so.6(abort+0x148)[0x2af69544bce8]
                            [titan01:01173] [ 3]
                            
/home/gl069/bin/jdk1.7.0_25/jre/lib/amd64/server/libjvm.so(+0x742ac5)[0x2af695f18ac5]
                            [titan01:01173] [ 4]
                            
/home/gl069/bin/jdk1.7.0_25/jre/lib/amd64/server/libjvm.so(+0x8a2137)[0x2af696078137]
                            [titan01:01173] [ 5]
                            
/home/gl069/bin/jdk1.7.0_25/jre/lib/amd64/server/libjvm.so(JVM_handle_linux_signal+0x140)[0x2af695f1c5e0]
                            [titan01:01173] [ 6]
                            /usr/lib64/libc.so.6(+0x35670)[0x2af69544a670]
                            [titan01:01173] [ 7] [0x2af69c0693a1]
                            [titan01:01173] *** End of error
                            message ***
                            
-------------------------------------------------------
                            Primary job terminated normally,
                            but 1 process returned
                            a non-zero exit code. Per
                            user-direction, the job has been
                            aborted.
                            
-------------------------------------------------------
                            
--------------------------------------------------------------------------
                            mpirun noticed that process rank 1
                            with PID 0 on node titan01 exited
                            on signal 6 (Aborted).


                            ########CONFIGURATION:
                            I used the ompi master sources
                            from github:
                            commit
                            267821f0dd405b5f4370017a287d9a49f92e734a
                            Author: Gilles Gouaillardet
                            <gil...@rist.or.jp>
                            Date:   Tue Jul 5 13:47:50 2016 +0900

                            ./configure --enable-mpi-java
                            --with-jdk-dir=/home/gl069/bin/jdk1.7.0_25
                            --disable-dlopen --disable-mca-dso

                            Thanks a lot for your help!
                            Gundram

                            _______________________________________________
                            users mailing list
                            us...@open-mpi.org
                            Subscription:
                            https://www.open-mpi.org/mailman/listinfo.cgi/users
                            Link to this post:
                            
http://www.open-mpi.org/community/lists/users/2016/07/29584.php




                        _______________________________________________
                        users mailing list
                        us...@open-mpi.org
                        
Subscription:https://www.open-mpi.org/mailman/listinfo.cgi/users
                        Link to this 
post:http://www.open-mpi.org/community/lists/users/2016/07/29585.php



                    _______________________________________________
                    users mailing list
                    us...@open-mpi.org
                    
Subscription:https://www.open-mpi.org/mailman/listinfo.cgi/users
                    Link to this 
post:http://www.open-mpi.org/community/lists/users/2016/07/29587.php



                _______________________________________________
                users mailing list
                us...@open-mpi.org
                Subscription:https://www.open-mpi.org/mailman/listinfo.cgi/users
                Link to this 
post:http://www.open-mpi.org/community/lists/users/2016/07/29589.php



                _______________________________________________
                users mailing list
                us...@open-mpi.org
                Subscription:https://www.open-mpi.org/mailman/listinfo.cgi/users
                Link to this 
post:http://www.open-mpi.org/community/lists/users/2016/07/29590.php



                _______________________________________________
                users mailing list
                us...@open-mpi.org
                Subscription:https://www.open-mpi.org/mailman/listinfo.cgi/users
                Link to this 
post:http://www.open-mpi.org/community/lists/users/2016/07/29592.php



                _______________________________________________
                users mailing list
                us...@open-mpi.org
                Subscription:https://www.open-mpi.org/mailman/listinfo.cgi/users
                Link to this 
post:http://www.open-mpi.org/community/lists/users/2016/07/29593.php



            _______________________________________________
            users mailing list
            us...@open-mpi.org
            Subscription:https://www.open-mpi.org/mailman/listinfo.cgi/users
            Link to this 
post:http://www.open-mpi.org/community/lists/users/2016/07/29601.php



        _______________________________________________
        users mailing list
        us...@open-mpi.org
        Subscription:https://www.open-mpi.org/mailman/listinfo.cgi/users
        Link to this 
post:http://www.open-mpi.org/community/lists/users/2016/07/29603.php



_______________________________________________
users mailing list
us...@open-mpi.org
Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2016/07/29610.php

Reply via email to