Hi Gilles,

I don't know what happened, but the files are not available now
and they were definitely available when I answered the email from
Ralph. The files also have a different timestamp now. This is an
extract from my email to Ralph for Solaris Sparc.

-rwxr-xr-x 1 root root     977 Apr 19 19:49 mca_plm_rsh.la
-rwxr-xr-x 1 root root  153280 Apr 19 19:49 mca_plm_rsh.so
-rwxr-xr-x 1 root root    1007 Apr 19 19:47 mca_pmix_pmix112.la
-rwxr-xr-x 1 root root 1400512 Apr 19 19:47 mca_pmix_pmix112.so
-rwxr-xr-x 1 root root     971 Apr 19 19:52 mca_pml_cm.la
-rwxr-xr-x 1 root root  342440 Apr 19 19:52 mca_pml_cm.so

Now I have the following output for these files.

-rwxr-xr-x 1 root root     976 Apr 19 19:58 mca_plm_rsh.la
-rwxr-xr-x 1 root root  319816 Apr 19 19:58 mca_plm_rsh.so
-rwxr-xr-x 1 root root     970 Apr 19 20:00 mca_pml_cm.la
-rwxr-xr-x 1 root root 1507440 Apr 19 20:00 mca_pml_cm.so

I'll try to find out what happened next week when I'm back in
my office.


Kind regards

Siegmar





Am 23.04.16 um 02:12 schrieb Gilles Gouaillardet:
Siegmar,

I will try to reproduce this on my solaris11 x86_64 vm

In the mean time, can you please double check mca_pmix_pmix_pmix112.so
is a 64 bits library ?
(E.g, confirm "-m64" was correctly passed to pmix)

Cheers,

Gilles

On Friday, April 22, 2016, Siegmar Gross
<siegmar.gr...@informatik.hs-fulda.de
<mailto:siegmar.gr...@informatik.hs-fulda.de>> wrote:

    Hi Ralph,

    I've already used "-enable-debug". "SYSTEM_ENV" is "SunOS" or
    "Linux" and "MACHINE_ENV" is "sparc" or "x86_84".

    mkdir openmpi-v2.x-dev-1280-gc110ae8-${SYSTEM_ENV}.${MACHINE_ENV}.64_gcc
    cd openmpi-v2.x-dev-1280-gc110ae8-${SYSTEM_ENV}.${MACHINE_ENV}.64_gcc

    ../openmpi-v2.x-dev-1280-gc110ae8/configure \
      --prefix=/usr/local/openmpi-2.0.0_64_gcc \
      --libdir=/usr/local/openmpi-2.0.0_64_gcc/lib64 \
      --with-jdk-bindir=/usr/local/jdk1.8.0/bin \
      --with-jdk-headers=/usr/local/jdk1.8.0/include \
      JAVA_HOME=/usr/local/jdk1.8.0 \
      LDFLAGS="-m64" CC="gcc" CXX="g++" FC="gfortran" \
      CFLAGS="-m64" CXXFLAGS="-m64" FCFLAGS="-m64" \
      CPP="cpp" CXXCPP="cpp" \
      --enable-mpi-cxx \
      --enable-cxx-exceptions \
      --enable-mpi-java \
      --enable-heterogeneous \
      --enable-mpi-thread-multiple \
      --with-hwloc=internal \
      --without-verbs \
      --with-wrapper-cflags="-std=c11 -m64" \
      --with-wrapper-cxxflags="-m64" \
      --with-wrapper-fcflags="-m64" \
      --enable-debug \
      |& tee log.configure.$SYSTEM_ENV.$MACHINE_ENV.64_gcc


    mkdir openmpi-v2.x-dev-1280-gc110ae8-${SYSTEM_ENV}.${MACHINE_ENV}.64_cc
    cd openmpi-v2.x-dev-1280-gc110ae8-${SYSTEM_ENV}.${MACHINE_ENV}.64_cc

    ../openmpi-v2.x-dev-1280-gc110ae8/configure \
      --prefix=/usr/local/openmpi-2.0.0_64_cc \
      --libdir=/usr/local/openmpi-2.0.0_64_cc/lib64 \
      --with-jdk-bindir=/usr/local/jdk1.8.0/bin \
      --with-jdk-headers=/usr/local/jdk1.8.0/include \
      JAVA_HOME=/usr/local/jdk1.8.0 \
      LDFLAGS="-m64" CC="cc" CXX="CC" FC="f95" \
      CFLAGS="-m64" CXXFLAGS="-m64 -library=stlport4" FCFLAGS="-m64" \
      CPP="cpp" CXXCPP="cpp" \
      --enable-mpi-cxx \
      --enable-cxx-exceptions \
      --enable-mpi-java \
      --enable-heterogeneous \
      --enable-mpi-thread-multiple \
      --with-hwloc=internal \
      --without-verbs \
      --with-wrapper-cflags="-m64" \
      --with-wrapper-cxxflags="-m64 -library=stlport4" \
      --with-wrapper-fcflags="-m64" \
      --with-wrapper-ldflags="" \
      --enable-debug \
      |& tee log.configure.$SYSTEM_ENV.$MACHINE_ENV.64_cc


    Kind regards

    Siegmar

    Am 21.04.2016 um 18:18 schrieb Ralph Castain:

        Can you please rebuild OMPI with -enable-debug in the configure
        cmd? It will let us see more error output


            On Apr 21, 2016, at 8:52 AM, Siegmar Gross
            <siegmar.gr...@informatik.hs-fulda.de> wrote:

            Hi Ralph,

            I don't see any additional information.

            tyr hello_1 108 mpiexec -np 4 --host
            tyr,sunpc1,linpc1,ruester -mca
            mca_base_component_show_load_errors 1 hello_1_mpi
            [tyr.informatik.hs-fulda.de:06211
            <http://tyr.informatik.hs-fulda.de:06211>] [[48741,0],0]
            ORTE_ERROR_LOG: Not found in file
            
../../../../../openmpi-v2.x-dev-1280-gc110ae8/orte/mca/ess/hnp/ess_hnp_module.c
            at line 638
            
--------------------------------------------------------------------------
            It looks like orte_init failed for some reason; your
            parallel process is
            likely to abort.  There are many reasons that a parallel
            process can
            fail during orte_init; some of which are due to configuration or
            environment problems.  This failure appears to be an
            internal failure;
            here's some additional information (which may only be
            relevant to an
            Open MPI developer):

             opal_pmix_base_select failed
             --> Returned value Not found (-13) instead of ORTE_SUCCESS
            
--------------------------------------------------------------------------


            tyr hello_1 109 mpiexec -np 4 --host
            tyr,sunpc1,linpc1,ruester -mca
            mca_base_component_show_load_errors 1 -mca pmix_base_verbose
            10 -mca pmix_server_verbose 5 hello_1_mpi
            [tyr.informatik.hs-fulda.de:06212
            <http://tyr.informatik.hs-fulda.de:06212>] mca: base:
            components_register: registering framework pmix components
            [tyr.informatik.hs-fulda.de:06212
            <http://tyr.informatik.hs-fulda.de:06212>] mca: base:
            components_open: opening pmix components
            [tyr.informatik.hs-fulda.de:06212
            <http://tyr.informatik.hs-fulda.de:06212>] mca:base:select:
            Auto-selecting pmix components
            [tyr.informatik.hs-fulda.de:06212
            <http://tyr.informatik.hs-fulda.de:06212>] mca:base:select:(
            pmix) No component selected!
            [tyr.informatik.hs-fulda.de:06212
            <http://tyr.informatik.hs-fulda.de:06212>] [[48738,0],0]
            ORTE_ERROR_LOG: Not found in file
            
../../../../../openmpi-v2.x-dev-1280-gc110ae8/orte/mca/ess/hnp/ess_hnp_module.c
            at line 638
            
--------------------------------------------------------------------------
            It looks like orte_init failed for some reason; your
            parallel process is
            likely to abort.  There are many reasons that a parallel
            process can
            fail during orte_init; some of which are due to configuration or
            environment problems.  This failure appears to be an
            internal failure;
            here's some additional information (which may only be
            relevant to an
            Open MPI developer):

             opal_pmix_base_select failed
             --> Returned value Not found (-13) instead of ORTE_SUCCESS
            
--------------------------------------------------------------------------
            tyr hello_1 110


            Kind regards

            Siegmar


            Am 21.04.2016 um 17:24 schrieb Ralph Castain:

                Hmmm…it looks like you built the right components, but
                they are not being picked up. Can you run your mpiexec
                command again, adding “-mca
                mca_base_component_show_load_errors 1” to the cmd line?


                    On Apr 21, 2016, at 8:16 AM, Siegmar Gross
                    <siegmar.gr...@informatik.hs-fulda.de> wrote:

                    Hi Ralph,

                    I have attached ompi_info output for both compilers
                    from my
                    sparc machine and the listings for both compilers
                    from the
                    <prefix>/lib/openmpi directories. Hopefully that
                    helps to
                    find the problem.

                    hermes tmp 3 tar zvft openmpi-2.x_info.tar.gz
                    -rw-r--r-- root/root     10969 2016-04-21 17:06
                    ompi_info_SunOS_sparc_cc.txt
                    -rw-r--r-- root/root     11044 2016-04-21 17:06
                    ompi_info_SunOS_sparc_gcc.txt
                    -rw-r--r-- root/root     71252 2016-04-21 17:02
                    lib64_openmpi.txt
                    hermes tmp 4


                    Kind regards and thank you very much once more for
                    your help

                    Siegmar


                    Am 21.04.2016 um 15:54 schrieb Ralph Castain:

                        Odd - it would appear that none of the pmix
                        components built? Can you send
                        along the output from ompi_info? Or just send a
                        listing of the files in the
                        <prefix>/lib/openmpi directory?


                            On Apr 21, 2016, at 1:27 AM, Siegmar Gross
                            <siegmar.gr...@informatik.hs-fulda.de
                            <mailto:siegmar.gr...@informatik.hs-fulda.de>>
                            wrote:

                            Hi Ralph,

                            Am 21.04.2016 um 00:18 schrieb Ralph Castain:

                                Could you please rerun these test and
                                add “-mca pmix_base_verbose 10
                                -mca pmix_server_verbose 5” to your cmd
                                line? I need to see why the
                                pmix components failed.



                            tyr spawn 111 mpiexec -np 1 --host
                            tyr,sunpc1,linpc1,ruester -mca
                            pmix_base_verbose 10 -mca
                            pmix_server_verbose 5 spawn_multiple_master
                            [tyr.informatik.hs-fulda.de
                            <http://tyr.informatik.hs-fulda.de>
                            <http://tyr.informatik.hs-fulda.de/>:26652] mca:
                            base: components_register: registering
                            framework pmix components
                            [tyr.informatik.hs-fulda.de
                            <http://tyr.informatik.hs-fulda.de>
                            <http://tyr.informatik.hs-fulda.de/>:26652] mca:
                            base: components_open: opening pmix components
                            [tyr.informatik.hs-fulda.de
                            <http://tyr.informatik.hs-fulda.de>
                            <http://tyr.informatik.hs-fulda.de/>:26652]
                            mca:base:select: Auto-selecting pmix components
                            [tyr.informatik.hs-fulda.de
                            <http://tyr.informatik.hs-fulda.de>
                            <http://tyr.informatik.hs-fulda.de/>:26652]
                            mca:base:select:( pmix) No component selected!
                            [tyr.informatik.hs-fulda.de
                            <http://tyr.informatik.hs-fulda.de>
                            <http://tyr.informatik.hs-fulda.de/>:26652]
                            [[52794,0],0] ORTE_ERROR_LOG: Not found in file
                            
../../../../../openmpi-v2.x-dev-1280-gc110ae8/orte/mca/ess/hnp/ess_hnp_module.c
                            at line 638
                            
--------------------------------------------------------------------------
                            It looks like orte_init failed for some
                            reason; your parallel process is
                            likely to abort.  There are many reasons
                            that a parallel process can
                            fail during orte_init; some of which are due
                            to configuration or
                            environment problems.  This failure appears
                            to be an internal failure;
                            here's some additional information (which
                            may only be relevant to an
                            Open MPI developer):

                            opal_pmix_base_select failed
                            --> Returned value Not found (-13) instead
                            of ORTE_SUCCESS
                            
--------------------------------------------------------------------------
                            tyr spawn 112




                            tyr hello_1 116 mpiexec -np 1 --host
                            tyr,sunpc1,linpc1,ruester -mca
                            pmix_base_verbose 10 -mca
                            pmix_server_verbose 5 hello_1_mpi
                            [tyr.informatik.hs-fulda.de
                            <http://tyr.informatik.hs-fulda.de>
                            <http://tyr.informatik.hs-fulda.de/>:27261] mca:
                            base: components_register: registering
                            framework pmix components
                            [tyr.informatik.hs-fulda.de
                            <http://tyr.informatik.hs-fulda.de>
                            <http://tyr.informatik.hs-fulda.de/>:27261] mca:
                            base: components_open: opening pmix components
                            [tyr.informatik.hs-fulda.de
                            <http://tyr.informatik.hs-fulda.de>
                            <http://tyr.informatik.hs-fulda.de/>:27261]
                            mca:base:select: Auto-selecting pmix components
                            [tyr.informatik.hs-fulda.de
                            <http://tyr.informatik.hs-fulda.de>
                            <http://tyr.informatik.hs-fulda.de/>:27261]
                            mca:base:select:( pmix) No component selected!
                            [tyr.informatik.hs-fulda.de
                            <http://tyr.informatik.hs-fulda.de>
                            <http://tyr.informatik.hs-fulda.de/>:27261]
                            [[52315,0],0] ORTE_ERROR_LOG: Not found in file
                            
../../../../../openmpi-v2.x-dev-1280-gc110ae8/orte/mca/ess/hnp/ess_hnp_module.c
                            at line 638
                            
--------------------------------------------------------------------------
                            It looks like orte_init failed for some
                            reason; your parallel process is
                            likely to abort.  There are many reasons
                            that a parallel process can
                            fail during orte_init; some of which are due
                            to configuration or
                            environment problems.  This failure appears
                            to be an internal failure;
                            here's some additional information (which
                            may only be relevant to an
                            Open MPI developer):

                            opal_pmix_base_select failed
                            --> Returned value Not found (-13) instead
                            of ORTE_SUCCESS
                            
--------------------------------------------------------------------------
                            tyr hello_1 117



                            Thank you very much for your help.


                            Kind regards

                            Siegmar




                                Thanks
                                Ralph

                                    On Apr 20, 2016, at 10:12 AM,
                                    Siegmar Gross
                                    <siegmar.gr...@informatik.hs-fulda.de
                                    
<mailto:siegmar.gr...@informatik.hs-fulda.de>>
                                    wrote:

                                    Hi,

                                    I have built
                                    openmpi-v2.x-dev-1280-gc110ae8 on my
                                    machines
                                    (Solaris 10 Sparc, Solaris 10
                                    x86_64, and openSUSE Linux
                                    12.1 x86_64) with gcc-5.1.0 and Sun
                                    C 5.13. Unfortunately I get
                                    runtime errors for some programs.


                                    Sun C 5.13:
                                    ===========

                                    For all my test programs I get the
                                    same error on Solaris Sparc and
                                    Solaris x86_64, while the programs
                                    work fine on Linux.

                                    tyr hello_1 115 mpiexec -np 2
                                    hello_1_mpi
                                    [tyr.informatik.hs-fulda.de
                                    <http://tyr.informatik.hs-fulda.de>
                                    <http://tyr.informatik.hs-fulda.de>:22373]
                                    [[61763,0],0] ORTE_ERROR_LOG: Not
                                    found in file
                                    
../../../../../openmpi-v2.x-dev-1280-gc110ae8/orte/mca/ess/hnp/ess_hnp_module.c
                                    at line 638
                                    
--------------------------------------------------------------------------
                                    It looks like orte_init failed for
                                    some reason; your parallel process is
                                    likely to abort.  There are many
                                    reasons that a parallel process can
                                    fail during orte_init; some of which
                                    are due to configuration or
                                    environment problems.  This failure
                                    appears to be an internal failure;
                                    here's some additional information
                                    (which may only be relevant to an
                                    Open MPI developer):

                                    opal_pmix_base_select failed
                                    --> Returned value Not found (-13)
                                    instead of ORTE_SUCCESS
                                    
--------------------------------------------------------------------------
                                    tyr hello_1 116




                                    GCC-5.1.0:
                                    ==========

                                    tyr spawn 121 mpiexec -np 1 --host
                                    tyr,sunpc1,linpc1,ruester
                                    spawn_multiple_master

                                    Parent process 0 running on
                                    tyr.informatik.hs-fulda.de
                                    <http://tyr.informatik.hs-fulda.de>
                                    <http://tyr.informatik.hs-fulda.de>
                                    I create 3 slave processes.

                                    [tyr.informatik.hs-fulda.de
                                    <http://tyr.informatik.hs-fulda.de>
                                    <http://tyr.informatik.hs-fulda.de>:25366]
                                    PMIX ERROR: UNPACK-PAST-END in file
                                    
../../../../../../openmpi-v2.x-dev-1280-gc110ae8/opal/mca/pmix/pmix112/pmix/src/server/pmix_server_ops.c
                                    at line 829
                                    [tyr.informatik.hs-fulda.de
                                    <http://tyr.informatik.hs-fulda.de>
                                    <http://tyr.informatik.hs-fulda.de>:25366]
                                    PMIX ERROR: UNPACK-PAST-END in file
                                    
../../../../../../openmpi-v2.x-dev-1280-gc110ae8/opal/mca/pmix/pmix112/pmix/src/server/pmix_server.c
                                    at line 2176
                                    [tyr:25377] *** An error occurred in
                                    MPI_Comm_spawn_multiple
                                    [tyr:25377] *** reported by process
                                    [3308257281,0]
                                    [tyr:25377] *** on communicator
                                    MPI_COMM_WORLD
                                    [tyr:25377] *** MPI_ERR_SPAWN: could
                                    not spawn processes
                                    [tyr:25377] *** MPI_ERRORS_ARE_FATAL
                                    (processes in this communicator will
                                    now abort,
                                    [tyr:25377] ***    and potentially
                                    your MPI job)
                                    tyr spawn 122


                                    I would be grateful if somebody can
                                    fix the problems. Thank you very
                                    much for any help in advance.


                                    Kind regards

                                    Siegmar
                                    
<hello_1_mpi.c><spawn_multiple_master.c>_______________________________________________
                                    users mailing list
                                    us...@open-mpi.org 
<mailto:us...@open-mpi.org>
                                    Subscription:
                                    
http://www.open-mpi.org/mailman/listinfo.cgi/users
                                    Link to this post:
                                    
http://www.open-mpi.org/community/lists/users/2016/04/28983.php


                                _______________________________________________
                                users mailing list
                                us...@open-mpi.org <mailto:us...@open-mpi.org>
                                Subscription:
                                
http://www.open-mpi.org/mailman/listinfo.cgi/users
                                Link to this
                                post:
                                
http://www.open-mpi.org/community/lists/users/2016/04/28986.php


                            _______________________________________________
                            users mailing list
                            us...@open-mpi.org <mailto:us...@open-mpi.org>
                            Subscription:
                            http://www.open-mpi.org/mailman/listinfo.cgi/users
                            Link to this
                            post:
                            
http://www.open-mpi.org/community/lists/users/2016/04/28987.php




                        _______________________________________________
                        users mailing list
                        us...@open-mpi.org
                        Subscription:
                        http://www.open-mpi.org/mailman/listinfo.cgi/users
                        Link to this post:
                        
http://www.open-mpi.org/community/lists/users/2016/04/28988.php

                    
<openmpi-2.x_info.tar.gz>_______________________________________________
                    users mailing list
                    us...@open-mpi.org
                    Subscription:
                    http://www.open-mpi.org/mailman/listinfo.cgi/users
                    Link to this post:
                    
http://www.open-mpi.org/community/lists/users/2016/04/28989.php


                _______________________________________________
                users mailing list
                us...@open-mpi.org
                Subscription:
                http://www.open-mpi.org/mailman/listinfo.cgi/users
                Link to this post:
                http://www.open-mpi.org/community/lists/users/2016/04/28990.php


            _______________________________________________
            users mailing list
            us...@open-mpi.org
            Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
            Link to this post:
            http://www.open-mpi.org/community/lists/users/2016/04/28991.php


        _______________________________________________
        users mailing list
        us...@open-mpi.org
        Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
        Link to this post:
        http://www.open-mpi.org/community/lists/users/2016/04/28992.php

    _______________________________________________
    users mailing list
    us...@open-mpi.org
    Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
    Link to this post:
    http://www.open-mpi.org/community/lists/users/2016/04/28993.php



_______________________________________________
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2016/04/28999.php

Reply via email to