Apparently Solaris 10 lacks support for strnlen. We should add it to our configure and provide a replacement where needed.
George. On Wed, Jun 8, 2016 at 4:30 PM, Siegmar Gross < siegmar.gr...@informatik.hs-fulda.de> wrote: > Hi, > > I have built openmpi-dev-4221-gb707d13 on my machines (Solaris 10 > Sparc, Solaris 10 x86_64, and openSUSE Linux 12.1 x86_64) with > gcc-5.1.0 and Sun C 5.13. Unfortunately I get an error for a small > program. > > > tyr hello_1 109 ompi_info | grep -e "OPAL repo revision:" -e "C compiler > absolute:" > OPAL repo revision: dev-4221-gb707d13 > C compiler absolute: /usr/local/gcc-5.1.0/bin/gcc > > tyr hello_1 110 mpiexec -np 4 --host tyr,sunpc1,linpc1,tyr hello_1_mpi > ld.so.1: orted: fatal: relocation error: file > /usr/local/openmpi-master_64_gcc/lib64/openmpi/mca_pmix_pmix114.so: symbol > strnlen: referenced symbol not found > -------------------------------------------------------------------------- > ORTE has lost communication with its daemon located on node: > > hostname: sunpc1 > > This is usually due to either a failure of the TCP network > connection to the node, or possibly an internal failure of > the daemon itself. We cannot recover from this failure, and > therefore will terminate the job. > -------------------------------------------------------------------------- > > > > > I get the same error, if I login on a Solaris x86_64 machine and only use > that machine. > > sunpc1 fd1026 101 mpiexec -np 2 --host sunpc1,sunpc1 hello_1_mpi > ld.so.1: orterun: fatal: relocation error: file > /usr/local/openmpi-master_64_gcc/lib64/openmpi/mca_pmix_pmix114.so: symbol > strnlen: referenced symbol not found > Killed > sunpc1 fd1026 102 > > > > > > tyr hello_1 111 /usr/local/gdb-7.6.1_64_gcc/bin/gdb mpiexec > GNU gdb (GDB) 7.6.1 > Copyright (C) 2013 Free Software Foundation, Inc. > License GPLv3+: GNU GPL version 3 or later < > http://gnu.org/licenses/gpl.html> > This is free software: you are free to change and redistribute it. > There is NO WARRANTY, to the extent permitted by law. Type "show copying" > and "show warranty" for details. > This GDB was configured as "sparc-sun-solaris2.10". > For bug reporting instructions, please see: > <http://www.gnu.org/software/gdb/bugs/>... > Reading symbols from > /export2/prog/SunOS_sparc/openmpi-master_64_gcc/bin/orterun...done. > (gdb) set args -np 4 --host tyr,sunpc1,linpc1,tyr hello_1_mpi > (gdb) r > Starting program: /usr/local/openmpi-master_64_gcc/bin/mpiexec -np 4 > --host tyr,sunpc1,linpc1,tyr hello_1_mpi > [Thread debugging using libthread_db enabled] > [New Thread 1 (LWP 1)] > [New LWP 2 ] > [New LWP 3 ] > [New LWP 4 ] > [New LWP 5 ] > ld.so.1: orted: fatal: relocation error: file > /usr/local/openmpi-master_64_gcc/lib64/openmpi/mca_pmix_pmix114.so: symbol > strnlen: referenced symbol not found > -------------------------------------------------------------------------- > ORTE has lost communication with its daemon located on node: > > hostname: sunpc1 > > This is usually due to either a failure of the TCP network > connection to the node, or possibly an internal failure of > the daemon itself. We cannot recover from this failure, and > therefore will terminate the job. > -------------------------------------------------------------------------- > [LWP 5 exited] > [New Thread 5 ] > [LWP 4 exited] > [New Thread 4 ] > [LWP 3 exited] > [New Thread 3 ] > [Switching to Thread 1 (LWP 1)] > sol_thread_fetch_registers: td_ta_map_id2thr: no thread can be found to > satisfy query > (gdb) Killed > > (gdb) bt > #0 0xffffffff7f6173d0 in rtld_db_dlactivity () from > /usr/lib/sparcv9/ld.so.1 > #1 0xffffffff7f6175a8 in rd_event () from /usr/lib/sparcv9/ld.so.1 > #2 0xffffffff7f618950 in lm_delete () from /usr/lib/sparcv9/ld.so.1 > #3 0xffffffff7f6226bc in remove_so () from /usr/lib/sparcv9/ld.so.1 > #4 0xffffffff7f624574 in remove_hdl () from /usr/lib/sparcv9/ld.so.1 > #5 0xffffffff7f61d97c in dlclose_core () from /usr/lib/sparcv9/ld.so.1 > #6 0xffffffff7f61d9d4 in dlclose_intn () from /usr/lib/sparcv9/ld.so.1 > #7 0xffffffff7f61db0c in dlclose () from /usr/lib/sparcv9/ld.so.1 > #8 0xffffffff7ece8d30 in dlopen_close (handle=0x1001a8350) > at > ../../../../../openmpi-dev-4221-gb707d13/opal/mca/dl/dlopen/dl_dlopen_module.c:148 > #9 0xffffffff7ece8464 in opal_dl_close (handle=0x1001a8350) > at > ../../../../openmpi-dev-4221-gb707d13/opal/mca/dl/base/dl_base_fns.c:53 > #10 0xffffffff7ecab1c0 in mca_base_component_repository_release_internal > (ri=0x1001406d0) > at > ../../../../openmpi-dev-4221-gb707d13/opal/mca/base/mca_base_component_repository.c:280 > #11 0xffffffff7ecab338 in mca_base_component_repository_release ( > component=0xffffffff799a70c0 <mca_pmix_pmix114_component>) > at > ../../../../openmpi-dev-4221-gb707d13/opal/mca/base/mca_base_component_repository.c:317 > #12 0xffffffff7ecad0d8 in mca_base_component_unload ( > component=0xffffffff799a70c0 <mca_pmix_pmix114_component>, > output_id=-1) > at > ../../../../openmpi-dev-4221-gb707d13/opal/mca/base/mca_base_components_close.c:46 > #13 0xffffffff7ecad170 in mca_base_component_close ( > component=0xffffffff799a70c0 <mca_pmix_pmix114_component>, > output_id=-1) > at > ../../../../openmpi-dev-4221-gb707d13/opal/mca/base/mca_base_components_close.c:59 > #14 0xffffffff7ecad240 in mca_base_components_close (output_id=-1, > components=0xffffffff7ee9f558 <opal_pmix_base_framework+80>, skip=0x0) > at > ../../../../openmpi-dev-4221-gb707d13/opal/mca/base/mca_base_components_close.c:85 > #15 0xffffffff7ecad1b0 in mca_base_framework_components_close ( > framework=0xffffffff7ee9f508 <opal_pmix_base_framework>, skip=0x0) > at > ../../../../openmpi-dev-4221-gb707d13/opal/mca/base/mca_base_components_close.c:65 > #16 0xffffffff7ed4921c in opal_pmix_base_frame_close () > at > ../../../../openmpi-dev-4221-gb707d13/opal/mca/pmix/base/pmix_base_frame.c:57 > #17 0xffffffff7ecc3418 in mca_base_framework_close ( > framework=0xffffffff7ee9f508 <opal_pmix_base_framework>) > at > ../../../../openmpi-dev-4221-gb707d13/opal/mca/base/mca_base_framework.c:214 > #18 0xffffffff7c20782c in rte_finalize () > at > ../../../../../openmpi-dev-4221-gb707d13/orte/mca/ess/hnp/ess_hnp_module.c:795 > #19 0xffffffff7ef39e20 in orte_finalize () > at ../../openmpi-dev-4221-gb707d13/orte/runtime/orte_finalize.c:73 > #20 0x0000000100002d08 in orterun (argc=6, argv=0xffffffff7fffdf88) > at > ../../../../openmpi-dev-4221-gb707d13/orte/tools/orterun/orterun.c:293 > #21 0x0000000100001928 in main (argc=6, argv=0xffffffff7fffdf88) > at ../../../../openmpi-dev-4221-gb707d13/orte/tools/orterun/main.c:13 > (gdb) q > A debugging session is active. > > Inferior 1 [process 27925 ] will be killed. > > Quit anyway? (y or n) y > Quitting: sol_thread_fetch_registers: td_ta_map_id2thr: no thread can be > found to satisfy query > tyr hello_1 112 > > > > > > tyr hello_1 112 mpiexec -np 4 --host tyr,linpc1,linpc1,tyr hello_1_mpi > ld.so.1: orterun: fatal: relocation error: file > /usr/local/openmpi-master_64_gcc/lib64/openmpi/mca_pmix_pmix114.so: symbol > strnlen: referenced symbol not found > Killed > tyr hello_1 113 Speicherschutzverletzung > [linpc1:25689] *** Process received signal *** > [linpc1:25689] Signal: Segmentation fault (11) > [linpc1:25689] Signal code: Address not mapped (1) > [linpc1:25689] Failing at address: 0x7f721f828aa1 > > tyr hello_1 113 > > > > > > > > > > > I would be grateful if somebody can fix the problem. Please let me > know, if you need more information. Thank you very much for any help > in advance. > > > Kind regards > > Siegmar > > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2016/06/29405.php >