I think there is some relevant discussion here: 
https://github.com/open-mpi/ompi/issues/1569 
<https://github.com/open-mpi/ompi/issues/1569>

It looks like Gilles had (at least at one point) a fix for master when 
enable-heterogeneous, but I don’t know if that was committed.

> On Jan 9, 2017, at 8:23 AM, Howard Pritchard <hpprit...@gmail.com> wrote:
> 
> HI Siegmar,
> 
> You have some config parameters I wasn't trying that may have some impact.
> I'll give a try with these parameters.
> 
> This should be enough info for now,
> 
> Thanks,
> 
> Howard
> 
> 
> 2017-01-09 0:59 GMT-07:00 Siegmar Gross <siegmar.gr...@informatik.hs-fulda.de 
> <mailto:siegmar.gr...@informatik.hs-fulda.de>>:
> Hi Howard,
> 
> I use the following commands to build and install the package.
> ${SYSTEM_ENV} is "Linux" and ${MACHINE_ENV} is "x86_64" for my
> Linux machine.
> 
> mkdir openmpi-2.0.2rc3-${SYSTEM_ENV}.${MACHINE_ENV}.64_cc
> cd openmpi-2.0.2rc3-${SYSTEM_ENV}.${MACHINE_ENV}.64_cc
> 
> ../openmpi-2.0.2rc3/configure \
>   --prefix=/usr/local/openmpi-2.0.2_64_cc \
>   --libdir=/usr/local/openmpi-2.0.2_64_cc/lib64 \
>   --with-jdk-bindir=/usr/local/jdk1.8.0_66/bin \
>   --with-jdk-headers=/usr/local/jdk1.8.0_66/include \
>   JAVA_HOME=/usr/local/jdk1.8.0_66 \
>   LDFLAGS="-m64 -mt -Wl,-z -Wl,noexecstack" CC="cc" CXX="CC" FC="f95" \
>   CFLAGS="-m64 -mt" CXXFLAGS="-m64" FCFLAGS="-m64" \
>   CPP="cpp" CXXCPP="cpp" \
>   --enable-mpi-cxx \
>   --enable-mpi-cxx-bindings \
>   --enable-cxx-exceptions \
>   --enable-mpi-java \
>   --enable-heterogeneous \
>   --enable-mpi-thread-multiple \
>   --with-hwloc=internal \
>   --without-verbs \
>   --with-wrapper-cflags="-m64 -mt" \
>   --with-wrapper-cxxflags="-m64" \
>   --with-wrapper-fcflags="-m64" \
>   --with-wrapper-ldflags="-mt" \
>   --enable-debug \
>   |& tee log.configure.$SYSTEM_ENV.$MACHINE_ENV.64_cc
> 
> make |& tee log.make.$SYSTEM_ENV.$MACHINE_ENV.64_cc
> rm -r /usr/local/openmpi-2.0.2_64_cc.old
> mv /usr/local/openmpi-2.0.2_64_cc /usr/local/openmpi-2.0.2_64_cc.old
> make install |& tee log.make-install.$SYSTEM_ENV.$MACHINE_ENV.64_cc
> make check |& tee log.make-check.$SYSTEM_ENV.$MACHINE_ENV.64_cc
> 
> 
> I get a different error if I run the program with gdb.
> 
> loki spawn 118 gdb /usr/local/openmpi-2.0.2_64_cc/bin/mpiexec
> GNU gdb (GDB; SUSE Linux Enterprise 12) 7.11.1
> Copyright (C) 2016 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html 
> <http://gnu.org/licenses/gpl.html>>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
> and "show warranty" for details.
> This GDB was configured as "x86_64-suse-linux".
> Type "show configuration" for configuration details.
> For bug reporting instructions, please see:
> <http://bugs.opensuse.org/ <http://bugs.opensuse.org/>>.
> Find the GDB manual and other documentation resources online at:
> <http://www.gnu.org/software/gdb/documentation/ 
> <http://www.gnu.org/software/gdb/documentation/>>.
> For help, type "help".
> Type "apropos word" to search for commands related to "word"...
> Reading symbols from /usr/local/openmpi-2.0.2_64_cc/bin/mpiexec...done.
> (gdb) r -np 1 --host loki --slot-list 0:0-5,1:0-5 spawn_master
> Starting program: /usr/local/openmpi-2.0.2_64_cc/bin/mpiexec -np 1 --host 
> loki --slot-list 0:0-5,1:0-5 spawn_master
> Missing separate debuginfos, use: zypper install 
> glibc-debuginfo-2.24-2.3.x86_64
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib64/libthread_db.so.1".
> [New Thread 0x7ffff3b97700 (LWP 13582)]
> [New Thread 0x7ffff18a4700 (LWP 13583)]
> [New Thread 0x7ffff10a3700 (LWP 13584)]
> [New Thread 0x7fffebbba700 (LWP 13585)]
> Detaching after fork from child process 13586.
> 
> Parent process 0 running on loki
>   I create 4 slave processes
> 
> Detaching after fork from child process 13589.
> Detaching after fork from child process 13590.
> Detaching after fork from child process 13591.
> [loki:13586] OPAL ERROR: Timeout in file 
> ../../../../openmpi-2.0.2rc3/opal/mca/pmix/base/pmix_base_fns.c at line 193
> [loki:13586] *** An error occurred in MPI_Comm_spawn
> [loki:13586] *** reported by process [2873294849,0]
> [loki:13586] *** on communicator MPI_COMM_WORLD
> [loki:13586] *** MPI_ERR_UNKNOWN: unknown error
> [loki:13586] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will 
> now abort,
> [loki:13586] ***    and potentially your MPI job)
> [Thread 0x7fffebbba700 (LWP 13585) exited]
> [Thread 0x7ffff10a3700 (LWP 13584) exited]
> [Thread 0x7ffff18a4700 (LWP 13583) exited]
> [Thread 0x7ffff3b97700 (LWP 13582) exited]
> [Inferior 1 (process 13567) exited with code 016]
> Missing separate debuginfos, use: zypper install 
> libpciaccess0-debuginfo-0.13.2-5.1.x86_64 
> libudev1-debuginfo-210-116.3.3.x86_64
> (gdb) bt
> No stack.
> (gdb)
> 
> Do you need anything else?
> 
> 
> Kind regards
> 
> Siegmar
> 
> Am 08.01.2017 um 17:02 schrieb Howard Pritchard:
> HI Siegmar,
> 
> Could you post the configury options you use when building the 2.0.2rc3?
> Maybe that will help in trying to reproduce the segfault you are observing.
> 
> Howard
> 
> 
> 2017-01-07 2:30 GMT-07:00 Siegmar Gross <siegmar.gr...@informatik.hs-fulda.de 
> <mailto:siegmar.gr...@informatik.hs-fulda.de> 
> <mailto:siegmar.gr...@informatik.hs-fulda.de 
> <mailto:siegmar.gr...@informatik.hs-fulda.de>>>:
> 
>     Hi,
> 
>     I have installed openmpi-2.0.2rc3 on my "SUSE Linux Enterprise
>     Server 12 (x86_64)" with Sun C 5.14 and gcc-6.3.0. Unfortunately,
>     I still get the same error that I reported for rc2.
> 
>     I would be grateful, if somebody can fix the problem before
>     releasing the final version. Thank you very much for any help
>     in advance.
> 
> 
>     Kind regards
> 
>     Siegmar
>     _______________________________________________
>     users mailing list
>     users@lists.open-mpi.org <mailto:users@lists.open-mpi.org> 
> <mailto:users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>>
>     https://rfd.newmexicoconsortium.org/mailman/listinfo/users 
> <https://rfd.newmexicoconsortium.org/mailman/listinfo/users> 
> <https://rfd.newmexicoconsortium.org/mailman/listinfo/users 
> <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>>
> 
> 
> 
> 
> _______________________________________________
> users mailing list
> users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users 
> <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
> 
> _______________________________________________
> users mailing list
> users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users 
> <https://rfd.newmexicoconsortium.org/mailman/listinfo/users>
> 
> _______________________________________________
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Reply via email to