Hi everybody, we've got some problems on our cluster with openmpi versions 1.2 and upward.
The following setup does work: openmpi-1.2b3: SLES 9 SP3 with gcc/g++ 4.1.1 and PGI f95 6.1-1 The following two setups give a SISEGV in mpiexec (stack see below) openmpi-1.2: SLES 9 SP3 with gcc/g++ 4.1.1 and PGI f95 6.1-1 openmpi-1.2.1: SLES 9 SP3 with gcc/g++ 4.1.1 and PGI f95 6.1-1 All have been compiled with export F77=pgf95 export FC=pgf95 ./configure --prefix=/sw/sles9-x64/voltaire/openmpi-1.2b3-pgi \ --enable-pretty-print-stacktrace \ --with-libnuma=/usr \ --with-mvapi=/usr \ --with-mvapi-libdir=/usr/lib64 (with changing prefix, of course) The stack trace: Starting program: /scratch/work/system/sw/sles9-x64/voltaire/openmpi-1.2.1-pgi/bin/mpiexec -host tornado1 --prefix=$MPIROOT -v -np 8 `pwd`/osu_bw [Thread debugging using libthread_db enabled] [New Thread 182906198784 (LWP 30805)] Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 182906198784 (LWP 30805)] 0x0000002a957f1b5b in _int_free () from /sw/sles9-x64/voltaire/openmpi-1.2.1-pgi/lib/libopen-pal.so.0 (gdb) where #0 0x0000002a957f1b5b in _int_free () from /sw/sles9-x64/voltaire/openmpi-1.2.1-pgi/lib/libopen-pal.so.0 #1 0x0000002a957f1e7d in free () from /sw/sles9-x64/voltaire/openmpi-1.2.1-pgi/lib/libopen-pal.so.0 #2 0x0000002a95563b72 in __tls_get_addr () from /lib64/ld-linux-x86-64.so.2 #3 0x0000002a95fb51ec in __libc_dl_error_tsd () from /lib64/tls/libc.so.6 #4 0x0000002a95dba6ec in __pthread_initialize_minimal_internal () from /lib64/tls/libpthread.so.0 #5 0x0000002a95dba419 in call_initialize_minimal () from /lib64/tls/libpthread.so.0 #6 0x0000002a95ec9000 in ?? () #7 0x0000002a95db9fe9 in _init () from /lib64/tls/libpthread.so.0 #8 0x0000007fbfffe7c0 in ?? () #9 0x0000002a9556168d in call_init () from /lib64/ld-linux-x86-64.so.2 #10 0x0000002a9556179b in _dl_init_internal () from /lib64/ld-linux-x86-64.so.2 #11 0x0000002a95fb39ac in dl_open_worker () from /lib64/tls/libc.so.6 #12 0x0000002a955612de in _dl_catch_error () from /lib64/ld-linux-x86-64.so.2 #13 0x0000002a95fb3160 in _dl_open () from /lib64/tls/libc.so.6 #14 0x0000002a959413b5 in dlopen_doit () from /lib64/libdl.so.2 #15 0x0000002a955612de in _dl_catch_error () from /lib64/ld-linux-x86-64.so.2 #16 0x0000002a959416fa in _dlerror_run () from /lib64/libdl.so.2 #17 0x0000002a95941362 in dlopen@@GLIBC_2.2.5 () from /lib64/libdl.so.2 #18 0x0000002a957db2ee in vm_open () from /sw/sles9-x64/voltaire/openmpi-1.2.1-pgi/lib/libopen-pal.so.0 #19 0x0000002a957d9645 in tryall_dlopen () from /sw/sles9-x64/voltaire/openmpi-1.2.1-pgi/lib/libopen-pal.so.0 #20 0x0000002a957d981e in tryall_dlopen_module () from /sw/sles9-x64/voltaire/openmpi-1.2.1-pgi/lib/libopen-pal.so.0 #21 0x0000002a957daab1 in try_dlopen () from /sw/sles9-x64/voltaire/openmpi-1.2.1-pgi/lib/libopen-pal.so.0 #22 0x0000002a957dacd6 in lt_dlopenext () from /sw/sles9-x64/voltaire/openmpi-1.2.1-pgi/lib/libopen-pal.so.0 #23 0x0000002a957e04f5 in open_component () from /sw/sles9-x64/voltaire/openmpi-1.2.1-pgi/lib/libopen-pal.so.0 #24 0x0000002a957e0f60 in mca_base_component_find () from /sw/sles9-x64/voltaire/openmpi-1.2.1-pgi/lib/libopen-pal.so.0 #25 0x0000002a957e189c in mca_base_components_open () from /sw/sles9-x64/voltaire/openmpi-1.2.1-pgi/lib/libopen-pal.so.0 #26 0x0000002a956a6119 in orte_rds_base_open () from /sw/sles9-x64/voltaire/openmpi-1.2.1-pgi/lib/libopen-rte.so.0 #27 0x0000002a95681d18 in orte_init_stage1 () from /sw/sles9-x64/voltaire/openmpi-1.2.1-pgi/lib/libopen-rte.so.0 #28 0x0000002a95684eba in orte_system_init () from /sw/sles9-x64/voltaire/openmpi-1.2.1-pgi/lib/libopen-rte.so.0 #29 0x0000002a9568179d in orte_init () from /sw/sles9-x64/voltaire/openmpi-1.2.1-pgi/lib/libopen-rte.so.0 #30 0x0000000000402a3a in orterun (argc=8, argv=0x7fbfffe778) at orterun.c:374 #31 0x00000000004028d3 in main (argc=8, argv=0x7fbfffe778) at main.c:13 (gdb) quit In case access to our cluster could help, we would be happy to provide an account. Cheerio, Luis -- \\\\\\ (-0^0-) --------------------------oOO--(_)--OOo----------------------------- Luis Kornblueh Tel. : +49-40-41173289 Max-Planck-Institute for Meteorology Fax. : +49-40-41173298 Bundesstr. 53 D-20146 Hamburg Email: luis.kornbl...@zmaw.de Federal Republic of Germany