Hello, everyone,
We got some "Segmentation fault" errors in running "mpiexec" with
"-loadbalance" option (details see below). However, we don't have any
problem in using "-bynode" or "-nolocal". We tried in both Intel
compiler and GCC 4.1.2. The same type of error appears.
Here is the error message we got:
mpiexec -n 4 --loadbalance ./a.out
[n265:00912] *** Process received signal ***
[n265:00912] Signal: Segmentation fault (11)
[n265:00912] Signal code: Address not mapped (1)
[n265:00912] Failing at address: 0x50
[n265:00912] [ 0] /lib64/libpthread.so.0 [0x3e0820eb10]
[n265:00912] [ 1]
/u/local/intel/11.1/openmpi/1.4.2/lib/libopen-rte.so.0(orte_util_encode_pidmap+0xcf)
[0x2b9344c3f7ff]
[n265:00912] [ 2]
/u/local/intel/11.1/openmpi/1.4.2/lib/libopen-rte.so.0(orte_odls_base_default_get_add_procs_data+0x3b8)
[0x2b9344c5f2a8]
[n265:00912] [ 3]
/u/local/intel/11.1/openmpi/1.4.2/lib/libopen-rte.so.0(orte_plm_base_launch_apps+0xd7)
[0x2b9344c70b97]
[n265:00912] [ 4] /u/local/intel/11.1/openmpi/1.4.2/lib/libopen-rte.so.0
[0x2b9344c77171]
[n265:00912] [ 5] mpiexec [0x404c27]
[n265:00912] [ 6] mpiexec [0x403e38]
[n265:00912] [ 7] /lib64/libc.so.6(__libc_start_main+0xf4) [0x3e0761d994]
[n265:00912] [ 8] mpiexec [0x403d69]
[n265:00912] *** End of error message ***
Segmentation fault
[n96:22288] [[57019,0],3] routed:binomial: Connection to lifeline
[[57019,0],0] lost
Below is the info for our system:
+ OS: CentOS 5.5
+ OpenMPI 1.4.2 configuration:
./configure --prefix=/u/local/intel/11.1/openmpi/1.4.2 \
--with-openib=/usr --enable-static \
CC=icc CXX=icpc F77=ifort FC=ifort --with-sge
(similar for GCC~)
Any ideas would be appreciated. Many thanks in advance.
Best wishes,
Qiyang