Hello, everyone,

We got some "Segmentation fault" errors in running "mpiexec" with "-loadbalance" option (details see below). However, we don't have any problem in using "-bynode" or "-nolocal". We tried in both Intel compiler and GCC 4.1.2. The same type of error appears.

Here is the error message we got:

mpiexec -n 4 --loadbalance ./a.out

[n265:00912] *** Process received signal ***
[n265:00912] Signal: Segmentation fault (11)
[n265:00912] Signal code: Address not mapped (1)
[n265:00912] Failing at address: 0x50
[n265:00912] [ 0] /lib64/libpthread.so.0 [0x3e0820eb10]
[n265:00912] [ 1] /u/local/intel/11.1/openmpi/1.4.2/lib/libopen-rte.so.0(orte_util_encode_pidmap+0xcf) [0x2b9344c3f7ff] [n265:00912] [ 2] /u/local/intel/11.1/openmpi/1.4.2/lib/libopen-rte.so.0(orte_odls_base_default_get_add_procs_data+0x3b8) [0x2b9344c5f2a8] [n265:00912] [ 3] /u/local/intel/11.1/openmpi/1.4.2/lib/libopen-rte.so.0(orte_plm_base_launch_apps+0xd7) [0x2b9344c70b97] [n265:00912] [ 4] /u/local/intel/11.1/openmpi/1.4.2/lib/libopen-rte.so.0 [0x2b9344c77171]
[n265:00912] [ 5] mpiexec [0x404c27]
[n265:00912] [ 6] mpiexec [0x403e38]
[n265:00912] [ 7] /lib64/libc.so.6(__libc_start_main+0xf4) [0x3e0761d994]
[n265:00912] [ 8] mpiexec [0x403d69]
[n265:00912] *** End of error message ***
Segmentation fault
[n96:22288] [[57019,0],3] routed:binomial: Connection to lifeline [[57019,0],0] lost


Below is the info for our system:

+ OS: CentOS 5.5
+ OpenMPI 1.4.2 configuration:
./configure --prefix=/u/local/intel/11.1/openmpi/1.4.2  \
--with-openib=/usr --enable-static    \
CC=icc CXX=icpc F77=ifort FC=ifort  --with-sge
(similar for GCC~)


Any ideas would be appreciated. Many thanks in advance.

Best wishes,
Qiyang


Reply via email to