Dear list members I am using openmpi 1.3.3 with OFED on a HP cluster with redhatLinux.
Occasionally (not always) I get a crash with the following message: [hydra11:09312] *** Process received signal *** [hydra11:09312] Signal: Segmentation fault (11) [hydra11:09312] Signal code: Address not mapped (1) [hydra11:09312] Failing at address: 0xffffffffab5f30a8 [hydra11:09312] [ 0] /lib64/libpthread.so.0 [0x3c1400e4c0] [hydra11:09312] [ 1] /home/ipl/openmpi-1.3.3/platforms/hp/lib/libmpi.so.0(MPI_Isend+0x93) [0x2af1be45a3e3] [hydra11:09312] [ 2] ./flow(MP_SendReal+0x60) [0x6bc993] [hydra11:09312] [ 3] ./flow(SendRealsAlongFaceWithOffset_3D+0x4ab) [0x68ba19] [hydra11:09312] [ 4] ./flow(MP_SendVertexArrayBlock+0x23d) [0x6891e1] [hydra11:09312] [ 5] ./flow(MB_CommAllVertex+0x65) [0x6848ba] [hydra11:09312] [ 6] ./flow(MB_SetupVertexArray+0xd5) [0x68c837] [hydra11:09312] [ 7] ./flow(MB_SetupGrid+0xa8) [0x68be51] [hydra11:09312] [ 8] ./flow(SetGrid+0x58) [0x446224] [hydra11:09312] [ 9] ./flow(main+0x148) [0x43b728] [hydra11:09312] [10] /lib64/libc.so.6(__libc_start_main+0xf4) [0x3c1341d974] [hydra11:09312] [11] ./flow(__gxx_personality_v0+0xd9) [0x429b19] [hydra11:09312] *** End of error message *** -------------------------------------------------------------------------- mpirun noticed that process rank 6 with PID 9312 on node hydra11 exited on signal 11 (Segmentation fault). -------------------------------------------------------------------------- The crash does not appear always - sometimes the application runs fine. However, it seems that the crash especially occurs when I run on more than 1 node. I have consulted the archive of open-mpi and have found many error messages of the same kind, but none from the 1.3.3 version, and none of direct relevance. I would really appreciate comments on this. Below is the information required according to the openmpi web, Config.log: attached (config.zip) Open mpi was configured with prefix and with the path to openib, and with the following compiler flags setenv CC gcc setenv CFLAGS '-O' setenv CXX g++ setenv CXXFLAGS '-O' setenv F77 'gfortran' setenv FFLAGS '-O' ompi_info -all: attached The application (named flow) was launched on hydra11 by nohup mpirun -H hydra11,hydra12 -np 8 ./flow caseC.in & the PATH and LD_LIBRARY_PATH, hydra11 and hydra12: PATH=/home/ipl/openmpi-1.3.3/platforms/hp/bin LD_LIBRARY_PATH= /home/ipl/openmpi-1.3.3/platforms/hp/lib OpenFabrics version: 1.4 Linux: X86_64-redhat-linux/3.4.6 ibv_devinfo, hydra11: attached ibv_devinfo, hydra12: attached ifconfig, hydra11: attached ifconfig, hydra12: attached ulimit -l (hydra11): 6000000 ulimit -l (hydra12): unlimited Furthermore, I can say that I have not specified any MCA parameters. The application which I am running (named flow) is linked from fortran, c and c++ libraries with the following: /home/ipl/openmpi-1.3.3/platforms/hp/bin/mpicc -DMP -DNS3_ARCH_LINUX -DLAPACK -I/home/ipl/ns3/engine/include_forLinux -I/home/ipl/openmpi-1.3.3/platforms/hp/include -c -o user_small_3D.o user_small_3D.c rm -f flow /home/ipl/openmpi-1.3.3/platforms/hp/bin/mpicxx -o flow user_small_3D.o -L/home/ipl/ns3/engine/lib_forLinux -lns3main -lns3pars -lns3util -lns3vofl -lns3turb -lns3solv -lns3mesh -lns3diff -lns3grid -lns3line -lns3data -lns3base -lfitpack -lillusolve -lfftpack_small -lfenton -lns3air -lns3dens -lns3poro -lns3sedi -llapack_small -lblas_small -lm -lgfortran /home/ipl/ns3/engine/lib_Tecplot_forLinux/tecio64.a Please let me know if you need more info! Thanks in advance, Iris Lohmann Iris Pernille Lohmann MSc, PhD Ports & Offshore Technology (POT) [cid:image001.gif@01CA564A.A05EDAA0] DHI Agern Allé 5 DK-2970 Hørsholm Denmark Tel: +45 4516 9200 Direct: 45169427 i...@dhigroup.com www.dhigroup.com WATER * ENVIRONMENT * HEALTH ***************************************************************************** ** ** ** WARNING: This email contains an attachment of a very suspicious type. ** ** You are urged NOT to open this attachment unless you are absolutely ** ** sure it is legitimate. Opening this attachment may cause irreparable ** ** damage to your computer and your files. If you have any questions ** ** about the validity of this message, PLEASE SEEK HELP BEFORE OPENING IT. ** ** ** ** This warning was added by the IU Computer Science Dept. mail scanner. ** *****************************************************************************
<<attachment: config.zip>>
<<attachment: ompi_info_all.zip>>
ibv_devinfo_hydra11.out
Description: ibv_devinfo_hydra11.out
ibv_devinfo_hydra12.out
Description: ibv_devinfo_hydra12.out
ifconfig_hydra11.out
Description: ifconfig_hydra11.out
ifconfig_hydra12.out
Description: ifconfig_hydra12.out