Oh, by the way, here is the segfault: [m4b-1-8:11481] *** Process received signal *** [m4b-1-8:11481] Signal: Segmentation fault (11) [m4b-1-8:11481] Signal code: Address not mapped (1) [m4b-1-8:11481] Failing at address: 0x2b91c69eed [m4b-1-8:11483] [ 0] /lib64/libpthread.so.0 [0x33e8c0de70] [m4b-1-8:11483] [ 1] /fslhome/dhansen7/openmpi/lib/libmpi.so.0 [0x2aaaaabea7c0] [m4b-1-8:11483] [ 2] /fslhome/dhansen7/openmpi/lib/libmpi.so.0 [0x2aaaaabea675] [m4b-1-8:11483] [ 3] /fslhome/dhansen7/openmpi/lib/libmpi.so.0(mca_pml_ob1_send+0x2da) [0x2aaaaabeaf55] [m4b-1-8:11483] [ 4] /fslhome/dhansen7/openmpi/lib/libmpi.so.0(MPI_Send+0x28e) [0x2aaaaab52c5a] [m4b-1-8:11483] [ 5] /fslhome/dhansen7/compute/for_DanielHansen/replica_mpi_marylou2/Openmpi_md_twham(twham_init+0x708) [0x42a8a8] [m4b-1-8:11483] [ 6] /fslhome/dhansen7/compute/for_DanielHansen/replica_mpi_marylou2/Openmpi_md_twham(repexch+0x73c) [0x425d5c] [m4b-1-8:11483] [ 7] /fslhome/dhansen7/compute/for_DanielHansen/replica_mpi_marylou2/Openmpi_md_twham(main+0x855) [0x4133a5] [m4b-1-8:11483] [ 8] /lib64/libc.so.6(__libc_start_main+0xf4) [0x33e841d8a4] [m4b-1-8:11483] [ 9] /fslhome/dhansen7/compute/for_DanielHansen/replica_mpi_marylou2/Openmpi_md_twham [0x4040b9] [m4b-1-8:11483] *** End of error message ***
On Fri, Oct 3, 2008 at 3:20 PM, Daniel Hansen <dhan...@byu.net> wrote: > I have been testing some code against openmpi lately that always causes it > to crash during certain mpi function calls. The code does not seem to be > the problem, as it runs just fine against mpich. I have tested it against > openmpi 1.2.5, 1.2.6, and 1.2.7 and they all exhibit the same problem. > Also, the problem only occurs in openmpi when running more than 16 > processes. I have posted this stack trace to the list before, but I am > submitting it now as a potential bug report. I need some help debugging it > and finding out exactly what is going on in openmpi when the segfault > occurs. Are there any suggestions on how best to do this? Is there an easy > way to attach gdb to one of the processes or something?? I have already > compiled openmpi with debugging, memory profiling, etc. How can I best take > advantage of these features? > > Thanks, > Daniel Hansen > Systems Administrator > BYU Fulton Supercomputing Lab >