Hi, While debugging a mysterious crash of a code, I was able to trace down to a SIGSEGV in OMPI 1.6 and 1.6.1. The offending code is in opal/mca/memory/linux/malloc.c. Please see the following gdb log.
(gdb) c Continuing. Program received signal SIGSEGV, Segmentation fault. opal_memory_ptmalloc2_int_free (av=0x2fd0637, mem=0x203a746f74512000) at malloc.c:4385 4385 nextsize = chunksize(nextchunk); (gdb) l 4380 Consolidate other non-mmapped chunks as they arrive. 4381 */ 4382 4383 else if (!chunk_is_mmapped(p)) { 4384 nextchunk = chunk_at_offset(p, size); 4385 nextsize = chunksize(nextchunk); 4386 assert(nextsize > 0); 4387 4388 /* consolidate backward */ 4389 if (!prev_inuse(p)) { (gdb) bt #0 opal_memory_ptmalloc2_int_free (av=0x2fd0637, mem=0x203a746f74512000) at malloc.c:4385 #1 0x00002ae6b18ea0c0 in opal_memory_ptmalloc2_free (mem=0x2fd0637) at malloc.c:3511 #2 0x00002ae6b18ea736 in opal_memory_linux_free_hook (__ptr=0x2fd0637, caller=0x203a746f74512000) at hooks.c:705 #3 0x0000000001412fcc in for_dealloc_allocatable () #4 0x00000000007767b1 in ALLOC::dealloc_d2 (array=@0x2fd0647, name=@0x6f6e6f69006f6e78, routine=Cannot access memory at address 0x0 ) at alloc.F90:1357 #5 0x000000000082628c in M_LDAU::hubbard_term (scell=..., nua=@0xd5, na=@0xd5, isa=..., xa=..., indxua=..., maxnh=@0xcf4ff, maxnd=@0xcf4ff, lasto=..., iphorb=..., numd=..., listdptr=..., listd=..., numh=..., listhptr=..., listh=..., nspin=@0xcf4ff00000002, dscf=..., eldau=@0x0, deldau=@0x0, fa=..., stress=..., h=..., first=@0x0, last=@0x0) at ldau.F:752 #6 0x00000000006cd532 in M_SETUP_HAMILTONIAN::setup_hamiltonian (first=@0x0, last=@0x0, iscf=@0x2) at setup_hamiltonian.F:199 #7 0x000000000070e257 in M_SIESTA_FORCES::siesta_forces (istep=@0xf9a4d07000000000) at siesta_forces.F:90 #8 0x000000000070e475 in siesta () at siesta.F:23 #9 0x000000000045e47c in main () Can anybody shed some light here on what could be wrong? Thanks, Yong Qin