I'm afraid we have lost our checkpoint/restart support, so we probably won't be able to address this unless he just happens to glance in at some time. Only suggestion I could make is to not enable the thread options as thread support is weak at best.
On Jan 4, 2013, at 4:34 PM, William Au <au_wai_ch...@hotmail.com> wrote: > Hi, > > I encountered a core dump when using ompi-checkpoint --term pid. > > Here is the trace: > > [genova:01808] *** Process received signal *** > [genova:01808] Signal: Segmentation fault (11) > [genova:01808] Signal code: Address not mapped (1) > [genova:01808] Failing at address: 0x90 > [genova:01808] [ 0] /lib64/libpthread.so.0 [0x3a78a0ebe0] > [genova:01808] [ 1] > /import/cad-capex2/wa156553/openmpi-1.6_x86_64_i4/lib/openmpi/mca_crcp_bkmrk.so > [0x2aaaaefe110b] > [genova:01808] [ 2] > /import/cad-capex2/wa156553/openmpi-1.6_x86_64_i4/lib/openmpi/mca_crcp_bkmrk.so > [0x2aaaaefe4952] > [genova:01808] [ 3] > /import/cad-capex2/wa156553/openmpi-1.6_x86_64_i4/lib/openmpi/mca_crcp_bkmrk.so(ompi_crcp_bkmrk_pml_ft_event+0x74e) > [0x2aaaaefe5b9e] > [genova:01808] [ 4] > /import/cad-capex2/wa156553/openmpi-1.6_x86_64_i4/lib/openmpi/mca_pml_crcpw.so(mca_pml_crcpw_ft_event+0x59) > [0x2aaaacc1eea9] > [genova:01808] [ 5] > /import/cad-capex2/wa156553/openmpi-1.6_x86_64_i4/lib/libmpi.so.1(ompi_cr_coord+0xe0) > [0x2b95b29a5690] > [genova:01808] [ 6] > /import/cad-capex2/wa156553/openmpi-1.6_x86_64_i4/lib/libmpi.so.1(opal_cr_inc_core_prep+0xc) > [0x2b95b2a6017c] > [genova:01808] [ 7] > /import/cad-capex2/wa156553/openmpi-1.6_x86_64_i4/lib/openmpi/mca_snapc_full.so > [0x2aaaab7d9d15] > [genova:01808] [ 8] > /import/cad-capex2/wa156553/openmpi-1.6_x86_64_i4/lib/libmpi.so.1(opal_cr_test_if_checkpoint_ready+0x52) > [0x2b95b2a60282] > [genova:01808] [ 9] > /import/cad-capex2/wa156553/openmpi-1.6_x86_64_i4/lib/libmpi.so.1 > [0x2b95b2a60ec1] > [genova:01808] [10] /lib64/libpthread.so.0 [0x3a78a0677d] > [genova:01808] [11] /lib64/libc.so.6(clone+0x6d) [0x3a77ad3c1d] > [genova:01808] *** End of error message *** > [genova:01807] local) Error: Unable to read state from named pipe > (/tmp/opal_cr_prog_write.1808). 0 > [genova:01807] [[8178,0],0] ORTE_ERROR_LOG: Error in file snapc_full_local.c > at line 1602 > [genova:01807] local) Error: Unable to read state from named pipe > (/tmp/opal_cr_prog_write.1810). 0 > [genova:01807] [[8178,0],0] ORTE_ERROR_LOG: Error in file snapc_full_local.c > at line 1602 > [genova:01807] local) Error: Unable to read state from named pipe > (/tmp/opal_cr_prog_write.1809). 0 > [genova:01807] [[8178,0],0] ORTE_ERROR_LOG: Error in file snapc_full_local.c > at line 1602 > > I configure with the following options: > > ./configure --enable-heterogeneous --enable-cxx-exceptions --enable-shared > --enable-orterun-prefix-by-default --enable-mpi-f90 --with-mpi-f90-size=small > --with-ft=cr --with-blcr=/opt/blcr --with-blcr-libdir=/opt/blcr/lib > --enable-ft-thread --enable-opal-multi-threads > > I am using openmpi 1.6. > > Any idea where I should look? > > Thanks. > > Regards, > > William > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users