Just a wild guess, but I have seen similar error messages when accidentally calling H5Aclose on a dataset handle (which should be closed with H5Dclose...), i.e., make sure you are calling the right H5?close on handles.
Best, G. From: Hdf-forum [mailto:[email protected]] On Behalf Of Mehmet Belgin Sent: Friday, November 15, 2013 4:15 PM To: [email protected] Subject: [Hdf-forum] HDF5 1.8.10 crash with Parallel MPB Hi Everyone, I am trying to help one of our researchers for running MPB (http://ab-initio.mit.edu/wiki/index.php/MIT_Photonic_Bands) on our clusters, which uses HDF5. For OpenMPI based compilations, the process just hangs. For Mvapich2 stack, HDF fails with the error message copied below. This only happens when using multiple cores/nodes and the code works well for a sequential run. I could not make sense of the error messages and blindly tested a few compiler/MPI combinations, but none seem to work. I will appreciate *any* suggestions for fixing or troubleshooting this problem! Thanks in advance, -Mehmet ========================== HDF5-DIAG: Error detected in HDF5 (1.8.10-patch1) MPI-process 0: #000: H5F.c line 2058 in H5Fclose(): decrementing file ID failed major: Object atom minor: Unable to close file #001: H5I.c line 1479 in H5I_dec_app_ref(): can't decrement ID ref count major: Object atom minor: Unable to decrement reference count #002: H5F.c line 1835 in H5F_close(): can't close file major: File accessability minor: Unable to close file #003: H5F.c line 1997 in H5F_try_close(): problems closing file major: File accessability minor: Unable to close file #004: H5F.c line 1142 in H5F_dest(): low level truncate failed major: File accessability minor: Write failed #005: H5FD.c line 1897 in H5FD_truncate(): driver truncate request failed major: Virtual File Layer minor: Can't update object #006: H5FDmpio.c line 1984 in H5FD_mpio_truncate(): MPI_File_set_size failed major: Internal error (too specific to document in detail) minor: Some MPI function failed #007: H5FDmpio.c line 1984 in H5FD_mpio_truncate(): Invalid argument, error stack: MPI_FILE_SET_SIZE(74): Inconsistent arguments to collective routine major: Internal error (too specific to document in detail) minor: MPI Error String CHECK failure on line 400 of matrixio.c: error closing HDF file [cli_0]: aborting job: application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0 HDF5-DIAG: Error detected in HDF5 (1.8.10-patch1) MPI-process 1: #000: H5F.c line 2058 in H5Fclose(): decrementing file ID failed major: Object atom minor: Unable to close file #001: H5I.c line 1479 in H5I_dec_app_ref(): can't decrement ID ref count major: Object atom minor: Unable to decrement reference count #002: H5F.c line 1835 in H5F_close(): can't close file major: File accessability minor: Unable to close file #003: H5F.c line 1997 in H5F_try_close(): problems closing file major: File accessability minor: Unable to close file #004: H5F.c line 1142 in H5F_dest(): low level truncate failed major: File accessability minor: Write failed #005: H5FD.c line 1897 in H5FD_truncate(): driver truncate request failed major: Virtual File Layer minor: Can't update object #006: H5FDmpio.c line 1984 in H5FD_mpio_truncate(): MPI_File_set_size failed major: Internal error (too specific to document in detail) minor: Some MPI function failed #007: H5FDmpio.c line 1984 in H5FD_mpio_truncate(): Invalid argument, error stack: MPI_FILE_SET_SIZE(74): Inconsistent arguments to collective routine major: Internal error (too specific to document in detail) minor: MPI Error String CHECK failure on line 400 of matrixio.c: error closing HDF file [cli_1]: aborting job: application called MPI_Abort(MPI_COMM_WORLD, 1) - process 1 [iw-h43-29:mpi_rank_0][error_sighandler] Caught error: Segmentation fault (signal 11) [iw-h43-29:mpi_rank_1][error_sighandler] Caught error: Segmentation fault (signal 11)
_______________________________________________ Hdf-forum is for HDF software users discussion. [email protected] http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
