In develop, H5MM_malloc() and H5MM_calloc() will throw an assert if size is zero. That should not be there and the function docs even say that we return NULL on size zero.
The bad line is at lines 271 and 360 in H5MM.c if you want to try yanking that out and rebuilding. Dana On 11/9/17, 09:06, "Hdf-forum on behalf of Michael K. Edwards" <hdf-forum-boun...@lists.hdfgroup.org on behalf of m.k.edwa...@gmail.com> wrote: Actually, it's not the H5Screate() that crashes; that works fine since HDF5 1.8.7. It's a zero-sized malloc somewhere inside the call to H5Dwrite(), possibly in the filter. I think this is close to resolution; just have to get tools on it. On Thu, Nov 9, 2017 at 8:47 AM, Michael K. Edwards <m.k.edwa...@gmail.com> wrote: > Apparently this has been reported before as a problem with PETSc/HDF5 > integration: https://lists.mcs.anl.gov/pipermail/petsc-users/2012-January/011980.html > > On Thu, Nov 9, 2017 at 8:37 AM, Michael K. Edwards > <m.k.edwa...@gmail.com> wrote: >> Thank you for the validation, and for the suggestion to use >> H5Sselect_none(). That is probably the right thing for the dataspace. >> Not quite sure what to do about the memspace, though; the comment is >> correct that we crash if any of the dimensions is zero. >> >> On Thu, Nov 9, 2017 at 8:34 AM, Jordan Henderson >> <jhender...@hdfgroup.org> wrote: >>> It seems you're discovering the issues right as I'm typing this! >>> >>> >>> I'm glad you were able to solve the issue with the hanging. I was starting >>> to suspect an issue with the MPI implementation but it's usually the last >>> thing on the list after inspecting the code itself. >>> >>> >>> As you've seen, it seems that PETSc is creating a NULL dataspace for the >>> ranks which are not contributing, instead of creating a Scalar/Simple >>> dataspace on all ranks and calling H5Sselect_none() for those that don't >>> participate. This would most likely explain the reason you saw the assertion >>> failure in the non-filtered case, as the legacy code probably was not >>> expecting to receive a NULL dataspace. On top of that, the NULL dataspace >>> seems like it is causing the parallel operation to break collective mode, >>> which is not allowed when filters are involved. I would need to do some >>> research as to why this happens before deciding whether it's more >>> appropriate to modify this in HDF5 or to have PETSc not use NULL dataspaces. >>> >>> >>> Avoiding deadlock from the final sort has been an issue I had to re-tackle a >>> few different times due to the nature of the code's complexity, but I will >>> investigate using the chunk offset as a secondary sort key and see if it >>> will run into problems in any other cases. Ideally, the chunk redistribution >>> might be updated in the future to involve all ranks in the operation instead >>> of just rank 0, also allowing for improvements to the redistribution >>> algorithm that may solve these problems, but for the time being this may be >>> sufficient. _______________________________________________ Hdf-forum is for HDF software users discussion. Hdf-forum@lists.hdfgroup.org http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org Twitter: https://twitter.com/hdf5 _______________________________________________ Hdf-forum is for HDF software users discussion. Hdf-forum@lists.hdfgroup.org http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org Twitter: https://twitter.com/hdf5