Re: [Hdf-forum] Collective IO and filters

Jordan Henderson Wed, 08 Nov 2017 13:48:07 -0800

Dana,


would it then make sense for all outside filters to use these routines? Due to 
Parallel Compression's internal nature, it uses buffers allocated via H5MM_ 
routines to collect and scatter data, which works fine for the internal filters 
like deflate, since they use these as well. However, since some of the outside 
filters use the raw malloc/free routines, causing issues, I'm wondering if 
having all outside filters use the H5_ routines is the cleanest solution..


Michael,


Based on the "num_writers: 4" field, the NULL "receive_requests_array" and the 
fact that for the same chunk, rank 0 shows "original owner: 0, new owner: 0" 
and rank 3 shows "original owner: 3, new_owner: 0", it seems as though everyone 
IS interested in the chunk the rank 0 is now working on, but now I'm more 
confident that at some point either the messages may have failed to send or 
rank 0 is having problems finding the messages.


Since in the unfiltered case it won't hit this particular code path, I'm not 
surprised that that case succeeds. If I had to make another guess based on 
this, I would be inclined to think that rank 0 must be hanging on the 
MPI_Mprobe due to a mismatch in the "tag" field. I use the index of the chunk 
as the tag for the message in order to funnel specific messages to the correct 
rank for the correct chunk during the last part of the chunk redistribution and 
if rank 0 can't match the tag it of course won't find the message. Why this 
might be happening, I'm not entirely certain currently.

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

Re: [Hdf-forum] Collective IO and filters

Reply via email to