Thanks for your replies, Mark! So what I'd like to have happen is as follows (still using the case of a uint32_t dataset in memory).
* When the dataset has values >= 2^16-1 (== 65535) that are *not* equal to 2^32-1, the dataset will just be saved to disk also as a bog-standard uint32_t dataset. * When the dataset has ONLY values < 2^16-1, except perhaps for values equal to 2^32-1, the dataset will be saved to disk as a bog-standard uint16_t dataset, where any instances of 2^32-1 in RAM get translated to instances of 2^16-1 on disk. That is, later users that look at the HDF5 file in the second case will see a 16-bit dataset, and may see values of 2^16-1 ... that's fine, there is no need for those users to see 2^32-1 instead. I do a pre-check and switch to determine which of these cases to fall into. After some more research online, I found this document: https://support.hdfgroup.org/HDF5/doc/Supplements/dtype_conversion/Conversion.html and I see that it says if there is no "conversion exception" callback defined, then any conversion between integers is going to be a "hard conversion" which acts just as if one wrote out_type out = (out_type)in; So I think I actually don't need to do anything special to take care of the conversion! ... since the code "uint16_t out = (uint16_t)in" would translate 2^32-1 to 2^16-1 automatically, due to the modulus properties of unsigned arithmetic. Best regards, Kevin On Thu, Oct 12, 2017 at 11:18 AM, Miller, Mark C. <mille...@llnl.gov> wrote: > You know...I think my response here is a bit confused. > > > > Here's what I am getting at...an HDF5 dataset has a datatype associated with > it. That type is determined at the time the dataset is created. The question > is whether you want such datasets to be seen as *always* having uint32_t > even if they are stored on disk as 16 bit or whether you plan to have the > dataset's type to be determined by the condition of whether the data is > indeed 16 bits or not? Obviously, in the latter case the caller winds up > having to take some special action to select the data type upon creating the > dataset to write to. In the former, the caller just always creates what it > thinks are 32 bit datasets and then writes the data from memory to those > datasets and, magic happens, and only 16 bit data is stored in the file if > indeed the data written all fits in 16 bits. > > > > Hope that makes a tad more sense. > > > > Mark > > > > > > > > "Miller, Mark C." wrote: > > > > Hmmm. Do you care about whether the HDF5 dataset's type in the file shows as > "uint32_t" for example? Or, do you simply care that you are not wasting > space storing an array of 16 bit values using a 32 bit type? > > > > If you DO NOT CARE about the HDF5's dataset type, my first thought would be > to handle this as a filter instead of a type_cb callback. Have you > considered that? > > > > · https://support.hdfgroup.org/HDF5/doc/RM/RM_H5Z.html > > · > https://support.hdfgroup.org/HDF5/doc/RM/RM_H5P.html#Property-SetFilter > > · > https://support.hdfgroup.org/HDF5/doc/Advanced/DynamicallyLoadedFilters/ > > > > The filter could handle *both* the type size change and the special value > mapping much like any "compression" filter would. > > > > Now, consumers of the dataset data in memory would still only ever think > that the type in memory was a 32 bit type, but the data stored on disk would > be 16 bits. > > > > Now, if you really do want the dataset's type to be some 16 bit type so that > things like h5ls, h5dump, H5Dget_type all return a known, 16 bit type, then > yeah, probably a custom type conversion is the way to go? But, note that it > will still appear to be a *custom* type to HDF5 and not a built-in 16 bit > type. Also, I don't think type conversion can be handled as a 'plugin' in > the same way filters are so that anyone reading that data, would also need > to have linked with (e.g. HDF5 will not -- at least I don't think it will -- > load custom type conversion code from some plugin) your implementation of > that type conversion callback. > > > > Hope that helps. > > > > Mark > > > > > > > > "Hdf-forum on behalf of Kevin B. McCarty" wrote: > > > > Hi list, > > > > I am doing some work that should convert integer datasets > > automatically from a larger integer type in memory, to a smaller > > integer type on disk, when appropriate. > > > > To give a concrete example: I might have code that converts a uint32_t > > dataset in memory to a uint16_t dataset on disk if it turns out that > > the values in the in-memory dataset all can be expressed losslessly in > > 16 bits. > > > > The problem is that I wish to allow for the possibility of one > > specific value that does *not* fit in 16 bits, which however I'd like > > to translate to a suitable 16-bit replacement value on disk. That is: > > > > if (memValue == (uint32_t)(-1)) > > diskValue = (uint16_t)(-1); /* Quietly replace all instances of > > 4294967295 in RAM with 65535 on-disk */ > > > > It seems clear that in order to effect this automatic replacement, I > > need to write a callback to be given to H5Pset_type_conv_cb() that > > will catch overflows and make them instead quietly translate the > > out-of-range value to the desired replacement value. What I don't > > understand is what code should go in the body of the callback function > > to do this. (Feel free to assume that the only out-of-range value > > that might occur will be the specific value I wish to translate.) > > > > I've not been able to find any examples showing how to write such a > > callback function online. Advice would be greatly appreciated! > > > > Thanks in advance, > > > > -- > > Kevin B. McCarty > > <kmcca...@gmail.com> > > > > _______________________________________________ > > Hdf-forum is for HDF software users discussion. > > Hdf-forum@lists.hdfgroup.org > > http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org > > Twitter: https://twitter.com/hdf5 > > > > > _______________________________________________ > Hdf-forum is for HDF software users discussion. > Hdf-forum@lists.hdfgroup.org > http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org > Twitter: https://twitter.com/hdf5 -- Kevin B. McCarty <kmcca...@gmail.com> _______________________________________________ Hdf-forum is for HDF software users discussion. Hdf-forum@lists.hdfgroup.org http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org Twitter: https://twitter.com/hdf5