On Tue, Jun 22, 2021 at 12:27 PM Antoine Pitrou <anto...@python.org> wrote:
> On Mon, 21 Jun 2021 23:50:29 -0400 > Ying Zhou <yzhou7...@gmail.com> wrote: > > Hi, > > > > In data people use there are often bounded numbers, mostly integers with > clear and fixed upper and lower bounds but also decimals and floats as well > e.g. test scores, numerous codes in older databases, max temperature of a > city, latitudes, longitudes, numerous IDs etc. I wonder whether we should > include such types in Arrow (and more importantly in Parquet & Avro where > size matters a lot more). > > You are expressing two separate concerns here: > 1. expressing the semantics (and perhaps enforcing them, e.g. return an > error when an addition gives a result out of bounds) > I wonder if DictionaryArray could be a foundation for such semantics. It doesn't seem unreasonable to have a check that prevents you from adding values that are outside of the values accepted by the dictionary. Seems reasonable to implement most things like test scores, temperatures etc... Probably unreasonable for things with a bigger domain of valid values like coordinates and floats in general.