> > Relaxing from {128,256} to {32,64,128,256} seems a low risk > from an integration perspective, as implementations already need to read > the bitwidth to select the appropriate physical representation (if they > support it).
I think there are two reasons for having implementations first. 1. Lower risk bugs in implementation/spec. 2. A mechanism to ensure that there is some boot-strapped coverage in commonly used reference implementations. I agree 1, is fairly low-risk. On Mon, Mar 7, 2022 at 11:11 AM Jorge Cardoso Leitão < jorgecarlei...@gmail.com> wrote: > +1 adding 32 and 64 bit decimals. > > +0 to release it without integration tests - both IPC and the C data > interface use a variable bit width to declare the appropriate size for > decimal types. Relaxing from {128,256} to {32,64,128,256} seems a low risk > from an integration perspective, as implementations already need to read > the bitwidth to select the appropriate physical representation (if they > support it). > > Best, > Jorge > > > > > On Mon, Mar 7, 2022, 11:41 Antoine Pitrou <anto...@python.org> wrote: > > > > > Le 03/03/2022 à 18:05, Micah Kornfield a écrit : > > > I think this makes sense to add these. Typically when adding new > types, > > > we've waited on the official vote until there are two reference > > > implementations demonstrating compatibility. > > > > You are right, I had forgotten about that. Though in this case, it > > might be argued we are just relaxing the constraints on an existing type. > > > > What do others think? > > > > Regards > > > > Antoine. > > > > > > > > > > On Thu, Mar 3, 2022 at 6:55 AM Antoine Pitrou <anto...@python.org> > > wrote: > > > > > >> > > >> Hello, > > >> > > >> Currently, the Arrow format specification restricts the bitwidth of > > >> decimal numbers to either 128 or 256 bits. > > >> > > >> However, there is interest in allowing other bitwidths, at least 32 > and > > >> 64 bits for this proposal. A 64-bit (respectively 32-bit) decimal > > >> datatype would allow for precisions of up to 18 digits (respectively 9 > > >> digits), which are sufficient for some applications which are mainly > > >> looking for exact computations rather than sheer precision. Obviously, > > >> smaller datatypes are cheaper to store in memory and cheaper to run > > >> computations on. > > >> > > >> For example, the Spark documentation mentions that some decimal types > > >> may fit in a Java int (32 bits) or long (64 bits): > > >> > > >> > > > https://spark.apache.org/docs/latest/api/java/org/apache/spark/sql/types/DecimalType.html > > >> > > >> ... and a draft PR had even been filed for initial support in the C++ > > >> implementation (https://github.com/apache/arrow/pull/8578). > > >> > > >> I am therefore proposing that we relax the wording in the Arrow format > > >> specification to also allow 32- and 64-bit decimal types. > > >> > > >> This is a preliminary discussion to gather opinions and potential > > >> counter-arguments against this proposal. If no strong counter-argument > > >> emerges, we will probably run a vote in a week or two. > > >> > > >> Best regards > > >> > > >> Antoine. > > >> > > > > > >