Re: [Discuss][Format] Add 32-bit and 64-bit Decimals

Micah Kornfield Mon, 07 Mar 2022 11:26:42 -0800

>
> Relaxing from {128,256} to {32,64,128,256} seems a low risk
> from an integration perspective, as implementations already need to read
> the bitwidth to select the appropriate physical representation (if they
> support it).


I think there are two reasons for having implementations first.
1.  Lower risk bugs in implementation/spec.
2.  A mechanism to ensure that there is some boot-strapped coverage in
commonly used reference implementations.

I agree 1, is fairly low-risk.

On Mon, Mar 7, 2022 at 11:11 AM Jorge Cardoso Leitão <
jorgecarlei...@gmail.com> wrote:

> +1 adding 32 and 64 bit decimals.
>
> +0 to release it without integration tests - both IPC and the C data
> interface use a variable bit width to declare the appropriate size for
> decimal types. Relaxing from {128,256} to {32,64,128,256} seems a low risk
> from an integration perspective, as implementations already need to read
> the bitwidth to select the appropriate physical representation (if they
> support it).
>
> Best,
> Jorge
>
>
>
>
> On Mon, Mar 7, 2022, 11:41 Antoine Pitrou <anto...@python.org> wrote:
>
> >
> > Le 03/03/2022 à 18:05, Micah Kornfield a écrit :
> > > I think this makes sense to add these.  Typically when adding new
> types,
> > > we've waited  on the official vote until there are two reference
> > > implementations demonstrating compatibility.
> >
> > You are right, I had forgotten about that.  Though in this case, it
> > might be argued we are just relaxing the constraints on an existing type.
> >
> > What do others think?
> >
> > Regards
> >
> > Antoine.
> >
> >
> > >
> > > On Thu, Mar 3, 2022 at 6:55 AM Antoine Pitrou <anto...@python.org>
> > wrote:
> > >
> > >>
> > >> Hello,
> > >>
> > >> Currently, the Arrow format specification restricts the bitwidth of
> > >> decimal numbers to either 128 or 256 bits.
> > >>
> > >> However, there is interest in allowing other bitwidths, at least 32
> and
> > >> 64 bits for this proposal. A 64-bit (respectively 32-bit) decimal
> > >> datatype would allow for precisions of up to 18 digits (respectively 9
> > >> digits), which are sufficient for some applications which are mainly
> > >> looking for exact computations rather than sheer precision. Obviously,
> > >> smaller datatypes are cheaper to store in memory and cheaper to run
> > >> computations on.
> > >>
> > >> For example, the Spark documentation mentions that some decimal types
> > >> may fit in a Java int (32 bits) or long (64 bits):
> > >>
> > >>
> >
> https://spark.apache.org/docs/latest/api/java/org/apache/spark/sql/types/DecimalType.html
> > >>
> > >> ... and a draft PR had even been filed for initial support in the C++
> > >> implementation (https://github.com/apache/arrow/pull/8578).
> > >>
> > >> I am therefore proposing that we relax the wording in the Arrow format
> > >> specification to also allow 32- and 64-bit decimal types.
> > >>
> > >> This is a preliminary discussion to gather opinions and potential
> > >> counter-arguments against this proposal. If no strong counter-argument
> > >> emerges, we will probably run a vote in a week or two.
> > >>
> > >> Best regards
> > >>
> > >> Antoine.
> > >>
> > >
> >
>

Re: [Discuss][Format] Add 32-bit and 64-bit Decimals

Reply via email to