That's certainly an option, too.

On Tue, Jul 2, 2019 at 9:40 PM Micah Kornfield <emkornfi...@gmail.com> wrote:
>
> Hi Wes,
> Just a question, I'm ok going either way on this but why not a new variable
> width decimal type and deprecating the old one instead of breaking forward
> compatibility?
>
> Thanks,
> Micah
>
> On Tuesday, July 2, 2019, Wes McKinney <wesmck...@gmail.com> wrote:
>
> > Note that if we do make this change as described, it will probably
> > need to accompany a bump in the MetadataVersion (for
> > forward-compatibility reasons, otherwise old clients won't be able to
> > distinguish one decimal type from another). But that seems prudent
> > regardless to force an upgrade to the stable 1.x.x series of releases.
> >
> > Are there any other opinions about this? I can bring a vote about it
> > and we can decide when to actually commit a patch based on the rest of
> > the 1.0.0 timeline.
> >
> > On Tue, Jun 11, 2019 at 11:29 AM Ravindra Pindikura <ravin...@dremio.com>
> > wrote:
> > >
> > > On Tue, Jun 11, 2019 at 2:48 AM Wes McKinney <wesmck...@gmail.com>
> > wrote:
> > >
> > > > On the 1.0.0 protocol discussion, one item that we've skirted for some
> > > > time is other decimal sizes:
> > > >
> > > > https://issues.apache.org/jira/browse/ARROW-2009
> > > >
> > > > I understand this is a loaded subject since a deliberate decision was
> > > > made to remove types from the initial Java implementation of Arrow
> > > > that was forked from Apache Drill. However, it's a friction point that
> > > > has come up in a number of scenarios as many database and storage
> > > > systems have 32- and 64-bit variants for low precision decimal data.
> > > > As an example Apache Kudu [1] has all three types, and the Parquet
> > > > columnar format allows not only 32/64 bit storage but fixed size
> > > > binary (size a function of precision) and variable-length binary
> > > > encoding [2].
> > > >
> > > > One of the arguments against using these types in a computational
> > > > setting is that many mathematical operations will necessarily trigger
> > > > an up-promotion to a larger type. It's hard for us to predict how
> > > > people will use the Arrow format, though, and the current situation is
> > > > forcing an up-promotion regardless of how the format is being used,
> > > > even for simple data transport
> > > >
> > > > In anticipation of long-term needs, I would suggest a possible
> > solution of:
> > > >
> > > > * Adding bitWidth field to Decimal table in Schema.fbs [3] with
> > > > default value of 128
> > > >
> > >
> > > +1
> > >
> > >
> > > > * Constraining bit widths to 32, 64, and 128 bits for the time being
> > > > * Permit storage of smaller precision decimals in larger storage like
> > > > we have now
> > > >
> > > > If this isn't deemed desirable by the community, decimal extension
> > > > types could be employed for serialization-free transport for smaller
> > > > decimals, but I view this as suboptimal.
> > > >
> > > > Interested in the thoughts of others.
> > > >
> > > > thanks
> > > > Wes
> > > >
> > > > [1]:
> > > > https://github.com/apache/kudu/blob/master/src/kudu/
> > common/common.proto#L55
> > > > [2]:
> > > > https://github.com/apache/parquet-format/blob/master/
> > LogicalTypes.md#decimal
> > > > [3]: https://github.com/apache/arrow/blob/master/format/
> > Schema.fbs#L121
> > > >
> > >
> > >
> > > --
> > > Thanks and regards,
> > > Ravindra.
> >

Reply via email to