On Tue, Jun 11, 2019 at 2:48 AM Wes McKinney <wesmck...@gmail.com> wrote:
> On the 1.0.0 protocol discussion, one item that we've skirted for some > time is other decimal sizes: > > https://issues.apache.org/jira/browse/ARROW-2009 > > I understand this is a loaded subject since a deliberate decision was > made to remove types from the initial Java implementation of Arrow > that was forked from Apache Drill. However, it's a friction point that > has come up in a number of scenarios as many database and storage > systems have 32- and 64-bit variants for low precision decimal data. > As an example Apache Kudu [1] has all three types, and the Parquet > columnar format allows not only 32/64 bit storage but fixed size > binary (size a function of precision) and variable-length binary > encoding [2]. > > One of the arguments against using these types in a computational > setting is that many mathematical operations will necessarily trigger > an up-promotion to a larger type. It's hard for us to predict how > people will use the Arrow format, though, and the current situation is > forcing an up-promotion regardless of how the format is being used, > even for simple data transport > > In anticipation of long-term needs, I would suggest a possible solution of: > > * Adding bitWidth field to Decimal table in Schema.fbs [3] with > default value of 128 > +1 > * Constraining bit widths to 32, 64, and 128 bits for the time being > * Permit storage of smaller precision decimals in larger storage like > we have now > > If this isn't deemed desirable by the community, decimal extension > types could be employed for serialization-free transport for smaller > decimals, but I view this as suboptimal. > > Interested in the thoughts of others. > > thanks > Wes > > [1]: > https://github.com/apache/kudu/blob/master/src/kudu/common/common.proto#L55 > [2]: > https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#decimal > [3]: https://github.com/apache/arrow/blob/master/format/Schema.fbs#L121 > -- Thanks and regards, Ravindra.