On Fri, Aug 14, 2020 at 11:17 PM Micah Kornfield <emkornfi...@gmail.com> wrote: > > Hi Jacques, > > Do we have a good definition of what is necessary to add a new data type? > > Adding a type but not pulling it through most of the code seems less than > > ideal since it means one part of Arrow doesn't work with another (providing > > a less optimal end-user experience). > > I think what I proposed below is a minimum viable integration plan (and > covers previously discussed requirements for new types). It demonstrates > interop between two reference implementations and allows conversion to/from > idiomatic language analogues. So it covers the basic goal of "arrow > interop". > > > For example, would this work include making Gandiva and all the kernels > > support this new data type where appropriate? > > Not initially. There needs to be a stepping stone to start supporting new > types. I don't think it is feasible to try to land all of this > functionality in one PR. I'll lend a hand at trying get support for > built-in compute after we get the first part done.
Since (I think?) there are other data types that Gandiva already does not support, trying to use decimal256 data with Gandiva would raise the same exception that it would raise with an unsupported type. Another option would be to insert an implicit cast to decimal128 as a stopgap. > Thanks, > Micah > > > > On Fri, Aug 14, 2020 at 5:08 PM Jacques Nadeau <jacq...@apache.org> wrote: > > > Do we have a good definition of what is necessary to add a new data type? > > Adding a type but not pulling it through most of the code seems less than > > ideal since it means one part of Arrow doesn't work with another (providing > > a less optimal end-user experience). > > > > For example, would this work include making Gandiva and all the kernels > > support this new data type where appropriate? > > > > On Wed, Aug 5, 2020 at 12:01 PM Wes McKinney <wesmck...@gmail.com> wrote: > > > > > Sounds fine to me. I guess one question is what needs to be formalized > > > in the Schema.fbs files or elsewhere in the columnar format > > > documentation (and we will need to hold an associated vote for that I > > > think) > > > > > > On Mon, Aug 3, 2020 at 11:30 PM Micah Kornfield <emkornfi...@gmail.com> > > > wrote: > > > > > > > > Given no objections, we'll go ahead and start implementing support for > > > 256-bit decimals. > > > > > > > > I'm considering setting up another branch to develop all the components > > > so they can be merged to master atomically. > > > > > > > > Thanks, > > > > Micah > > > > > > > > On Tue, Jul 28, 2020 at 6:39 AM Wes McKinney <wesmck...@gmail.com> > > > wrote: > > > >> > > > >> Generally this sounds fine to me. At some point it would be good to > > > >> add 32-bit and 64-bit decimal support but this can be done in the > > > >> future. > > > >> > > > >> On Tue, Jul 28, 2020 at 7:28 AM Fan Liya <liya.fa...@gmail.com> > > wrote: > > > >> > > > > >> > Hi Micah, > > > >> > > > > >> > Thanks for opening the discussion. > > > >> > I am aware of some scenarios where decimal requires more than 16 > > > bytes, so > > > >> > I think it would be beneficial to support this in Arrow. > > > >> > > > > >> > Best, > > > >> > Liya Fan > > > >> > > > > >> > > > > >> > On Tue, Jul 28, 2020 at 11:12 AM Micah Kornfield < > > > emkornfi...@gmail.com> > > > >> > wrote: > > > >> > > > > >> > > Hi Arrow Dev, > > > >> > > ZetaSQL (Google's open source standard SQL library) recently > > > introduced a > > > >> > > BigNumeric [1] type which requires a 256 bit width to properly > > > support it. > > > >> > > I'd like to add support (possibly in collaboration with some of my > > > >> > > colleagues) to add support for 256 bit width Decimals in Arrow to > > > support a > > > >> > > type corresponding to BigNumeric. > > > >> > > > > > >> > > In past discussions on this, I don't think we established a > > minimum > > > bar for > > > >> > > supporting additional bit-widths within Arrow. > > > >> > > > > > >> > > I'd like to propose the following requirements: > > > >> > > 1. A vote agreeing on adding support for a new bitwidth (we can > > > discuss > > > >> > > any objections here). > > > >> > > 2. Support in Java and C++ for integration tests verifying the > > > ability to > > > >> > > round-trip the value. > > > >> > > 3. Support in Java for conversion to/from BigDecimal [2] > > > >> > > 4. Support in Python converting to/from Decimal [3] > > > >> > > > > > >> > > Is there anything else that people feel like is a requirement for > > > basic > > > >> > > support of an additional bit width for Decimal's? > > > >> > > > > > >> > > Thanks, > > > >> > > Micah > > > >> > > > > > >> > > > > > >> > > [1] > > > >> > > > > > >> > > > > > > > https://github.com/google/zetasql/blob/1aefaa7c62fc7a50def879bb7c4225ec6974b7ef/zetasql/public/numeric_value.h#L486 > > > >> > > [2] > > > https://docs.oracle.com/javase/7/docs/api/java/math/BigDecimal.html > > > >> > > [3] https://docs.python.org/3/library/decimal.html > > > >> > > > > > > >