hi Micah -- this makes sense to me
On Fri, Sep 25, 2020 at 10:56 PM Micah Kornfield <emkornfi...@gmail.com> wrote: > > The decimal256 branch now contains sufficient implementations in Java and C++ > to pass round trip integration tests. Some of python interop is missing (but > actively being worked on). > > I'll plan on creating PR to update the specification with a corresponding > vote over the next couple of days. > > One thing Antoine brought up was if it makes sense to try to merge the > contents of Decimal256 sooner rather than later to master to avoid > accumulating a much larger PR. > > Thoughts? > > Thanks, > Micah > > On Sat, Aug 15, 2020 at 8:48 AM Wes McKinney <wesmck...@gmail.com> wrote: >> >> On Fri, Aug 14, 2020 at 11:17 PM Micah Kornfield <emkornfi...@gmail.com> >> wrote: >> > >> > Hi Jacques, >> > >> > Do we have a good definition of what is necessary to add a new data type? >> > > Adding a type but not pulling it through most of the code seems less than >> > > ideal since it means one part of Arrow doesn't work with another >> > > (providing >> > > a less optimal end-user experience). >> > >> > I think what I proposed below is a minimum viable integration plan (and >> > covers previously discussed requirements for new types). It demonstrates >> > interop between two reference implementations and allows conversion to/from >> > idiomatic language analogues. So it covers the basic goal of "arrow >> > interop". >> > >> > >> > For example, would this work include making Gandiva and all the kernels >> > > support this new data type where appropriate? >> > >> > Not initially. There needs to be a stepping stone to start supporting new >> > types. I don't think it is feasible to try to land all of this >> > functionality in one PR. I'll lend a hand at trying get support for >> > built-in compute after we get the first part done. >> >> Since (I think?) there are other data types that Gandiva already does >> not support, trying to use decimal256 data with Gandiva would raise >> the same exception that it would raise with an unsupported type. >> Another option would be to insert an implicit cast to decimal128 as a >> stopgap. >> >> > Thanks, >> > Micah >> > >> > >> > >> > On Fri, Aug 14, 2020 at 5:08 PM Jacques Nadeau <jacq...@apache.org> wrote: >> > >> > > Do we have a good definition of what is necessary to add a new data type? >> > > Adding a type but not pulling it through most of the code seems less than >> > > ideal since it means one part of Arrow doesn't work with another >> > > (providing >> > > a less optimal end-user experience). >> > > >> > > For example, would this work include making Gandiva and all the kernels >> > > support this new data type where appropriate? >> > > >> > > On Wed, Aug 5, 2020 at 12:01 PM Wes McKinney <wesmck...@gmail.com> wrote: >> > > >> > > > Sounds fine to me. I guess one question is what needs to be formalized >> > > > in the Schema.fbs files or elsewhere in the columnar format >> > > > documentation (and we will need to hold an associated vote for that I >> > > > think) >> > > > >> > > > On Mon, Aug 3, 2020 at 11:30 PM Micah Kornfield <emkornfi...@gmail.com> >> > > > wrote: >> > > > > >> > > > > Given no objections, we'll go ahead and start implementing support >> > > > > for >> > > > 256-bit decimals. >> > > > > >> > > > > I'm considering setting up another branch to develop all the >> > > > > components >> > > > so they can be merged to master atomically. >> > > > > >> > > > > Thanks, >> > > > > Micah >> > > > > >> > > > > On Tue, Jul 28, 2020 at 6:39 AM Wes McKinney <wesmck...@gmail.com> >> > > > wrote: >> > > > >> >> > > > >> Generally this sounds fine to me. At some point it would be good to >> > > > >> add 32-bit and 64-bit decimal support but this can be done in the >> > > > >> future. >> > > > >> >> > > > >> On Tue, Jul 28, 2020 at 7:28 AM Fan Liya <liya.fa...@gmail.com> >> > > wrote: >> > > > >> > >> > > > >> > Hi Micah, >> > > > >> > >> > > > >> > Thanks for opening the discussion. >> > > > >> > I am aware of some scenarios where decimal requires more than 16 >> > > > bytes, so >> > > > >> > I think it would be beneficial to support this in Arrow. >> > > > >> > >> > > > >> > Best, >> > > > >> > Liya Fan >> > > > >> > >> > > > >> > >> > > > >> > On Tue, Jul 28, 2020 at 11:12 AM Micah Kornfield < >> > > > emkornfi...@gmail.com> >> > > > >> > wrote: >> > > > >> > >> > > > >> > > Hi Arrow Dev, >> > > > >> > > ZetaSQL (Google's open source standard SQL library) recently >> > > > introduced a >> > > > >> > > BigNumeric [1] type which requires a 256 bit width to properly >> > > > support it. >> > > > >> > > I'd like to add support (possibly in collaboration with some of >> > > > >> > > my >> > > > >> > > colleagues) to add support for 256 bit width Decimals in Arrow >> > > > >> > > to >> > > > support a >> > > > >> > > type corresponding to BigNumeric. >> > > > >> > > >> > > > >> > > In past discussions on this, I don't think we established a >> > > minimum >> > > > bar for >> > > > >> > > supporting additional bit-widths within Arrow. >> > > > >> > > >> > > > >> > > I'd like to propose the following requirements: >> > > > >> > > 1. A vote agreeing on adding support for a new bitwidth (we can >> > > > discuss >> > > > >> > > any objections here). >> > > > >> > > 2. Support in Java and C++ for integration tests verifying the >> > > > ability to >> > > > >> > > round-trip the value. >> > > > >> > > 3. Support in Java for conversion to/from BigDecimal [2] >> > > > >> > > 4. Support in Python converting to/from Decimal [3] >> > > > >> > > >> > > > >> > > Is there anything else that people feel like is a requirement >> > > > >> > > for >> > > > basic >> > > > >> > > support of an additional bit width for Decimal's? >> > > > >> > > >> > > > >> > > Thanks, >> > > > >> > > Micah >> > > > >> > > >> > > > >> > > >> > > > >> > > [1] >> > > > >> > > >> > > > >> > > >> > > > >> > > https://github.com/google/zetasql/blob/1aefaa7c62fc7a50def879bb7c4225ec6974b7ef/zetasql/public/numeric_value.h#L486 >> > > > >> > > [2] >> > > > https://docs.oracle.com/javase/7/docs/api/java/math/BigDecimal.html >> > > > >> > > [3] https://docs.python.org/3/library/decimal.html >> > > > >> > > >> > > > >> > >