> While dedicated types are not strictly required, compute functions would > be much easier to add for a first-class dedicated complex datatype > rather than for an extension type. @pitrou
This is perhaps a naive question (and admittedly, I'm not up to speed on my compute kernels) but why is this the case? For example, if adding a complex addition kernel it seems we would be talking about... dest_scalar.real = scalar1.real + scalar2.real; dest_scalar.im = scalar1.im + scalar2.im; vs... dest_scalar[0] = scalar1[0] + scalar2[0]; dest_scalar[1] = scalar1[1] + scalar2[1]; On Thu, Jun 10, 2021 at 11:27 AM Wes McKinney <wesmck...@gmail.com> wrote: > > I'd be supportive of starting with this as a "canonical" extension > type so that all implementations are not expected to support complex > types — this would encourage us to build sufficient integration e.g. > with NumPy to get things working end-to-end with the on-wire > representation being an extension type. We could certainly choose to > treat the type as "first class" in the C++ library without it being > "top level" in the Type union in Flatbuffers. > > I agree that the use cases are more specialized, and the fact that we > haven't needed it until now (or at least, its absence suggests this) > shows that this is the case. > > On Thu, Jun 10, 2021 at 4:17 PM Micah Kornfield <emkornfi...@gmail.com> wrote: > > > > > > > > I'm convinced now that first-class types seem to be the way to go and I'm > > > happy to take this approach. > > > > I agree from an implementation effort it is simpler, but I'm still not > > convinced that we should be adding this as a first class type. As noted in > > the survey below it appears Complex numbers are not a core concept in many > > general purpose coding languages and it doesn't appear to be a common type > > in SQL systems either. > > > > The reason why I am being nit-picky here is I think that having a first > > class type indicates that it should eventually be supported by all > > reference implementations. An "well known" extension type I think offers > > less guarantees which makes it seem more suitable for niche types. > > > > > I don't immediately see a Packed Struct type. Would this need to be > > > > implemented? > > > Not necessarily (*). But before thinking about implementation, this > > > proposal must be accepted into the format. > > > > > > Yes, this is a type that has been proposed in the past and I think handles > > a lot of types not yet in Arrow but have been requested (e.g. IP > > Addresses, Geo coordinates), etc. > > > > On Thu, Jun 10, 2021 at 1:06 AM Simon Perkins <simon.perk...@gmail.com> > > wrote: > > > > > On Wed, Jun 9, 2021 at 7:56 PM Antoine Pitrou <anto...@python.org> wrote: > > > > > > > > > > > Le 09/06/2021 à 17:52, Micah Kornfield a écrit : > > > > > > > > > > Adding a new first-class type in Arrow requires working integration > > > tests > > > > > between C++ and Java libraries (once the idea is informally agreed > > > upon) > > > > > and then a final vote for approval. We haven't formalized extension > > > > types > > > > > but I imagine a similar cross language requirement would be agreed > > > upon. > > > > > Implementation of computation wouldn't be required for adding a new > > > type. > > > > > Different language bindings have taken different approaches on how > > > > > much > > > > > additional computational elements are packaged in them. > > > > > > > > While dedicated types are not strictly required, compute functions would > > > > be much easier to add for a first-class dedicated complex datatype > > > > rather than for an extension type. > > > > > > > > Since complex numbers are quite common in some domains, and since they > > > > are conceptually simply, IMHO it would make sense to add them to the > > > > native Arrow datatypes (at least COMPLEX64 and COMPLEX128). > > > > > > > > > > I'm convinced now that first-class types seem to be the way to go and I'm > > > happy to take this approach. > > > Regarding compute functions, it looks like the standard set of scalar > > > arithmetic and reduction functionality > > > is desirable for complex numbers: > > > https://arrow.apache.org/docs/cpp/compute.html# > > > Perhaps it would be better to split the addition of the Types and addition > > > Compute functionality into separate PRs? > > > > > > Regarding the process for managing this PR, it sounds like a proposal must > > > be voted on? > > > i.e. is this proposal still in this phase > > > http://arrow.apache.org/docs/developers/contributing.html#before-starting > > > Regards > > > > > > Simon > > >