>
> How about "at least two native implementations" instead of
> "Java and C++"? Now, we have multiple native
> implementations:
>

I think we should have two complete implementations. I don't think having
one feature in C# and Go and another in JavaScript and Rust does justice to
the project goals. I think Java and C++ should always be complete. They are
the first two implementations. I believe they are the most complete and
broadly used/popular (C++ given Python & Pandas integration and Java via
Spark & Dremio). This is a compromise between setting a high barrier for
creation of new features and making sure that we have validated things
across impls.

Are there specific changes to format/ that have been merged that you
> are concerned about that you feel need to be discussed separately?
> There have been some changes related to serializing tensor metadata
> that are clearly marked as experimental, and they also do not interact
> with the columnar format.


There are several things we've introduced over time that suffered this
problem. Alignment changes, dictionary encoding, union behavior, interval
behavior, tensors, unsigned integrations, etc that we've failed to make
sure we have integration tests for. I've meant to send this email for
months but saw a couple of recent proposed changes which made me feel like
we should discuss further.

Reply via email to