> > How about "at least two native implementations" instead of > "Java and C++"? Now, we have multiple native > implementations: >
I think we should have two complete implementations. I don't think having one feature in C# and Go and another in JavaScript and Rust does justice to the project goals. I think Java and C++ should always be complete. They are the first two implementations. I believe they are the most complete and broadly used/popular (C++ given Python & Pandas integration and Java via Spark & Dremio). This is a compromise between setting a high barrier for creation of new features and making sure that we have validated things across impls. Are there specific changes to format/ that have been merged that you > are concerned about that you feel need to be discussed separately? > There have been some changes related to serializing tensor metadata > that are clearly marked as experimental, and they also do not interact > with the columnar format. There are several things we've introduced over time that suffered this problem. Alignment changes, dictionary encoding, union behavior, interval behavior, tensors, unsigned integrations, etc that we've failed to make sure we have integration tests for. I've meant to send this email for months but saw a couple of recent proposed changes which made me feel like we should discuss further.