Hi folks, This issue is probably the one true "blocker" for the 1.0.0 release. Ideally, all libraries should emit V5 MetadataVersion by default. How V4 handled depends on the willingness to implement compatibility code:
* Since V4 is backwards compatible with V5 (except for unions), libraries can read V4 and either error if they receive unions or implement compatibility code that allows unions that do not have top-level nulls (this is what Antoine did on https://github.com/apache/arrow/pull/7664) * Libraries (if they wish) may implement an opt-in "V4 compatibility mode" that permits writing messages that do not utilize any features / changes in 1.0.0 (e.g. writing unions is not allowed) * Integration test executables ideally would be extended to allow the intended metadata version to be set either with a command line argument or environment variable. If this work cannot be done, then as a last resort we should make sure that un-upgraded libraries (the at-risk libraries are Go and JavaScript since they are part of the integration tests) will successfully reject V5 data generated by other libraries. - Wes On Thu, Jul 2, 2020 at 5:50 PM Wes McKinney <wesmck...@gmail.com> wrote: > > The vote carries with 6 binding +1 votes and 2 non-binding +1 > > On Tue, Jun 30, 2020 at 4:03 PM Sutou Kouhei <k...@clear-code.com> wrote: > > > > +1 (binding) > > > > In <CAJPUwMCsMG5pga5h=+slr97ogkk52ecr8ti-9oftwlhvgf1...@mail.gmail.com> > > "[VOTE] Increment MetadataVersion in Schema.fbs from V4 to V5 for 1.0.0 > > release" on Mon, 29 Jun 2020 16:42:45 -0500, > > Wes McKinney <wesmck...@gmail.com> wrote: > > > > > Hi, > > > > > > As discussed on the mailing list [1], in order to demarcate the > > > pre-1.0.0 and post-1.0.0 worlds, and to allow the > > > forward-compatibility-protection changes we are making to actually > > > work (i.e. so that libraries can recognize that they have received > > > data with a feature that they do not support), I have proposed to > > > increment the MetadataVersion from V4 to V5. Additionally, if the > > > union validity bitmap changes are accepted, the MetadataVersion could > > > be used to control whether unions are permitted to be serialized or > > > not (with V4 -- used by v0.8.0 to v0.17.1, unions would not be > > > permitted). > > > > > > Since there have been no backward incompatible changes to the Arrow > > > format since 0.8.0, this would be no different, and (aside from the > > > union issue) libraries supporting V5 are expected to accept BOTH V4 > > > and V5 so that backward compatibility is not broken, and any > > > serialized data from prior versions of the Arrow libraries (0.8.0 > > > onward) will continue to be readable. > > > > > > Implementations are recommended, but not required, to provide an > > > optional "V4 compatibility mode" for forward compatibility > > > (serializing data from >= 1.0.0 that needs to be readable by older > > > libraries, e.g. Spark deployments stuck on an older Java-Arrow > > > version). In this compatibility mode, non-forward-compatible features > > > added in 1.0.0 and beyond would not be permitted. > > > > > > A PR with the changes to Schema.fbs (possibly subject to some > > > clarifying changes to the comments) is at [2]. > > > > > > Once the PR is merged, it will be necessary for implementations to be > > > updated and tested as appropriate at minimum to validate that backward > > > compatibility is preserved (i.e. V4 IPC payloads are still readable -- > > > we have some in apache/arrow-testing and can add more as needed). > > > > > > The vote will be open for at least 72 hours. > > > > > > [ ] +1 Accept addition of MetadataVersion::V5 along with its general > > > implications above > > > [ ] +0 > > > [ ] -1 Do not accept because... > > > > > > [1]: > > > https://lists.apache.org/thread.html/r856822cc366d944b3ecdf32c2ea9b1ad8fc9d12507baa2f2840a64b6%40%3Cdev.arrow.apache.org%3E > > > [2]: https://github.com/apache/arrow/pull/7566