>
> We don't have any test cases that have a future metadata version. I
> made a branch where I added V6 and wrote an IPC message, then found
> that I was unable to determine that it was out of bounds (presumably
> UBSAN would error, though, but we need a runtime error outside of
> ASAN/UBSAN).

To clarify I don't think UBSAN will error on the existing generated code on
future versions. I believe we had issues with parquet because the enums did
not have an explicit type (compare [1] to [2]) .  The version check needs
to be done in our code (comparing against MAX [3]).

Does that align with your expectations?  So we don't get this for free, but
I'm not sure I understand why this is difficult?

[1]
https://github.com/apache/arrow/blob/master/cpp/src/generated/parquet_types.h#L26
[2]
https://github.com/apache/arrow/blob/master/cpp/src/generated/Schema_generated.h#L91
[3]
https://github.com/apache/arrow/blob/master/cpp/src/generated/Schema_generated.h#L109

On Mon, Jul 13, 2020 at 2:34 PM Wes McKinney <wesmck...@gmail.com> wrote:

> On Mon, Jul 13, 2020 at 4:31 PM Micah Kornfield <emkornfi...@gmail.com>
> wrote:
> >
> > >
> > >
> > > That static cast is currently undefined behavior.
> >
> > Is ubsan reporting this?  When looking into the feature enum I tried to
> > understand if that was valid. At the time I read the C++ spec* if the
> enum
> > has an explicitly declared type, all values in that types range are
> > supported.
>
> We don't have any test cases that have a future metadata version. I
> made a branch where I added V6 and wrote an IPC message, then found
> that I was unable to determine that it was out of bounds (presumably
> UBSAN would error, though, but we need a runtime error outside of
> ASAN/UBSAN).
>
> > The generated enums provide  a "max" [1] value that should be comparable
> > against.
> > <
> https://github.com/apache/arrow/blob/master/cpp/src/generated/Schema_generated.h#L109
> >
> >
> >
> > * I am not a C++ lawyer
> >
> > [1]
> >
> https://github.com/apache/arrow/blob/master/cpp/src/generated/Schema_generated.h#L109
> >
> > On Mon, Jul 13, 2020 at 2:19 PM Wes McKinney <wesmck...@gmail.com>
> wrote:
> >
> > > I've discovered while working on ARROW-9399 that it is very difficult
> > > with the Flatbuffers API in C++ to detect a MetadataVersion [1] that
> > > is higher than the current version.
> > >
> > > For example, suppose that 3 or 4 years from now we move from version
> > > V5 to version V6. The generated Flatbuffers code looks like this
> > >
> > > org::apache::arrow::flatbuf::MetadataVersion version() const {
> > >   return
> > >
> static_cast<org::apache::arrow::flatbuf::MetadataVersion>(GetField<int16_t>(VT_VERSION,
> > > 0));
> > > }
> > >
> > > That static cast is currently undefined behavior.
> > >
> > > One way to deal with this would be to add placeholder future versions
> > > (e.g. V6 and V7).
> > >
> > > Another backward-and-forward-compatible option would be to return the
> > > version as a short (int16_t) rather than the enum value, which is
> > > subject to this brittleness.
> > >
> > > Either way unfortunately I think adding forward compatibility checks
> > > is out of scope for 1.0.0 and the risk is low since we don't
> > > anticipate bumping the version anytime soon.
> > >
> > > Thanks,
> > > Wes
> > >
> > > [1]: https://github.com/apache/arrow/blob/master/format/Schema.fbs#L22
> > >
>

Reply via email to