On Mon, Jan 20, 2020 at 12:20 PM Jacques Nadeau <jacq...@apache.org> wrote:
>
> >
> > I think what we have determined is that the changes that are being
> > discussed in this thread would not render any existing serialized
> > Flatbuffers unreadable, unless they are malformed / unable to be
> > read with the current libraries.
> >
>
> I think we need to separate two different things:
>
> Point 1: If all data is populated as we expect, changing from optional to
> required is a noop.
> Point 2: All current Arrow code fails to work in all cases where a field is
> not populated as expected.

I looked at the before/after when adding "(required)" to a field and
it appears the only change on the read path is the generated verifier
(which you have to explicitly invoke, and you can skip verification)

https://gist.github.com/wesm/f1a9e7492b0daee07ccef0566c3900a2

This is distinct from Protobuf (I think?) because protobuf verifies
the presence of required fields when parsing the protobuf. I assume
it's the same in other languages but we'll have to check to be sure

This means that if you _fail to invoke the verifier_, you can still
follow a null pointer, but applications that use the verifier will
stop there and not have to implement their own null checks.

>
> I think one needs to prove both points in order for this change to be a
> compatible change. I agree that point 1 is proven. I don't think point 2
> has been proven. In fact, I'm not sure how one could prove it(*). The bar
> for changing the format in a backwards incompatible way (assuming we can't
> prove point 2) should be high given how long the specification has been
> out. It doesn't feel like the benefits here outweigh the cost of changing
> in an incompatible way (especially given the subjective nature of optional
> vs. required).
>
> It's probably less of a concern for
> > an in-house protocol than for an open standard like Arrow where there
> > may be multiple third-party implementations around at some point.
> >
>
> This is subjective, just like the general argument around whether required
> or optional should be used in protobuf. My point in sharing was to (1)
> point out that the initial implementation choices weren't done without
> reason and (2) that we should avoid arguing that either direction is more
> technically sound (which seemed to be the direction the argument was
> taking).
>
> (*)  One could do an exhaustive analysis of every codepath. This would work
> for libraries in the Arrow project. However, the flatbuf definition is part
> of the external specification meaning that other codepaths likely exist
> that we could not evaluate.

Reply via email to