On Mon, Jan 20, 2020 at 12:20 PM Jacques Nadeau <jacq...@apache.org> wrote: > > > > > I think what we have determined is that the changes that are being > > discussed in this thread would not render any existing serialized > > Flatbuffers unreadable, unless they are malformed / unable to be > > read with the current libraries. > > > > I think we need to separate two different things: > > Point 1: If all data is populated as we expect, changing from optional to > required is a noop. > Point 2: All current Arrow code fails to work in all cases where a field is > not populated as expected.
I looked at the before/after when adding "(required)" to a field and it appears the only change on the read path is the generated verifier (which you have to explicitly invoke, and you can skip verification) https://gist.github.com/wesm/f1a9e7492b0daee07ccef0566c3900a2 This is distinct from Protobuf (I think?) because protobuf verifies the presence of required fields when parsing the protobuf. I assume it's the same in other languages but we'll have to check to be sure This means that if you _fail to invoke the verifier_, you can still follow a null pointer, but applications that use the verifier will stop there and not have to implement their own null checks. > > I think one needs to prove both points in order for this change to be a > compatible change. I agree that point 1 is proven. I don't think point 2 > has been proven. In fact, I'm not sure how one could prove it(*). The bar > for changing the format in a backwards incompatible way (assuming we can't > prove point 2) should be high given how long the specification has been > out. It doesn't feel like the benefits here outweigh the cost of changing > in an incompatible way (especially given the subjective nature of optional > vs. required). > > It's probably less of a concern for > > an in-house protocol than for an open standard like Arrow where there > > may be multiple third-party implementations around at some point. > > > > This is subjective, just like the general argument around whether required > or optional should be used in protobuf. My point in sharing was to (1) > point out that the initial implementation choices weren't done without > reason and (2) that we should avoid arguing that either direction is more > technically sound (which seemed to be the direction the argument was > taking). > > (*) One could do an exhaustive analysis of every codepath. This would work > for libraries in the Arrow project. However, the flatbuf definition is part > of the external specification meaning that other codepaths likely exist > that we could not evaluate.