Anybody?
On Tue, Oct 23, 2018 at 11:33 PM Raymie Stata <[email protected]> wrote:
>
> While working on AVRO-2090, I noticed what is either an implementation
> bug or a specification bug in schema resolution for enumerations.
>
> The relevant code is here: https://bit.ly/2q5tsIp. This code uses the
> reader's default symbol, if it exists, in the case where the writer's
> symbols is missing.
>
> Let's think about this through an example. Let's say the reader
> defines just two symbols for an enum: "alpha" and "beta", with "one"
> as the default. Let's say that the writer had three symbols: "alpha",
> "beta", and "gamma". The way https://bit.ly/2q5tsIp is written, if
> the reader encounters a file containing the symbol "gamma", and error
> will NOT be thrown. Instead, the reader will be told that the actual
> symbol was "alpha".
>
> Note that the Avro specification says the following about matching
> enumerations: "if the writer's symbol is not present in the reader's
> enum, then an error is signalled." This would suggest that, in the
> example just described, an error should be thrown, rather than the
> value "alpha" returns. So either the code is wrong, or the spec is
> wrong.
>
> On a related note, the current spec says nothing about a "default"
> property for enumerations. When should this property be used? As a
> "default default" for fields? (If so, this isn't happening.) As a
> value to be used in resolution, when the writer provides a symbole
> that is not (any longer) defined? (If so, this is happening in the
> code, but the spec needs an update.) And/or should it be used in
> other circumstances?
>
> I'm willing to update docs and/or code appropriately, but can someone
> indicated the intended semantics of "default" for enums?
>
> Raymie