As with enums, I found what looks like a discrepancy between the spec
and the implementation of resolution, this time for the case that the
reader's schema is a union and the writer's is not, so we look for the
"first matching field."

The spec for this case (I call it the "reader-union case") is as
follows: "The first schema in the reader's union that matches the
writer's schema is recursively resolved against it.  If none match, an
error is signalled."

Note this says the FIRST schema that matches, not the BEST schema that
matches.  Notice also that it doesn't say what should happen if the
match is "imperfect," i.e., if it could "signal an error" on some (or
all) of its data.

Here is the code in question: https://bit.ly/2ONfl9B

*** Primitive types: on line 460 the code first looks for an _exact_
match of the schema.  Then, on line 492, it looks for a "promote-able
match".  Thus, consider these two schemas:

   Reader: ["long", "int"]
   Writer: "int"

The spec says that the "long" branch of the reader should be returned
(_FIRST_ match).  But the implementation will pick the "int" branch
(presumably the "BEST" match).  Which is right, the spec or the impl?

I think we should change the implementation to return the FIRST match:
it's perfectly functional, the code is simpler, and the spec is
simpler.  On the other hand, while it's hard to imagine a program out
there _depending_ on this behavior, strictly speaking this would not
be a backward-compatible change.  If we want "bug-for-bug"
compatibility, then the spec could be changed to say something like
"for primitive types, an exact match is preferred over a promote-able
one, but if an exact match isn't available, then the FIRST
promote-able one will be used."

(Note that the code doesn't pick the "best" promotable branch, e.g.,
in the case of reading an underlying "int", it does not prefer a
"long" over a "float", but rather takes whichever is first.  This is
why I prefer changing the implementation, because the implementation
is neither a "best" nor "first" match, but a mix of both.)


*** Record types: on lines 469-471, the the reader's and writer's
names "match exactly" (i.e.,
writer.getFullName().equals(reader.getFullName()), then the record is
considered a match, whether or not that match is "imperfect" (e.g.,
because the writer is missing a field for which the reader has no
default value).

On lines 473 to 490, if the reader's and writer's names match "kind
of" (i.e., if writer.getName().equals(reader.getName()), then it this
match is "considered", but under the following circumstances:

   STRUCTURE) If matching the fields of the two record schemas
generates certain kind of errors (e.g., if the writer is missing a
field without a default value in the reader, or if the reader tries to
"promote" a field that's a string in the writer to an int in the
reader), then this "kind of" match is rejected.  Note that no such
condition is put in place when the full-names match: if the names
match, then that is the branch that's taken no matter what deeper
errors might occur when matching the underlying record schema.

   PICK_LAST) If there is more than one "kind of" name match that
passes the STRUCTURE test, then the _LAST_ of these from the union is
picked, not the first.

In this case, I think we have no choice but to change the impl to
conform to the spec (which is to say), get rid of what the comments in
the code calls a "structure match."

I say this because doing anything based on the unqualified names of
schemas matching is unprecedented in the Avro semantics.  (If we do
this for records, why not for enum and fixed?)

And also, the "STRUCTURE" test that is being done is unprincipled.
For example, when matching two schemas, as mentioned above, this test
will fail if the the reader's schema for a field tries to "promote" a
string-valued field to an "int."  However, if the two schemas contain
a field (with the same name) that's also a record schema, and that
nested record-schema tries to "promote" a string-valued field to an
"int", then no error is flagged and this schema will pass the
STRUCTURE test.

While changing the impl in this case is also backward compatible
(fixing any bug in a library is "incompatible" to folks who depend on
that buggy behavior), anyone relying on this inconsistency in the
STRUCTURE test has a very fragile system on their hands.

Reply via email to