I mostly agree! I would be very, very surprised if someone ever wrote code where the correct behaviour relied on *failing* to resolve two schemas because the spec says so!
I also think the Java implementation is doing the right and convenient thing. I've typically used new record names for new schemas (AddressV1, AddressV2) and just expected it to work, and it does. That being said -- in this case, there are no interoperability concerns when schema evolution is concerned, since the rules are by nature linked to one version of one implementation. If there's value in this part of the spec, it's because it would be *great* if all implementations did roughly the same thing and we could point users to this documentation! Anyone have any knowledge about how closely the other implementations follow the spec as written? If everyone is ignoring record names with direct record->record evolution, it would definitely be appropriate to update the spec! I'll take a look at the other named schemas (fixed, enum) and see if they're consistent in this behaviour. As it is, the change to include "the same (unqualified) name" seems misleading if it was done for legacy Java reasons AND the name is ignored when comparing two records... All my best, Ryan On Mon, Aug 26, 2019 at 7:47 PM Doug Cutting <[email protected]> wrote: > > This may be an example of Postel's Law, where neither the implementation > nor the spec are wrong. An implementation is allowed to accept more than > the the strictest interpretation of the spec. Within reason, we prefer > that folks can read data rather than get an error when trying. (We also > want them to be able to write data which can be read by the widest range of > implementations.) > > Does the likelihood of harm in quietly accepting mismatched namespaces > exceed convenience and back-compatibility here? > > Cheers, > > Doug > > On Mon, Aug 26, 2019 at 2:31 AM Ryan Skraba <[email protected]> wrote: > > > Hello! I've been going through some code that should be cleaned up if > > https://issues.apache.org/jira/browse/AVRO-2492 is applied (removing > > one of the deprecated record schema constructores). > > > > In the meantime, I have a question about names in general. I noticed > > in the spec: > > > > https://avro.apache.org/docs/1.9.0/spec.html#Schema+Resolution > > > > <heavily snipped> > > * To match, one of the following must hold: > > - both schemas are records with the same name > > * If both are records; > > - <more criteria with respect to fields> > > > > In 1.9.1, "the same name" was changed to "the same (unqualified) name" > > (AVRO-2400) > > > > For reading records, I have definitely observed that the reader and > > writer schema can have different top-level record names and work > > together successfully -- implying that the name isn't taken into > > account at all. > > > > Is the spec wrong, or the implementation? Is this behaviour > > consistent across named schemas? I seem to recall that when resolving > > a record against a union, the name is *preferred* if available. > > > > All my best, Ryan > >
