That's great, thanks! I thought this would probably have come up before. Have you written down your changes in a somewhat more formal specification document, by any chance?
cheers, rog. On Mon, 6 Jan 2020, 18:50 zoly farkas, <zolyfar...@yahoo.com> wrote: > I think there is consensus that this should be implemented, see [AVRO-1582] > Json serialization of nullable fileds and fields with default values > improvement. - ASF JIRA <https://issues.apache.org/jira/browse/AVRO-1582> > > [AVRO-1582] Json serialization of nullable fileds and fields with defaul... > > <https://issues.apache.org/jira/browse/AVRO-1582> > > > Here is a live example to get some sample data in avro json: > https://demo.spf4j.org/example/records/1?_Accept=application/avro%2Bjson > and the "Natural" > https://demo.spf4j.org/example/records/1?_Accept=application/json using > the encoder suggested as implementation in the jira. > > Somebody needs to find the time do the work to integrate this... > > --Z > > > > > On Monday, January 6, 2020, 12:36:44 PM EST, roger peppe < > rogpe...@gmail.com> wrote: > > > Hi, > > The JSON encoding in the specification > <https://avro.apache.org/docs/current/spec.html#json_encoding> includes > an explicit type name for all kinds of object other than null. This means > that a JSON-encoded Avro value with a union is very rarely directly > compatible with normal JSON formats. > > For example, it's very common for a JSON-encoded value to allow a value > that's either null or string. In Avro, that's trivially expressed as the > union type ["null", "string"]. With conventional JSON, a string value > "foo" would be encoded just as "foo", which is easily distinguished from > null when decoding. However when using the Avro JSON format it must be > encoded as {"string": "foo"}. > > This means that Avro JSON-encoded values don't interchange easily with > other JSON-encoded values. > > AFAICS the main reason that the type name is always required in > JSON-encoded unions is to avoid ambiguity. This particularly applies to > record and map types, where it's not possible in general to tell which > member of the union has been specified by looking at the data itself. > > However, that reasoning doesn't apply if all the members of the union can > be distinguished from their JSON token type. > > I am considering using a JSON encoding that omits the type name when all > the members of the union encode to distinct JSON token types (the JSON > token types being: null, boolean, string, number, object and array). > > For example, JSON-encoded values using the Avro schema ["null", "string", > "int"] would encode as the literal values themselves (e.g. null, "foo", > 999), but JSON-encoded values using the Avro schema ["int", "double"] > would require the type name because the JSON lexeme doesn't distinguish > between different kinds of number. > > This would mean that it would be possible to represent a significant > subset of "normal" JSON schemas with Avro. It seems to me that would > potentially be very useful. > > Thoughts? Is this a really bad idea to be contemplating? :) > > cheers, > rog. > > >