That's great, thanks! I thought this would probably have come up before.

Have you written down your changes in a somewhat more formal specification
document, by any chance?

  cheers,
    rog.


On Mon, 6 Jan 2020, 18:50 zoly farkas, <zolyfar...@yahoo.com> wrote:

> I think there is consensus that this should be implemented, see [AVRO-1582]
> Json serialization of nullable fileds and fields with default values
> improvement. - ASF JIRA <https://issues.apache.org/jira/browse/AVRO-1582>
>
> [AVRO-1582] Json serialization of nullable fileds and fields with defaul...
>
> <https://issues.apache.org/jira/browse/AVRO-1582>
>
>
> Here is a live example to get some sample data in avro json:
> https://demo.spf4j.org/example/records/1?_Accept=application/avro%2Bjson
> and the "Natural"
> https://demo.spf4j.org/example/records/1?_Accept=application/json using
> the encoder suggested as implementation in the jira.
>
> Somebody needs to find the time do the work to integrate this...
>
> --Z
>
>
>
>
> On Monday, January 6, 2020, 12:36:44 PM EST, roger peppe <
> rogpe...@gmail.com> wrote:
>
>
> Hi,
>
> The JSON encoding in the specification
> <https://avro.apache.org/docs/current/spec.html#json_encoding> includes
> an explicit type name for all kinds of object other than null. This means
> that a JSON-encoded Avro value with a union is very rarely directly
> compatible with normal JSON formats.
>
> For example, it's very common for a JSON-encoded value to allow a value
> that's either null or string. In Avro, that's trivially expressed as the
> union type ["null", "string"]. With conventional JSON, a string value
> "foo" would be encoded just as "foo", which is easily distinguished from
> null when decoding. However when using the Avro JSON format it must be
> encoded as {"string": "foo"}.
>
> This means that Avro JSON-encoded values don't interchange easily with
> other JSON-encoded values.
>
> AFAICS the main reason that the type name is always required in
> JSON-encoded unions is to avoid ambiguity. This particularly applies to
> record and map types, where it's not possible in general to tell which
> member of the union has been specified by looking at the data itself.
>
> However, that reasoning doesn't apply if all the members of the union can
> be distinguished from their JSON token type.
>
> I am considering using a JSON encoding that omits the type name when all
> the members of the union encode to distinct JSON token types (the JSON
> token types being: null, boolean, string, number, object and array).
>
> For example, JSON-encoded values using the Avro schema ["null", "string",
> "int"] would encode as the literal values themselves (e.g. null, "foo",
> 999), but JSON-encoded values using the Avro schema ["int", "double"]
> would require the type name because the JSON lexeme doesn't distinguish
> between different kinds of number.
>
> This would mean that it would be possible to represent a significant
> subset of "normal" JSON schemas with Avro. It seems to me that would
> potentially be very useful.
>
> Thoughts? Is this a really bad idea to be contemplating? :)
>
>   cheers,
>     rog.
>
>
>

Reply via email to