Thanks for chipping in Zoltan and Sean. I did not plan to change the
current JSON encoder. My initial suggestion would make this an option that
the user can set. The default will be the current situation, so nothing
should change when upgrading to a newer version of Avro.

Cheers, Fokko

Op wo 8 jan. 2020 om 21:39 schreef Sean Busbey <bus...@apache.org>:

> I agree with Zoltan here. We have a really long history of maintaining
> compatibility for encoders.
>
> On Tue, Jan 7, 2020 at 10:06 AM Zoltan Farkas <zolyfar...@yahoo.com>
> wrote:
>
>> Fokko,
>>
>> I am not sure we should be changing the existing json encoder,
>> I think we should just add another encoder, and devs can use either one
>> of them based on their use case… and stay backward compatible.
>>
>> we should maybe standardize the content types for them… I have seen
>> application/avro being used for binary, we could have for json:
>> application/avro+json for the current format, application/avro.2+json for
>> the new format….
>>
>> At some point in the future we could deprecate the old one…
>>
>> —Z
>>
>>
>> On Jan 7, 2020, at 2:41 AM, Driesprong, Fokko <fo...@driesprong.frl>
>> wrote:
>>
>> I would be a great fan of this as well. This also bothered me. The tricky
>> part here is to see when to release this because it will break the existing
>> JSON structure. We could make this configurable as well.
>>
>> Cheers, Fokko
>>
>> Op ma 6 jan. 2020 om 22:36 schreef roger peppe <rogpe...@gmail.com>:
>>
>>> That's great, thanks! I thought this would probably have come up before.
>>>
>>> Have you written down your changes in a somewhat more formal
>>> specification document, by any chance?
>>>
>>>   cheers,
>>>     rog.
>>>
>>>
>>> On Mon, 6 Jan 2020, 18:50 zoly farkas, <zolyfar...@yahoo.com> wrote:
>>>
>>>> I think there is consensus that this should be implemented, see [AVRO-1582]
>>>> Json serialization of nullable fileds and fields with default values
>>>> improvement. - ASF JIRA
>>>> <https://issues.apache.org/jira/browse/AVRO-1582>
>>>>
>>>> [AVRO-1582] Json serialization of nullable fileds and fields with
>>>> defaul...
>>>>
>>>> <https://issues.apache.org/jira/browse/AVRO-1582>
>>>>
>>>>
>>>> Here is a live example to get some sample data in avro json:
>>>> https://demo.spf4j.org/example/records/1?_Accept=application/avro%2Bjson
>>>> and the "Natural"
>>>> https://demo.spf4j.org/example/records/1?_Accept=application/json using
>>>> the encoder suggested as implementation in the jira.
>>>>
>>>> Somebody needs to find the time do the work to integrate this...
>>>>
>>>> --Z
>>>>
>>>>
>>>>
>>>>
>>>> On Monday, January 6, 2020, 12:36:44 PM EST, roger peppe <
>>>> rogpe...@gmail.com> wrote:
>>>>
>>>>
>>>> Hi,
>>>>
>>>> The JSON encoding in the specification
>>>> <https://avro.apache.org/docs/current/spec.html#json_encoding> includes
>>>> an explicit type name for all kinds of object other than null. This means
>>>> that a JSON-encoded Avro value with a union is very rarely directly
>>>> compatible with normal JSON formats.
>>>>
>>>> For example, it's very common for a JSON-encoded value to allow a value
>>>> that's either null or string. In Avro, that's trivially expressed as the
>>>> union type ["null", "string"]. With conventional JSON, a string value
>>>> "foo" would be encoded just as "foo", which is easily distinguished
>>>> from null when decoding. However when using the Avro JSON format it
>>>> must be encoded as {"string": "foo"}.
>>>>
>>>> This means that Avro JSON-encoded values don't interchange easily with
>>>> other JSON-encoded values.
>>>>
>>>> AFAICS the main reason that the type name is always required in
>>>> JSON-encoded unions is to avoid ambiguity. This particularly applies to
>>>> record and map types, where it's not possible in general to tell which
>>>> member of the union has been specified by looking at the data itself.
>>>>
>>>> However, that reasoning doesn't apply if all the members of the union
>>>> can be distinguished from their JSON token type.
>>>>
>>>> I am considering using a JSON encoding that omits the type name when
>>>> all the members of the union encode to distinct JSON token types (the JSON
>>>> token types being: null, boolean, string, number, object and array).
>>>>
>>>> For example, JSON-encoded values using the Avro schema ["null",
>>>> "string", "int"] would encode as the literal values themselves (e.g.
>>>> null, "foo", 999), but JSON-encoded values using the Avro schema ["int",
>>>> "double"] would require the type name because the JSON lexeme doesn't
>>>> distinguish between different kinds of number.
>>>>
>>>> This would mean that it would be possible to represent a significant
>>>> subset of "normal" JSON schemas with Avro. It seems to me that would
>>>> potentially be very useful.
>>>>
>>>> Thoughts? Is this a really bad idea to be contemplating? :)
>>>>
>>>>   cheers,
>>>>     rog.
>>>>
>>>>
>>>>
>>

Reply via email to