Makes sense, 

We have to agree on he scope of this implementation.

Right now the implementation I have in java, handles only the:

union {null, [some type]} situation.

Are we ok with this for a start?

What I see more, is to handle:

1) union {string, double}, (although we have to specify behavior for NAN, 
Positive and negative infinity);  union {string, boolean}; ….

2) Make decimal an avro first class type. Current logical type approach is not 
natural in JSON. (see https://issues.apache.org/jira/browse/AVRO-2164).

For 1.9.x    2) is probably a non-starter

let me know.

—Z


> On Jan 14, 2020, at 12:09 PM, roger peppe <rogpe...@gmail.com> wrote:
> 
> 
> On Tue, 14 Jan 2020 at 15:00, Zoltan Farkas <zolyfar...@yahoo.com 
> <mailto:zolyfar...@yahoo.com>> wrote:
> I can go ahead create a PR to add the Encoder/Decoder implementations.
> let me know if anyone else plans to do that. (to avoid wasting time)
> 
> Hi,
> 
> Before you do that, would it be possible to write a specification for exactly 
> what the conventions are and publish it somewhere? There are a bunch of edge 
> cases that could be done in different ways, I think.
> 
> That way people like me that don't use Java can implement the same spec. (and 
> also it's useful to know exactly what one is implementing before diving in 
> and writing the code :])
> 
>   cheers,
>     rog.
> 
> 
> thanks
> 
> —Z
> 
>> On Jan 9, 2020, at 3:51 AM, Driesprong, Fokko <fo...@driesprong.frl 
>> <mailto:fo...@driesprong.frl>> wrote:
>> 
>> Thanks for chipping in Zoltan and Sean. I did not plan to change the current 
>> JSON encoder. My initial suggestion would make this an option that the user 
>> can set. The default will be the current situation, so nothing should change 
>> when upgrading to a newer version of Avro.
>> 
>> Cheers, Fokko
>> 
>> Op wo 8 jan. 2020 om 21:39 schreef Sean Busbey <bus...@apache.org 
>> <mailto:bus...@apache.org>>:
>> I agree with Zoltan here. We have a really long history of maintaining 
>> compatibility for encoders.
>> 
>> On Tue, Jan 7, 2020 at 10:06 AM Zoltan Farkas <zolyfar...@yahoo.com 
>> <mailto:zolyfar...@yahoo.com>> wrote:
>> Fokko, 
>> 
>> I am not sure we should be changing the existing json encoder,
>> I think we should just add another encoder, and devs can use either one of 
>> them based on their use case… and stay backward compatible.
>> 
>> we should maybe standardize the content types for them… I have seen 
>> application/avro being used for binary, we could have for json:
>> application/avro+json for the current format, application/avro.2+json for 
>> the new format…. 
>> 
>> At some point in the future we could deprecate the old one…
>> 
>> —Z
>> 
>> 
>>> On Jan 7, 2020, at 2:41 AM, Driesprong, Fokko <fo...@driesprong.frl 
>>> <mailto:fo...@driesprong.frl>> wrote:
>>> 
>>> I would be a great fan of this as well. This also bothered me. The tricky 
>>> part here is to see when to release this because it will break the existing 
>>> JSON structure. We could make this configurable as well.
>>> 
>>> Cheers, Fokko
>>> 
>>> Op ma 6 jan. 2020 om 22:36 schreef roger peppe <rogpe...@gmail.com 
>>> <mailto:rogpe...@gmail.com>>:
>>> That's great, thanks! I thought this would probably have come up before.
>>> 
>>> Have you written down your changes in a somewhat more formal specification 
>>> document, by any chance?
>>> 
>>>   cheers,
>>>     rog.
>>> 
>>> 
>>> On Mon, 6 Jan 2020, 18:50 zoly farkas, <zolyfar...@yahoo.com 
>>> <mailto:zolyfar...@yahoo.com>> wrote:
>>> I think there is consensus that this should be implemented, see [AVRO-1582] 
>>> Json serialization of nullable fileds and fields with default values 
>>> improvement. - ASF JIRA <https://issues.apache.org/jira/browse/AVRO-1582>
>>> 
>>> [AVRO-1582] Json serialization of nullable fileds and fields with defaul...
>>>  <https://issues.apache.org/jira/browse/AVRO-1582>
>>> 
>>> 
>>> Here is a live example to get some sample data in avro json: 
>>> https://demo.spf4j.org/example/records/1?_Accept=application/avro%2Bjson 
>>> <https://demo.spf4j.org/example/records/1?_Accept=application/avro%2Bjson>
>>> and the "Natural" 
>>> https://demo.spf4j.org/example/records/1?_Accept=application/json 
>>> <https://demo.spf4j.org/example/records/1?_Accept=application/json> using 
>>> the encoder suggested as implementation in the jira.
>>> 
>>> Somebody needs to find the time do the work to integrate this...
>>> 
>>> --Z
>>> 
>>> 
>>> 
>>> 
>>> On Monday, January 6, 2020, 12:36:44 PM EST, roger peppe 
>>> <rogpe...@gmail.com <mailto:rogpe...@gmail.com>> wrote:
>>> 
>>> 
>>> Hi,
>>> 
>>> The JSON encoding in the specification 
>>> <https://avro.apache.org/docs/current/spec.html#json_encoding> includes an 
>>> explicit type name for all kinds of object other than null. This means that 
>>> a JSON-encoded Avro value with a union is very rarely directly compatible 
>>> with normal JSON formats.
>>> 
>>> For example, it's very common for a JSON-encoded value to allow a value 
>>> that's either null or string. In Avro, that's trivially expressed as the 
>>> union type ["null", "string"]. With conventional JSON, a string value "foo" 
>>> would be encoded just as "foo", which is easily distinguished from null 
>>> when decoding. However when using the Avro JSON format it must be encoded 
>>> as {"string": "foo"}.
>>> 
>>> This means that Avro JSON-encoded values don't interchange easily with 
>>> other JSON-encoded values.
>>> 
>>> AFAICS the main reason that the type name is always required in 
>>> JSON-encoded unions is to avoid ambiguity. This particularly applies to 
>>> record and map types, where it's not possible in general to tell which 
>>> member of the union has been specified by looking at the data itself.
>>> 
>>> However, that reasoning doesn't apply if all the members of the union can 
>>> be distinguished from their JSON token type.
>>> 
>>> I am considering using a JSON encoding that omits the type name when all 
>>> the members of the union encode to distinct JSON token types (the JSON 
>>> token types being: null, boolean, string, number, object and array).
>>> 
>>> For example, JSON-encoded values using the Avro schema ["null", "string", 
>>> "int"] would encode as the literal values themselves (e.g. null, "foo", 
>>> 999), but JSON-encoded values using the Avro schema ["int", "double"] would 
>>> require the type name because the JSON lexeme doesn't distinguish between 
>>> different kinds of number.
>>> 
>>> This would mean that it would be possible to represent a significant subset 
>>> of "normal" JSON schemas with Avro. It seems to me that would potentially 
>>> be very useful.
>>> 
>>> Thoughts? Is this a really bad idea to be contemplating? :)
>>> 
>>>   cheers,
>>>     rog.
>>> 
>>> 
>> 
> 

Reply via email to