Makes sense, We have to agree on he scope of this implementation.
Right now the implementation I have in java, handles only the: union {null, [some type]} situation. Are we ok with this for a start? What I see more, is to handle: 1) union {string, double}, (although we have to specify behavior for NAN, Positive and negative infinity); union {string, boolean}; …. 2) Make decimal an avro first class type. Current logical type approach is not natural in JSON. (see https://issues.apache.org/jira/browse/AVRO-2164). For 1.9.x 2) is probably a non-starter let me know. —Z > On Jan 14, 2020, at 12:09 PM, roger peppe <rogpe...@gmail.com> wrote: > > > On Tue, 14 Jan 2020 at 15:00, Zoltan Farkas <zolyfar...@yahoo.com > <mailto:zolyfar...@yahoo.com>> wrote: > I can go ahead create a PR to add the Encoder/Decoder implementations. > let me know if anyone else plans to do that. (to avoid wasting time) > > Hi, > > Before you do that, would it be possible to write a specification for exactly > what the conventions are and publish it somewhere? There are a bunch of edge > cases that could be done in different ways, I think. > > That way people like me that don't use Java can implement the same spec. (and > also it's useful to know exactly what one is implementing before diving in > and writing the code :]) > > cheers, > rog. > > > thanks > > —Z > >> On Jan 9, 2020, at 3:51 AM, Driesprong, Fokko <fo...@driesprong.frl >> <mailto:fo...@driesprong.frl>> wrote: >> >> Thanks for chipping in Zoltan and Sean. I did not plan to change the current >> JSON encoder. My initial suggestion would make this an option that the user >> can set. The default will be the current situation, so nothing should change >> when upgrading to a newer version of Avro. >> >> Cheers, Fokko >> >> Op wo 8 jan. 2020 om 21:39 schreef Sean Busbey <bus...@apache.org >> <mailto:bus...@apache.org>>: >> I agree with Zoltan here. We have a really long history of maintaining >> compatibility for encoders. >> >> On Tue, Jan 7, 2020 at 10:06 AM Zoltan Farkas <zolyfar...@yahoo.com >> <mailto:zolyfar...@yahoo.com>> wrote: >> Fokko, >> >> I am not sure we should be changing the existing json encoder, >> I think we should just add another encoder, and devs can use either one of >> them based on their use case… and stay backward compatible. >> >> we should maybe standardize the content types for them… I have seen >> application/avro being used for binary, we could have for json: >> application/avro+json for the current format, application/avro.2+json for >> the new format…. >> >> At some point in the future we could deprecate the old one… >> >> —Z >> >> >>> On Jan 7, 2020, at 2:41 AM, Driesprong, Fokko <fo...@driesprong.frl >>> <mailto:fo...@driesprong.frl>> wrote: >>> >>> I would be a great fan of this as well. This also bothered me. The tricky >>> part here is to see when to release this because it will break the existing >>> JSON structure. We could make this configurable as well. >>> >>> Cheers, Fokko >>> >>> Op ma 6 jan. 2020 om 22:36 schreef roger peppe <rogpe...@gmail.com >>> <mailto:rogpe...@gmail.com>>: >>> That's great, thanks! I thought this would probably have come up before. >>> >>> Have you written down your changes in a somewhat more formal specification >>> document, by any chance? >>> >>> cheers, >>> rog. >>> >>> >>> On Mon, 6 Jan 2020, 18:50 zoly farkas, <zolyfar...@yahoo.com >>> <mailto:zolyfar...@yahoo.com>> wrote: >>> I think there is consensus that this should be implemented, see [AVRO-1582] >>> Json serialization of nullable fileds and fields with default values >>> improvement. - ASF JIRA <https://issues.apache.org/jira/browse/AVRO-1582> >>> >>> [AVRO-1582] Json serialization of nullable fileds and fields with defaul... >>> <https://issues.apache.org/jira/browse/AVRO-1582> >>> >>> >>> Here is a live example to get some sample data in avro json: >>> https://demo.spf4j.org/example/records/1?_Accept=application/avro%2Bjson >>> <https://demo.spf4j.org/example/records/1?_Accept=application/avro%2Bjson> >>> and the "Natural" >>> https://demo.spf4j.org/example/records/1?_Accept=application/json >>> <https://demo.spf4j.org/example/records/1?_Accept=application/json> using >>> the encoder suggested as implementation in the jira. >>> >>> Somebody needs to find the time do the work to integrate this... >>> >>> --Z >>> >>> >>> >>> >>> On Monday, January 6, 2020, 12:36:44 PM EST, roger peppe >>> <rogpe...@gmail.com <mailto:rogpe...@gmail.com>> wrote: >>> >>> >>> Hi, >>> >>> The JSON encoding in the specification >>> <https://avro.apache.org/docs/current/spec.html#json_encoding> includes an >>> explicit type name for all kinds of object other than null. This means that >>> a JSON-encoded Avro value with a union is very rarely directly compatible >>> with normal JSON formats. >>> >>> For example, it's very common for a JSON-encoded value to allow a value >>> that's either null or string. In Avro, that's trivially expressed as the >>> union type ["null", "string"]. With conventional JSON, a string value "foo" >>> would be encoded just as "foo", which is easily distinguished from null >>> when decoding. However when using the Avro JSON format it must be encoded >>> as {"string": "foo"}. >>> >>> This means that Avro JSON-encoded values don't interchange easily with >>> other JSON-encoded values. >>> >>> AFAICS the main reason that the type name is always required in >>> JSON-encoded unions is to avoid ambiguity. This particularly applies to >>> record and map types, where it's not possible in general to tell which >>> member of the union has been specified by looking at the data itself. >>> >>> However, that reasoning doesn't apply if all the members of the union can >>> be distinguished from their JSON token type. >>> >>> I am considering using a JSON encoding that omits the type name when all >>> the members of the union encode to distinct JSON token types (the JSON >>> token types being: null, boolean, string, number, object and array). >>> >>> For example, JSON-encoded values using the Avro schema ["null", "string", >>> "int"] would encode as the literal values themselves (e.g. null, "foo", >>> 999), but JSON-encoded values using the Avro schema ["int", "double"] would >>> require the type name because the JSON lexeme doesn't distinguish between >>> different kinds of number. >>> >>> This would mean that it would be possible to represent a significant subset >>> of "normal" JSON schemas with Avro. It seems to me that would potentially >>> be very useful. >>> >>> Thoughts? Is this a really bad idea to be contemplating? :) >>> >>> cheers, >>> rog. >>> >>> >> >