On Wed, 15 Jan 2020 at 16:27, Zoltan Farkas <zolyfar...@yahoo.com> wrote:

> See comments in-line below:
>
> On Jan 15, 2020, at 3:42 AM, roger peppe <rogpe...@gmail.com> wrote:
>
> Oops, I left arrays out! Two other thoughts:
>
>
>    - I wonder if it might be worth hedging bets about logical types. It
>    would be nice if (for example) a `timestamp-micros` value could be encoded
>    as an RFC3339 string, so perhaps that should be allowed for, but maybe
>    that's a step too far.
>
> I think logical types should should stay above the encoding/decoding…
> With timestamp-micros we could extend it to make it applicable to string
> and implement the converters, and then in json you would have something
> readable, but you would then have the same in binary and pay the
> readability cost there as well.
>

I'm not sure what you mean there. I wouldn't expect the Avro binary format
to be readable at all.

I implemented special handling for decimal logical type in my
> encoder/decoder, but the best implementation I could do still feels like a
> hack...
>
>
>    - I wonder if there should be some indication of version so that you
>    know which JSON encoding version you're reading. Perhaps the Avro schema
>    could include a version field (maybe as part of a definition) so you know
>    which version of the spec to use when encoding/decoding. Then bet-hedging
>    wouldn't be quite as important.
>
> I think Schema needs to stay decoupled from the encoding. The same schema
> can be encoded in various ways (I have a csv encoder/decoder for example,
> https://demo.spf4j.org/example/records?_Accept=text/csv ).
> I think the right abstraction for what you are looking for is the Media
> Type(https://en.wikipedia.org/wiki/Media_type ),
> It would be helpful to “standardize” the media types for the avro
> encodings:
>

Yes, on reflection, I agree, even though not every possible medium has a
media type. For example, what if we're storing JSON data in a file? I guess
it would be up to us to store the type along with the data, as the registry
message wire format
<https://docs.confluent.io/current/schema-registry/serializer-formatter.html#wire-format>
does, for example by wrapping the entire value in another JSON object.


> Here is what I mean, (with some examples where the same schema is served
> with different encodings):
>
> 1) Binary: “application/avro”
> https://demo.spf4j.org/example/records?_Accept=application/avro
> 2) Current Json: “application/avro+json"
> https://demo.spf4j.org/example/records?_Accept=application/avro-x%2Bjson
> <https://demo.spf4j.org/example/records?_Accept=application/avro+json>
> 3) New Json: “application/avro-x+json” ?
> https://demo.spf4j.org/example/records?_Accept=application/avro-x%2Bjson
> <https://demo.spf4j.org/example/records?_Accept=application/avro+json>
>

ISTM that "x" isn't a hugely descriptive qualifier there. How about
"application/avro+json.v2" ? Then it's clear what to do if we want to make
another version.



> The media type including the avro schema (like you can see in the response
> ContentType in the headers above) can provide complete type  information to
> be able to read a avro object from a byte stream.
>
>
> application/avro-x+json;avsc="{\"type\":\"array\",\"items\":{\"$ref\":\"org.spf4j.demo:jaxrs-spf4j-demo-schema:0.8:b\"}}”
>
> In HTTP context this fits well with content negotiation, and a client can
> ask for a previous version like:
>
>
> https://demo.spf4j.org/example/records/1?_Accept=application/json;avsc=%22{\%22$ref\%22:\%22org.spf4j.demo:jaxrs-spf4j-demo-schema:0.4:b\%22}%22
> <https://demo.spf4j.org/example/records/1?_Accept=application/json;avsc=%22%7B%5C%22$ref%5C%22:%5C%22org.spf4j.demo:jaxrs-spf4j-demo-schema:0.4:b%5C%22%7D%22>
>
>

> Note on $ref,  it is an extension to avsc I use to reference schemas from
> maven repos. (see
> https://github.com/zolyfarkas/jaxrs-spf4j-demo/wiki/AvroReferences if
> interested in more detail)
>

Interesting stuff. I like the idea of being able to get the server to check
the desired client encoding, although I'm somewhat wary of the potential
security implications of $ref with arbitrary URLs.

Apart from the issues you raised, does my description of the proposed
semantics seem reasonable? It could be slightly cleverer and avoid
type-name wrapping in more situations, but this seemed like a nice balance
between easy-to-explain and idiomatic-in-most-situations.

   cheers,
     rog.

Reply via email to