Hi, Yes, totally forgot about the statement. KIP-466 is updated.
Thank you so much John Roesler, Matthias J. Sax, Sophie Blee-Goldman for your valuable input! I hope I did not cause too much trouble :) I’ll start the vote now. Best, Daniyar Yeralin > On Jul 16, 2019, at 3:17 PM, John Roesler <j...@confluent.io> wrote: > > Hi Daniyar, > > Thanks for that update. I took a look, and I think this is in good shape. > > One note, the statement "New method public static <T> Serde<List<T>> > ListSerde() in org.apache.kafka.common.serialization.Serdes class > (infers list implementation and inner serde from config file)" is > still present in the KIP, although I do it is was removed from the PR. > > Once you remove that statement from the KIP, then I think this KIP is > ready to go up for a vote! Then, we can really review the PR in > earnest and get this thing merged. > > Thanks, > -john > > On Tue, Jul 16, 2019 at 2:05 PM Development <d...@yeralin.net> wrote: >> >> Hi, >> >> Pushed new changes under my PR: https://github.com/apache/kafka/pull/6592 >> <https://github.com/apache/kafka/pull/6592> >> >> Feel free to put any comments in there. >> >> Best, >> Daniyar Yeralin >> >>> On Jul 15, 2019, at 1:06 PM, Development <d...@yeralin.net> wrote: >>> >>> Hi John, >>> >>> I knew I was missing something. Yes, that makes sense now, I removed all >>> `listSerde()` methods, and left empty constructors instead. >>> >>> As per `CommonClientConfigs` I looked at the class, it doesn’t have any >>> properties related to serdes, and that bothers me a little. >>> >>> All properties like `default.key.serde` `default.windowed.key.serde.*` are >>> located in StreamsConfig. I don’t want to create a confusion. >>> What also doesn’t make sense to me is that `WindowedSerdes` and its >>> (de)serializers are not located in org.apache.kafka.common.serialization. I >>> guess it kind of makes sense since windowed serdes are only available for >>> kafka streams, not vice versa. >>> >>> If everyone is okay to put list properties in `CommonClientConfigs` class, >>> I’ll go ahead and do that then. >>> >>> Thank you for your input! >>> >>> Best, >>> Daniyar Yeralin >>> >>>> On Jul 15, 2019, at 11:45 AM, John Roesler <j...@confluent.io> wrote: >>>> >>>> Hi all, >>>> >>>> Regarding the placement, you might as well move the constants to >>>> `org.apache.kafka.clients.CommonClientConfigs`, so that the constants and >>>> the configs and the code are in the same module. >>>> >>>> Regarding the constructor... What Matthias said is correct: The serde, >>>> serializer, and deserializer all need to have zero-arg constructors so >>>> they can be instantiated reflectively by Kafka. However, the factory >>>> method you proposed "New method public static <T> Serde<List<T>> >>>> ListSerde()" is not a constructor, and is not required. It would be used >>>> purely from the Java interface, but has the drawbacks I listed above. This >>>> method, not the constructor, is what I proposed to remove from the KIP. >>>> >>>> Thanks, >>>> -John >>>> >>>> >>>> On Mon, Jul 15, 2019 at 10:15 AM Development <d...@yeralin.net >>>> <mailto:d...@yeralin.net>> wrote: >>>> One problem though. >>>> >>>> Since WindowedSerde (Windowed(De)Serializer) are so similar, I’m trying to >>>> mimic the implementation of my ListSerde accordingly. >>>> >>>> I created couple constants under StreamsConfig: >>>> >>>> >>>> >>>> And trying to do similar construct: >>>> final String propertyName = isKey ? >>>> StreamsConfig.DEFAULT_LIST_KEY_SERDE_INNER_CLASS : >>>> StreamsConfig.DEFAULT_LIST_VALUE_SERDE_INNER_CLASS; >>>> But then found out that StreamsConfig is not accessible from >>>> org.apache.kafka.common.serialization package while window serde >>>> (de)serializers are located under org.apache.kafka.streams.kstream package. >>>> >>>> What should I do? Should I move my classes under >>>> org.apache.kafka.streams.kstream package instead? >>>> >>>>> On Jul 15, 2019, at 10:45 AM, Development <d...@yeralin.net >>>>> <mailto:d...@yeralin.net>> wrote: >>>>> >>>>> Hi Matthias, >>>>> >>>>> Thank you for your input. >>>>> >>>>> I updated the KIP, made it a little more readable. >>>>> >>>>> I think the configuration parameters strategy is finalized then. >>>>> >>>>> Do you have any other questions/concerns regarding this KIP? >>>>> >>>>> Meanwhile I’ll start doing appropriate code changes, and commit them >>>>> under my PR. >>>>> >>>>> Best, >>>>> Daniyar Yeralin >>>>> >>>>>> On Jul 11, 2019, at 2:44 PM, Matthias J. Sax <matth...@confluent.io >>>>>> <mailto:matth...@confluent.io>> wrote: >>>>>> >>>>>> Daniyar, >>>>>> >>>>>> thanks for the update to the KIP. It's in really good shape and well >>>>>> written. >>>>>> >>>>>> About the default constructor question: >>>>>> >>>>>> All Serdes/Serializer/Deserializer classes need a default constructor to >>>>>> create them easily via reflections when specifies in a config. I >>>>>> understand that it is not super user friendly, but all existing code >>>>>> works this way. Hence, it seems best to stick with the established >>>>>> pattern. >>>>>> >>>>>> We have a similar issue with `TimeWindowedSerde` and >>>>>> `SessionWindowedSerde`, and I just recently did a PR to improve user >>>>>> experience that address the exact issue John raised. (cf >>>>>> https://github.com/apache/kafka/pull/7067 >>>>>> <https://github.com/apache/kafka/pull/7067>) >>>>>> >>>>>> Note, that if a user would instantiate the Serde manually, the user >>>>>> would also need to call `configure()` to setup the inner serdes. Kafka >>>>>> Streams would not setup those automatically and one might most likely >>>>>> end-up with an NPE. >>>>>> >>>>>> >>>>>> Coming back the KIP, and the parameter names. `WindowedSerdes` are >>>>>> similar to `ListSerde` as they wrap another Serde. For `WindowedSerdes`, >>>>>> we use the following parameter names: >>>>>> >>>>>> - default.windowed.key.serde.inner >>>>>> - default.windowed.value.serde.inner >>>>>> >>>>>> >>>>>> It might be good to align the naming pattern. I would also suggest to >>>>>> use `type` instead of `impl`? >>>>>> >>>>>> >>>>>> default.key.list.serde.impl -> default.list.key.serde.type >>>>>> default.value.list.serde.impl -> default.list.value.serde.type >>>>>> default.key.list.serde.element -> default.list.key.serde.inner >>>>>> default.value.list.serde.element -> default.list.value.serde.inner >>>>>> >>>>>> >>>>>> >>>>>> -Matthias >>>>>> >>>>>> >>>>>> On 7/10/19 8:52 AM, Development wrote: >>>>>>> Hi John, >>>>>>> >>>>>>> Yes, I do agree. That totally makes sense. The only thing is that it >>>>>>> goes against what Matthias suggested earlier: >>>>>>> "I think that ... `ListSerde` should have an default constructor and it >>>>>>> should be possible to pass in the `Class listClass` information via a >>>>>>> configuration. Otherwise, KafkaStreams cannot use it as default serde.” >>>>>>> >>>>>>> What do you think about that? I hope I’m not confusing anything. >>>>>>> >>>>>>> Best, >>>>>>> Daniyar Yeralin >>>>>>> >>>>>>>> On Jul 9, 2019, at 5:56 PM, John Roesler <j...@confluent.io >>>>>>>> <mailto:j...@confluent.io>> wrote: >>>>>>>> >>>>>>>> Ah, my apologies, I must have just overlooked it. Thanks for the >>>>>>>> update, too. >>>>>>>> >>>>>>>> Just one more super-small question, do we need this variant: >>>>>>>> >>>>>>>>> New method public static <T> Serde<List<T>> ListSerde() in >>>>>>>>> org.apache.kafka.common.serialization.Serdes class (infers list >>>>>>>>> implementation and inner serde from config file) >>>>>>>> >>>>>>>> It seems like this situation implies my config file is already set up >>>>>>>> for the list serde, so passing this serde (e.g., in Produced) would >>>>>>>> have the same effect as not specifying it. >>>>>>>> >>>>>>>> I guess that it could be the case that you have the >>>>>>>> `default.key/value.serde` set to something else, like StringSerde, but >>>>>>>> you still have the `default.key/value.list.serde.impl/element` set. >>>>>>>> This seems like it would result in more confusion than convenience, so >>>>>>>> my gut instinct is maybe we shouldn't introduce the `ListSerde()` >>>>>>>> variant until people actually request it later on. >>>>>>>> >>>>>>>> Thus, we'd just stick with fully config-driven or fully >>>>>>>> source-code-driven, not half/half. >>>>>>>> >>>>>>>> What do you think? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> -John >>>>>>>> >>>>>>>> >>>>>>>> On Tue, Jul 9, 2019 at 9:58 AM Development <d...@yeralin.net >>>>>>>> <mailto:d...@yeralin.net> <mailto:d...@yeralin.net >>>>>>>> <mailto:d...@yeralin.net>>> wrote: >>>>>>>>> >>>>>>>>> Hi John, >>>>>>>>> >>>>>>>>> I hope everyone had a great long weekend. >>>>>>>>> >>>>>>>>> Regarding Java interfaces, I may not understand you correctly, but I >>>>>>>>> think I already listed them: >>>>>>>>> >>>>>>>>> So for Produced, you would use it in the following fashion, for >>>>>>>>> example: Produced.keySerde(Serdes.ListSerde(ArrayList.class, >>>>>>>>> Serdes.Integer())) >>>>>>>>> >>>>>>>>> I also updated the KIP, and added a section “Serialization Strategy” >>>>>>>>> where I describe our logic of conditional serialization based on the >>>>>>>>> type of an inner serde. >>>>>>>>> >>>>>>>>> Thank you! >>>>>>>>> >>>>>>>>> Best, >>>>>>>>> Daniyar Yeralin >>>>>>>>> >>>>>>>>> On Jun 26, 2019, at 11:44 AM, John Roesler <j...@confluent.io >>>>>>>>> <mailto:j...@confluent.io> <mailto:j...@confluent.io >>>>>>>>> <mailto:j...@confluent.io>>> wrote: >>>>>>>>> >>>>>>>>> Thanks for the update, Daniyar! >>>>>>>>> >>>>>>>>> In addition to specifying the config interface, can you also specify >>>>>>>>> the Java interface? Namely, if I need to pass an instance of this >>>>>>>>> serde in to the DSL directly, as in Produced, Materialized, etc., what >>>>>>>>> constructor(s) would I have available? Likewise with the Serializer >>>>>>>>> and Deserailizer. I don't think you need to specify the implementation >>>>>>>>> logic, since we've already discussed it here. >>>>>>>>> >>>>>>>>> If you also want to specify the serialized format of the data records >>>>>>>>> in the KIP, it could be useful documentation, as well as letting us >>>>>>>>> verify the schema for forward/backward compatibility concerns, etc. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> John >>>>>>>>> >>>>>>>>> On Wed, Jun 26, 2019 at 10:33 AM Development <d...@yeralin.net >>>>>>>>> <mailto:d...@yeralin.net> <mailto:d...@yeralin.net >>>>>>>>> <mailto:d...@yeralin.net>>> wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> Hey, >>>>>>>>> >>>>>>>>> Finally made updates to the KIP: >>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-466%3A+Add+support+for+List%3CT%3E+serialization+and+deserialization >>>>>>>>> >>>>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-466%3A+Add+support+for+List%3CT%3E+serialization+and+deserialization> >>>>>>>>> >>>>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-466%3A+Add+support+for+List%3CT%3E+serialization+and+deserialization >>>>>>>>> >>>>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-466%3A+Add+support+for+List%3CT%3E+serialization+and+deserialization>> >>>>>>>>> >>>>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-466:+Add+support+for+List%3CT%3E+serialization+and+deserialization >>>>>>>>> >>>>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-466:+Add+support+for+List%3CT%3E+serialization+and+deserialization> >>>>>>>>> >>>>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-466:+Add+support+for+List%3CT%3E+serialization+and+deserialization >>>>>>>>> >>>>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-466:+Add+support+for+List%3CT%3E+serialization+and+deserialization>>> >>>>>>>>> Sorry for the delay :) >>>>>>>>> >>>>>>>>> Thank You! >>>>>>>>> >>>>>>>>> Best, >>>>>>>>> Daniyar Yeralin >>>>>>>>> >>>>>>>>> On Jun 22, 2019, at 12:49 AM, Matthias J. Sax <matth...@confluent.io >>>>>>>>> <mailto:matth...@confluent.io> <mailto:matth...@confluent.io >>>>>>>>> <mailto:matth...@confluent.io>>> wrote: >>>>>>>>> >>>>>>>>> Yes, something like this. I did not think about good configuration >>>>>>>>> parameter names yet. I am also not sure if I understand all proposed >>>>>>>>> configs atm. But all configs should be listed and explained in the KIP >>>>>>>>> anyway, and we can discuss further after you have updated the KIP (I >>>>>>>>> can >>>>>>>>> ask more detailed question if I have any). >>>>>>>>> >>>>>>>>> >>>>>>>>> -Matthias >>>>>>>>> >>>>>>>>> On 6/21/19 2:05 PM, Development wrote: >>>>>>>>> >>>>>>>>> Yes, you are right. ByteSerializer is not what I need to have in a >>>>>>>>> list >>>>>>>>> of primitives. >>>>>>>>> >>>>>>>>> As for the default constructor and configurability, just want to make >>>>>>>>> sure. Is this what you have on your mind? >>>>>>>>> >>>>>>>>> Best, >>>>>>>>> Daniyar Yeralin >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Jun 21, 2019, at 2:51 PM, Matthias J. Sax <matth...@confluent.io >>>>>>>>> <mailto:matth...@confluent.io> <mailto:matth...@confluent.io >>>>>>>>> <mailto:matth...@confluent.io>> >>>>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io> >>>>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io>>>> wrote: >>>>>>>>> >>>>>>>>> Thanks for the update! >>>>>>>>> >>>>>>>>> I think that `ListDeserializer`, `ListSerializer`, and `ListSerde` >>>>>>>>> should have an default constructor and it should be possible to pass >>>>>>>>> in >>>>>>>>> the `Class listClass` information via a configuration. Otherwise, >>>>>>>>> KafkaStreams cannot use it as default serde. >>>>>>>>> >>>>>>>>> >>>>>>>>> For the primitive serializers: `BytesSerializer` is not primitive >>>>>>>>> IMHO, >>>>>>>>> as is it for `byte[]` with variable length -- it's for arrays, not for >>>>>>>>> single `byte` (note, that `Bytes` is a Kafka class wrapping `byte[]`). >>>>>>>>> >>>>>>>>> >>>>>>>>> For tests, we can comment on the PR. No need to do this in the KIP >>>>>>>>> discussion. >>>>>>>>> >>>>>>>>> >>>>>>>>> Can you also update the KIP? >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -Matthias >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On 6/21/19 11:29 AM, Development wrote: >>>>>>>>> >>>>>>>>> I made and pushed necessary commits, so we could review the final >>>>>>>>> version under PR https://github.com/apache/kafka/pull/6592 >>>>>>>>> <https://github.com/apache/kafka/pull/6592> >>>>>>>>> <https://github.com/apache/kafka/pull/6592 >>>>>>>>> <https://github.com/apache/kafka/pull/6592>> >>>>>>>>> >>>>>>>>> I also need some advice on writing tests for this new serde. So far I >>>>>>>>> only have two test cases (roundtrip and empty payload), I’m not sure >>>>>>>>> if it is enough. >>>>>>>>> >>>>>>>>> Thank y’all for your help in this KIP :) >>>>>>>>> >>>>>>>>> Best, >>>>>>>>> Daniyar Yeralin >>>>>>>>> >>>>>>>>> >>>>>>>>> On Jun 21, 2019, at 1:44 PM, John Roesler <j...@confluent.io >>>>>>>>> <mailto:j...@confluent.io> <mailto:j...@confluent.io >>>>>>>>> <mailto:j...@confluent.io>> >>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io> >>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io>>>> wrote: >>>>>>>>> >>>>>>>>> Hey Daniyar, >>>>>>>>> >>>>>>>>> Looks good to me! Thanks for considering it. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> -John >>>>>>>>> >>>>>>>>> On Fri, Jun 21, 2019 at 9:04 AM Development <d...@yeralin.net >>>>>>>>> <mailto:d...@yeralin.net> <mailto:d...@yeralin.net >>>>>>>>> <mailto:d...@yeralin.net>> >>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> >>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>> >>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> >>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>>> wrote: >>>>>>>>> Hey John and Matthias, >>>>>>>>> >>>>>>>>> Yes, now I see it all. I’m storing lots of redundant information. >>>>>>>>> Here is my final idea. Yes, now a user should pass a list type. I >>>>>>>>> realized that’s the type is not really needed in ListSerializer, but >>>>>>>>> only in ListDeserializer: >>>>>>>>> >>>>>>>>> >>>>>>>>> In ListSerializer we will start storing sizes only if serializer is >>>>>>>>> not a primitive serializer: >>>>>>>>> >>>>>>>>> >>>>>>>>> Then, in deserializer, we persist passed list type, so that during >>>>>>>>> deserialization we could create an instance of it with predefined >>>>>>>>> listSize for better performance. >>>>>>>>> We also try to locate a primitiveSize based on passed deserializer. >>>>>>>>> If it is not there, then primitiveSize will be null. Which means >>>>>>>>> that each entry’s size was encoded individually. >>>>>>>>> >>>>>>>>> >>>>>>>>> This looks much cleaner and more concise. >>>>>>>>> >>>>>>>>> What do you think? >>>>>>>>> >>>>>>>>> Best, >>>>>>>>> Daniyar Yeralin >>>>>>>>> >>>>>>>>> On Jun 20, 2019, at 5:45 PM, Matthias J. Sax <matth...@confluent.io >>>>>>>>> <mailto:matth...@confluent.io> <mailto:matth...@confluent.io >>>>>>>>> <mailto:matth...@confluent.io>> >>>>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io> >>>>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io>>> >>>>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io> >>>>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io>>>> wrote: >>>>>>>>> >>>>>>>>> For encoding the list-type: I see John's point about re-encoding the >>>>>>>>> list-type redundantly. However, I also don't like the idea that the >>>>>>>>> Deserializer returns a fixed type... >>>>>>>>> >>>>>>>>> Maybe it's best allow users to specify the target list type on >>>>>>>>> deserialization via config? >>>>>>>>> >>>>>>>>> Similar for the primitive types: I don't think we need to encode the >>>>>>>>> type size, but users could specify the type on the deserializer (via a >>>>>>>>> config again)? >>>>>>>>> >>>>>>>>> >>>>>>>>> About generics: nesting could be arbitrarily deep. Hence, I doubt >>>>>>>>> we can >>>>>>>>> support this and a cast will be necessary at some point in the user >>>>>>>>> code. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -Matthias >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On 6/20/19 1:21 PM, John Roesler wrote: >>>>>>>>> >>>>>>>>> Hey Daniyar, >>>>>>>>> >>>>>>>>> Thanks for looking at it! >>>>>>>>> >>>>>>>>> Something like your screenshot is more along the lines of what I was >>>>>>>>> thinking. Sorry, but I didn't follow what you mean, how would that not >>>>>>>>> be "vanilla java"? >>>>>>>>> >>>>>>>>> Unfortunately the deserializer needs more information, though. For >>>>>>>>> example, what if the inner type is a Map<String,String>? The serde >>>>>>>>> could >>>>>>>>> only be used to produce a LinkedList<Map>, thus, we'd still need an >>>>>>>>> inner serde, like you have in the KIP (Serde<T> innerSerde). >>>>>>>>> >>>>>>>>> Something more like Serde<LinkedList<MyRecord>> = Serdes.listSerde( >>>>>>>>> /**list type**/ LinkedList.class, >>>>>>>>> /**inner serde**/ new MyRecordSerde() >>>>>>>>> ) >>>>>>>>> >>>>>>>>> And in configuration, it's something like: >>>>>>>>> default.key.serde: org...ListSerde >>>>>>>>> default.key.list.serde.type: java.util.LinkedList >>>>>>>>> default.key.list.serde.inner: com.mycompany.MyRecordSerde >>>>>>>>> >>>>>>>>> >>>>>>>>> What do you think? >>>>>>>>> Thanks, >>>>>>>>> -John >>>>>>>>> >>>>>>>>> On Thu, Jun 20, 2019 at 2:46 PM Development <d...@yeralin.net >>>>>>>>> <mailto:d...@yeralin.net> <mailto:d...@yeralin.net >>>>>>>>> <mailto:d...@yeralin.net>> >>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> >>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>> >>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> >>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>> >>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> >>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>> >>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> >>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>>>> wrote: >>>>>>>>> >>>>>>>>> Hey John, >>>>>>>>> >>>>>>>>> I gave read about TypeReference. It could work for the list serde. >>>>>>>>> However, it is not directly >>>>>>>>> supported: >>>>>>>>> https://github.com/FasterXML/jackson-databind/issues/1490 >>>>>>>>> <https://github.com/FasterXML/jackson-databind/issues/1490> >>>>>>>>> <https://github.com/FasterXML/jackson-databind/issues/1490 >>>>>>>>> <https://github.com/FasterXML/jackson-databind/issues/1490>> >>>>>>>>> <https://github.com/FasterXML/jackson-databind/issues/1490 >>>>>>>>> <https://github.com/FasterXML/jackson-databind/issues/1490> >>>>>>>>> <https://github.com/FasterXML/jackson-databind/issues/1490 >>>>>>>>> <https://github.com/FasterXML/jackson-databind/issues/1490>>> >>>>>>>>> The only way is to pass an actual class object into the constructor, >>>>>>>>> something like: >>>>>>>>> >>>>>>>>> It could be an option, but not a pretty one. What do you think of my >>>>>>>>> approach to use vanilla java and canonical class name? (As described >>>>>>>>> previously) >>>>>>>>> >>>>>>>>> Best, >>>>>>>>> Daniyar Yeralin >>>>>>>>> >>>>>>>>> On Jun 20, 2019, at 2:45 PM, Development <d...@yeralin.net >>>>>>>>> <mailto:d...@yeralin.net> <mailto:d...@yeralin.net >>>>>>>>> <mailto:d...@yeralin.net>> >>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> >>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>> >>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> >>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>> >>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> >>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>> >>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> >>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>>>> wrote: >>>>>>>>> >>>>>>>>> Hi John, >>>>>>>>> >>>>>>>>> Thank you for your input! Yes, my idea looks a little bit over >>>>>>>>> engineered :) >>>>>>>>> >>>>>>>>> I also wanted to see a feedback from Mathias as well since he gave >>>>>>>>> me an idea about storing fixed/variable size entries. >>>>>>>>> >>>>>>>>> Best, >>>>>>>>> Daniyar Yeralin >>>>>>>>> >>>>>>>>> On Jun 18, 2019, at 6:06 PM, John Roesler <j...@confluent.io >>>>>>>>> <mailto:j...@confluent.io> <mailto:j...@confluent.io >>>>>>>>> <mailto:j...@confluent.io>> >>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io> >>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io>>> >>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io> >>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io>>> >>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io> >>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io>> >>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io> >>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io>>>>> wrote: >>>>>>>>> >>>>>>>>> Hi Daniyar, >>>>>>>>> >>>>>>>>> That's a very clever solution! >>>>>>>>> >>>>>>>>> One observation is that, now, this is what we might call a >>>>>>>>> polymorphic >>>>>>>>> serde. That is, you're detecting the actual concrete type and then >>>>>>>>> promising to produce the exact same concrete type on read. >>>>>>>>> There are >>>>>>>>> some inherent problems with this approach, which in general >>>>>>>>> require >>>>>>>>> some kind of schema registry (not necessarily Schema >>>>>>>>> Registry, just >>>>>>>>> any registry for schemas) to solve. >>>>>>>>> >>>>>>>>> Notice that every serialized record has quite a bit of duplicated >>>>>>>>> information: the concrete type as well as a byte to indicate >>>>>>>>> whether >>>>>>>>> the value type is a fixed size, and, if so, an integer to >>>>>>>>> indicate the >>>>>>>>> actual size. These constitute a schema, of sorts, because they >>>>>>>>> tell us >>>>>>>>> later how exactly to deserialize the data. Unfortunately, this >>>>>>>>> information is completely redundant. In all likelihood, the >>>>>>>>> information will be exactly the same for every record in the >>>>>>>>> topic. >>>>>>>>> This problem is essentially the core motivation for serializations >>>>>>>>> like Avro: to move the schema outside of the serialization >>>>>>>>> itself, so >>>>>>>>> that the records won't contain so much redundant information. >>>>>>>>> >>>>>>>>> In this light, I'm wondering if it makes sense to go back to >>>>>>>>> something >>>>>>>>> like what you had earlier in which you don't support perfectly >>>>>>>>> preserving the concrete type for _this_ serde, but instead just >>>>>>>>> support deserializing to _some_ List. Then, you could defer full, >>>>>>>>> perfect, type preservation to serdes that have an external >>>>>>>>> system in >>>>>>>>> which to register their type information. >>>>>>>>> >>>>>>>>> There does exist an alternative, if we really do want to >>>>>>>>> preserve the >>>>>>>>> concrete type (which does seem kind of nice). You can add a >>>>>>>>> configuration option specifically for the serde to configure >>>>>>>>> what the >>>>>>>>> list type will be, and maybe what the element type is, as well. >>>>>>>>> >>>>>>>>> As far as "related work" goes, you might be interested to take >>>>>>>>> a look >>>>>>>>> at how Jackson can be configured to deserialize into a specific, >>>>>>>>> arbitrarily nested, generically parameterized class structure. >>>>>>>>> Specifically, you might find >>>>>>>>> https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html >>>>>>>>> >>>>>>>>> <https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html> >>>>>>>>> >>>>>>>>> <https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html >>>>>>>>> >>>>>>>>> <https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html>> >>>>>>>>> <https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html >>>>>>>>> >>>>>>>>> <https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html> >>>>>>>>> >>>>>>>>> <https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html >>>>>>>>> >>>>>>>>> <https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html>>> >>>>>>>>> interesting. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> -John >>>>>>>>> >>>>>>>>> On Mon, Jun 17, 2019 at 12:38 PM Development <d...@yeralin.net >>>>>>>>> <mailto:d...@yeralin.net> <mailto:d...@yeralin.net >>>>>>>>> <mailto:d...@yeralin.net>> >>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> >>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>> >>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> >>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>> >>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> >>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>> >>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> >>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>>>> wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> bump >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >>