Hi Matthias, I agree, ConsumerConfig did not seem like a right place for these configurations. I’ll put them in ProducerConfig, ConsumerConfig, and StreamsConfig.
However, I have a question. What should I do in "configure(Map<String, ?> configs, boolean isKey)” methods? Which configurations should I try to locate? I was comparing my (de)serializer implementations with SessionWindows(De)serializer classes, and they use StreamsConfig class to get either StreamsConfig.DEFAULT_WINDOWED_KEY_SERDE_INNER_CLASS : StreamsConfig.DEFAULT_WINDOWED_VALUE_SERDE_INNER_CLASS In my case, as I mentioned earlier, StreamsConfig class is not accessible from org.apache.kafka.common.serialization package. So, I can’t utilize it. Any suggestions here? Best, Daniyar Yeralin > On Jul 18, 2019, at 8:46 PM, Matthias J. Sax <matth...@confluent.io> wrote: > > Thanks! > > One minor question about the configs. The KIP adds three classes, a > Serializer, a Deserializer, and a Serde. > > Hence, would it make sense to add the corresponding configs to > `ConsumerConfig`, `ProducerConfig`, and `StreamsConfig` using slightly > different names each time? > > > Somethin like this: > > ProducerConfig: > > list.key/value.serializer.type > list.key/value.serializer.inner > > ConsumerConfig: > > list.key/value.deserializer.type > list.key/value.deserializer.inner > > StreamsConfig: > > default.list.key/value.serde.type > default.list.key/value.serde.inner > > > Adding `d.l.k/v.serde.t/i` to `CommonClientConfigs does not sound right > to me. Also note, that it seems better to avoid the `default.` prefix > for consumers and producers because there is only one Serializer or > Deserializer anyway. Only for Streams, there are multiple and > StreamsConfig specifies the default one of an operator does not > overwrite it. > > Thoughts? > > > Also, the KIP should explicitly mention to what classed certain configs > are added. Atm, the KIP only list parameter names, but does not state > where those are added. > > > -Matthias > > > > > > On 7/16/19 1:11 PM, Development wrote: >> Hi, >> >> Yes, totally forgot about the statement. KIP-466 is updated. >> >> Thank you so much John Roesler, Matthias J. Sax, Sophie Blee-Goldman for >> your valuable input! >> >> I hope I did not cause too much trouble :) >> >> I’ll start the vote now. >> >> Best, >> Daniyar Yeralin >> >>> On Jul 16, 2019, at 3:17 PM, John Roesler <j...@confluent.io> wrote: >>> >>> Hi Daniyar, >>> >>> Thanks for that update. I took a look, and I think this is in good shape. >>> >>> One note, the statement "New method public static <T> Serde<List<T>> >>> ListSerde() in org.apache.kafka.common.serialization.Serdes class >>> (infers list implementation and inner serde from config file)" is >>> still present in the KIP, although I do it is was removed from the PR. >>> >>> Once you remove that statement from the KIP, then I think this KIP is >>> ready to go up for a vote! Then, we can really review the PR in >>> earnest and get this thing merged. >>> >>> Thanks, >>> -john >>> >>> On Tue, Jul 16, 2019 at 2:05 PM Development <d...@yeralin.net> wrote: >>>> >>>> Hi, >>>> >>>> Pushed new changes under my PR: https://github.com/apache/kafka/pull/6592 >>>> <https://github.com/apache/kafka/pull/6592> >>>> >>>> Feel free to put any comments in there. >>>> >>>> Best, >>>> Daniyar Yeralin >>>> >>>>> On Jul 15, 2019, at 1:06 PM, Development <d...@yeralin.net> wrote: >>>>> >>>>> Hi John, >>>>> >>>>> I knew I was missing something. Yes, that makes sense now, I removed all >>>>> `listSerde()` methods, and left empty constructors instead. >>>>> >>>>> As per `CommonClientConfigs` I looked at the class, it doesn’t have any >>>>> properties related to serdes, and that bothers me a little. >>>>> >>>>> All properties like `default.key.serde` `default.windowed.key.serde.*` >>>>> are located in StreamsConfig. I don’t want to create a confusion. >>>>> What also doesn’t make sense to me is that `WindowedSerdes` and its >>>>> (de)serializers are not located in org.apache.kafka.common.serialization. >>>>> I guess it kind of makes sense since windowed serdes are only available >>>>> for kafka streams, not vice versa. >>>>> >>>>> If everyone is okay to put list properties in `CommonClientConfigs` >>>>> class, I’ll go ahead and do that then. >>>>> >>>>> Thank you for your input! >>>>> >>>>> Best, >>>>> Daniyar Yeralin >>>>> >>>>>> On Jul 15, 2019, at 11:45 AM, John Roesler <j...@confluent.io> wrote: >>>>>> >>>>>> Hi all, >>>>>> >>>>>> Regarding the placement, you might as well move the constants to >>>>>> `org.apache.kafka.clients.CommonClientConfigs`, so that the constants >>>>>> and the configs and the code are in the same module. >>>>>> >>>>>> Regarding the constructor... What Matthias said is correct: The serde, >>>>>> serializer, and deserializer all need to have zero-arg constructors so >>>>>> they can be instantiated reflectively by Kafka. However, the factory >>>>>> method you proposed "New method public static <T> Serde<List<T>> >>>>>> ListSerde()" is not a constructor, and is not required. It would be used >>>>>> purely from the Java interface, but has the drawbacks I listed above. >>>>>> This method, not the constructor, is what I proposed to remove from the >>>>>> KIP. >>>>>> >>>>>> Thanks, >>>>>> -John >>>>>> >>>>>> >>>>>> On Mon, Jul 15, 2019 at 10:15 AM Development <d...@yeralin.net >>>>>> <mailto:d...@yeralin.net>> wrote: >>>>>> One problem though. >>>>>> >>>>>> Since WindowedSerde (Windowed(De)Serializer) are so similar, I’m trying >>>>>> to mimic the implementation of my ListSerde accordingly. >>>>>> >>>>>> I created couple constants under StreamsConfig: >>>>>> >>>>>> >>>>>> >>>>>> And trying to do similar construct: >>>>>> final String propertyName = isKey ? >>>>>> StreamsConfig.DEFAULT_LIST_KEY_SERDE_INNER_CLASS : >>>>>> StreamsConfig.DEFAULT_LIST_VALUE_SERDE_INNER_CLASS; >>>>>> But then found out that StreamsConfig is not accessible from >>>>>> org.apache.kafka.common.serialization package while window serde >>>>>> (de)serializers are located under org.apache.kafka.streams.kstream >>>>>> package. >>>>>> >>>>>> What should I do? Should I move my classes under >>>>>> org.apache.kafka.streams.kstream package instead? >>>>>> >>>>>>> On Jul 15, 2019, at 10:45 AM, Development <d...@yeralin.net >>>>>>> <mailto:d...@yeralin.net>> wrote: >>>>>>> >>>>>>> Hi Matthias, >>>>>>> >>>>>>> Thank you for your input. >>>>>>> >>>>>>> I updated the KIP, made it a little more readable. >>>>>>> >>>>>>> I think the configuration parameters strategy is finalized then. >>>>>>> >>>>>>> Do you have any other questions/concerns regarding this KIP? >>>>>>> >>>>>>> Meanwhile I’ll start doing appropriate code changes, and commit them >>>>>>> under my PR. >>>>>>> >>>>>>> Best, >>>>>>> Daniyar Yeralin >>>>>>> >>>>>>>> On Jul 11, 2019, at 2:44 PM, Matthias J. Sax <matth...@confluent.io >>>>>>>> <mailto:matth...@confluent.io>> wrote: >>>>>>>> >>>>>>>> Daniyar, >>>>>>>> >>>>>>>> thanks for the update to the KIP. It's in really good shape and well >>>>>>>> written. >>>>>>>> >>>>>>>> About the default constructor question: >>>>>>>> >>>>>>>> All Serdes/Serializer/Deserializer classes need a default constructor >>>>>>>> to >>>>>>>> create them easily via reflections when specifies in a config. I >>>>>>>> understand that it is not super user friendly, but all existing code >>>>>>>> works this way. Hence, it seems best to stick with the established >>>>>>>> pattern. >>>>>>>> >>>>>>>> We have a similar issue with `TimeWindowedSerde` and >>>>>>>> `SessionWindowedSerde`, and I just recently did a PR to improve user >>>>>>>> experience that address the exact issue John raised. (cf >>>>>>>> https://github.com/apache/kafka/pull/7067 >>>>>>>> <https://github.com/apache/kafka/pull/7067>) >>>>>>>> >>>>>>>> Note, that if a user would instantiate the Serde manually, the user >>>>>>>> would also need to call `configure()` to setup the inner serdes. Kafka >>>>>>>> Streams would not setup those automatically and one might most likely >>>>>>>> end-up with an NPE. >>>>>>>> >>>>>>>> >>>>>>>> Coming back the KIP, and the parameter names. `WindowedSerdes` are >>>>>>>> similar to `ListSerde` as they wrap another Serde. For >>>>>>>> `WindowedSerdes`, >>>>>>>> we use the following parameter names: >>>>>>>> >>>>>>>> - default.windowed.key.serde.inner >>>>>>>> - default.windowed.value.serde.inner >>>>>>>> >>>>>>>> >>>>>>>> It might be good to align the naming pattern. I would also suggest to >>>>>>>> use `type` instead of `impl`? >>>>>>>> >>>>>>>> >>>>>>>> default.key.list.serde.impl -> default.list.key.serde.type >>>>>>>> default.value.list.serde.impl -> default.list.value.serde.type >>>>>>>> default.key.list.serde.element -> default.list.key.serde.inner >>>>>>>> default.value.list.serde.element -> default.list.value.serde.inner >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -Matthias >>>>>>>> >>>>>>>> >>>>>>>> On 7/10/19 8:52 AM, Development wrote: >>>>>>>>> Hi John, >>>>>>>>> >>>>>>>>> Yes, I do agree. That totally makes sense. The only thing is that it >>>>>>>>> goes against what Matthias suggested earlier: >>>>>>>>> "I think that ... `ListSerde` should have an default constructor and >>>>>>>>> it should be possible to pass in the `Class listClass` information >>>>>>>>> via a configuration. Otherwise, KafkaStreams cannot use it as default >>>>>>>>> serde.” >>>>>>>>> >>>>>>>>> What do you think about that? I hope I’m not confusing anything. >>>>>>>>> >>>>>>>>> Best, >>>>>>>>> Daniyar Yeralin >>>>>>>>> >>>>>>>>>> On Jul 9, 2019, at 5:56 PM, John Roesler <j...@confluent.io >>>>>>>>>> <mailto:j...@confluent.io>> wrote: >>>>>>>>>> >>>>>>>>>> Ah, my apologies, I must have just overlooked it. Thanks for the >>>>>>>>>> update, too. >>>>>>>>>> >>>>>>>>>> Just one more super-small question, do we need this variant: >>>>>>>>>> >>>>>>>>>>> New method public static <T> Serde<List<T>> ListSerde() in >>>>>>>>>>> org.apache.kafka.common.serialization.Serdes class (infers list >>>>>>>>>>> implementation and inner serde from config file) >>>>>>>>>> >>>>>>>>>> It seems like this situation implies my config file is already set >>>>>>>>>> up for the list serde, so passing this serde (e.g., in Produced) >>>>>>>>>> would have the same effect as not specifying it. >>>>>>>>>> >>>>>>>>>> I guess that it could be the case that you have the >>>>>>>>>> `default.key/value.serde` set to something else, like StringSerde, >>>>>>>>>> but you still have the `default.key/value.list.serde.impl/element` >>>>>>>>>> set. This seems like it would result in more confusion than >>>>>>>>>> convenience, so my gut instinct is maybe we shouldn't introduce the >>>>>>>>>> `ListSerde()` variant until people actually request it later on. >>>>>>>>>> >>>>>>>>>> Thus, we'd just stick with fully config-driven or fully >>>>>>>>>> source-code-driven, not half/half. >>>>>>>>>> >>>>>>>>>> What do you think? >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> -John >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Tue, Jul 9, 2019 at 9:58 AM Development <d...@yeralin.net >>>>>>>>>> <mailto:d...@yeralin.net> <mailto:d...@yeralin.net >>>>>>>>>> <mailto:d...@yeralin.net>>> wrote: >>>>>>>>>>> >>>>>>>>>>> Hi John, >>>>>>>>>>> >>>>>>>>>>> I hope everyone had a great long weekend. >>>>>>>>>>> >>>>>>>>>>> Regarding Java interfaces, I may not understand you correctly, but >>>>>>>>>>> I think I already listed them: >>>>>>>>>>> >>>>>>>>>>> So for Produced, you would use it in the following fashion, for >>>>>>>>>>> example: Produced.keySerde(Serdes.ListSerde(ArrayList.class, >>>>>>>>>>> Serdes.Integer())) >>>>>>>>>>> >>>>>>>>>>> I also updated the KIP, and added a section “Serialization >>>>>>>>>>> Strategy” where I describe our logic of conditional serialization >>>>>>>>>>> based on the type of an inner serde. >>>>>>>>>>> >>>>>>>>>>> Thank you! >>>>>>>>>>> >>>>>>>>>>> Best, >>>>>>>>>>> Daniyar Yeralin >>>>>>>>>>> >>>>>>>>>>> On Jun 26, 2019, at 11:44 AM, John Roesler <j...@confluent.io >>>>>>>>>>> <mailto:j...@confluent.io> <mailto:j...@confluent.io >>>>>>>>>>> <mailto:j...@confluent.io>>> wrote: >>>>>>>>>>> >>>>>>>>>>> Thanks for the update, Daniyar! >>>>>>>>>>> >>>>>>>>>>> In addition to specifying the config interface, can you also specify >>>>>>>>>>> the Java interface? Namely, if I need to pass an instance of this >>>>>>>>>>> serde in to the DSL directly, as in Produced, Materialized, etc., >>>>>>>>>>> what >>>>>>>>>>> constructor(s) would I have available? Likewise with the Serializer >>>>>>>>>>> and Deserailizer. I don't think you need to specify the >>>>>>>>>>> implementation >>>>>>>>>>> logic, since we've already discussed it here. >>>>>>>>>>> >>>>>>>>>>> If you also want to specify the serialized format of the data >>>>>>>>>>> records >>>>>>>>>>> in the KIP, it could be useful documentation, as well as letting us >>>>>>>>>>> verify the schema for forward/backward compatibility concerns, etc. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> John >>>>>>>>>>> >>>>>>>>>>> On Wed, Jun 26, 2019 at 10:33 AM Development <d...@yeralin.net >>>>>>>>>>> <mailto:d...@yeralin.net> <mailto:d...@yeralin.net >>>>>>>>>>> <mailto:d...@yeralin.net>>> wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Hey, >>>>>>>>>>> >>>>>>>>>>> Finally made updates to the KIP: >>>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-466%3A+Add+support+for+List%3CT%3E+serialization+and+deserialization >>>>>>>>>>> >>>>>>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-466%3A+Add+support+for+List%3CT%3E+serialization+and+deserialization> >>>>>>>>>>> >>>>>>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-466%3A+Add+support+for+List%3CT%3E+serialization+and+deserialization >>>>>>>>>>> >>>>>>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-466%3A+Add+support+for+List%3CT%3E+serialization+and+deserialization>> >>>>>>>>>>> >>>>>>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-466:+Add+support+for+List%3CT%3E+serialization+and+deserialization >>>>>>>>>>> >>>>>>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-466:+Add+support+for+List%3CT%3E+serialization+and+deserialization> >>>>>>>>>>> >>>>>>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-466:+Add+support+for+List%3CT%3E+serialization+and+deserialization >>>>>>>>>>> >>>>>>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-466:+Add+support+for+List%3CT%3E+serialization+and+deserialization>>> >>>>>>>>>>> Sorry for the delay :) >>>>>>>>>>> >>>>>>>>>>> Thank You! >>>>>>>>>>> >>>>>>>>>>> Best, >>>>>>>>>>> Daniyar Yeralin >>>>>>>>>>> >>>>>>>>>>> On Jun 22, 2019, at 12:49 AM, Matthias J. Sax >>>>>>>>>>> <matth...@confluent.io <mailto:matth...@confluent.io> >>>>>>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io>>> >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> Yes, something like this. I did not think about good configuration >>>>>>>>>>> parameter names yet. I am also not sure if I understand all proposed >>>>>>>>>>> configs atm. But all configs should be listed and explained in the >>>>>>>>>>> KIP >>>>>>>>>>> anyway, and we can discuss further after you have updated the KIP >>>>>>>>>>> (I can >>>>>>>>>>> ask more detailed question if I have any). >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -Matthias >>>>>>>>>>> >>>>>>>>>>> On 6/21/19 2:05 PM, Development wrote: >>>>>>>>>>> >>>>>>>>>>> Yes, you are right. ByteSerializer is not what I need to have in a >>>>>>>>>>> list >>>>>>>>>>> of primitives. >>>>>>>>>>> >>>>>>>>>>> As for the default constructor and configurability, just want to >>>>>>>>>>> make >>>>>>>>>>> sure. Is this what you have on your mind? >>>>>>>>>>> >>>>>>>>>>> Best, >>>>>>>>>>> Daniyar Yeralin >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Jun 21, 2019, at 2:51 PM, Matthias J. Sax <matth...@confluent.io >>>>>>>>>>> <mailto:matth...@confluent.io> <mailto:matth...@confluent.io >>>>>>>>>>> <mailto:matth...@confluent.io>> >>>>>>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io> >>>>>>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io>>>> >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> Thanks for the update! >>>>>>>>>>> >>>>>>>>>>> I think that `ListDeserializer`, `ListSerializer`, and `ListSerde` >>>>>>>>>>> should have an default constructor and it should be possible to >>>>>>>>>>> pass in >>>>>>>>>>> the `Class listClass` information via a configuration. Otherwise, >>>>>>>>>>> KafkaStreams cannot use it as default serde. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> For the primitive serializers: `BytesSerializer` is not primitive >>>>>>>>>>> IMHO, >>>>>>>>>>> as is it for `byte[]` with variable length -- it's for arrays, not >>>>>>>>>>> for >>>>>>>>>>> single `byte` (note, that `Bytes` is a Kafka class wrapping >>>>>>>>>>> `byte[]`). >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> For tests, we can comment on the PR. No need to do this in the KIP >>>>>>>>>>> discussion. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Can you also update the KIP? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -Matthias >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 6/21/19 11:29 AM, Development wrote: >>>>>>>>>>> >>>>>>>>>>> I made and pushed necessary commits, so we could review the final >>>>>>>>>>> version under PR https://github.com/apache/kafka/pull/6592 >>>>>>>>>>> <https://github.com/apache/kafka/pull/6592> >>>>>>>>>>> <https://github.com/apache/kafka/pull/6592 >>>>>>>>>>> <https://github.com/apache/kafka/pull/6592>> >>>>>>>>>>> >>>>>>>>>>> I also need some advice on writing tests for this new serde. So far >>>>>>>>>>> I >>>>>>>>>>> only have two test cases (roundtrip and empty payload), I’m not sure >>>>>>>>>>> if it is enough. >>>>>>>>>>> >>>>>>>>>>> Thank y’all for your help in this KIP :) >>>>>>>>>>> >>>>>>>>>>> Best, >>>>>>>>>>> Daniyar Yeralin >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Jun 21, 2019, at 1:44 PM, John Roesler <j...@confluent.io >>>>>>>>>>> <mailto:j...@confluent.io> <mailto:j...@confluent.io >>>>>>>>>>> <mailto:j...@confluent.io>> >>>>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io> >>>>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> Hey Daniyar, >>>>>>>>>>> >>>>>>>>>>> Looks good to me! Thanks for considering it. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> -John >>>>>>>>>>> >>>>>>>>>>> On Fri, Jun 21, 2019 at 9:04 AM Development <d...@yeralin.net >>>>>>>>>>> <mailto:d...@yeralin.net> <mailto:d...@yeralin.net >>>>>>>>>>> <mailto:d...@yeralin.net>> >>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> >>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>> >>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> >>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>>> wrote: >>>>>>>>>>> Hey John and Matthias, >>>>>>>>>>> >>>>>>>>>>> Yes, now I see it all. I’m storing lots of redundant information. >>>>>>>>>>> Here is my final idea. Yes, now a user should pass a list type. I >>>>>>>>>>> realized that’s the type is not really needed in ListSerializer, but >>>>>>>>>>> only in ListDeserializer: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> In ListSerializer we will start storing sizes only if serializer is >>>>>>>>>>> not a primitive serializer: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Then, in deserializer, we persist passed list type, so that during >>>>>>>>>>> deserialization we could create an instance of it with predefined >>>>>>>>>>> listSize for better performance. >>>>>>>>>>> We also try to locate a primitiveSize based on passed deserializer. >>>>>>>>>>> If it is not there, then primitiveSize will be null. Which means >>>>>>>>>>> that each entry’s size was encoded individually. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> This looks much cleaner and more concise. >>>>>>>>>>> >>>>>>>>>>> What do you think? >>>>>>>>>>> >>>>>>>>>>> Best, >>>>>>>>>>> Daniyar Yeralin >>>>>>>>>>> >>>>>>>>>>> On Jun 20, 2019, at 5:45 PM, Matthias J. Sax <matth...@confluent.io >>>>>>>>>>> <mailto:matth...@confluent.io> <mailto:matth...@confluent.io >>>>>>>>>>> <mailto:matth...@confluent.io>> >>>>>>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io> >>>>>>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io>>> >>>>>>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io> >>>>>>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io>>>> >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> For encoding the list-type: I see John's point about re-encoding the >>>>>>>>>>> list-type redundantly. However, I also don't like the idea that the >>>>>>>>>>> Deserializer returns a fixed type... >>>>>>>>>>> >>>>>>>>>>> Maybe it's best allow users to specify the target list type on >>>>>>>>>>> deserialization via config? >>>>>>>>>>> >>>>>>>>>>> Similar for the primitive types: I don't think we need to encode the >>>>>>>>>>> type size, but users could specify the type on the deserializer >>>>>>>>>>> (via a >>>>>>>>>>> config again)? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> About generics: nesting could be arbitrarily deep. Hence, I doubt >>>>>>>>>>> we can >>>>>>>>>>> support this and a cast will be necessary at some point in the user >>>>>>>>>>> code. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -Matthias >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 6/20/19 1:21 PM, John Roesler wrote: >>>>>>>>>>> >>>>>>>>>>> Hey Daniyar, >>>>>>>>>>> >>>>>>>>>>> Thanks for looking at it! >>>>>>>>>>> >>>>>>>>>>> Something like your screenshot is more along the lines of what I was >>>>>>>>>>> thinking. Sorry, but I didn't follow what you mean, how would that >>>>>>>>>>> not >>>>>>>>>>> be "vanilla java"? >>>>>>>>>>> >>>>>>>>>>> Unfortunately the deserializer needs more information, though. For >>>>>>>>>>> example, what if the inner type is a Map<String,String>? The serde >>>>>>>>>>> could >>>>>>>>>>> only be used to produce a LinkedList<Map>, thus, we'd still need an >>>>>>>>>>> inner serde, like you have in the KIP (Serde<T> innerSerde). >>>>>>>>>>> >>>>>>>>>>> Something more like Serde<LinkedList<MyRecord>> = Serdes.listSerde( >>>>>>>>>>> /**list type**/ LinkedList.class, >>>>>>>>>>> /**inner serde**/ new MyRecordSerde() >>>>>>>>>>> ) >>>>>>>>>>> >>>>>>>>>>> And in configuration, it's something like: >>>>>>>>>>> default.key.serde: org...ListSerde >>>>>>>>>>> default.key.list.serde.type: java.util.LinkedList >>>>>>>>>>> default.key.list.serde.inner: com.mycompany.MyRecordSerde >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> What do you think? >>>>>>>>>>> Thanks, >>>>>>>>>>> -John >>>>>>>>>>> >>>>>>>>>>> On Thu, Jun 20, 2019 at 2:46 PM Development <d...@yeralin.net >>>>>>>>>>> <mailto:d...@yeralin.net> <mailto:d...@yeralin.net >>>>>>>>>>> <mailto:d...@yeralin.net>> >>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> >>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>> >>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> >>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>> >>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> >>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>> >>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> >>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> Hey John, >>>>>>>>>>> >>>>>>>>>>> I gave read about TypeReference. It could work for the list serde. >>>>>>>>>>> However, it is not directly >>>>>>>>>>> supported: >>>>>>>>>>> https://github.com/FasterXML/jackson-databind/issues/1490 >>>>>>>>>>> <https://github.com/FasterXML/jackson-databind/issues/1490> >>>>>>>>>>> <https://github.com/FasterXML/jackson-databind/issues/1490 >>>>>>>>>>> <https://github.com/FasterXML/jackson-databind/issues/1490>> >>>>>>>>>>> <https://github.com/FasterXML/jackson-databind/issues/1490 >>>>>>>>>>> <https://github.com/FasterXML/jackson-databind/issues/1490> >>>>>>>>>>> <https://github.com/FasterXML/jackson-databind/issues/1490 >>>>>>>>>>> <https://github.com/FasterXML/jackson-databind/issues/1490>>> >>>>>>>>>>> The only way is to pass an actual class object into the constructor, >>>>>>>>>>> something like: >>>>>>>>>>> >>>>>>>>>>> It could be an option, but not a pretty one. What do you think of my >>>>>>>>>>> approach to use vanilla java and canonical class name? (As described >>>>>>>>>>> previously) >>>>>>>>>>> >>>>>>>>>>> Best, >>>>>>>>>>> Daniyar Yeralin >>>>>>>>>>> >>>>>>>>>>> On Jun 20, 2019, at 2:45 PM, Development <d...@yeralin.net >>>>>>>>>>> <mailto:d...@yeralin.net> <mailto:d...@yeralin.net >>>>>>>>>>> <mailto:d...@yeralin.net>> >>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> >>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>> >>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> >>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>> >>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> >>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>> >>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> >>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> Hi John, >>>>>>>>>>> >>>>>>>>>>> Thank you for your input! Yes, my idea looks a little bit over >>>>>>>>>>> engineered :) >>>>>>>>>>> >>>>>>>>>>> I also wanted to see a feedback from Mathias as well since he gave >>>>>>>>>>> me an idea about storing fixed/variable size entries. >>>>>>>>>>> >>>>>>>>>>> Best, >>>>>>>>>>> Daniyar Yeralin >>>>>>>>>>> >>>>>>>>>>> On Jun 18, 2019, at 6:06 PM, John Roesler <j...@confluent.io >>>>>>>>>>> <mailto:j...@confluent.io> <mailto:j...@confluent.io >>>>>>>>>>> <mailto:j...@confluent.io>> >>>>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io> >>>>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io>>> >>>>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io> >>>>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io>>> >>>>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io> >>>>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io>> >>>>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io> >>>>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> Hi Daniyar, >>>>>>>>>>> >>>>>>>>>>> That's a very clever solution! >>>>>>>>>>> >>>>>>>>>>> One observation is that, now, this is what we might call a >>>>>>>>>>> polymorphic >>>>>>>>>>> serde. That is, you're detecting the actual concrete type and then >>>>>>>>>>> promising to produce the exact same concrete type on read. >>>>>>>>>>> There are >>>>>>>>>>> some inherent problems with this approach, which in general >>>>>>>>>>> require >>>>>>>>>>> some kind of schema registry (not necessarily Schema >>>>>>>>>>> Registry, just >>>>>>>>>>> any registry for schemas) to solve. >>>>>>>>>>> >>>>>>>>>>> Notice that every serialized record has quite a bit of duplicated >>>>>>>>>>> information: the concrete type as well as a byte to indicate >>>>>>>>>>> whether >>>>>>>>>>> the value type is a fixed size, and, if so, an integer to >>>>>>>>>>> indicate the >>>>>>>>>>> actual size. These constitute a schema, of sorts, because they >>>>>>>>>>> tell us >>>>>>>>>>> later how exactly to deserialize the data. Unfortunately, this >>>>>>>>>>> information is completely redundant. In all likelihood, the >>>>>>>>>>> information will be exactly the same for every record in the >>>>>>>>>>> topic. >>>>>>>>>>> This problem is essentially the core motivation for serializations >>>>>>>>>>> like Avro: to move the schema outside of the serialization >>>>>>>>>>> itself, so >>>>>>>>>>> that the records won't contain so much redundant information. >>>>>>>>>>> >>>>>>>>>>> In this light, I'm wondering if it makes sense to go back to >>>>>>>>>>> something >>>>>>>>>>> like what you had earlier in which you don't support perfectly >>>>>>>>>>> preserving the concrete type for _this_ serde, but instead just >>>>>>>>>>> support deserializing to _some_ List. Then, you could defer full, >>>>>>>>>>> perfect, type preservation to serdes that have an external >>>>>>>>>>> system in >>>>>>>>>>> which to register their type information. >>>>>>>>>>> >>>>>>>>>>> There does exist an alternative, if we really do want to >>>>>>>>>>> preserve the >>>>>>>>>>> concrete type (which does seem kind of nice). You can add a >>>>>>>>>>> configuration option specifically for the serde to configure >>>>>>>>>>> what the >>>>>>>>>>> list type will be, and maybe what the element type is, as well. >>>>>>>>>>> >>>>>>>>>>> As far as "related work" goes, you might be interested to take >>>>>>>>>>> a look >>>>>>>>>>> at how Jackson can be configured to deserialize into a specific, >>>>>>>>>>> arbitrarily nested, generically parameterized class structure. >>>>>>>>>>> Specifically, you might find >>>>>>>>>>> https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html >>>>>>>>>>> >>>>>>>>>>> <https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html> >>>>>>>>>>> >>>>>>>>>>> <https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html >>>>>>>>>>> >>>>>>>>>>> <https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html>> >>>>>>>>>>> <https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html >>>>>>>>>>> >>>>>>>>>>> <https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html> >>>>>>>>>>> >>>>>>>>>>> <https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html >>>>>>>>>>> >>>>>>>>>>> <https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html>>> >>>>>>>>>>> interesting. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> -John >>>>>>>>>>> >>>>>>>>>>> On Mon, Jun 17, 2019 at 12:38 PM Development <d...@yeralin.net >>>>>>>>>>> <mailto:d...@yeralin.net> <mailto:d...@yeralin.net >>>>>>>>>>> <mailto:d...@yeralin.net>> >>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> >>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>> >>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> >>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>> >>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> >>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>> >>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> >>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> bump >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >> >