Good point. I guess the simplest solution is, to actually add
>> default.list.key/value.serde.type >> default.list.key/value.serde.inner to both `CommonClientConfigs` and `StreamsConfig`. It's not super clean, but I think it's the best we can do. Thoughts? -Matthias On 7/19/19 1:23 PM, Development wrote: > Hi Matthias, > > I agree, ConsumerConfig did not seem like a right place for these > configurations. > I’ll put them in ProducerConfig, ConsumerConfig, and StreamsConfig. > > However, I have a question. What should I do in "configure(Map<String, ?> > configs, boolean isKey)” methods? Which configurations should I try to > locate? I was comparing my (de)serializer implementations with > SessionWindows(De)serializer classes, and they use StreamsConfig class to get > either StreamsConfig.DEFAULT_WINDOWED_KEY_SERDE_INNER_CLASS : > StreamsConfig.DEFAULT_WINDOWED_VALUE_SERDE_INNER_CLASS > > In my case, as I mentioned earlier, StreamsConfig class is not accessible > from org.apache.kafka.common.serialization package. So, I can’t utilize it. > Any suggestions here? > > Best, > Daniyar Yeralin > > >> On Jul 18, 2019, at 8:46 PM, Matthias J. Sax <matth...@confluent.io> wrote: >> >> Thanks! >> >> One minor question about the configs. The KIP adds three classes, a >> Serializer, a Deserializer, and a Serde. >> >> Hence, would it make sense to add the corresponding configs to >> `ConsumerConfig`, `ProducerConfig`, and `StreamsConfig` using slightly >> different names each time? >> >> >> Somethin like this: >> >> ProducerConfig: >> >> list.key/value.serializer.type >> list.key/value.serializer.inner >> >> ConsumerConfig: >> >> list.key/value.deserializer.type >> list.key/value.deserializer.inner >> >> StreamsConfig: >> >> default.list.key/value.serde.type >> default.list.key/value.serde.inner >> >> >> Adding `d.l.k/v.serde.t/i` to `CommonClientConfigs does not sound right >> to me. Also note, that it seems better to avoid the `default.` prefix >> for consumers and producers because there is only one Serializer or >> Deserializer anyway. Only for Streams, there are multiple and >> StreamsConfig specifies the default one of an operator does not >> overwrite it. >> >> Thoughts? >> >> >> Also, the KIP should explicitly mention to what classed certain configs >> are added. Atm, the KIP only list parameter names, but does not state >> where those are added. >> >> >> -Matthias >> >> >> >> >> >> On 7/16/19 1:11 PM, Development wrote: >>> Hi, >>> >>> Yes, totally forgot about the statement. KIP-466 is updated. >>> >>> Thank you so much John Roesler, Matthias J. Sax, Sophie Blee-Goldman for >>> your valuable input! >>> >>> I hope I did not cause too much trouble :) >>> >>> I’ll start the vote now. >>> >>> Best, >>> Daniyar Yeralin >>> >>>> On Jul 16, 2019, at 3:17 PM, John Roesler <j...@confluent.io> wrote: >>>> >>>> Hi Daniyar, >>>> >>>> Thanks for that update. I took a look, and I think this is in good shape. >>>> >>>> One note, the statement "New method public static <T> Serde<List<T>> >>>> ListSerde() in org.apache.kafka.common.serialization.Serdes class >>>> (infers list implementation and inner serde from config file)" is >>>> still present in the KIP, although I do it is was removed from the PR. >>>> >>>> Once you remove that statement from the KIP, then I think this KIP is >>>> ready to go up for a vote! Then, we can really review the PR in >>>> earnest and get this thing merged. >>>> >>>> Thanks, >>>> -john >>>> >>>> On Tue, Jul 16, 2019 at 2:05 PM Development <d...@yeralin.net> wrote: >>>>> >>>>> Hi, >>>>> >>>>> Pushed new changes under my PR: https://github.com/apache/kafka/pull/6592 >>>>> <https://github.com/apache/kafka/pull/6592> >>>>> >>>>> Feel free to put any comments in there. >>>>> >>>>> Best, >>>>> Daniyar Yeralin >>>>> >>>>>> On Jul 15, 2019, at 1:06 PM, Development <d...@yeralin.net> wrote: >>>>>> >>>>>> Hi John, >>>>>> >>>>>> I knew I was missing something. Yes, that makes sense now, I removed all >>>>>> `listSerde()` methods, and left empty constructors instead. >>>>>> >>>>>> As per `CommonClientConfigs` I looked at the class, it doesn’t have any >>>>>> properties related to serdes, and that bothers me a little. >>>>>> >>>>>> All properties like `default.key.serde` `default.windowed.key.serde.*` >>>>>> are located in StreamsConfig. I don’t want to create a confusion. >>>>>> What also doesn’t make sense to me is that `WindowedSerdes` and its >>>>>> (de)serializers are not located in >>>>>> org.apache.kafka.common.serialization. I guess it kind of makes sense >>>>>> since windowed serdes are only available for kafka streams, not vice >>>>>> versa. >>>>>> >>>>>> If everyone is okay to put list properties in `CommonClientConfigs` >>>>>> class, I’ll go ahead and do that then. >>>>>> >>>>>> Thank you for your input! >>>>>> >>>>>> Best, >>>>>> Daniyar Yeralin >>>>>> >>>>>>> On Jul 15, 2019, at 11:45 AM, John Roesler <j...@confluent.io> wrote: >>>>>>> >>>>>>> Hi all, >>>>>>> >>>>>>> Regarding the placement, you might as well move the constants to >>>>>>> `org.apache.kafka.clients.CommonClientConfigs`, so that the constants >>>>>>> and the configs and the code are in the same module. >>>>>>> >>>>>>> Regarding the constructor... What Matthias said is correct: The serde, >>>>>>> serializer, and deserializer all need to have zero-arg constructors so >>>>>>> they can be instantiated reflectively by Kafka. However, the factory >>>>>>> method you proposed "New method public static <T> Serde<List<T>> >>>>>>> ListSerde()" is not a constructor, and is not required. It would be >>>>>>> used purely from the Java interface, but has the drawbacks I listed >>>>>>> above. This method, not the constructor, is what I proposed to remove >>>>>>> from the KIP. >>>>>>> >>>>>>> Thanks, >>>>>>> -John >>>>>>> >>>>>>> >>>>>>> On Mon, Jul 15, 2019 at 10:15 AM Development <d...@yeralin.net >>>>>>> <mailto:d...@yeralin.net>> wrote: >>>>>>> One problem though. >>>>>>> >>>>>>> Since WindowedSerde (Windowed(De)Serializer) are so similar, I’m trying >>>>>>> to mimic the implementation of my ListSerde accordingly. >>>>>>> >>>>>>> I created couple constants under StreamsConfig: >>>>>>> >>>>>>> >>>>>>> >>>>>>> And trying to do similar construct: >>>>>>> final String propertyName = isKey ? >>>>>>> StreamsConfig.DEFAULT_LIST_KEY_SERDE_INNER_CLASS : >>>>>>> StreamsConfig.DEFAULT_LIST_VALUE_SERDE_INNER_CLASS; >>>>>>> But then found out that StreamsConfig is not accessible from >>>>>>> org.apache.kafka.common.serialization package while window serde >>>>>>> (de)serializers are located under org.apache.kafka.streams.kstream >>>>>>> package. >>>>>>> >>>>>>> What should I do? Should I move my classes under >>>>>>> org.apache.kafka.streams.kstream package instead? >>>>>>> >>>>>>>> On Jul 15, 2019, at 10:45 AM, Development <d...@yeralin.net >>>>>>>> <mailto:d...@yeralin.net>> wrote: >>>>>>>> >>>>>>>> Hi Matthias, >>>>>>>> >>>>>>>> Thank you for your input. >>>>>>>> >>>>>>>> I updated the KIP, made it a little more readable. >>>>>>>> >>>>>>>> I think the configuration parameters strategy is finalized then. >>>>>>>> >>>>>>>> Do you have any other questions/concerns regarding this KIP? >>>>>>>> >>>>>>>> Meanwhile I’ll start doing appropriate code changes, and commit them >>>>>>>> under my PR. >>>>>>>> >>>>>>>> Best, >>>>>>>> Daniyar Yeralin >>>>>>>> >>>>>>>>> On Jul 11, 2019, at 2:44 PM, Matthias J. Sax <matth...@confluent.io >>>>>>>>> <mailto:matth...@confluent.io>> wrote: >>>>>>>>> >>>>>>>>> Daniyar, >>>>>>>>> >>>>>>>>> thanks for the update to the KIP. It's in really good shape and well >>>>>>>>> written. >>>>>>>>> >>>>>>>>> About the default constructor question: >>>>>>>>> >>>>>>>>> All Serdes/Serializer/Deserializer classes need a default constructor >>>>>>>>> to >>>>>>>>> create them easily via reflections when specifies in a config. I >>>>>>>>> understand that it is not super user friendly, but all existing code >>>>>>>>> works this way. Hence, it seems best to stick with the established >>>>>>>>> pattern. >>>>>>>>> >>>>>>>>> We have a similar issue with `TimeWindowedSerde` and >>>>>>>>> `SessionWindowedSerde`, and I just recently did a PR to improve user >>>>>>>>> experience that address the exact issue John raised. (cf >>>>>>>>> https://github.com/apache/kafka/pull/7067 >>>>>>>>> <https://github.com/apache/kafka/pull/7067>) >>>>>>>>> >>>>>>>>> Note, that if a user would instantiate the Serde manually, the user >>>>>>>>> would also need to call `configure()` to setup the inner serdes. Kafka >>>>>>>>> Streams would not setup those automatically and one might most likely >>>>>>>>> end-up with an NPE. >>>>>>>>> >>>>>>>>> >>>>>>>>> Coming back the KIP, and the parameter names. `WindowedSerdes` are >>>>>>>>> similar to `ListSerde` as they wrap another Serde. For >>>>>>>>> `WindowedSerdes`, >>>>>>>>> we use the following parameter names: >>>>>>>>> >>>>>>>>> - default.windowed.key.serde.inner >>>>>>>>> - default.windowed.value.serde.inner >>>>>>>>> >>>>>>>>> >>>>>>>>> It might be good to align the naming pattern. I would also suggest to >>>>>>>>> use `type` instead of `impl`? >>>>>>>>> >>>>>>>>> >>>>>>>>> default.key.list.serde.impl -> default.list.key.serde.type >>>>>>>>> default.value.list.serde.impl -> default.list.value.serde.type >>>>>>>>> default.key.list.serde.element -> default.list.key.serde.inner >>>>>>>>> default.value.list.serde.element -> default.list.value.serde.inner >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -Matthias >>>>>>>>> >>>>>>>>> >>>>>>>>> On 7/10/19 8:52 AM, Development wrote: >>>>>>>>>> Hi John, >>>>>>>>>> >>>>>>>>>> Yes, I do agree. That totally makes sense. The only thing is that it >>>>>>>>>> goes against what Matthias suggested earlier: >>>>>>>>>> "I think that ... `ListSerde` should have an default constructor and >>>>>>>>>> it should be possible to pass in the `Class listClass` information >>>>>>>>>> via a configuration. Otherwise, KafkaStreams cannot use it as >>>>>>>>>> default serde.” >>>>>>>>>> >>>>>>>>>> What do you think about that? I hope I’m not confusing anything. >>>>>>>>>> >>>>>>>>>> Best, >>>>>>>>>> Daniyar Yeralin >>>>>>>>>> >>>>>>>>>>> On Jul 9, 2019, at 5:56 PM, John Roesler <j...@confluent.io >>>>>>>>>>> <mailto:j...@confluent.io>> wrote: >>>>>>>>>>> >>>>>>>>>>> Ah, my apologies, I must have just overlooked it. Thanks for the >>>>>>>>>>> update, too. >>>>>>>>>>> >>>>>>>>>>> Just one more super-small question, do we need this variant: >>>>>>>>>>> >>>>>>>>>>>> New method public static <T> Serde<List<T>> ListSerde() in >>>>>>>>>>>> org.apache.kafka.common.serialization.Serdes class (infers list >>>>>>>>>>>> implementation and inner serde from config file) >>>>>>>>>>> >>>>>>>>>>> It seems like this situation implies my config file is already set >>>>>>>>>>> up for the list serde, so passing this serde (e.g., in Produced) >>>>>>>>>>> would have the same effect as not specifying it. >>>>>>>>>>> >>>>>>>>>>> I guess that it could be the case that you have the >>>>>>>>>>> `default.key/value.serde` set to something else, like StringSerde, >>>>>>>>>>> but you still have the `default.key/value.list.serde.impl/element` >>>>>>>>>>> set. This seems like it would result in more confusion than >>>>>>>>>>> convenience, so my gut instinct is maybe we shouldn't introduce the >>>>>>>>>>> `ListSerde()` variant until people actually request it later on. >>>>>>>>>>> >>>>>>>>>>> Thus, we'd just stick with fully config-driven or fully >>>>>>>>>>> source-code-driven, not half/half. >>>>>>>>>>> >>>>>>>>>>> What do you think? >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> -John >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Tue, Jul 9, 2019 at 9:58 AM Development <d...@yeralin.net >>>>>>>>>>> <mailto:d...@yeralin.net> <mailto:d...@yeralin.net >>>>>>>>>>> <mailto:d...@yeralin.net>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> Hi John, >>>>>>>>>>>> >>>>>>>>>>>> I hope everyone had a great long weekend. >>>>>>>>>>>> >>>>>>>>>>>> Regarding Java interfaces, I may not understand you correctly, but >>>>>>>>>>>> I think I already listed them: >>>>>>>>>>>> >>>>>>>>>>>> So for Produced, you would use it in the following fashion, for >>>>>>>>>>>> example: Produced.keySerde(Serdes.ListSerde(ArrayList.class, >>>>>>>>>>>> Serdes.Integer())) >>>>>>>>>>>> >>>>>>>>>>>> I also updated the KIP, and added a section “Serialization >>>>>>>>>>>> Strategy” where I describe our logic of conditional serialization >>>>>>>>>>>> based on the type of an inner serde. >>>>>>>>>>>> >>>>>>>>>>>> Thank you! >>>>>>>>>>>> >>>>>>>>>>>> Best, >>>>>>>>>>>> Daniyar Yeralin >>>>>>>>>>>> >>>>>>>>>>>> On Jun 26, 2019, at 11:44 AM, John Roesler <j...@confluent.io >>>>>>>>>>>> <mailto:j...@confluent.io> <mailto:j...@confluent.io >>>>>>>>>>>> <mailto:j...@confluent.io>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> Thanks for the update, Daniyar! >>>>>>>>>>>> >>>>>>>>>>>> In addition to specifying the config interface, can you also >>>>>>>>>>>> specify >>>>>>>>>>>> the Java interface? Namely, if I need to pass an instance of this >>>>>>>>>>>> serde in to the DSL directly, as in Produced, Materialized, etc., >>>>>>>>>>>> what >>>>>>>>>>>> constructor(s) would I have available? Likewise with the Serializer >>>>>>>>>>>> and Deserailizer. I don't think you need to specify the >>>>>>>>>>>> implementation >>>>>>>>>>>> logic, since we've already discussed it here. >>>>>>>>>>>> >>>>>>>>>>>> If you also want to specify the serialized format of the data >>>>>>>>>>>> records >>>>>>>>>>>> in the KIP, it could be useful documentation, as well as letting us >>>>>>>>>>>> verify the schema for forward/backward compatibility concerns, etc. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> John >>>>>>>>>>>> >>>>>>>>>>>> On Wed, Jun 26, 2019 at 10:33 AM Development <d...@yeralin.net >>>>>>>>>>>> <mailto:d...@yeralin.net> <mailto:d...@yeralin.net >>>>>>>>>>>> <mailto:d...@yeralin.net>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Hey, >>>>>>>>>>>> >>>>>>>>>>>> Finally made updates to the KIP: >>>>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-466%3A+Add+support+for+List%3CT%3E+serialization+and+deserialization >>>>>>>>>>>> >>>>>>>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-466%3A+Add+support+for+List%3CT%3E+serialization+and+deserialization> >>>>>>>>>>>> >>>>>>>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-466%3A+Add+support+for+List%3CT%3E+serialization+and+deserialization >>>>>>>>>>>> >>>>>>>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-466%3A+Add+support+for+List%3CT%3E+serialization+and+deserialization>> >>>>>>>>>>>> >>>>>>>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-466:+Add+support+for+List%3CT%3E+serialization+and+deserialization >>>>>>>>>>>> >>>>>>>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-466:+Add+support+for+List%3CT%3E+serialization+and+deserialization> >>>>>>>>>>>> >>>>>>>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-466:+Add+support+for+List%3CT%3E+serialization+and+deserialization >>>>>>>>>>>> >>>>>>>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-466:+Add+support+for+List%3CT%3E+serialization+and+deserialization>>> >>>>>>>>>>>> Sorry for the delay :) >>>>>>>>>>>> >>>>>>>>>>>> Thank You! >>>>>>>>>>>> >>>>>>>>>>>> Best, >>>>>>>>>>>> Daniyar Yeralin >>>>>>>>>>>> >>>>>>>>>>>> On Jun 22, 2019, at 12:49 AM, Matthias J. Sax >>>>>>>>>>>> <matth...@confluent.io <mailto:matth...@confluent.io> >>>>>>>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io>>> >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> Yes, something like this. I did not think about good configuration >>>>>>>>>>>> parameter names yet. I am also not sure if I understand all >>>>>>>>>>>> proposed >>>>>>>>>>>> configs atm. But all configs should be listed and explained in the >>>>>>>>>>>> KIP >>>>>>>>>>>> anyway, and we can discuss further after you have updated the KIP >>>>>>>>>>>> (I can >>>>>>>>>>>> ask more detailed question if I have any). >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -Matthias >>>>>>>>>>>> >>>>>>>>>>>> On 6/21/19 2:05 PM, Development wrote: >>>>>>>>>>>> >>>>>>>>>>>> Yes, you are right. ByteSerializer is not what I need to have in a >>>>>>>>>>>> list >>>>>>>>>>>> of primitives. >>>>>>>>>>>> >>>>>>>>>>>> As for the default constructor and configurability, just want to >>>>>>>>>>>> make >>>>>>>>>>>> sure. Is this what you have on your mind? >>>>>>>>>>>> >>>>>>>>>>>> Best, >>>>>>>>>>>> Daniyar Yeralin >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Jun 21, 2019, at 2:51 PM, Matthias J. Sax >>>>>>>>>>>> <matth...@confluent.io <mailto:matth...@confluent.io> >>>>>>>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io>> >>>>>>>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io> >>>>>>>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io>>>> >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> Thanks for the update! >>>>>>>>>>>> >>>>>>>>>>>> I think that `ListDeserializer`, `ListSerializer`, and `ListSerde` >>>>>>>>>>>> should have an default constructor and it should be possible to >>>>>>>>>>>> pass in >>>>>>>>>>>> the `Class listClass` information via a configuration. Otherwise, >>>>>>>>>>>> KafkaStreams cannot use it as default serde. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> For the primitive serializers: `BytesSerializer` is not primitive >>>>>>>>>>>> IMHO, >>>>>>>>>>>> as is it for `byte[]` with variable length -- it's for arrays, not >>>>>>>>>>>> for >>>>>>>>>>>> single `byte` (note, that `Bytes` is a Kafka class wrapping >>>>>>>>>>>> `byte[]`). >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> For tests, we can comment on the PR. No need to do this in the KIP >>>>>>>>>>>> discussion. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Can you also update the KIP? >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -Matthias >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 6/21/19 11:29 AM, Development wrote: >>>>>>>>>>>> >>>>>>>>>>>> I made and pushed necessary commits, so we could review the final >>>>>>>>>>>> version under PR https://github.com/apache/kafka/pull/6592 >>>>>>>>>>>> <https://github.com/apache/kafka/pull/6592> >>>>>>>>>>>> <https://github.com/apache/kafka/pull/6592 >>>>>>>>>>>> <https://github.com/apache/kafka/pull/6592>> >>>>>>>>>>>> >>>>>>>>>>>> I also need some advice on writing tests for this new serde. So >>>>>>>>>>>> far I >>>>>>>>>>>> only have two test cases (roundtrip and empty payload), I’m not >>>>>>>>>>>> sure >>>>>>>>>>>> if it is enough. >>>>>>>>>>>> >>>>>>>>>>>> Thank y’all for your help in this KIP :) >>>>>>>>>>>> >>>>>>>>>>>> Best, >>>>>>>>>>>> Daniyar Yeralin >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Jun 21, 2019, at 1:44 PM, John Roesler <j...@confluent.io >>>>>>>>>>>> <mailto:j...@confluent.io> <mailto:j...@confluent.io >>>>>>>>>>>> <mailto:j...@confluent.io>> >>>>>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io> >>>>>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> Hey Daniyar, >>>>>>>>>>>> >>>>>>>>>>>> Looks good to me! Thanks for considering it. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> -John >>>>>>>>>>>> >>>>>>>>>>>> On Fri, Jun 21, 2019 at 9:04 AM Development <d...@yeralin.net >>>>>>>>>>>> <mailto:d...@yeralin.net> <mailto:d...@yeralin.net >>>>>>>>>>>> <mailto:d...@yeralin.net>> >>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> >>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>> >>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> >>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>>> wrote: >>>>>>>>>>>> Hey John and Matthias, >>>>>>>>>>>> >>>>>>>>>>>> Yes, now I see it all. I’m storing lots of redundant information. >>>>>>>>>>>> Here is my final idea. Yes, now a user should pass a list type. I >>>>>>>>>>>> realized that’s the type is not really needed in ListSerializer, >>>>>>>>>>>> but >>>>>>>>>>>> only in ListDeserializer: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> In ListSerializer we will start storing sizes only if serializer is >>>>>>>>>>>> not a primitive serializer: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Then, in deserializer, we persist passed list type, so that during >>>>>>>>>>>> deserialization we could create an instance of it with predefined >>>>>>>>>>>> listSize for better performance. >>>>>>>>>>>> We also try to locate a primitiveSize based on passed deserializer. >>>>>>>>>>>> If it is not there, then primitiveSize will be null. Which means >>>>>>>>>>>> that each entry’s size was encoded individually. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> This looks much cleaner and more concise. >>>>>>>>>>>> >>>>>>>>>>>> What do you think? >>>>>>>>>>>> >>>>>>>>>>>> Best, >>>>>>>>>>>> Daniyar Yeralin >>>>>>>>>>>> >>>>>>>>>>>> On Jun 20, 2019, at 5:45 PM, Matthias J. Sax >>>>>>>>>>>> <matth...@confluent.io <mailto:matth...@confluent.io> >>>>>>>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io>> >>>>>>>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io> >>>>>>>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io>>> >>>>>>>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io> >>>>>>>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io>>>> >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> For encoding the list-type: I see John's point about re-encoding >>>>>>>>>>>> the >>>>>>>>>>>> list-type redundantly. However, I also don't like the idea that the >>>>>>>>>>>> Deserializer returns a fixed type... >>>>>>>>>>>> >>>>>>>>>>>> Maybe it's best allow users to specify the target list type on >>>>>>>>>>>> deserialization via config? >>>>>>>>>>>> >>>>>>>>>>>> Similar for the primitive types: I don't think we need to encode >>>>>>>>>>>> the >>>>>>>>>>>> type size, but users could specify the type on the deserializer >>>>>>>>>>>> (via a >>>>>>>>>>>> config again)? >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> About generics: nesting could be arbitrarily deep. Hence, I doubt >>>>>>>>>>>> we can >>>>>>>>>>>> support this and a cast will be necessary at some point in the user >>>>>>>>>>>> code. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -Matthias >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 6/20/19 1:21 PM, John Roesler wrote: >>>>>>>>>>>> >>>>>>>>>>>> Hey Daniyar, >>>>>>>>>>>> >>>>>>>>>>>> Thanks for looking at it! >>>>>>>>>>>> >>>>>>>>>>>> Something like your screenshot is more along the lines of what I >>>>>>>>>>>> was >>>>>>>>>>>> thinking. Sorry, but I didn't follow what you mean, how would that >>>>>>>>>>>> not >>>>>>>>>>>> be "vanilla java"? >>>>>>>>>>>> >>>>>>>>>>>> Unfortunately the deserializer needs more information, though. For >>>>>>>>>>>> example, what if the inner type is a Map<String,String>? The serde >>>>>>>>>>>> could >>>>>>>>>>>> only be used to produce a LinkedList<Map>, thus, we'd still need an >>>>>>>>>>>> inner serde, like you have in the KIP (Serde<T> innerSerde). >>>>>>>>>>>> >>>>>>>>>>>> Something more like Serde<LinkedList<MyRecord>> = Serdes.listSerde( >>>>>>>>>>>> /**list type**/ LinkedList.class, >>>>>>>>>>>> /**inner serde**/ new MyRecordSerde() >>>>>>>>>>>> ) >>>>>>>>>>>> >>>>>>>>>>>> And in configuration, it's something like: >>>>>>>>>>>> default.key.serde: org...ListSerde >>>>>>>>>>>> default.key.list.serde.type: java.util.LinkedList >>>>>>>>>>>> default.key.list.serde.inner: com.mycompany.MyRecordSerde >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> What do you think? >>>>>>>>>>>> Thanks, >>>>>>>>>>>> -John >>>>>>>>>>>> >>>>>>>>>>>> On Thu, Jun 20, 2019 at 2:46 PM Development <d...@yeralin.net >>>>>>>>>>>> <mailto:d...@yeralin.net> <mailto:d...@yeralin.net >>>>>>>>>>>> <mailto:d...@yeralin.net>> >>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> >>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>> >>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> >>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>> >>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> >>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>> >>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> >>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> Hey John, >>>>>>>>>>>> >>>>>>>>>>>> I gave read about TypeReference. It could work for the list serde. >>>>>>>>>>>> However, it is not directly >>>>>>>>>>>> supported: >>>>>>>>>>>> https://github.com/FasterXML/jackson-databind/issues/1490 >>>>>>>>>>>> <https://github.com/FasterXML/jackson-databind/issues/1490> >>>>>>>>>>>> <https://github.com/FasterXML/jackson-databind/issues/1490 >>>>>>>>>>>> <https://github.com/FasterXML/jackson-databind/issues/1490>> >>>>>>>>>>>> <https://github.com/FasterXML/jackson-databind/issues/1490 >>>>>>>>>>>> <https://github.com/FasterXML/jackson-databind/issues/1490> >>>>>>>>>>>> <https://github.com/FasterXML/jackson-databind/issues/1490 >>>>>>>>>>>> <https://github.com/FasterXML/jackson-databind/issues/1490>>> >>>>>>>>>>>> The only way is to pass an actual class object into the >>>>>>>>>>>> constructor, >>>>>>>>>>>> something like: >>>>>>>>>>>> >>>>>>>>>>>> It could be an option, but not a pretty one. What do you think of >>>>>>>>>>>> my >>>>>>>>>>>> approach to use vanilla java and canonical class name? (As >>>>>>>>>>>> described >>>>>>>>>>>> previously) >>>>>>>>>>>> >>>>>>>>>>>> Best, >>>>>>>>>>>> Daniyar Yeralin >>>>>>>>>>>> >>>>>>>>>>>> On Jun 20, 2019, at 2:45 PM, Development <d...@yeralin.net >>>>>>>>>>>> <mailto:d...@yeralin.net> <mailto:d...@yeralin.net >>>>>>>>>>>> <mailto:d...@yeralin.net>> >>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> >>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>> >>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> >>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>> >>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> >>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>> >>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> >>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> Hi John, >>>>>>>>>>>> >>>>>>>>>>>> Thank you for your input! Yes, my idea looks a little bit over >>>>>>>>>>>> engineered :) >>>>>>>>>>>> >>>>>>>>>>>> I also wanted to see a feedback from Mathias as well since he gave >>>>>>>>>>>> me an idea about storing fixed/variable size entries. >>>>>>>>>>>> >>>>>>>>>>>> Best, >>>>>>>>>>>> Daniyar Yeralin >>>>>>>>>>>> >>>>>>>>>>>> On Jun 18, 2019, at 6:06 PM, John Roesler <j...@confluent.io >>>>>>>>>>>> <mailto:j...@confluent.io> <mailto:j...@confluent.io >>>>>>>>>>>> <mailto:j...@confluent.io>> >>>>>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io> >>>>>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io>>> >>>>>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io> >>>>>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io>>> >>>>>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io> >>>>>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io>> >>>>>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io> >>>>>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> Hi Daniyar, >>>>>>>>>>>> >>>>>>>>>>>> That's a very clever solution! >>>>>>>>>>>> >>>>>>>>>>>> One observation is that, now, this is what we might call a >>>>>>>>>>>> polymorphic >>>>>>>>>>>> serde. That is, you're detecting the actual concrete type and then >>>>>>>>>>>> promising to produce the exact same concrete type on read. >>>>>>>>>>>> There are >>>>>>>>>>>> some inherent problems with this approach, which in general >>>>>>>>>>>> require >>>>>>>>>>>> some kind of schema registry (not necessarily Schema >>>>>>>>>>>> Registry, just >>>>>>>>>>>> any registry for schemas) to solve. >>>>>>>>>>>> >>>>>>>>>>>> Notice that every serialized record has quite a bit of duplicated >>>>>>>>>>>> information: the concrete type as well as a byte to indicate >>>>>>>>>>>> whether >>>>>>>>>>>> the value type is a fixed size, and, if so, an integer to >>>>>>>>>>>> indicate the >>>>>>>>>>>> actual size. These constitute a schema, of sorts, because they >>>>>>>>>>>> tell us >>>>>>>>>>>> later how exactly to deserialize the data. Unfortunately, this >>>>>>>>>>>> information is completely redundant. In all likelihood, the >>>>>>>>>>>> information will be exactly the same for every record in the >>>>>>>>>>>> topic. >>>>>>>>>>>> This problem is essentially the core motivation for serializations >>>>>>>>>>>> like Avro: to move the schema outside of the serialization >>>>>>>>>>>> itself, so >>>>>>>>>>>> that the records won't contain so much redundant information. >>>>>>>>>>>> >>>>>>>>>>>> In this light, I'm wondering if it makes sense to go back to >>>>>>>>>>>> something >>>>>>>>>>>> like what you had earlier in which you don't support perfectly >>>>>>>>>>>> preserving the concrete type for _this_ serde, but instead just >>>>>>>>>>>> support deserializing to _some_ List. Then, you could defer full, >>>>>>>>>>>> perfect, type preservation to serdes that have an external >>>>>>>>>>>> system in >>>>>>>>>>>> which to register their type information. >>>>>>>>>>>> >>>>>>>>>>>> There does exist an alternative, if we really do want to >>>>>>>>>>>> preserve the >>>>>>>>>>>> concrete type (which does seem kind of nice). You can add a >>>>>>>>>>>> configuration option specifically for the serde to configure >>>>>>>>>>>> what the >>>>>>>>>>>> list type will be, and maybe what the element type is, as well. >>>>>>>>>>>> >>>>>>>>>>>> As far as "related work" goes, you might be interested to take >>>>>>>>>>>> a look >>>>>>>>>>>> at how Jackson can be configured to deserialize into a specific, >>>>>>>>>>>> arbitrarily nested, generically parameterized class structure. >>>>>>>>>>>> Specifically, you might find >>>>>>>>>>>> https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html >>>>>>>>>>>> >>>>>>>>>>>> <https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html> >>>>>>>>>>>> >>>>>>>>>>>> <https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html >>>>>>>>>>>> >>>>>>>>>>>> <https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html>> >>>>>>>>>>>> <https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html >>>>>>>>>>>> >>>>>>>>>>>> <https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html> >>>>>>>>>>>> >>>>>>>>>>>> <https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html >>>>>>>>>>>> >>>>>>>>>>>> <https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html>>> >>>>>>>>>>>> interesting. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> -John >>>>>>>>>>>> >>>>>>>>>>>> On Mon, Jun 17, 2019 at 12:38 PM Development <d...@yeralin.net >>>>>>>>>>>> <mailto:d...@yeralin.net> <mailto:d...@yeralin.net >>>>>>>>>>>> <mailto:d...@yeralin.net>> >>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> >>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>> >>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> >>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>> >>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> >>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>> >>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> >>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> bump >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>> >> > >
signature.asc
Description: OpenPGP digital signature