Hi Matthias,

I agree, ConsumerConfig did not seem like a right place for these 
configurations.
I’ll put them in ProducerConfig, ConsumerConfig, and StreamsConfig.

However, I have a question. What should I do in "configure(Map<String, ?> 
configs, boolean isKey)” methods? Which configurations should I try to locate? 
I was comparing my (de)serializer implementations with 
SessionWindows(De)serializer classes, and they use StreamsConfig class to get  
either StreamsConfig.DEFAULT_WINDOWED_KEY_SERDE_INNER_CLASS : 
StreamsConfig.DEFAULT_WINDOWED_VALUE_SERDE_INNER_CLASS

In my case, as I mentioned earlier, StreamsConfig class is not accessible from 
org.apache.kafka.common.serialization package. So, I can’t utilize it. Any 
suggestions here?

Best,
Daniyar Yeralin


> On Jul 18, 2019, at 8:46 PM, Matthias J. Sax <matth...@confluent.io> wrote:
> 
> Thanks!
> 
> One minor question about the configs. The KIP adds three classes, a
> Serializer, a Deserializer, and a Serde.
> 
> Hence, would it make sense to add the corresponding configs to
> `ConsumerConfig`, `ProducerConfig`, and `StreamsConfig` using slightly
> different names each time?
> 
> 
> Somethin like this:
> 
> ProducerConfig:
> 
> list.key/value.serializer.type
> list.key/value.serializer.inner
> 
> ConsumerConfig:
> 
> list.key/value.deserializer.type
> list.key/value.deserializer.inner
> 
> StreamsConfig:
> 
> default.list.key/value.serde.type
> default.list.key/value.serde.inner
> 
> 
> Adding `d.l.k/v.serde.t/i` to `CommonClientConfigs does not sound right
> to me. Also note, that it seems better to avoid the `default.` prefix
> for consumers and producers because there is only one Serializer or
> Deserializer anyway. Only for Streams, there are multiple and
> StreamsConfig specifies the default one of an operator does not
> overwrite it.
> 
> Thoughts?
> 
> 
> Also, the KIP should explicitly mention to what classed certain configs
> are added. Atm, the KIP only list parameter names, but does not state
> where those are added.
> 
> 
> -Matthias
> 
> 
> 
> 
> 
> On 7/16/19 1:11 PM, Development wrote:
>> Hi,
>> 
>> Yes, totally forgot about the statement. KIP-466 is updated.
>> 
>> Thank you so much John Roesler, Matthias J. Sax, Sophie Blee-Goldman for 
>> your valuable input!
>> 
>> I hope I did not cause too much trouble :)
>> 
>> I’ll start the vote now.
>> 
>> Best,
>> Daniyar Yeralin
>> 
>>> On Jul 16, 2019, at 3:17 PM, John Roesler <j...@confluent.io> wrote:
>>> 
>>> Hi Daniyar,
>>> 
>>> Thanks for that update. I took a look, and I think this is in good shape.
>>> 
>>> One note, the statement "New method public static <T> Serde<List<T>>
>>> ListSerde() in org.apache.kafka.common.serialization.Serdes class
>>> (infers list implementation and inner serde from config file)" is
>>> still present in the KIP, although I do it is was removed from the PR.
>>> 
>>> Once you remove that statement from the KIP, then I think this KIP is
>>> ready to go up for a vote! Then, we can really review the PR in
>>> earnest and get this thing merged.
>>> 
>>> Thanks,
>>> -john
>>> 
>>> On Tue, Jul 16, 2019 at 2:05 PM Development <d...@yeralin.net> wrote:
>>>> 
>>>> Hi,
>>>> 
>>>> Pushed new changes under my PR: https://github.com/apache/kafka/pull/6592 
>>>> <https://github.com/apache/kafka/pull/6592>
>>>> 
>>>> Feel free to put any comments in there.
>>>> 
>>>> Best,
>>>> Daniyar Yeralin
>>>> 
>>>>> On Jul 15, 2019, at 1:06 PM, Development <d...@yeralin.net> wrote:
>>>>> 
>>>>> Hi John,
>>>>> 
>>>>> I knew I was missing something. Yes, that makes sense now, I removed all 
>>>>> `listSerde()` methods, and left empty constructors instead.
>>>>> 
>>>>> As per `CommonClientConfigs` I looked at the class, it doesn’t have any 
>>>>> properties related to serdes, and that bothers me a little.
>>>>> 
>>>>> All properties like `default.key.serde` `default.windowed.key.serde.*` 
>>>>> are located in StreamsConfig. I don’t want to create a confusion.
>>>>> What also doesn’t make sense to me is that `WindowedSerdes` and its 
>>>>> (de)serializers are not located in org.apache.kafka.common.serialization. 
>>>>> I guess it kind of makes sense since windowed serdes are only available 
>>>>> for kafka streams, not vice versa.
>>>>> 
>>>>> If everyone is okay to put list properties in `CommonClientConfigs` 
>>>>> class, I’ll go ahead and do that then.
>>>>> 
>>>>> Thank you for your input!
>>>>> 
>>>>> Best,
>>>>> Daniyar Yeralin
>>>>> 
>>>>>> On Jul 15, 2019, at 11:45 AM, John Roesler <j...@confluent.io> wrote:
>>>>>> 
>>>>>> Hi all,
>>>>>> 
>>>>>> Regarding the placement, you might as well move the constants to 
>>>>>> `org.apache.kafka.clients.CommonClientConfigs`, so that the constants 
>>>>>> and the configs and the code are in the same module.
>>>>>> 
>>>>>> Regarding the constructor... What Matthias said is correct: The serde, 
>>>>>> serializer, and deserializer all need to have zero-arg constructors so 
>>>>>> they can be instantiated reflectively by Kafka. However, the factory 
>>>>>> method you proposed "New method public static <T> Serde<List<T>> 
>>>>>> ListSerde()" is not a constructor, and is not required. It would be used 
>>>>>> purely from the Java interface, but has the drawbacks I listed above. 
>>>>>> This method, not the constructor, is what I proposed to remove from the 
>>>>>> KIP.
>>>>>> 
>>>>>> Thanks,
>>>>>> -John
>>>>>> 
>>>>>> 
>>>>>> On Mon, Jul 15, 2019 at 10:15 AM Development <d...@yeralin.net 
>>>>>> <mailto:d...@yeralin.net>> wrote:
>>>>>> One problem though.
>>>>>> 
>>>>>> Since WindowedSerde (Windowed(De)Serializer) are so similar, I’m trying 
>>>>>> to mimic the implementation of my ListSerde accordingly.
>>>>>> 
>>>>>> I created couple constants under StreamsConfig:
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> And trying to do similar construct:
>>>>>> final String propertyName = isKey ? 
>>>>>> StreamsConfig.DEFAULT_LIST_KEY_SERDE_INNER_CLASS : 
>>>>>> StreamsConfig.DEFAULT_LIST_VALUE_SERDE_INNER_CLASS;
>>>>>> But then found out that StreamsConfig is not accessible from 
>>>>>> org.apache.kafka.common.serialization package while window serde 
>>>>>> (de)serializers are located under org.apache.kafka.streams.kstream 
>>>>>> package.
>>>>>> 
>>>>>> What should I do? Should I move my classes under 
>>>>>> org.apache.kafka.streams.kstream package instead?
>>>>>> 
>>>>>>> On Jul 15, 2019, at 10:45 AM, Development <d...@yeralin.net 
>>>>>>> <mailto:d...@yeralin.net>> wrote:
>>>>>>> 
>>>>>>> Hi Matthias,
>>>>>>> 
>>>>>>> Thank you for your input.
>>>>>>> 
>>>>>>> I updated the KIP, made it a little more readable.
>>>>>>> 
>>>>>>> I think the configuration parameters strategy is finalized then.
>>>>>>> 
>>>>>>> Do you have any other questions/concerns regarding this KIP?
>>>>>>> 
>>>>>>> Meanwhile I’ll start doing appropriate code changes, and commit them 
>>>>>>> under my PR.
>>>>>>> 
>>>>>>> Best,
>>>>>>> Daniyar Yeralin
>>>>>>> 
>>>>>>>> On Jul 11, 2019, at 2:44 PM, Matthias J. Sax <matth...@confluent.io 
>>>>>>>> <mailto:matth...@confluent.io>> wrote:
>>>>>>>> 
>>>>>>>> Daniyar,
>>>>>>>> 
>>>>>>>> thanks for the update to the KIP. It's in really good shape and well
>>>>>>>> written.
>>>>>>>> 
>>>>>>>> About the default constructor question:
>>>>>>>> 
>>>>>>>> All Serdes/Serializer/Deserializer classes need a default constructor 
>>>>>>>> to
>>>>>>>> create them easily via reflections when specifies in a config. I
>>>>>>>> understand that it is not super user friendly, but all existing code
>>>>>>>> works this way. Hence, it seems best to stick with the established 
>>>>>>>> pattern.
>>>>>>>> 
>>>>>>>> We have a similar issue with `TimeWindowedSerde` and
>>>>>>>> `SessionWindowedSerde`, and I just recently did a PR to improve user
>>>>>>>> experience that address the exact issue John raised. (cf
>>>>>>>> https://github.com/apache/kafka/pull/7067 
>>>>>>>> <https://github.com/apache/kafka/pull/7067>)
>>>>>>>> 
>>>>>>>> Note, that if a user would instantiate the Serde manually, the user
>>>>>>>> would also need to call `configure()` to setup the inner serdes. Kafka
>>>>>>>> Streams would not setup those automatically and one might most likely
>>>>>>>> end-up with an NPE.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Coming back the KIP, and the parameter names. `WindowedSerdes` are
>>>>>>>> similar to `ListSerde` as they wrap another Serde. For 
>>>>>>>> `WindowedSerdes`,
>>>>>>>> we use the following parameter names:
>>>>>>>> 
>>>>>>>> - default.windowed.key.serde.inner
>>>>>>>> - default.windowed.value.serde.inner
>>>>>>>> 
>>>>>>>> 
>>>>>>>> It might be good to align the naming pattern. I would also suggest to
>>>>>>>> use `type` instead of `impl`?
>>>>>>>> 
>>>>>>>> 
>>>>>>>> default.key.list.serde.impl  ->  default.list.key.serde.type
>>>>>>>> default.value.list.serde.impl  ->  default.list.value.serde.type
>>>>>>>> default.key.list.serde.element  ->  default.list.key.serde.inner
>>>>>>>> default.value.list.serde.element  ->  default.list.value.serde.inner
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> -Matthias
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On 7/10/19 8:52 AM, Development wrote:
>>>>>>>>> Hi John,
>>>>>>>>> 
>>>>>>>>> Yes, I do agree. That totally makes sense. The only thing is that it 
>>>>>>>>> goes against what Matthias suggested earlier:
>>>>>>>>> "I think that ... `ListSerde` should have an default constructor and 
>>>>>>>>> it should be possible to pass in the `Class listClass` information 
>>>>>>>>> via a configuration. Otherwise, KafkaStreams cannot use it as default 
>>>>>>>>> serde.”
>>>>>>>>> 
>>>>>>>>> What do you think about that? I hope I’m not confusing anything.
>>>>>>>>> 
>>>>>>>>> Best,
>>>>>>>>> Daniyar Yeralin
>>>>>>>>> 
>>>>>>>>>> On Jul 9, 2019, at 5:56 PM, John Roesler <j...@confluent.io 
>>>>>>>>>> <mailto:j...@confluent.io>> wrote:
>>>>>>>>>> 
>>>>>>>>>> Ah, my apologies, I must have just overlooked it. Thanks for the 
>>>>>>>>>> update, too.
>>>>>>>>>> 
>>>>>>>>>> Just one more super-small question, do we need this variant:
>>>>>>>>>> 
>>>>>>>>>>> New method public static <T> Serde<List<T>> ListSerde() in 
>>>>>>>>>>> org.apache.kafka.common.serialization.Serdes class (infers list 
>>>>>>>>>>> implementation and inner serde from config file)
>>>>>>>>>> 
>>>>>>>>>> It seems like this situation implies my config file is already set 
>>>>>>>>>> up for the list serde, so passing this serde (e.g., in Produced) 
>>>>>>>>>> would have the same effect as not specifying it.
>>>>>>>>>> 
>>>>>>>>>> I guess that it could be the case that you have the 
>>>>>>>>>> `default.key/value.serde` set to something else, like StringSerde, 
>>>>>>>>>> but you still have the `default.key/value.list.serde.impl/element` 
>>>>>>>>>> set. This seems like it would result in more confusion than 
>>>>>>>>>> convenience, so my gut instinct is maybe we shouldn't introduce the 
>>>>>>>>>> `ListSerde()` variant until people actually request it later on.
>>>>>>>>>> 
>>>>>>>>>> Thus, we'd just stick with fully config-driven or fully 
>>>>>>>>>> source-code-driven, not half/half.
>>>>>>>>>> 
>>>>>>>>>> What do you think?
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> -John
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Tue, Jul 9, 2019 at 9:58 AM Development <d...@yeralin.net 
>>>>>>>>>> <mailto:d...@yeralin.net> <mailto:d...@yeralin.net 
>>>>>>>>>> <mailto:d...@yeralin.net>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> Hi John,
>>>>>>>>>>> 
>>>>>>>>>>> I hope everyone had a great long weekend.
>>>>>>>>>>> 
>>>>>>>>>>> Regarding Java interfaces, I may not understand you correctly, but 
>>>>>>>>>>> I think I already listed them:
>>>>>>>>>>> 
>>>>>>>>>>> So for Produced, you would use it in the following fashion, for 
>>>>>>>>>>> example: Produced.keySerde(Serdes.ListSerde(ArrayList.class, 
>>>>>>>>>>> Serdes.Integer()))
>>>>>>>>>>> 
>>>>>>>>>>> I also updated the KIP, and added a section “Serialization 
>>>>>>>>>>> Strategy” where I describe our logic of conditional serialization 
>>>>>>>>>>> based on the type of an inner serde.
>>>>>>>>>>> 
>>>>>>>>>>> Thank you!
>>>>>>>>>>> 
>>>>>>>>>>> Best,
>>>>>>>>>>> Daniyar Yeralin
>>>>>>>>>>> 
>>>>>>>>>>> On Jun 26, 2019, at 11:44 AM, John Roesler <j...@confluent.io 
>>>>>>>>>>> <mailto:j...@confluent.io> <mailto:j...@confluent.io 
>>>>>>>>>>> <mailto:j...@confluent.io>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> Thanks for the update, Daniyar!
>>>>>>>>>>> 
>>>>>>>>>>> In addition to specifying the config interface, can you also specify
>>>>>>>>>>> the Java interface? Namely, if I need to pass an instance of this
>>>>>>>>>>> serde in to the DSL directly, as in Produced, Materialized, etc., 
>>>>>>>>>>> what
>>>>>>>>>>> constructor(s) would I have available? Likewise with the Serializer
>>>>>>>>>>> and Deserailizer. I don't think you need to specify the 
>>>>>>>>>>> implementation
>>>>>>>>>>> logic, since we've already discussed it here.
>>>>>>>>>>> 
>>>>>>>>>>> If you also want to specify the serialized format of the data 
>>>>>>>>>>> records
>>>>>>>>>>> in the KIP, it could be useful documentation, as well as letting us
>>>>>>>>>>> verify the schema for forward/backward compatibility concerns, etc.
>>>>>>>>>>> 
>>>>>>>>>>> Thanks,
>>>>>>>>>>> John
>>>>>>>>>>> 
>>>>>>>>>>> On Wed, Jun 26, 2019 at 10:33 AM Development <d...@yeralin.net 
>>>>>>>>>>> <mailto:d...@yeralin.net> <mailto:d...@yeralin.net 
>>>>>>>>>>> <mailto:d...@yeralin.net>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> Hey,
>>>>>>>>>>> 
>>>>>>>>>>> Finally made updates to the KIP: 
>>>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-466%3A+Add+support+for+List%3CT%3E+serialization+and+deserialization
>>>>>>>>>>>  
>>>>>>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-466%3A+Add+support+for+List%3CT%3E+serialization+and+deserialization>
>>>>>>>>>>>  
>>>>>>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-466%3A+Add+support+for+List%3CT%3E+serialization+and+deserialization
>>>>>>>>>>>  
>>>>>>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-466%3A+Add+support+for+List%3CT%3E+serialization+and+deserialization>>
>>>>>>>>>>>  
>>>>>>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-466:+Add+support+for+List%3CT%3E+serialization+and+deserialization
>>>>>>>>>>>  
>>>>>>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-466:+Add+support+for+List%3CT%3E+serialization+and+deserialization>
>>>>>>>>>>>  
>>>>>>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-466:+Add+support+for+List%3CT%3E+serialization+and+deserialization
>>>>>>>>>>>  
>>>>>>>>>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-466:+Add+support+for+List%3CT%3E+serialization+and+deserialization>>>
>>>>>>>>>>> Sorry for the delay :)
>>>>>>>>>>> 
>>>>>>>>>>> Thank You!
>>>>>>>>>>> 
>>>>>>>>>>> Best,
>>>>>>>>>>> Daniyar Yeralin
>>>>>>>>>>> 
>>>>>>>>>>> On Jun 22, 2019, at 12:49 AM, Matthias J. Sax 
>>>>>>>>>>> <matth...@confluent.io <mailto:matth...@confluent.io> 
>>>>>>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io>>> 
>>>>>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> Yes, something like this. I did not think about good configuration
>>>>>>>>>>> parameter names yet. I am also not sure if I understand all proposed
>>>>>>>>>>> configs atm. But all configs should be listed and explained in the 
>>>>>>>>>>> KIP
>>>>>>>>>>> anyway, and we can discuss further after you have updated the KIP 
>>>>>>>>>>> (I can
>>>>>>>>>>> ask more detailed question if I have any).
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> -Matthias
>>>>>>>>>>> 
>>>>>>>>>>> On 6/21/19 2:05 PM, Development wrote:
>>>>>>>>>>> 
>>>>>>>>>>> Yes, you are right. ByteSerializer is not what I need to have in a 
>>>>>>>>>>> list
>>>>>>>>>>> of primitives.
>>>>>>>>>>> 
>>>>>>>>>>> As for the default constructor and configurability, just want to 
>>>>>>>>>>> make
>>>>>>>>>>> sure. Is this what you have on your mind?
>>>>>>>>>>> 
>>>>>>>>>>> Best,
>>>>>>>>>>> Daniyar Yeralin
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> On Jun 21, 2019, at 2:51 PM, Matthias J. Sax <matth...@confluent.io 
>>>>>>>>>>> <mailto:matth...@confluent.io> <mailto:matth...@confluent.io 
>>>>>>>>>>> <mailto:matth...@confluent.io>>
>>>>>>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io> 
>>>>>>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io>>>> 
>>>>>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> Thanks for the update!
>>>>>>>>>>> 
>>>>>>>>>>> I think that `ListDeserializer`, `ListSerializer`, and `ListSerde`
>>>>>>>>>>> should have an default constructor and it should be possible to 
>>>>>>>>>>> pass in
>>>>>>>>>>> the `Class listClass` information via a configuration. Otherwise,
>>>>>>>>>>> KafkaStreams cannot use it as default serde.
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> For the primitive serializers: `BytesSerializer` is not primitive 
>>>>>>>>>>> IMHO,
>>>>>>>>>>> as is it for `byte[]` with variable length -- it's for arrays, not 
>>>>>>>>>>> for
>>>>>>>>>>> single `byte` (note, that `Bytes` is a Kafka class wrapping 
>>>>>>>>>>> `byte[]`).
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> For tests, we can comment on the PR. No need to do this in the KIP
>>>>>>>>>>> discussion.
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> Can you also update the KIP?
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> -Matthias
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> On 6/21/19 11:29 AM, Development wrote:
>>>>>>>>>>> 
>>>>>>>>>>> I made and pushed necessary commits, so we could review the final
>>>>>>>>>>> version under PR https://github.com/apache/kafka/pull/6592 
>>>>>>>>>>> <https://github.com/apache/kafka/pull/6592> 
>>>>>>>>>>> <https://github.com/apache/kafka/pull/6592 
>>>>>>>>>>> <https://github.com/apache/kafka/pull/6592>>
>>>>>>>>>>> 
>>>>>>>>>>> I also need some advice on writing tests for this new serde. So far 
>>>>>>>>>>> I
>>>>>>>>>>> only have two test cases (roundtrip and empty payload), I’m not sure
>>>>>>>>>>> if it is enough.
>>>>>>>>>>> 
>>>>>>>>>>> Thank y’all for your help in this KIP :)
>>>>>>>>>>> 
>>>>>>>>>>> Best,
>>>>>>>>>>> Daniyar Yeralin
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> On Jun 21, 2019, at 1:44 PM, John Roesler <j...@confluent.io 
>>>>>>>>>>> <mailto:j...@confluent.io> <mailto:j...@confluent.io 
>>>>>>>>>>> <mailto:j...@confluent.io>>
>>>>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io> 
>>>>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> Hey Daniyar,
>>>>>>>>>>> 
>>>>>>>>>>> Looks good to me! Thanks for considering it.
>>>>>>>>>>> 
>>>>>>>>>>> Thanks,
>>>>>>>>>>> -John
>>>>>>>>>>> 
>>>>>>>>>>> On Fri, Jun 21, 2019 at 9:04 AM Development <d...@yeralin.net 
>>>>>>>>>>> <mailto:d...@yeralin.net> <mailto:d...@yeralin.net 
>>>>>>>>>>> <mailto:d...@yeralin.net>>
>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>> 
>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>>> wrote:
>>>>>>>>>>> Hey John and Matthias,
>>>>>>>>>>> 
>>>>>>>>>>> Yes, now I see it all. I’m storing lots of redundant information.
>>>>>>>>>>> Here is my final idea. Yes, now a user should pass a list type. I
>>>>>>>>>>> realized that’s the type is not really needed in ListSerializer, but
>>>>>>>>>>> only in ListDeserializer:
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> In ListSerializer we will start storing sizes only if serializer is
>>>>>>>>>>> not a primitive serializer:
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> Then, in deserializer, we persist passed list type, so that during
>>>>>>>>>>> deserialization we could create an instance of it with predefined
>>>>>>>>>>> listSize for better performance.
>>>>>>>>>>> We also try to locate a primitiveSize based on passed deserializer.
>>>>>>>>>>> If it is not there, then primitiveSize will be null. Which means
>>>>>>>>>>> that each entry’s size was encoded individually.
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> This looks much cleaner and more concise.
>>>>>>>>>>> 
>>>>>>>>>>> What do you think?
>>>>>>>>>>> 
>>>>>>>>>>> Best,
>>>>>>>>>>> Daniyar Yeralin
>>>>>>>>>>> 
>>>>>>>>>>> On Jun 20, 2019, at 5:45 PM, Matthias J. Sax <matth...@confluent.io 
>>>>>>>>>>> <mailto:matth...@confluent.io> <mailto:matth...@confluent.io 
>>>>>>>>>>> <mailto:matth...@confluent.io>>
>>>>>>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io> 
>>>>>>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io>>> 
>>>>>>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io> 
>>>>>>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io>>>> 
>>>>>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> For encoding the list-type: I see John's point about re-encoding the
>>>>>>>>>>> list-type redundantly. However, I also don't like the idea that the
>>>>>>>>>>> Deserializer returns a fixed type...
>>>>>>>>>>> 
>>>>>>>>>>> Maybe it's best allow users to specify the target list type on
>>>>>>>>>>> deserialization via config?
>>>>>>>>>>> 
>>>>>>>>>>> Similar for the primitive types: I don't think we need to encode the
>>>>>>>>>>> type size, but users could specify the type on the deserializer 
>>>>>>>>>>> (via a
>>>>>>>>>>> config again)?
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> About generics: nesting could be arbitrarily deep. Hence, I doubt
>>>>>>>>>>> we can
>>>>>>>>>>> support this and a cast will be necessary at some point in the user
>>>>>>>>>>> code.
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> -Matthias
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> On 6/20/19 1:21 PM, John Roesler wrote:
>>>>>>>>>>> 
>>>>>>>>>>> Hey Daniyar,
>>>>>>>>>>> 
>>>>>>>>>>> Thanks for looking at it!
>>>>>>>>>>> 
>>>>>>>>>>> Something like your screenshot is more along the lines of what I was
>>>>>>>>>>> thinking. Sorry, but I didn't follow what you mean, how would that 
>>>>>>>>>>> not
>>>>>>>>>>> be "vanilla java"?
>>>>>>>>>>> 
>>>>>>>>>>> Unfortunately the deserializer needs more information, though. For
>>>>>>>>>>> example, what if the inner type is a Map<String,String>? The serde
>>>>>>>>>>> could
>>>>>>>>>>> only be used to produce a LinkedList<Map>, thus, we'd still need an
>>>>>>>>>>> inner serde, like you have in the KIP (Serde<T> innerSerde).
>>>>>>>>>>> 
>>>>>>>>>>> Something more like Serde<LinkedList<MyRecord>> = Serdes.listSerde(
>>>>>>>>>>> /**list type**/ LinkedList.class,
>>>>>>>>>>> /**inner serde**/ new MyRecordSerde()
>>>>>>>>>>> )
>>>>>>>>>>> 
>>>>>>>>>>> And in configuration, it's something like:
>>>>>>>>>>> default.key.serde: org...ListSerde
>>>>>>>>>>> default.key.list.serde.type: java.util.LinkedList
>>>>>>>>>>> default.key.list.serde.inner: com.mycompany.MyRecordSerde
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> What do you think?
>>>>>>>>>>> Thanks,
>>>>>>>>>>> -John
>>>>>>>>>>> 
>>>>>>>>>>> On Thu, Jun 20, 2019 at 2:46 PM Development <d...@yeralin.net 
>>>>>>>>>>> <mailto:d...@yeralin.net> <mailto:d...@yeralin.net 
>>>>>>>>>>> <mailto:d...@yeralin.net>>
>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>> 
>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>>
>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>> 
>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> Hey John,
>>>>>>>>>>> 
>>>>>>>>>>> I gave read about TypeReference. It could work for the list serde.
>>>>>>>>>>> However, it is not directly
>>>>>>>>>>> supported:
>>>>>>>>>>> https://github.com/FasterXML/jackson-databind/issues/1490 
>>>>>>>>>>> <https://github.com/FasterXML/jackson-databind/issues/1490> 
>>>>>>>>>>> <https://github.com/FasterXML/jackson-databind/issues/1490 
>>>>>>>>>>> <https://github.com/FasterXML/jackson-databind/issues/1490>>
>>>>>>>>>>> <https://github.com/FasterXML/jackson-databind/issues/1490 
>>>>>>>>>>> <https://github.com/FasterXML/jackson-databind/issues/1490> 
>>>>>>>>>>> <https://github.com/FasterXML/jackson-databind/issues/1490 
>>>>>>>>>>> <https://github.com/FasterXML/jackson-databind/issues/1490>>>
>>>>>>>>>>> The only way is to pass an actual class object into the constructor,
>>>>>>>>>>> something like:
>>>>>>>>>>> 
>>>>>>>>>>> It could be an option, but not a pretty one. What do you think of my
>>>>>>>>>>> approach to use vanilla java and canonical class name? (As described
>>>>>>>>>>> previously)
>>>>>>>>>>> 
>>>>>>>>>>> Best,
>>>>>>>>>>> Daniyar Yeralin
>>>>>>>>>>> 
>>>>>>>>>>> On Jun 20, 2019, at 2:45 PM, Development <d...@yeralin.net 
>>>>>>>>>>> <mailto:d...@yeralin.net> <mailto:d...@yeralin.net 
>>>>>>>>>>> <mailto:d...@yeralin.net>>
>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>> 
>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>>
>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>> 
>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> Hi John,
>>>>>>>>>>> 
>>>>>>>>>>> Thank you for your input! Yes, my idea looks a little bit over
>>>>>>>>>>> engineered :)
>>>>>>>>>>> 
>>>>>>>>>>> I also wanted to see a feedback from Mathias as well since he gave
>>>>>>>>>>> me an idea about storing fixed/variable size entries.
>>>>>>>>>>> 
>>>>>>>>>>> Best,
>>>>>>>>>>> Daniyar Yeralin
>>>>>>>>>>> 
>>>>>>>>>>> On Jun 18, 2019, at 6:06 PM, John Roesler <j...@confluent.io 
>>>>>>>>>>> <mailto:j...@confluent.io> <mailto:j...@confluent.io 
>>>>>>>>>>> <mailto:j...@confluent.io>>
>>>>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io> 
>>>>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io>>> 
>>>>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io> 
>>>>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io>>>
>>>>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io> 
>>>>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io>> 
>>>>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io> 
>>>>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> Hi Daniyar,
>>>>>>>>>>> 
>>>>>>>>>>> That's a very clever solution!
>>>>>>>>>>> 
>>>>>>>>>>> One observation is that, now, this is what we might call a
>>>>>>>>>>> polymorphic
>>>>>>>>>>> serde. That is, you're detecting the actual concrete type and then
>>>>>>>>>>> promising to produce the exact same concrete type on read.
>>>>>>>>>>> There are
>>>>>>>>>>> some inherent problems with this approach, which in general
>>>>>>>>>>> require
>>>>>>>>>>> some kind of  schema registry (not necessarily Schema
>>>>>>>>>>> Registry, just
>>>>>>>>>>> any registry for schemas) to solve.
>>>>>>>>>>> 
>>>>>>>>>>> Notice that every serialized record has quite a bit of duplicated
>>>>>>>>>>> information: the concrete type as well as a byte to indicate
>>>>>>>>>>> whether
>>>>>>>>>>> the value type is a fixed size, and, if so, an integer to
>>>>>>>>>>> indicate the
>>>>>>>>>>> actual size. These constitute a schema, of sorts, because they
>>>>>>>>>>> tell us
>>>>>>>>>>> later how exactly to deserialize the data. Unfortunately, this
>>>>>>>>>>> information is completely redundant. In all likelihood, the
>>>>>>>>>>> information will be exactly the same for every record in the
>>>>>>>>>>> topic.
>>>>>>>>>>> This problem is essentially the core motivation for serializations
>>>>>>>>>>> like Avro: to move the schema outside of the serialization
>>>>>>>>>>> itself, so
>>>>>>>>>>> that the records won't contain so much redundant information.
>>>>>>>>>>> 
>>>>>>>>>>> In this light, I'm wondering if it makes sense to go back to
>>>>>>>>>>> something
>>>>>>>>>>> like what you had earlier in which you don't support perfectly
>>>>>>>>>>> preserving the concrete type for _this_ serde, but instead just
>>>>>>>>>>> support deserializing to _some_ List. Then, you could defer full,
>>>>>>>>>>> perfect, type preservation to serdes that have an external
>>>>>>>>>>> system in
>>>>>>>>>>> which to register their type information.
>>>>>>>>>>> 
>>>>>>>>>>> There does exist an alternative, if we really do want to
>>>>>>>>>>> preserve the
>>>>>>>>>>> concrete type (which does seem kind of nice). You can add a
>>>>>>>>>>> configuration option specifically for the serde to configure
>>>>>>>>>>> what the
>>>>>>>>>>> list type will be, and maybe what the element type is, as well.
>>>>>>>>>>> 
>>>>>>>>>>> As far as "related work" goes, you might be interested to take
>>>>>>>>>>> a look
>>>>>>>>>>> at how Jackson can be configured to deserialize into a specific,
>>>>>>>>>>> arbitrarily nested, generically parameterized class structure.
>>>>>>>>>>> Specifically, you might find
>>>>>>>>>>> https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html
>>>>>>>>>>>  
>>>>>>>>>>> <https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html>
>>>>>>>>>>>  
>>>>>>>>>>> <https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html
>>>>>>>>>>>  
>>>>>>>>>>> <https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html>>
>>>>>>>>>>> <https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html
>>>>>>>>>>>  
>>>>>>>>>>> <https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html>
>>>>>>>>>>>  
>>>>>>>>>>> <https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html
>>>>>>>>>>>  
>>>>>>>>>>> <https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html>>>
>>>>>>>>>>> interesting.
>>>>>>>>>>> 
>>>>>>>>>>> Thanks,
>>>>>>>>>>> -John
>>>>>>>>>>> 
>>>>>>>>>>> On Mon, Jun 17, 2019 at 12:38 PM Development <d...@yeralin.net 
>>>>>>>>>>> <mailto:d...@yeralin.net> <mailto:d...@yeralin.net 
>>>>>>>>>>> <mailto:d...@yeralin.net>>
>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>> 
>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>>
>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>> 
>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> 
>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> bump
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>> 
> 

Reply via email to