>> Just to make sure I understand the problem you're highlighting:
>> I guess the difference is that the serializer and deserializer that are
>> nested inside the serde also need to be configured? So, by default I'd have
>> to specify all six configs when I'm using Streams?

That is not the problem. And you actually describe the solution for it
yourself:

>> I guess in the Serde, it could make use of a package-protected constructor
>> for the serializer and deserializer that fixes the list type and inner type
>> to the serde-configured ones. Then, when you're configuring Streams, you
>> only need to specify the StreamsConfigs.




The problem is, that `ListSerde` is in package `clients` and thus
`ListSerde` cannot access `StreamsConfig`, and hence cannot use
`StreamsConfig#DEFAULT_LIST_KEY_SERDE_TYPE` (and others). Therefore, we
either need to hard-code strings literal for the config names (what does
not sound right) or add `CommonClientConfig#DEFAULT_LIST_KEY_SERDE_TYPE`
(and others).

In StreamsConfig we would just redefine them for convenience:

> public static final String DEFAULT_LIST_KEY_SERDE_TYPE = 
> CommonClientConfig#DEFAULT_LIST_KEY_SERDE_TYPE;


Note that `TimeWindowSerde` is contained in `streams` package and thus
it can access `StreamsConfig` and
`StreamsConfig#DEFAULT_WINDOWED_KEY_SERDE_INNER_CLASS`.




Btw: I just realized that we actually don't need `ProducerConfig`

> list.key/value.serializer.type

because the list-type is irrelevant on write. We only need `inner` config.



-Matthias


On 7/23/19 1:30 PM, John Roesler wrote:
> Hmm, that's a tricky situation.
> 
> I think Daniyar was on the right track... Producer only cares about
> serializer configs, and Consumer only cares about deserializer configs.
> 
> I didn't see the problem with your proposal:
> 
> ProducerConfig:
>> list.key/value.serializer.type
>> list.key/value.serializer.inner
>> ConsumerConfig:
>> list.key/value.deserializer.type
>> list.key/value.deserializer.inner
>> StreamsConfig:
>> default.list.key/value.serde.type
>> default.list.key/value.serde.inner
> 
> 
> It seems like the key/value serde configs are a better analogy than the
> windowed serde.
> ProducerConfig: key.serializer
> ConsumerConfig: key.deserializer
> StreamsConfig: default.key.serde
> 
> Just to make sure I understand the problem you're highlighting:
> I guess the difference is that the serializer and deserializer that are
> nested inside the serde also need to be configured? So, by default I'd have
> to specify all six configs when I'm using Streams?
> 
> I guess in the Serde, it could make use of a package-protected constructor
> for the serializer and deserializer that fixes the list type and inner type
> to the serde-configured ones. Then, when you're configuring Streams, you
> only need to specify the StreamsConfigs.
> 
> Does that work?
> -John
> 
> 
> On Tue, Jul 23, 2019 at 11:39 AM Development <d...@yeralin.net> wrote:
> 
>> Bump
>>
>>> On Jul 22, 2019, at 11:26 AM, Development <d...@yeralin.net> wrote:
>>>
>>> Hey Matthias,
>>>
>>> It looks a little confusing, but I don’t have enough expertise to judge
>> on the configuration placement.
>>>
>>> If you think, it is fine I’ll go ahead with this approach.
>>>
>>> Best,
>>> Daniyar Yeralin
>>>
>>>> On Jul 19, 2019, at 5:49 PM, Matthias J. Sax <matth...@confluent.io>
>> wrote:
>>>>
>>>> Good point.
>>>>
>>>> I guess the simplest solution is, to actually add
>>>>
>>>>>> default.list.key/value.serde.type
>>>>>> default.list.key/value.serde.inner
>>>>
>>>> to both `CommonClientConfigs` and `StreamsConfig`.
>>>>
>>>> It's not super clean, but I think it's the best we can do. Thoughts?
>>>>
>>>>
>>>> -Matthias
>>>>
>>>> On 7/19/19 1:23 PM, Development wrote:
>>>>> Hi Matthias,
>>>>>
>>>>> I agree, ConsumerConfig did not seem like a right place for these
>> configurations.
>>>>> I’ll put them in ProducerConfig, ConsumerConfig, and StreamsConfig.
>>>>>
>>>>> However, I have a question. What should I do in "configure(Map<String,
>> ?> configs, boolean isKey)” methods? Which configurations should I try to
>> locate? I was comparing my (de)serializer implementations with
>> SessionWindows(De)serializer classes, and they use StreamsConfig class to
>> get  either StreamsConfig.DEFAULT_WINDOWED_KEY_SERDE_INNER_CLASS :
>> StreamsConfig.DEFAULT_WINDOWED_VALUE_SERDE_INNER_CLASS
>>>>>
>>>>> In my case, as I mentioned earlier, StreamsConfig class is not
>> accessible from org.apache.kafka.common.serialization package. So, I can’t
>> utilize it. Any suggestions here?
>>>>>
>>>>> Best,
>>>>> Daniyar Yeralin
>>>>>
>>>>>
>>>>>> On Jul 18, 2019, at 8:46 PM, Matthias J. Sax <matth...@confluent.io>
>> wrote:
>>>>>>
>>>>>> Thanks!
>>>>>>
>>>>>> One minor question about the configs. The KIP adds three classes, a
>>>>>> Serializer, a Deserializer, and a Serde.
>>>>>>
>>>>>> Hence, would it make sense to add the corresponding configs to
>>>>>> `ConsumerConfig`, `ProducerConfig`, and `StreamsConfig` using slightly
>>>>>> different names each time?
>>>>>>
>>>>>>
>>>>>> Somethin like this:
>>>>>>
>>>>>> ProducerConfig:
>>>>>>
>>>>>> list.key/value.serializer.type
>>>>>> list.key/value.serializer.inner
>>>>>>
>>>>>> ConsumerConfig:
>>>>>>
>>>>>> list.key/value.deserializer.type
>>>>>> list.key/value.deserializer.inner
>>>>>>
>>>>>> StreamsConfig:
>>>>>>
>>>>>> default.list.key/value.serde.type
>>>>>> default.list.key/value.serde.inner
>>>>>>
>>>>>>
>>>>>> Adding `d.l.k/v.serde.t/i` to `CommonClientConfigs does not sound
>> right
>>>>>> to me. Also note, that it seems better to avoid the `default.` prefix
>>>>>> for consumers and producers because there is only one Serializer or
>>>>>> Deserializer anyway. Only for Streams, there are multiple and
>>>>>> StreamsConfig specifies the default one of an operator does not
>>>>>> overwrite it.
>>>>>>
>>>>>> Thoughts?
>>>>>>
>>>>>>
>>>>>> Also, the KIP should explicitly mention to what classed certain
>> configs
>>>>>> are added. Atm, the KIP only list parameter names, but does not state
>>>>>> where those are added.
>>>>>>
>>>>>>
>>>>>> -Matthias
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 7/16/19 1:11 PM, Development wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> Yes, totally forgot about the statement. KIP-466 is updated.
>>>>>>>
>>>>>>> Thank you so much John Roesler, Matthias J. Sax, Sophie Blee-Goldman
>> for your valuable input!
>>>>>>>
>>>>>>> I hope I did not cause too much trouble :)
>>>>>>>
>>>>>>> I’ll start the vote now.
>>>>>>>
>>>>>>> Best,
>>>>>>> Daniyar Yeralin
>>>>>>>
>>>>>>>> On Jul 16, 2019, at 3:17 PM, John Roesler <j...@confluent.io>
>> wrote:
>>>>>>>>
>>>>>>>> Hi Daniyar,
>>>>>>>>
>>>>>>>> Thanks for that update. I took a look, and I think this is in good
>> shape.
>>>>>>>>
>>>>>>>> One note, the statement "New method public static <T> Serde<List<T>>
>>>>>>>> ListSerde() in org.apache.kafka.common.serialization.Serdes class
>>>>>>>> (infers list implementation and inner serde from config file)" is
>>>>>>>> still present in the KIP, although I do it is was removed from the
>> PR.
>>>>>>>>
>>>>>>>> Once you remove that statement from the KIP, then I think this KIP
>> is
>>>>>>>> ready to go up for a vote! Then, we can really review the PR in
>>>>>>>> earnest and get this thing merged.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> -john
>>>>>>>>
>>>>>>>> On Tue, Jul 16, 2019 at 2:05 PM Development <d...@yeralin.net>
>> wrote:
>>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> Pushed new changes under my PR:
>> https://github.com/apache/kafka/pull/6592 <
>> https://github.com/apache/kafka/pull/6592>
>>>>>>>>>
>>>>>>>>> Feel free to put any comments in there.
>>>>>>>>>
>>>>>>>>> Best,
>>>>>>>>> Daniyar Yeralin
>>>>>>>>>
>>>>>>>>>> On Jul 15, 2019, at 1:06 PM, Development <d...@yeralin.net> wrote:
>>>>>>>>>>
>>>>>>>>>> Hi John,
>>>>>>>>>>
>>>>>>>>>> I knew I was missing something. Yes, that makes sense now, I
>> removed all `listSerde()` methods, and left empty constructors instead.
>>>>>>>>>>
>>>>>>>>>> As per `CommonClientConfigs` I looked at the class, it doesn’t
>> have any properties related to serdes, and that bothers me a little.
>>>>>>>>>>
>>>>>>>>>> All properties like `default.key.serde`
>> `default.windowed.key.serde.*` are located in StreamsConfig. I don’t want
>> to create a confusion.
>>>>>>>>>> What also doesn’t make sense to me is that `WindowedSerdes` and
>> its (de)serializers are not located in
>> org.apache.kafka.common.serialization. I guess it kind of makes sense since
>> windowed serdes are only available for kafka streams, not vice versa.
>>>>>>>>>>
>>>>>>>>>> If everyone is okay to put list properties in
>> `CommonClientConfigs` class, I’ll go ahead and do that then.
>>>>>>>>>>
>>>>>>>>>> Thank you for your input!
>>>>>>>>>>
>>>>>>>>>> Best,
>>>>>>>>>> Daniyar Yeralin
>>>>>>>>>>
>>>>>>>>>>> On Jul 15, 2019, at 11:45 AM, John Roesler <j...@confluent.io>
>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Hi all,
>>>>>>>>>>>
>>>>>>>>>>> Regarding the placement, you might as well move the constants to
>> `org.apache.kafka.clients.CommonClientConfigs`, so that the constants and
>> the configs and the code are in the same module.
>>>>>>>>>>>
>>>>>>>>>>> Regarding the constructor... What Matthias said is correct: The
>> serde, serializer, and deserializer all need to have zero-arg constructors
>> so they can be instantiated reflectively by Kafka. However, the factory
>> method you proposed "New method public static <T> Serde<List<T>>
>> ListSerde()" is not a constructor, and is not required. It would be used
>> purely from the Java interface, but has the drawbacks I listed above. This
>> method, not the constructor, is what I proposed to remove from the KIP.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> -John
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Jul 15, 2019 at 10:15 AM Development <d...@yeralin.net
>> <mailto:d...@yeralin.net>> wrote:
>>>>>>>>>>> One problem though.
>>>>>>>>>>>
>>>>>>>>>>> Since WindowedSerde (Windowed(De)Serializer) are so similar, I’m
>> trying to mimic the implementation of my ListSerde accordingly.
>>>>>>>>>>>
>>>>>>>>>>> I created couple constants under StreamsConfig:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> And trying to do similar construct:
>>>>>>>>>>> final String propertyName = isKey ?
>> StreamsConfig.DEFAULT_LIST_KEY_SERDE_INNER_CLASS :
>> StreamsConfig.DEFAULT_LIST_VALUE_SERDE_INNER_CLASS;
>>>>>>>>>>> But then found out that StreamsConfig is not accessible from
>> org.apache.kafka.common.serialization package while window serde
>> (de)serializers are located under org.apache.kafka.streams.kstream package.
>>>>>>>>>>>
>>>>>>>>>>> What should I do? Should I move my classes under
>> org.apache.kafka.streams.kstream package instead?
>>>>>>>>>>>
>>>>>>>>>>>> On Jul 15, 2019, at 10:45 AM, Development <d...@yeralin.net
>> <mailto:d...@yeralin.net>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Hi Matthias,
>>>>>>>>>>>>
>>>>>>>>>>>> Thank you for your input.
>>>>>>>>>>>>
>>>>>>>>>>>> I updated the KIP, made it a little more readable.
>>>>>>>>>>>>
>>>>>>>>>>>> I think the configuration parameters strategy is finalized then.
>>>>>>>>>>>>
>>>>>>>>>>>> Do you have any other questions/concerns regarding this KIP?
>>>>>>>>>>>>
>>>>>>>>>>>> Meanwhile I’ll start doing appropriate code changes, and commit
>> them under my PR.
>>>>>>>>>>>>
>>>>>>>>>>>> Best,
>>>>>>>>>>>> Daniyar Yeralin
>>>>>>>>>>>>
>>>>>>>>>>>>> On Jul 11, 2019, at 2:44 PM, Matthias J. Sax <
>> matth...@confluent.io <mailto:matth...@confluent.io>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Daniyar,
>>>>>>>>>>>>>
>>>>>>>>>>>>> thanks for the update to the KIP. It's in really good shape
>> and well
>>>>>>>>>>>>> written.
>>>>>>>>>>>>>
>>>>>>>>>>>>> About the default constructor question:
>>>>>>>>>>>>>
>>>>>>>>>>>>> All Serdes/Serializer/Deserializer classes need a default
>> constructor to
>>>>>>>>>>>>> create them easily via reflections when specifies in a config.
>> I
>>>>>>>>>>>>> understand that it is not super user friendly, but all
>> existing code
>>>>>>>>>>>>> works this way. Hence, it seems best to stick with the
>> established pattern.
>>>>>>>>>>>>>
>>>>>>>>>>>>> We have a similar issue with `TimeWindowedSerde` and
>>>>>>>>>>>>> `SessionWindowedSerde`, and I just recently did a PR to
>> improve user
>>>>>>>>>>>>> experience that address the exact issue John raised. (cf
>>>>>>>>>>>>> https://github.com/apache/kafka/pull/7067 <
>> https://github.com/apache/kafka/pull/7067>)
>>>>>>>>>>>>>
>>>>>>>>>>>>> Note, that if a user would instantiate the Serde manually, the
>> user
>>>>>>>>>>>>> would also need to call `configure()` to setup the inner
>> serdes. Kafka
>>>>>>>>>>>>> Streams would not setup those automatically and one might most
>> likely
>>>>>>>>>>>>> end-up with an NPE.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Coming back the KIP, and the parameter names. `WindowedSerdes`
>> are
>>>>>>>>>>>>> similar to `ListSerde` as they wrap another Serde. For
>> `WindowedSerdes`,
>>>>>>>>>>>>> we use the following parameter names:
>>>>>>>>>>>>>
>>>>>>>>>>>>> - default.windowed.key.serde.inner
>>>>>>>>>>>>> - default.windowed.value.serde.inner
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> It might be good to align the naming pattern. I would also
>> suggest to
>>>>>>>>>>>>> use `type` instead of `impl`?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> default.key.list.serde.impl  ->  default.list.key.serde.type
>>>>>>>>>>>>> default.value.list.serde.impl  ->
>> default.list.value.serde.type
>>>>>>>>>>>>> default.key.list.serde.element  ->
>> default.list.key.serde.inner
>>>>>>>>>>>>> default.value.list.serde.element  ->
>> default.list.value.serde.inner
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> -Matthias
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 7/10/19 8:52 AM, Development wrote:
>>>>>>>>>>>>>> Hi John,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Yes, I do agree. That totally makes sense. The only thing is
>> that it goes against what Matthias suggested earlier:
>>>>>>>>>>>>>> "I think that ... `ListSerde` should have an default
>> constructor and it should be possible to pass in the `Class listClass`
>> information via a configuration. Otherwise, KafkaStreams cannot use it as
>> default serde.”
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> What do you think about that? I hope I’m not confusing
>> anything.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>> Daniyar Yeralin
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Jul 9, 2019, at 5:56 PM, John Roesler <j...@confluent.io
>> <mailto:j...@confluent.io>> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Ah, my apologies, I must have just overlooked it. Thanks for
>> the update, too.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Just one more super-small question, do we need this variant:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> New method public static <T> Serde<List<T>> ListSerde() in
>> org.apache.kafka.common.serialization.Serdes class (infers list
>> implementation and inner serde from config file)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> It seems like this situation implies my config file is
>> already set up for the list serde, so passing this serde (e.g., in
>> Produced) would have the same effect as not specifying it.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I guess that it could be the case that you have the
>> `default.key/value.serde` set to something else, like StringSerde, but you
>> still have the `default.key/value.list.serde.impl/element` set. This seems
>> like it would result in more confusion than convenience, so my gut instinct
>> is maybe we shouldn't introduce the `ListSerde()` variant until people
>> actually request it later on.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thus, we'd just stick with fully config-driven or fully
>> source-code-driven, not half/half.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> What do you think?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> -John
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Tue, Jul 9, 2019 at 9:58 AM Development <d...@yeralin.net
>> <mailto:d...@yeralin.net> <mailto:d...@yeralin.net 
>> <mailto:d...@yeralin.net>>>
>> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi John,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I hope everyone had a great long weekend.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Regarding Java interfaces, I may not understand you
>> correctly, but I think I already listed them:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> So for Produced, you would use it in the following fashion,
>> for example: Produced.keySerde(Serdes.ListSerde(ArrayList.class,
>> Serdes.Integer()))
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I also updated the KIP, and added a section “Serialization
>> Strategy” where I describe our logic of conditional serialization based on
>> the type of an inner serde.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thank you!
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>> Daniyar Yeralin
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Jun 26, 2019, at 11:44 AM, John Roesler <
>> j...@confluent.io <mailto:j...@confluent.io> <mailto:j...@confluent.io
>> <mailto:j...@confluent.io>>> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks for the update, Daniyar!
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> In addition to specifying the config interface, can you
>> also specify
>>>>>>>>>>>>>>>> the Java interface? Namely, if I need to pass an instance
>> of this
>>>>>>>>>>>>>>>> serde in to the DSL directly, as in Produced, Materialized,
>> etc., what
>>>>>>>>>>>>>>>> constructor(s) would I have available? Likewise with the
>> Serializer
>>>>>>>>>>>>>>>> and Deserailizer. I don't think you need to specify the
>> implementation
>>>>>>>>>>>>>>>> logic, since we've already discussed it here.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> If you also want to specify the serialized format of the
>> data records
>>>>>>>>>>>>>>>> in the KIP, it could be useful documentation, as well as
>> letting us
>>>>>>>>>>>>>>>> verify the schema for forward/backward compatibility
>> concerns, etc.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> John
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Wed, Jun 26, 2019 at 10:33 AM Development <
>> d...@yeralin.net <mailto:d...@yeralin.net> <mailto:d...@yeralin.net <mailto:
>> d...@yeralin.net>>> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hey,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Finally made updates to the KIP:
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-466%3A+Add+support+for+List%3CT%3E+serialization+and+deserialization
>> <
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-466%3A+Add+support+for+List%3CT%3E+serialization+and+deserialization>
>> <
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-466%3A+Add+support+for+List%3CT%3E+serialization+and+deserialization
>> <
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-466%3A+Add+support+for+List%3CT%3E+serialization+and+deserialization>>
>> <
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-466:+Add+support+for+List%3CT%3E+serialization+and+deserialization
>> <
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-466:+Add+support+for+List%3CT%3E+serialization+and+deserialization>
>> <
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-466:+Add+support+for+List%3CT%3E+serialization+and+deserialization
>> <
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-466:+Add+support+for+List%3CT%3E+serialization+and+deserialization
>>>>>
>>>>>>>>>>>>>>>> Sorry for the delay :)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thank You!
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>> Daniyar Yeralin
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Jun 22, 2019, at 12:49 AM, Matthias J. Sax <
>> matth...@confluent.io <mailto:matth...@confluent.io> <mailto:
>> matth...@confluent.io <mailto:matth...@confluent.io>>> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Yes, something like this. I did not think about good
>> configuration
>>>>>>>>>>>>>>>> parameter names yet. I am also not sure if I understand all
>> proposed
>>>>>>>>>>>>>>>> configs atm. But all configs should be listed and explained
>> in the KIP
>>>>>>>>>>>>>>>> anyway, and we can discuss further after you have updated
>> the KIP (I can
>>>>>>>>>>>>>>>> ask more detailed question if I have any).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> -Matthias
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 6/21/19 2:05 PM, Development wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Yes, you are right. ByteSerializer is not what I need to
>> have in a list
>>>>>>>>>>>>>>>> of primitives.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> As for the default constructor and configurability, just
>> want to make
>>>>>>>>>>>>>>>> sure. Is this what you have on your mind?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>> Daniyar Yeralin
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Jun 21, 2019, at 2:51 PM, Matthias J. Sax <
>> matth...@confluent.io <mailto:matth...@confluent.io> <mailto:
>> matth...@confluent.io <mailto:matth...@confluent.io>>
>>>>>>>>>>>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io>
>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io>>>> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks for the update!
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I think that `ListDeserializer`, `ListSerializer`, and
>> `ListSerde`
>>>>>>>>>>>>>>>> should have an default constructor and it should be
>> possible to pass in
>>>>>>>>>>>>>>>> the `Class listClass` information via a configuration.
>> Otherwise,
>>>>>>>>>>>>>>>> KafkaStreams cannot use it as default serde.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> For the primitive serializers: `BytesSerializer` is not
>> primitive IMHO,
>>>>>>>>>>>>>>>> as is it for `byte[]` with variable length -- it's for
>> arrays, not for
>>>>>>>>>>>>>>>> single `byte` (note, that `Bytes` is a Kafka class wrapping
>> `byte[]`).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> For tests, we can comment on the PR. No need to do this in
>> the KIP
>>>>>>>>>>>>>>>> discussion.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Can you also update the KIP?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> -Matthias
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 6/21/19 11:29 AM, Development wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I made and pushed necessary commits, so we could review the
>> final
>>>>>>>>>>>>>>>> version under PR https://github.com/apache/kafka/pull/6592
>> <https://github.com/apache/kafka/pull/6592> <
>> https://github.com/apache/kafka/pull/6592 <
>> https://github.com/apache/kafka/pull/6592>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I also need some advice on writing tests for this new
>> serde. So far I
>>>>>>>>>>>>>>>> only have two test cases (roundtrip and empty payload), I’m
>> not sure
>>>>>>>>>>>>>>>> if it is enough.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thank y’all for your help in this KIP :)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>> Daniyar Yeralin
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Jun 21, 2019, at 1:44 PM, John Roesler <
>> j...@confluent.io <mailto:j...@confluent.io> <mailto:j...@confluent.io
>> <mailto:j...@confluent.io>>
>>>>>>>>>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io>
>> <mailto:j...@confluent.io <mailto:j...@confluent.io>>>> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hey Daniyar,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Looks good to me! Thanks for considering it.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> -John
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Fri, Jun 21, 2019 at 9:04 AM Development <
>> d...@yeralin.net <mailto:d...@yeralin.net> <mailto:d...@yeralin.net <mailto:
>> d...@yeralin.net>>
>>>>>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> <mailto:
>> d...@yeralin.net <mailto:d...@yeralin.net>>> <mailto:d...@yeralin.net
>> <mailto:d...@yeralin.net> <mailto:d...@yeralin.net 
>> <mailto:d...@yeralin.net>>>>
>> wrote:
>>>>>>>>>>>>>>>> Hey John and Matthias,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Yes, now I see it all. I’m storing lots of redundant
>> information.
>>>>>>>>>>>>>>>> Here is my final idea. Yes, now a user should pass a list
>> type. I
>>>>>>>>>>>>>>>> realized that’s the type is not really needed in
>> ListSerializer, but
>>>>>>>>>>>>>>>> only in ListDeserializer:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> In ListSerializer we will start storing sizes only if
>> serializer is
>>>>>>>>>>>>>>>> not a primitive serializer:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Then, in deserializer, we persist passed list type, so that
>> during
>>>>>>>>>>>>>>>> deserialization we could create an instance of it with
>> predefined
>>>>>>>>>>>>>>>> listSize for better performance.
>>>>>>>>>>>>>>>> We also try to locate a primitiveSize based on passed
>> deserializer.
>>>>>>>>>>>>>>>> If it is not there, then primitiveSize will be null. Which
>> means
>>>>>>>>>>>>>>>> that each entry’s size was encoded individually.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> This looks much cleaner and more concise.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> What do you think?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>> Daniyar Yeralin
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Jun 20, 2019, at 5:45 PM, Matthias J. Sax <
>> matth...@confluent.io <mailto:matth...@confluent.io> <mailto:
>> matth...@confluent.io <mailto:matth...@confluent.io>>
>>>>>>>>>>>>>>>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io>
>> <mailto:matth...@confluent.io <mailto:matth...@confluent.io>>> <mailto:
>> matth...@confluent.io <mailto:matth...@confluent.io> <mailto:
>> matth...@confluent.io <mailto:matth...@confluent.io>>>> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> For encoding the list-type: I see John's point about
>> re-encoding the
>>>>>>>>>>>>>>>> list-type redundantly. However, I also don't like the idea
>> that the
>>>>>>>>>>>>>>>> Deserializer returns a fixed type...
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Maybe it's best allow users to specify the target list type
>> on
>>>>>>>>>>>>>>>> deserialization via config?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Similar for the primitive types: I don't think we need to
>> encode the
>>>>>>>>>>>>>>>> type size, but users could specify the type on the
>> deserializer (via a
>>>>>>>>>>>>>>>> config again)?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> About generics: nesting could be arbitrarily deep. Hence, I
>> doubt
>>>>>>>>>>>>>>>> we can
>>>>>>>>>>>>>>>> support this and a cast will be necessary at some point in
>> the user
>>>>>>>>>>>>>>>> code.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> -Matthias
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 6/20/19 1:21 PM, John Roesler wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hey Daniyar,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks for looking at it!
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Something like your screenshot is more along the lines of
>> what I was
>>>>>>>>>>>>>>>> thinking. Sorry, but I didn't follow what you mean, how
>> would that not
>>>>>>>>>>>>>>>> be "vanilla java"?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Unfortunately the deserializer needs more information,
>> though. For
>>>>>>>>>>>>>>>> example, what if the inner type is a Map<String,String>?
>> The serde
>>>>>>>>>>>>>>>> could
>>>>>>>>>>>>>>>> only be used to produce a LinkedList<Map>, thus, we'd still
>> need an
>>>>>>>>>>>>>>>> inner serde, like you have in the KIP (Serde<T> innerSerde).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Something more like Serde<LinkedList<MyRecord>> =
>> Serdes.listSerde(
>>>>>>>>>>>>>>>> /**list type**/ LinkedList.class,
>>>>>>>>>>>>>>>> /**inner serde**/ new MyRecordSerde()
>>>>>>>>>>>>>>>> )
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> And in configuration, it's something like:
>>>>>>>>>>>>>>>> default.key.serde: org...ListSerde
>>>>>>>>>>>>>>>> default.key.list.serde.type: java.util.LinkedList
>>>>>>>>>>>>>>>> default.key.list.serde.inner: com.mycompany.MyRecordSerde
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> What do you think?
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> -John
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Thu, Jun 20, 2019 at 2:46 PM Development <
>> d...@yeralin.net <mailto:d...@yeralin.net> <mailto:d...@yeralin.net <mailto:
>> d...@yeralin.net>>
>>>>>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> <mailto:
>> d...@yeralin.net <mailto:d...@yeralin.net>>> <mailto:d...@yeralin.net
>> <mailto:d...@yeralin.net> <mailto:d...@yeralin.net <mailto:d...@yeralin.net
>>>>>
>>>>>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> <mailto:
>> d...@yeralin.net <mailto:d...@yeralin.net>> <mailto:d...@yeralin.net <mailto:
>> d...@yeralin.net> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>>>>
>> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hey John,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I gave read about TypeReference. It could work for the list
>> serde.
>>>>>>>>>>>>>>>> However, it is not directly
>>>>>>>>>>>>>>>> supported:
>>>>>>>>>>>>>>>> https://github.com/FasterXML/jackson-databind/issues/1490 <
>> https://github.com/FasterXML/jackson-databind/issues/1490> <
>> https://github.com/FasterXML/jackson-databind/issues/1490 <
>> https://github.com/FasterXML/jackson-databind/issues/1490>>
>>>>>>>>>>>>>>>> <https://github.com/FasterXML/jackson-databind/issues/1490
>> <https://github.com/FasterXML/jackson-databind/issues/1490> <
>> https://github.com/FasterXML/jackson-databind/issues/1490 <
>> https://github.com/FasterXML/jackson-databind/issues/1490>>>
>>>>>>>>>>>>>>>> The only way is to pass an actual class object into the
>> constructor,
>>>>>>>>>>>>>>>> something like:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> It could be an option, but not a pretty one. What do you
>> think of my
>>>>>>>>>>>>>>>> approach to use vanilla java and canonical class name? (As
>> described
>>>>>>>>>>>>>>>> previously)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>> Daniyar Yeralin
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Jun 20, 2019, at 2:45 PM, Development <d...@yeralin.net
>> <mailto:d...@yeralin.net> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>
>>>>>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> <mailto:
>> d...@yeralin.net <mailto:d...@yeralin.net>>> <mailto:d...@yeralin.net
>> <mailto:d...@yeralin.net> <mailto:d...@yeralin.net <mailto:d...@yeralin.net
>>>>>
>>>>>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> <mailto:
>> d...@yeralin.net <mailto:d...@yeralin.net>> <mailto:d...@yeralin.net <mailto:
>> d...@yeralin.net> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>>>>
>> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi John,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thank you for your input! Yes, my idea looks a little bit
>> over
>>>>>>>>>>>>>>>> engineered :)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I also wanted to see a feedback from Mathias as well since
>> he gave
>>>>>>>>>>>>>>>> me an idea about storing fixed/variable size entries.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>> Daniyar Yeralin
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Jun 18, 2019, at 6:06 PM, John Roesler <
>> j...@confluent.io <mailto:j...@confluent.io> <mailto:j...@confluent.io
>> <mailto:j...@confluent.io>>
>>>>>>>>>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io>
>> <mailto:j...@confluent.io <mailto:j...@confluent.io>>> <mailto:
>> j...@confluent.io <mailto:j...@confluent.io> <mailto:j...@confluent.io
>> <mailto:j...@confluent.io>>>
>>>>>>>>>>>>>>>> <mailto:j...@confluent.io <mailto:j...@confluent.io>
>> <mailto:j...@confluent.io <mailto:j...@confluent.io>> <mailto:
>> j...@confluent.io <mailto:j...@confluent.io> <mailto:j...@confluent.io
>> <mailto:j...@confluent.io>>>>> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi Daniyar,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> That's a very clever solution!
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> One observation is that, now, this is what we might call a
>>>>>>>>>>>>>>>> polymorphic
>>>>>>>>>>>>>>>> serde. That is, you're detecting the actual concrete type
>> and then
>>>>>>>>>>>>>>>> promising to produce the exact same concrete type on read.
>>>>>>>>>>>>>>>> There are
>>>>>>>>>>>>>>>> some inherent problems with this approach, which in general
>>>>>>>>>>>>>>>> require
>>>>>>>>>>>>>>>> some kind of  schema registry (not necessarily Schema
>>>>>>>>>>>>>>>> Registry, just
>>>>>>>>>>>>>>>> any registry for schemas) to solve.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Notice that every serialized record has quite a bit of
>> duplicated
>>>>>>>>>>>>>>>> information: the concrete type as well as a byte to indicate
>>>>>>>>>>>>>>>> whether
>>>>>>>>>>>>>>>> the value type is a fixed size, and, if so, an integer to
>>>>>>>>>>>>>>>> indicate the
>>>>>>>>>>>>>>>> actual size. These constitute a schema, of sorts, because
>> they
>>>>>>>>>>>>>>>> tell us
>>>>>>>>>>>>>>>> later how exactly to deserialize the data. Unfortunately,
>> this
>>>>>>>>>>>>>>>> information is completely redundant. In all likelihood, the
>>>>>>>>>>>>>>>> information will be exactly the same for every record in the
>>>>>>>>>>>>>>>> topic.
>>>>>>>>>>>>>>>> This problem is essentially the core motivation for
>> serializations
>>>>>>>>>>>>>>>> like Avro: to move the schema outside of the serialization
>>>>>>>>>>>>>>>> itself, so
>>>>>>>>>>>>>>>> that the records won't contain so much redundant
>> information.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> In this light, I'm wondering if it makes sense to go back to
>>>>>>>>>>>>>>>> something
>>>>>>>>>>>>>>>> like what you had earlier in which you don't support
>> perfectly
>>>>>>>>>>>>>>>> preserving the concrete type for _this_ serde, but instead
>> just
>>>>>>>>>>>>>>>> support deserializing to _some_ List. Then, you could defer
>> full,
>>>>>>>>>>>>>>>> perfect, type preservation to serdes that have an external
>>>>>>>>>>>>>>>> system in
>>>>>>>>>>>>>>>> which to register their type information.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> There does exist an alternative, if we really do want to
>>>>>>>>>>>>>>>> preserve the
>>>>>>>>>>>>>>>> concrete type (which does seem kind of nice). You can add a
>>>>>>>>>>>>>>>> configuration option specifically for the serde to configure
>>>>>>>>>>>>>>>> what the
>>>>>>>>>>>>>>>> list type will be, and maybe what the element type is, as
>> well.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> As far as "related work" goes, you might be interested to
>> take
>>>>>>>>>>>>>>>> a look
>>>>>>>>>>>>>>>> at how Jackson can be configured to deserialize into a
>> specific,
>>>>>>>>>>>>>>>> arbitrarily nested, generically parameterized class
>> structure.
>>>>>>>>>>>>>>>> Specifically, you might find
>>>>>>>>>>>>>>>>
>> https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html
>> <
>> https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html>
>> <
>> https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html
>> <
>> https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html
>>>>
>>>>>>>>>>>>>>>> <
>> https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html
>> <
>> https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html>
>> <
>> https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html
>> <
>> https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html
>>>>>
>>>>>>>>>>>>>>>> interesting.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> -John
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Mon, Jun 17, 2019 at 12:38 PM Development <
>> d...@yeralin.net <mailto:d...@yeralin.net> <mailto:d...@yeralin.net <mailto:
>> d...@yeralin.net>>
>>>>>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> <mailto:
>> d...@yeralin.net <mailto:d...@yeralin.net>>> <mailto:d...@yeralin.net
>> <mailto:d...@yeralin.net> <mailto:d...@yeralin.net <mailto:d...@yeralin.net
>>>>>
>>>>>>>>>>>>>>>> <mailto:d...@yeralin.net <mailto:d...@yeralin.net> <mailto:
>> d...@yeralin.net <mailto:d...@yeralin.net>> <mailto:d...@yeralin.net <mailto:
>> d...@yeralin.net> <mailto:d...@yeralin.net <mailto:d...@yeralin.net>>>>>
>> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> bump
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>>
> 

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to