Thanks for the KIP. I also had a look into the PR and have two follow up
question:


 - did we consider to make the return type (ie, ArrayList, vs
LinkesList) configurable or encode it the serialized bytes?

 - atm the size of each element is encoded individually; did we consider
an optimization for fixed size elements (like Long) to avoid this overhead?



-Matthias

On 5/15/19 6:05 PM, John Roesler wrote:
> Sounds good!
> 
> On Tue, May 14, 2019 at 9:21 AM Development <d...@yeralin.net> wrote:
>>
>> Hey,
>>
>> I think it the proposal is finalized, no one raised any concerns. Shall we 
>> call it for a [VOTE]?
>>
>> Best,
>> Daniyar Yeralin
>>
>>> On May 10, 2019, at 10:17 AM, John Roesler <j...@confluent.io> wrote:
>>>
>>> Good observation, Daniyar.
>>>
>>> Maybe we should just not implement support for serdeFrom.
>>>
>>> We can always add it later, but I think you're right, we need some
>>> kind of more sophisticated support, or at least a second argument for
>>> the inner class.
>>>
>>> For now, it seems like most use cases would be satisfied without
>>> serdeFrom(...List...)
>>>
>>> -John
>>>
>>> On Fri, May 10, 2019 at 8:57 AM Development <d...@yeralin.net> wrote:
>>>>
>>>> Hi,
>>>>
>>>> I was trying to add some test cases for the list serde, and it led me to 
>>>> this class `org.apache.kafka.common.serialization.SerializationTest`. I 
>>>> saw that it relies on method 
>>>> `org.apache.kafka.common.serialization.serdeFrom(Class<T> type)`
>>>>
>>>> Now, I’m not sure how to adapt List<T> serde for this method, since it 
>>>> will be a “nested class”. What is the best approach in this case?
>>>>
>>>> I remember that in Jackson for example, one uses a TypeFactory, and 
>>>> constructs “collectionType” of two classes. For example, 
>>>> `constructCollectionType(List.class, String.class).getClass()`. I don’t 
>>>> think it applies here.
>>>>
>>>> Any ideas?
>>>>
>>>> Best,
>>>> Daniyar Yeralin
>>>>
>>>>> On May 9, 2019, at 2:10 PM, Development <d...@yeralin.net> wrote:
>>>>>
>>>>> Hey Sophie,
>>>>>
>>>>> Thank you for your input. I think I’d rather finish this KIP as is, and 
>>>>> then open a new one for the Collections (if everyone agrees). I don’t 
>>>>> want to extend the current KIP-466, since most of the work is already 
>>>>> done for it.
>>>>>
>>>>> Meanwhile, I’ll start adding some test cases for this new list serde 
>>>>> since this discussion seems to be approaching its logical end.
>>>>>
>>>>> Best,
>>>>> Daniyar Yeralin
>>>>>
>>>>>> On May 9, 2019, at 1:35 PM, Sophie Blee-Goldman <sop...@confluent.io> 
>>>>>> wrote:
>>>>>>
>>>>>> Good point about serdes for other Collections. On the one hand I'd guess
>>>>>> that non-List Collections are probably relatively rare in practice (if
>>>>>> anyone disagrees please correct me!) but on the other hand, a) even if 
>>>>>> just
>>>>>> a small number of people benefit I think it's worth the extra effort and 
>>>>>> b)
>>>>>> if we do end up needing/wanting them in the future it would save us a KIP
>>>>>> to just add them now. Personally I feel it would make sense to expand the
>>>>>> scope of this KIP a bit to include all Collections as a logical unit, but
>>>>>> the ROI could be low..
>>>>>>
>>>>>> (I know of at least one instance in the unit tests where a Set serde 
>>>>>> could
>>>>>> be useful, and there may be more)
>>>>>>
>>>>>> On Thu, May 9, 2019 at 7:27 AM Development <d...@yeralin.net> wrote:
>>>>>>
>>>>>>> Hey,
>>>>>>>
>>>>>>> I don’t see any replies. Seems like this proposal can be finalized and
>>>>>>> called for a vote?
>>>>>>>
>>>>>>> Also I’ve been thinking. Do we need more serdes for other Collections?
>>>>>>> Like queue or set for example
>>>>>>>
>>>>>>> Best,
>>>>>>> Daniyar Yeralin
>>>>>>>
>>>>>>>> On May 8, 2019, at 2:28 PM, John Roesler <j...@confluent.io> wrote:
>>>>>>>>
>>>>>>>> Hi Daniyar,
>>>>>>>>
>>>>>>>> No worries about the procedural stuff. Prior experience with KIPs is
>>>>>>>> not required :)
>>>>>>>>
>>>>>>>> I was just trying to help you propose this stuff in a way that the
>>>>>>>> others will find easy to review.
>>>>>>>>
>>>>>>>> Thanks for updating the KIP. Thanks to the others for helping out with
>>>>>>>> the syntax.
>>>>>>>>
>>>>>>>> Given these updates, I'm curious if anyone else has feedback about
>>>>>>>> this proposal. Personally, I think it sounds fine!
>>>>>>>>
>>>>>>>> -John
>>>>>>>>
>>>>>>>> On Wed, May 8, 2019 at 1:01 PM Development <d...@yeralin.net> wrote:
>>>>>>>>>
>>>>>>>>> Hey,
>>>>>>>>>
>>>>>>>>> That worked! I certainly lack Java generics knowledge. Thanks for the
>>>>>>> snippet. I’ll update KIP again.
>>>>>>>>>
>>>>>>>>> Best,
>>>>>>>>> Daniyar Yeralin
>>>>>>>>>
>>>>>>>>>> On May 8, 2019, at 1:39 PM, Chris Egerton <chr...@confluent.io> 
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> Hi Daniyar,
>>>>>>>>>>
>>>>>>>>>> I think you may want to tweak your syntax a little:
>>>>>>>>>>
>>>>>>>>>> public static <T> Serde<List<T>> List(Serde<T> innerSerde) {
>>>>>>>>>> return new ListSerde<T>(innerSerde);
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> Does that work?
>>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>>
>>>>>>>>>> Chris
>>>>>>>>>>
>>>>>>>>>> On Wed, May 8, 2019 at 10:29 AM Development <d...@yeralin.net 
>>>>>>>>>> <mailto:
>>>>>>> d...@yeralin.net>> wrote:
>>>>>>>>>> Hi John,
>>>>>>>>>>
>>>>>>>>>> I updated JIRA and KIP.
>>>>>>>>>>
>>>>>>>>>> I didn’t know about the process, and created PR before I knew about
>>>>>>> KIPs :)
>>>>>>>>>>
>>>>>>>>>> As per static declaration, I don’t think Java allows that:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Best,
>>>>>>>>>> Daniyar Yeralin
>>>>>>>>>>
>>>>>>>>>>> On May 7, 2019, at 2:22 PM, John Roesler <j...@confluent.io <mailto:
>>>>>>> j...@confluent.io>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Thanks for that update. Do you mind making changes primarily on the
>>>>>>>>>>> KIP document ? (
>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-466%3A+Add+support+for+List%3CT%3E+serialization+and+deserialization
>>>>>>> <
>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-466%3A+Add+support+for+List%3CT%3E+serialization+and+deserialization
>>>>>>>> )
>>>>>>>>>>>
>>>>>>>>>>> This is the design document that we have to agree on and vote for, 
>>>>>>>>>>> the
>>>>>>>>>>> PR comes later. It can be nice to have an implementation to look at,
>>>>>>>>>>> but the KIP is the main artifact for this discussion.
>>>>>>>>>>>
>>>>>>>>>>> With this in mind, it will help get more reviewers to look at it if
>>>>>>>>>>> you can tidy up the KIP document so that it stands on its own. 
>>>>>>>>>>> People
>>>>>>>>>>> shouldn't have to look at any other document to understand the
>>>>>>>>>>> motivation of the proposal, and they shouldn't have to look at a PR 
>>>>>>>>>>> to
>>>>>>>>>>> see what the public API will look like. If it helps, you can take a
>>>>>>>>>>> look at some other recent KIPs.
>>>>>>>>>>>
>>>>>>>>>>> Given that the list serde needs an inner serde, I agree you can't 
>>>>>>>>>>> have
>>>>>>>>>>> a zero-argument static factory method for it, but it seems you could
>>>>>>>>>>> still have a static method:
>>>>>>>>>>> `public static Serde<List<T>> List(Serde<T> innerSerde)`.
>>>>>>>>>>>
>>>>>>>>>>> Thoughts?
>>>>>>>>>>>
>>>>>>>>>>> On Tue, May 7, 2019 at 12:18 PM Development <d...@yeralin.net 
>>>>>>>>>>> <mailto:
>>>>>>> d...@yeralin.net>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Absolutely agree. Already pushed another commit to remove 
>>>>>>>>>>>> comparator
>>>>>>> argument: https://github.com/apache/kafka/pull/6592 <
>>>>>>> https://github.com/apache/kafka/pull/6592> <
>>>>>>> https://github.com/apache/kafka/pull/6592 <
>>>>>>> https://github.com/apache/kafka/pull/6592>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thank you for your input John! I really appreciate it.
>>>>>>>>>>>>
>>>>>>>>>>>> What about this point I made:
>>>>>>>>>>>>
>>>>>>>>>>>> 1. Since type for List serde needs to be declared before hand, I
>>>>>>> could not create a static method for List Serde under
>>>>>>> org.apache.kafka.common.serialization.Serdes. I addressed it in the KIP:
>>>>>>>>>>>> P.S. Static method corresponding to ListSerde under
>>>>>>> org.apache.kafka.common.serialization.Serdes (something like static 
>>>>>>> public
>>>>>>> Serde<List<T>> List() {...} 
>>>>>>> inorg.apache.kafka.common.serialization.Serdes)
>>>>>>> class cannot be added because type needs to be defined beforehand. 
>>>>>>> That's
>>>>>>> why one needs to create List Serde in the following fashion:
>>>>>>>>>>>> new Serdes.ListSerde<String>(Serdes.String(),
>>>>>>> Comparator.comparing(String::length));
>>>>>>>>>>>> (can possibly be simplified by declaring import static
>>>>>>> org.apache.kafka.common.serialization.Serdes.ListSerde)
>>>>>>>>>>>>
>>>>>>>>>>>>> On May 7, 2019, at 11:50 AM, John Roesler <j...@confluent.io
>>>>>>> <mailto:j...@confluent.io>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for the reply Daniyar,
>>>>>>>>>>>>>
>>>>>>>>>>>>> That makes much more sense! I thought I must be missing something,
>>>>>>> but I
>>>>>>>>>>>>> couldn't for the life of me figure it out.
>>>>>>>>>>>>>
>>>>>>>>>>>>> What do you think about just taking an argument, instead of for a
>>>>>>>>>>>>> Comparator, for the Serde of the inner type? That way, the user 
>>>>>>>>>>>>> can
>>>>>>> control
>>>>>>>>>>>>> how exactly the inner data gets serialized, while also bounding 
>>>>>>>>>>>>> the
>>>>>>> generic
>>>>>>>>>>>>> parameter properly. As for the order, since the list is already 
>>>>>>>>>>>>> in a
>>>>>>>>>>>>> specific order, which the user themselves controls, it doesn't 
>>>>>>>>>>>>> seem
>>>>>>>>>>>>> strictly necessary to offer an option to sort the data during
>>>>>>> serialization.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> -John
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, May 6, 2019 at 8:47 PM Development <d...@yeralin.net
>>>>>>> <mailto:d...@yeralin.net>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi John,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I’m really sorry for the confusion. I cloned that JIRA ticket 
>>>>>>>>>>>>>> from
>>>>>>> an old
>>>>>>>>>>>>>> one about introducing UUID Serde, and I guess was too hasty while
>>>>>>> editing
>>>>>>>>>>>>>> the copy to notice the mistake. Just edited the ticket. Sorry for
>>>>>>> any
>>>>>>>>>>>>>> inconvenience .
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> As per comparator, I agree. Let’s make user be responsible for
>>>>>>>>>>>>>> implementing comparable interface. I was just thinking to make 
>>>>>>>>>>>>>> the
>>>>>>> serde a
>>>>>>>>>>>>>> little more flexible (i.e. let user decide in which order records
>>>>>>> is going
>>>>>>>>>>>>>> to be inserted into a change log topic).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thank you!
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>> Daniyar Yeralin
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On May 6, 2019, at 5:37 PM, John Roesler <j...@confluent.io
>>>>>>> <mailto:j...@confluent.io>> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi Daniyar,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks for the proposal!
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> If I understand the point about the comparator, is it just to
>>>>>>> capture the
>>>>>>>>>>>>>>> generic type parameter? If so, then anything that implements a
>>>>>>> known
>>>>>>>>>>>>>>> interface would work just as well, right? I've been considering
>>>>>>> adding
>>>>>>>>>>>>>>> something like the Jackson TypeReference (or similar classes in
>>>>>>> many
>>>>>>>>>>>>>> other
>>>>>>>>>>>>>>> projects). Would this be a good time to do it?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Note that it's not necessary to actually require that the
>>>>>>> captured type
>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>> Comparable (as this proposal currently does), it's just a way to
>>>>>>> make
>>>>>>>>>>>>>> sure
>>>>>>>>>>>>>>> there is some method that makes use of the generic type
>>>>>>> parameter, to
>>>>>>>>>>>>>> force
>>>>>>>>>>>>>>> the compiler to capture the type.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Just to make sure I understand the motivation... You expressed a
>>>>>>> desire
>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>> be able to serialize UUIDs, which I didn't follow, since there 
>>>>>>>>>>>>>>> is
>>>>>>> a
>>>>>>>>>>>>>>> built-in UUID serde:
>>>>>>> org.apache.kafka.common.serialization.Serdes#UUID,
>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>> also, a UUID isn't a List. Did you mean that you need to use
>>>>>>> *lists of*
>>>>>>>>>>>>>>> UUIDs?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> -John
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Mon, May 6, 2019 at 11:49 AM Development <d...@yeralin.net
>>>>>>> <mailto:d...@yeralin.net>> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Starting a discussion for KIP-466 adding support for List 
>>>>>>>>>>>>>>>> Serde.
>>>>>>> PR is
>>>>>>>>>>>>>>>> created under https://github.com/apache/kafka/pull/6592 <
>>>>>>> https://github.com/apache/kafka/pull/6592> <
>>>>>>>>>>>>>>>> https://github.com/apache/kafka/pull/6592 <
>>>>>>> https://github.com/apache/kafka/pull/6592>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> There are two topics I would like to discuss:
>>>>>>>>>>>>>>>> 1. Since type for List serve needs to be declared before hand, 
>>>>>>>>>>>>>>>> I
>>>>>>> could
>>>>>>>>>>>>>> not
>>>>>>>>>>>>>>>> create a static method for List Serde under
>>>>>>>>>>>>>>>> org.apache.kafka.common.serialization.Serdes. I addressed it in
>>>>>>> the KIP:
>>>>>>>>>>>>>>>> P.S. Static method corresponding to ListSerde under
>>>>>>>>>>>>>>>> org.apache.kafka.common.serialization.Serdes (something like
>>>>>>> static
>>>>>>>>>>>>>> public
>>>>>>>>>>>>>>>> Serde<List<T>> List() {...}
>>>>>>>>>>>>>> inorg.apache.kafka.common.serialization.Serdes)
>>>>>>>>>>>>>>>> class cannot be added because type needs to be defined
>>>>>>> beforehand.
>>>>>>>>>>>>>> That's
>>>>>>>>>>>>>>>> why one needs to create List Serde in the following fashion:
>>>>>>>>>>>>>>>> new Serdes.ListSerde<String>(Serdes.String(),
>>>>>>>>>>>>>>>> Comparator.comparing(String::length));
>>>>>>>>>>>>>>>> (can possibly be simplified by declaring import static
>>>>>>>>>>>>>>>> org.apache.kafka.common.serialization.Serdes.ListSerde)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 2. @miguno Michael G. Noll <https://github.com/miguno <
>>>>>>> https://github.com/miguno>> is questioning
>>>>>>>>>>>>>>>> whether I need to pass a comparator to ListDeserializer. This
>>>>>>> certainly
>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>> not required. Feel free to add your input:
>>>>>>>>>>>>>>>> https://github.com/apache/kafka/pull/6592#discussion_r281152067
>>>>>>> <https://github.com/apache/kafka/pull/6592#discussion_r281152067>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thank you!
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>> Daniyar Yeralin
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On May 6, 2019, at 11:59 AM, Daniyar Yeralin (JIRA) <
>>>>>>> j...@apache.org <mailto:j...@apache.org>>
>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Daniyar Yeralin created KAFKA-8326:
>>>>>>>>>>>>>>>>> --------------------------------------
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>     Summary: Add List<T> Serde
>>>>>>>>>>>>>>>>>         Key: KAFKA-8326
>>>>>>>>>>>>>>>>>         URL:
>>>>>>> https://issues.apache.org/jira/browse/KAFKA-8326 <
>>>>>>> https://issues.apache.org/jira/browse/KAFKA-8326>
>>>>>>>>>>>>>>>>>     Project: Kafka
>>>>>>>>>>>>>>>>>  Issue Type: Improvement
>>>>>>>>>>>>>>>>>  Components: clients, streams
>>>>>>>>>>>>>>>>>    Reporter: Daniyar Yeralin
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I propose adding serializers and deserializers for the
>>>>>>> java.util.List
>>>>>>>>>>>>>>>> class.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I have many use cases where I want to set the key of a Kafka
>>>>>>> message to
>>>>>>>>>>>>>>>> be a UUID. Currently, I need to turn UUIDs into strings or byte
>>>>>>> arrays
>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>> use their associated Serdes, but it would be more convenient to
>>>>>>>>>>>>>> serialize
>>>>>>>>>>>>>>>> and deserialize UUIDs directly.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I believe there are many use cases where one would want to 
>>>>>>>>>>>>>>>>> have
>>>>>>> a List
>>>>>>>>>>>>>>>> serde. Ex. [
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>> https://stackoverflow.com/questions/41427174/aggregate-java-objects-in-a-list-with-kafka-streams-dsl-windows
>>>>>>> <
>>>>>>> https://stackoverflow.com/questions/41427174/aggregate-java-objects-in-a-list-with-kafka-streams-dsl-windows
>>>>>>>>
>>>>>>>>>>>>>> ],
>>>>>>>>>>>>>>>> [
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>> https://stackoverflow.com/questions/46365884/issue-with-arraylist-serde-in-kafka-streams-api
>>>>>>> <
>>>>>>> https://stackoverflow.com/questions/46365884/issue-with-arraylist-serde-in-kafka-streams-api
>>>>>>>>
>>>>>>>>>>>>>>>> ]
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> KIP Link: [
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-466%3A+Add+support+for+List%3CT%3E+serialization+and+deserialization
>>>>>>> <
>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-466%3A+Add+support+for+List%3CT%3E+serialization+and+deserialization
>>>>>>>>
>>>>>>>>>>>>>>>> ]
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>> This message was sent by Atlassian JIRA
>>>>>>>>>>>>>>>>> (v7.6.3#76005)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>>>>
>>

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to