[ https://issues.apache.org/jira/browse/KAFKA-3545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15236334#comment-15236334 ]
Guozhang Wang commented on KAFKA-3545: -------------------------------------- Thanks for reporting. Having generalized serdes for collection types is definitely on our road map. As for "group-by" followed by "aggregate", as I mentioned in KAFKA-3544 there are already built-in operators where users can use a "selector" to pick the aggregation key and an "aggregator" to aggregate the records with the same selected key. And in KAFKA-3337 we plan to extract the "selector" into a separate "groupBy" operator in Kafka Streams DSL. Would that work for your case? > Generalized Serdes for List/Map > ------------------------------- > > Key: KAFKA-3545 > URL: https://issues.apache.org/jira/browse/KAFKA-3545 > Project: Kafka > Issue Type: Improvement > Components: streams > Reporter: Greg Fodor > Assignee: Guozhang Wang > Priority: Minor > Labels: api > Fix For: 0.10.1.0 > > > In working with Kafka Streams I've found it's often the case I want to > perform a "group by" operation, where I repartition a stream based on a > foreign key and then do an aggregation of all the values into a single > collection, so the stream becomes one where each entry has a value that is a > serialized list of values that belonged to the key. (This seems unrelated to > the 'group by' operation talked about in KAFKA-3544.) Basically the same > typical group by operation found in systems like Cascading. > In order to create these intermediate list values I needed to define custom > avro schemas that simply wrap the elements of interest into a list. It seems > desirable that there be some basic facility for constructing simple Serdes of > Lists/Maps/Sets of other types, potentially using avro's serialization under > the hood. If this existed in the core library it would also enable the > addition of higher level operations on streams that can use these Serdes to > perform simple operations like the "group by" example I mention. -- This message was sent by Atlassian JIRA (v6.3.4#6332)