Thanks.

Tim Ward

-----Original Message-----
From: Matthias J. Sax <matth...@confluent.io>
Sent: 13 August 2019 08:23
To: users@kafka.apache.org
Subject: Re: How do I tell Kafka Streams not to repartition?

Atm, it's not possible to tell Kafka Streams that repartitioning is not
necessary after a key-changing operation at DSL level.

I personally think it would be a good improvement to add this
functionality. It's not the first time somebody asked for it. Feel free
to create a JIRA (and maybe even contribute :) -- note, that we would
need a KIP for this).


The only alternative you have currently, is to not use
`groupByKey().aggregate()`, but `transformValues()` (or similar) and
implement the aggregation manually.


-Matthias


On 8/12/19 1:25 AM, Tim Ward wrote:
> I'm using groupByKey, and it causes repartitioning.
>
> I suppose I could aggregate by parent ID, if the data structure into which I 
> aggregate by parent ID is itself a map from child ID to what I'm really 
> wanting to aggregate - is that what you had in mind? - I think it would work!
>
> Give or take a problem I've discovered with persistence following a crash in 
> the middle of aggregation, which I'll post separately.
>
> Tim Ward
>
> -----Original Message-----
> From: Boyang Chen <reluctanthero...@gmail.com>
> Sent: 09 August 2019 23:31
> To: users@kafka.apache.org
> Subject: Re: How do I tell Kafka Streams not to repartition?
>
> In case I'm not making myself clear, any operation that changes the record
> key will result in repartition. Since you don't want that, you shall choose
> to call groupByKey afterwards and aggregation will happen on `parent id`
> level.
>
> On Fri, Aug 9, 2019 at 3:27 PM Boyang Chen <reluctanthero...@gmail.com>
> wrote:
>
>> Hey Tim,
>>
>> I think the functionality you need is groupByKey() which avoids
>> repartitioning, feel free to check it out here:
>> https://docs.confluent.io/current/streams/developer-guide/dsl-api.html#aggregating.
>> Recommend you to read the whole thing but feel free just to search
>> `groupByKey`.
>>
>> On Fri, Aug 9, 2019 at 7:14 AM Tim Ward <tim.w...@origamienergy.com>
>> wrote:
>>
>>> I've got an input topic which is keyed by "parent ID". Each message
>>> contains multiple items of data, each for a different "child ID".
>>>
>>> To process these items separately I flatMapValues() the stream to make a
>>> new stream of the inner items of data, keyed by "child ID".
>>>
>>> Now, because I've changed the key, Kafka Streams thinks a repartition is
>>> needed. But in fact it isn't, because all the inner items for a particular
>>> "child ID" will be contained within messages keyed with the same "parent
>>> ID".
>>>
>>> How do I tell Kafka Streams that there is no need to repartition in this
>>> case, because all the data that should remain together in the same instance
>>> of the application will do so without repartitioning? (I appreciate that
>>> Streams can't know about the parent-child relationship unless I *do* tell
>>> it in some way.)
>>>
>>> Tim Ward
>>>
>>> This email is from Origami Energy Limited. The contents of this email and
>>> any attachment are confidential to the intended recipient(s). If you are
>>> not an intended recipient: (i) do not use, disclose, distribute, copy or
>>> publish this email or its contents; (ii) please contact Origami Energy
>>> Limited immediately; and then (iii) delete this email. For more
>>> information, our privacy policy is available here:
>>> https://origamienergy.com/privacy-policy/. Origami Energy Limited
>>> (company number 8619644) is a company registered in England with its
>>> registered office at Ashcombe Court, Woolsack Way, Godalming, GU7 1LQ.
>>>
>>
> This email is from Origami Energy Limited. The contents of this email and any 
> attachment are confidential to the intended recipient(s). If you are not an 
> intended recipient: (i) do not use, disclose, distribute, copy or publish 
> this email or its contents; (ii) please contact Origami Energy Limited 
> immediately; and then (iii) delete this email. For more information, our 
> privacy policy is available here: https://origamienergy.com/privacy-policy/. 
> Origami Energy Limited (company number 8619644) is a company registered in 
> England with its registered office at Ashcombe Court, Woolsack Way, 
> Godalming, GU7 1LQ.
>

This email is from Origami Energy Limited. The contents of this email and any 
attachment are confidential to the intended recipient(s). If you are not an 
intended recipient: (i) do not use, disclose, distribute, copy or publish this 
email or its contents; (ii) please contact Origami Energy Limited immediately; 
and then (iii) delete this email. For more information, our privacy policy is 
available here: https://origamienergy.com/privacy-policy/. Origami Energy 
Limited (company number 8619644) is a company registered in England with its 
registered office at Ashcombe Court, Woolsack Way, Godalming, GU7 1LQ.

Reply via email to