Boyang,

Thanks for pushing this KIP. Re-reading it, it only allows to specify
number of partitions for internally created repartition topics.

I still believe, that using `through()` in it's current form might be
too clumsy for some cases, because (talking to users) it's a regular
use-case to repartition data explicitly, without caring about the used
topic. Hence, it would be good to allow repartitioning without the need
to create a topic manually.

Hence, a `KStream.selectKey().through()` that creates the topic
automatically would be helpful. Or maybe even better,
`KStream#repartition()` with repartition taking the KeyValue->key
selector as syntactic sugar (we could also add an overlaod value->key
mapper).

`#repartition()` could also take a Produced (maybe also Consumed?)
parameter if the user needs to set Serdes or the number of partitions
similar to `through()` etc.

Adding `#repartition()` would provide a very intuitive name, it's easier
to use than `selectKey` plus `through()` (even if we would allow
through() to create the repartiton topic), and it keep a separation of
concerns (ie, `through()` for user topics, `repartition` for internal
topics).


-Matthias

On 1/20/19 10:11 PM, Boyang Chen wrote:
> Hey all,
> 
> I would like to start a new discussion thread for refined KIP-221: 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-221%3A+Enhance+KStream+with+Connecting+Topic+Creation+and+Repartition+Hint
> 
> The major goal for this KIP is simplified to empower KStream #to and #groupBy 
> APIs ability to rescale the repartition logic independent of upstream 
> partitions counts.
> 
> Let me know your thoughts, and credit to Jeyhun who is the original KIP owner!
> 
> Best,
> Boyang
> 

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to