Boyang, Thanks for pushing this KIP. Re-reading it, it only allows to specify number of partitions for internally created repartition topics.
I still believe, that using `through()` in it's current form might be too clumsy for some cases, because (talking to users) it's a regular use-case to repartition data explicitly, without caring about the used topic. Hence, it would be good to allow repartitioning without the need to create a topic manually. Hence, a `KStream.selectKey().through()` that creates the topic automatically would be helpful. Or maybe even better, `KStream#repartition()` with repartition taking the KeyValue->key selector as syntactic sugar (we could also add an overlaod value->key mapper). `#repartition()` could also take a Produced (maybe also Consumed?) parameter if the user needs to set Serdes or the number of partitions similar to `through()` etc. Adding `#repartition()` would provide a very intuitive name, it's easier to use than `selectKey` plus `through()` (even if we would allow through() to create the repartiton topic), and it keep a separation of concerns (ie, `through()` for user topics, `repartition` for internal topics). -Matthias On 1/20/19 10:11 PM, Boyang Chen wrote: > Hey all, > > I would like to start a new discussion thread for refined KIP-221: > https://cwiki.apache.org/confluence/display/KAFKA/KIP-221%3A+Enhance+KStream+with+Connecting+Topic+Creation+and+Repartition+Hint > > The major goal for this KIP is simplified to empower KStream #to and #groupBy > APIs ability to rescale the repartition logic independent of upstream > partitions counts. > > Let me know your thoughts, and credit to Jeyhun who is the original KIP owner! > > Best, > Boyang >
signature.asc
Description: OpenPGP digital signature