Re: Avoiding data shuffling when reading pre-partitioned data from Kafka

2023-03-04 Thread Ken Krugler
Hi Tommy, To use stateful timers, you need to have a keyed stream, which gets tricky when you’re trying to avoid network traffic caused by the keyBy() If the number of unique keys isn’t huge, I could think of yet another helicopter stunt that you could try :) It’s possible to calculate a compo

Re: unsubscribe

2023-03-04 Thread Yuxin Tan
Hi, please send an email to user-unsubscr...@flink.apache.org to unsubscribe. Best, Yuxin Xiangyu Su via user 于2023年3月3日周五 17:29写道: > >

Re: Avoiding data shuffling when reading pre-partitioned data from Kafka

2023-03-04 Thread Tommy May
Hello Ken, Thanks for the quick response! That is an interesting workaround. In our case though we are using a CoProcessFunction with stateful timers. Is there a similar workaround path available in that case? The one possible way I could find required partitioning data in kafka in a very specific

Re: Avoiding data shuffling when reading pre-partitioned data from Kafka

2023-03-04 Thread Ken Krugler
Hi Tommy, I believe there is a way to make this work currently, but with lots of caveats and constraints. This assumes you want to avoid any network shuffle. 1. Both topics have names that return the same value for ((topicName.hashCode() * 31) & 0x7) % parallelism. 2. Both topics have the