Hi, Have you seen "Reinterpreting a pre-partitioned data stream as keyed stream" feature? [1] However I'm not sure if and how can it be integrated with the Table API. Maybe someone more familiar with the Table API can help with that?
Piotrek [1] https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/experimental.html#reinterpreting-a-pre-partitioned-data-stream-as-keyed-stream śr., 16 wrz 2020 o 05:35 Dan Hill <quietgol...@gmail.com> napisał(a): > How do I avoid unnecessary reshuffles when using Kafka as input? My keys > in Kafka are ~userId. The first few stages do joins that are usually > (userId, someOtherKeyId). It makes sense for these joins to stay on the > same machine and avoid unnecessary shuffling. > > What's the best way to avoid unnecessary shuffling when using Table SQL > interface? I see PARTITION BY on TABLE. I'm not sure how to specify the > keys for Kafka. > > > > >