Hi Piotr! Yes, that's what I'm using with DataStream. It works well in my prototype.
On Wed, Sep 16, 2020 at 8:58 AM Piotr Nowojski <pnowoj...@apache.org> wrote: > Hi, > > Have you seen "Reinterpreting a pre-partitioned data stream as keyed > stream" feature? [1] However I'm not sure if and how can it be integrated > with the Table API. Maybe someone more familiar with the Table API can help > with that? > > Piotrek > > [1] > https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/experimental.html#reinterpreting-a-pre-partitioned-data-stream-as-keyed-stream > > śr., 16 wrz 2020 o 05:35 Dan Hill <quietgol...@gmail.com> napisał(a): > >> How do I avoid unnecessary reshuffles when using Kafka as input? My keys >> in Kafka are ~userId. The first few stages do joins that are usually >> (userId, someOtherKeyId). It makes sense for these joins to stay on the >> same machine and avoid unnecessary shuffling. >> >> What's the best way to avoid unnecessary shuffling when using Table SQL >> interface? I see PARTITION BY on TABLE. I'm not sure how to specify the >> keys for Kafka. >> >> >> >> >>