Re: Flink Table SQL, Kafka, partitions and unnecessary shuffling

Piotr Nowojski Wed, 16 Sep 2020 08:58:48 -0700

Hi,

Have you seen "Reinterpreting a pre-partitioned data stream as keyed
stream" feature? [1] However I'm not sure if and how can it be integrated
with the Table API. Maybe someone more familiar with the Table API can help
with that?


Piotrek

[1]
https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/experimental.html#reinterpreting-a-pre-partitioned-data-stream-as-keyed-stream

śr., 16 wrz 2020 o 05:35 Dan Hill <quietgol...@gmail.com> napisał(a):

> How do I avoid unnecessary reshuffles when using Kafka as input?  My keys
> in Kafka are ~userId.  The first few stages do joins that are usually
> (userId, someOtherKeyId).  It makes sense for these joins to stay on the
> same machine and avoid unnecessary shuffling.
>
> What's the best way to avoid unnecessary shuffling when using Table SQL
> interface?  I see PARTITION BY on TABLE.  I'm not sure how to specify the
> keys for Kafka.
>
>
>
>
>

Re: Flink Table SQL, Kafka, partitions and unnecessary shuffling

Reply via email to