Hi! I'm just getting back to this.
Questions:
1. Across operators, does the same key group ids get mapped to the same
task managers? E.g. if an item is in key group 1 of operator A and that
runs on taskmanager-0, will key group 1 of operator B also run on
taskmanager-0?
2. Are there any internal
Afaik you can express the partition key in Table API now which will be used
for co-location and optimization. So I'd probably give that a try first and
convert the Table to DataStream where needed.
On Sat, Jul 24, 2021 at 9:22 PM Dan Hill wrote:
> Thanks Fabian and Senhong!
>
> Here's an example
Thanks Fabian and Senhong!
Here's an example diagram of the join that I want to do. There are more
layers of joins.
https://docs.google.com/presentation/d/17vYTBUIgrdxuYyEYXrSHypFhwwS7NdbyhVgioYMxPWc/edit#slide=id.p
1) Thanks! I'll look into these.
2) I'm using the same key across multiple Kaf
Hi Dan,
1) If the key doesn’t change in the downstream operators and you want to avoid
shuffling, maybe the DataStreamUtils#reinterpretAsKeyedStream would be helpful.
2) I am not sure that if you are saying that the data are already partitioned
in the Kafka and you want to avoid shuffling in th
Hi Dan,
1) In general, there is no guarantee that your downstream operator is on the
same TM although working on the same key group. Nevertheless, you can try force
this kind of behaviour to prevent the network transfer by either chaining the
two operators (if no shuffle is in between) or confi
Hi.
1) If I use the same key in downstream operators (my key is a user id),
will the rows stay on the same TaskManager machine? I join in more info
based on the user id as the key. I'd like for these to stay on the same
machine rather than shuffle a bunch of user-specific info to multiple task
m