Re: Flink ID hashing

2021-01-18 Thread Rex Fenley
This is great info. Looks like it uses murmur hash below the surface too [1]. Thanks! [1] https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/state/KeyGroupRangeAssignment.java#L76 On Mon, Jan 18, 2021 at 1:38 AM Timo Walther wrote: > Hi Rex, > > fo

Re: Flink ID hashing

2021-01-18 Thread Timo Walther
Hi Rex, for questions like this, I would recommend to checkout the source code as well. Search for subclasses of `StreamPartitioner`. For example, for keyBy Flink uses: https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/partitio

Flink ID hashing

2021-01-16 Thread Rex Fenley
Hello, I'm wondering what sort of algorithm flink uses to map an Integer ID to a subtask when distributing data. Also, what operators from the TableAPI cause data to be redistributed? I know Joins will, what about Aggregates, Sources, Filters? Thanks! -- Rex Fenley | Software Engineer - Mobi