Hi all,
let's have a look at a simple Join with two DataSources and parallelism p=5. The whole Job consists of 3 parts: 1. DataSource Task 2. Join Task 3. DataSink Task In the first task, the data is provided and prepared for the Join task. In particular each DataSource task creates a ResultPartition which is divided into 5 subpartitions. Since 1/5 of the Join Task will be located in the same node, one of these subpartitions does not have to be shipped over the network. This one subpartition will be shipped to a LocalInputChannel (not RemoteInputChannel) and therefore will not get in touch with the network. Now I made some changes in the network part for my research and would like them to affect all subpartitions. Question: Is there a feature build into flink to completely disable the local stuff and send all subpartitions via network even if they have the same location and destination? If not - does anyone have an idea where to tweak this? Thanks. Chris