JIN SUN created FLINK-10643: ------------------------------- Summary: Bubble execution: Resource aware job execution Key: FLINK-10643 URL: https://issues.apache.org/jira/browse/FLINK-10643 Project: Flink Issue Type: New Feature Components: JobManager Reporter: JIN SUN Assignee: JIN SUN Fix For: 1.8.0 Attachments: image-2018-10-22-16-28-32-355.png
Today Flink support various channels such as pipelined channel and blocking channel. Blocking channel indicate that data need to be persistent in a batch and then it can be consumed later, it also indicate that the downstream task cannot start to process data unless its producer finished and also downstream task will only depends on this intermediate partition instead of upstream tasks. By leverage this characteristic, Flink already support fine grain-failover which will build a failover region has reduce failover cost. However, we can leverage this characteristic even more. As described by this [paper|http://www.vldb.org/pvldb/vol11/p746-yin.pdf] (VLDB 2018), *_Bubble Execution_* not only use this characteristic to implement fine-grain failover, but also use this to balance the resource utilization and job performance. As shown in the paper (also in the following chart), with 50% of the resource, it get 25% (0.75 speedup) average slow down for TPCH benchmark. !image-2018-10-22-16-28-32-355.png! This JIRA here is umbrella that try to apply the idea of this paper to FLINK. -- This message was sent by Atlassian JIRA (v7.6.3#76005)