Zhanghao Chen created FLINK-31826:
-------------------------------------

             Summary: Incorrect estimation of the target data rate of a vertex 
when only a subset of its upstream vertex's output is consumed
                 Key: FLINK-31826
                 URL: https://issues.apache.org/jira/browse/FLINK-31826
             Project: Flink
          Issue Type: Improvement
          Components: Autoscaler
            Reporter: Zhanghao Chen
         Attachments: LHL7VKOG4B.jpg

Currently, a vertex's target data rate = the sum of its upstream vertex's 
target data rate * input/output ratio. This assumes that all of the upstream 
vertex output goes into the current vertex. However, it does not always hold. 
Consider the following job plan generated by a Flink SQL job. The vertex in the 
middle has multiple Calc(select xx) operators chained, each connects to a 
separate downstream tasks. The total num_rec_out_rate of the middle vertex = 
SUM num_rec_in_rate of its downstream tasks.

To fix this problem, we need operator level output metrics and edge info. The 
operator level metrics part is easy, but AFAIK, there's no way to get the 
operator level edge info from the current Flink REST APIs.

!LHL7VKOG4B.jpg!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to