[ https://issues.apache.org/jira/browse/FLINK-14712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17187741#comment-17187741 ]
Piotr Nowojski commented on FLINK-14712: ---------------------------------------- I think the REST API/UI changes might be abandoned at the moment. > Improve back-pressure reporting mechanism > ----------------------------------------- > > Key: FLINK-14712 > URL: https://issues.apache.org/jira/browse/FLINK-14712 > Project: Flink > Issue Type: Improvement > Components: Runtime / Metrics, Runtime / Network, Runtime / REST > Reporter: lining > Assignee: lining > Priority: Major > Attachments: image-2019-11-12-14-30-16-130.png > > > h4. (1) The current monitor is heavy-weight. > * Backpressure monitoring works by repeatedly taking stack trace samples > of your running tasks. > h4. (2) It is difficult to find out which vertex is the source of > backpressure. > * User need to know current and upstream's network metric to judge current > whether is the source of backpressure. Now user has to record relevant > information. > h3. Proposed Changes > 1. expose the new mechanism implemented in FLINK-14472 as a "is > back-pressured" metric. > 2. show the vertex that produces the backpressure source for the job. > 3. expose network metric in IOMetricsInfo: > * SubTask > ** pool usage: outPoolUsage, inputExclusiveBuffersUsage, > inputFloatingBuffersUsage. > *** If the subtask is not back pressured, but it is causing backpressure > (full input, empty output) > *** By comparing exclusive/floating buffers usage, whether all channels are > back-pressure or only some of them > ** back-pressured for show whether it is back pressured. > * Vertex > ** pool usage: outPoolUsageAvg, inputExclusiveBuffersUsageAvg, > inputFloatingBuffersUsageAvg > ** back-pressured for show whether it is back pressured(merge all iths > subtasks) -- This message was sent by Atlassian Jira (v8.3.4#803005)