Vladimir Matveev created FLINK-32137:
----------------------------------------

             Summary: Flame graph is hard to use with many task managers
                 Key: FLINK-32137
                 URL: https://issues.apache.org/jira/browse/FLINK-32137
             Project: Flink
          Issue Type: Bug
          Components: Runtime / Web Frontend
    Affects Versions: 1.16.1
            Reporter: Vladimir Matveev
         Attachments: image (1).png

In case there are many task managers executing the same operator, the flame 
graph becomes very hard to use. As you can see on the attached picture, it 
considers instances of the same lambda function as different classes, and their 
number seems to be equal to the number of task managers (i.e. each JVM gets its 
own "class" name, which is expected for lambdas I guess). This lambda function 
is deep within Flink's own call stack, so this kind of graph is inevitable 
regardless of the job's own logic, and there is nothing we can do at the job 
logic's level to fix it.

This behavior makes evaluating the flame graph very hard, because all of the 
useful information gets "compressed" inside each "column" of the graph, and at 
the same time, it does not give any useful information since this is just an 
artifact of the class name generation in the JVM.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to