[ 
https://issues.apache.org/jira/browse/FLINK-5090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15924104#comment-15924104
 ] 

ASF GitHub Bot commented on FLINK-5090:
---------------------------------------

Github user zentol commented on a diff in the pull request:

    https://github.com/apache/flink/pull/3348#discussion_r105895733
  
    --- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/taskmanager/Task.java ---
    @@ -389,11 +389,20 @@ public Task(
                        ++counter;
                }
     
    +           invokableHasBeenCanceled = new AtomicBoolean(false);
    +
    +           // finally, create the executing thread, but do not start it
    +           executingThread = new Thread(TASK_THREADS_GROUP, this, 
taskNameWithSubtask);
    +
    +           // add metrics for buffers
    +           this.metrics.getIOMetricGroup().initializeBufferMetrics(this);
    +
                // register detailed network metrics, if configured
                if 
(tmConfig.getBoolean(TaskManagerOptions.NETWORK_DETAILED_METRICS_KEY)) {
    -                   MetricGroup networkGroup = 
metricGroup.addGroup("Network"); // same as in 
MetricUtils.instantiateNetworkMetrics()
    -                   MetricGroup outputGroup = 
networkGroup.addGroup("Output"); // this is optional
    -                   MetricGroup inputGroup = 
networkGroup.addGroup("Input"); // this is optional
    +                   // similar to MetricUtils.instantiateNetworkMetrics() 
but inside this IOMetricGroup
    +                   MetricGroup networkGroup = 
this.metrics.getIOMetricGroup().addGroup("Network");
    --- End diff --
    
    The point of the IOMetricGroup is to keep a lot of details out of the 
TaskMetricGroup without affecting the actual MetricGroup structure. IO metrics 
are handled a bit differently than other metrics in that they are a) also 
stored in the ExecutionGraph and b) are used from different parts of the code 
(like multiple RecordWriters). We preemptively moved this logic into a separate 
class so that the TaskMG doesn't blow up over time.
    
    There isn't anything wrong with registering metrics/adding groups on it, 
they aren't lost or anything. I'm only mentioning it since you modified 
existing code with something that is equivalent.
    
    If we want there to be an actual "IO" group we only have to modify these 2 
lines:
    ```
    TaskMetricGroup:
    this.ioMetrics = new TaskIOMetricGroup();
    =>
    this.ioMetrics = new TaskIOMetricGroup(addGroup("IO"));
    ```
    
    ```
    TaskIOMetricGroup:
    public TaskIOMetricGroup(TaskMetricGroup parent) {
    =>
    public TaskIOMetricGroup(MetricGroup parent) {
    ```


> Expose optionally detailed metrics about network queue lengths
> --------------------------------------------------------------
>
>                 Key: FLINK-5090
>                 URL: https://issues.apache.org/jira/browse/FLINK-5090
>             Project: Flink
>          Issue Type: New Feature
>          Components: Metrics, Network
>    Affects Versions: 1.1.3
>            Reporter: Stephan Ewen
>            Assignee: Stephan Ewen
>
> For debugging purposes, it is important to have access to more detailed 
> metrics about the length of network input and output queues.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to