
ASF GitHub Bot commented on FLINK-7781:

Github user zentol commented on a diff in the pull request:

    --- Diff: 
    @@ -0,0 +1,70 @@
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.flink.runtime.rest.handler.legacy.metrics;
    +import java.util.Map;
    +import java.util.concurrent.Executor;
    + * Request handler that returns, aggregated across all subtasks of a 
single tasks, a list of all available metrics or the
    + * values for a set of metrics.
    + *
    + * <p>If the query parameters do not contain a "get" parameter the list of 
all metrics is returned.
    + * {@code {"available": [ { "name" : "X", "id" : "X" } ] } }
    + *
    + * <p>If the query parameters do contain a "get" parameter, a 
comma-separated list of metric names is expected as a value.
    + * {@code /metrics?get=X,Y}
    + * The handler will then return a list containing the values of the 
requested metrics.
    + * {@code [ { "id" : "X", "value" : "S" }, { "id" : "Y", "value" : "T" } ] 
    + *
    + * <p>The "agg" query parameter is used to define which aggregates should 
be calculated. Available aggregations are
    + * "sum", "max", "min" and "avg".
    + */
    +public class SubtaskMetricsHandler extends AbstractMetricsHandler {
    +   private static final String SUBTASK_METRICS_REST_PATH = 
    +   public SubtaskMetricsHandler(Executor executor, MetricFetcher fetcher) {
    +           super(executor, fetcher);
    +   }
    +   @Override
    +   public String[] getPaths() {
    +           return new String[]{SUBTASK_METRICS_REST_PATH};
    +   }
    +   @Override
    +   protected Map<String, String> getMapFor(Map<String, String> pathParams, 
MetricStore metrics) {
    +           String subtaskNumString = 
    +           int subtaskNum;
    +           try {
    +                   subtaskNum = Integer.valueOf(subtaskNumString);
    +           } catch (NumberFormatException nfe) {
    +                   return null;
    +           }
    --- End diff --
    An unknown subtask index will cause the method to return null 
(```metrics.getSubtaskMetricStore(...)```), which is the same behavior as all 
other metric handlers.
    We can't provide an accurate message in this case; we only know that this 
subtask index is unknown to the store, but not why. (does it exceed the 
parallelism, metrics haven't arrived yet).

> Support simple on-demand metrics aggregation
> --------------------------------------------
>                 Key: FLINK-7781
>                 URL: https://issues.apache.org/jira/browse/FLINK-7781
>             Project: Flink
>          Issue Type: Improvement
>          Components: Metrics, REST
>    Affects Versions: 1.4.0
>            Reporter: Chesnay Schepler
>            Assignee: Chesnay Schepler
>             Fix For: 1.4.0
> We should support aggregations (min, max, avg, sum) of metrics in the REST 
> API. This is primarily about aggregating across subtasks, for example the 
> number of incoming records across all subtasks.
> This is useful for simple use-cases where a dedicated metrics backend is 
> overkill, and will allow us to provide better metrics in the web UI (since we 
> can expose these aggregated as well).
> I propose to add a new query parameter "agg=[min,max,avg,sum]". As a start 
> this parameter should only be used for task metrics. (This is simply the main 
> use-case i have in mind)
> The aggregation should (naturally) only work for numeric metrics.
> We will need a HashSet of metrics that exist for subtasks of a given tasks 
> that has to be updated in {{MetricStore#add}}.
> All task metrics are either stored as
> # {{<subtask-index>.<metric>}} or
> # {{<subtask-index>.<operator-name>.<metric>}}.
> If a user sends a request {{get=mymetric,agg=sum}}, only the metrics of the 
> first kind are to be considered. Similarly, given a request 
> {{get=myoperator.mymetric,agg=sum}} only metrics of the second kind are to be 
> considered.
> Ideally, the name of the aggregated metric (i.e. the original name without 
> subtask index) is also contained in the list of available metrics.

This message was sent by Atlassian JIRA

Reply via email to