Re: [DISCUSS] KIP-674: API to Aggregate Metrics in Kafka Streams

Sophie Blee-Goldman Wed, 30 Sep 2020 17:35:03 -0700

Thanks for the KIP! This looks really useful

1)

> If the recording level for the application is set to INFO, a DEBUG-level
> metric that should be aggregated will
> not record values even if the metrics that records the aggregation is on
> recording level INFO

I think I'm missing something here: how is it possible for there to be a
"DEBUG-level metric
that should be aggregated" and yet "the metrics that records the
aggregation is on INFO" ?
I thought that each metric to be aggregated would be uniquely defined by
the combination
of metricName + metricGroup, so there wouldn't be both an INFO and DEBUG
level
metric  to distinguish between.

It seems like if a user tries to aggregate a DEBUG-level metric but passes
in the recording
level as INFO, this would be a misconfiguration. Can you give an example to
help me
understand this?

2) Why do we require users to supply an aggregator and initializer?
Shouldn't
that be determined solely by the actual metric being aggregated? For example
if a user wants to aggregate the `commit-latency-avg`, then the only
sensible
way to aggregate this is by averaging. Similarly the only sensible
aggregation
for `commit-latency-max` is to take the max, a `-total` metric should be
summed,
and so on.

If it's not always possible to infer the aggregation type, then WDYT about
providing
a few aggregation options rather than a fully pluggable BiFunction that
users have
to implement themselves? It seems like the vast majority of aggregations
would be one of {avg, min, max, sum}, so we should just give users those
options
directly. For. example instead of the `initalAggregate` and
`aggregateFunction` parameters,
we just have an enum with possible values AVERAGE, MIN, MAX, and SUM
(am I missing any in that list?)

If a user really wants to do something more complicated, they can always
roll it
up themselves. And if people ask for it we can also go back and add the
pluggable
BiFunction option as well. I'd just rather keep things as simple for users
as possible,
until we've heard actual feedback that more complicated options are
desired.

3) nit: I feel the name `tagLabels` is a bit subtle, what about
`tagsToAggregate`?
(just a suggestion, feel free to ignore)

4) Also a bit of a nit: why put all the configs in the
MetricsAggregationConfig class,
except for the groupOfMetricsToAggregate and nameOfMetricsToAggregate? It
seems like they are just as much a config as the `tagLabels`, for example.

WDYT?

Sophie

On Wed, Sep 30, 2020 at 4:49 AM Bruno Cadonna <br...@confluent.io> wrote:

> Hi,
>
> I would like to propose the following KIP to add an API to the Kafka
> Streams client to aggregate metrics.
>
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-674%3A+API+to+Aggregate+Metrics+in+Kafka+Streams
>
> Best,
> Bruno
>

Re: [DISCUSS] KIP-674: API to Aggregate Metrics in Kafka Streams

Reply via email to