Hi Jun, Jiangle, I'd just like to clarify that KIP-225 seems to be using per partition metric the same way as KIP-223 seems to be doing.
I believe avg and max are still necessary because the MetricsReporter doesn't work in a "push" manner and the "Value" measurableStat will only keep the last recorded entry. Therefore a MetricsReporter usually polls to grab a current view with Value this view is incomplete so it becomes not possible to compute the Max/Min/Avg. Max/Min/Avg uses SampledStats which work with a rolling window of samples and therefore periodic polling would work. This is why I believe it's necessary to keep Avg, Min and Max for these metrics as otherwise we wouldn't be able to recompute it in an external monitoring system. Am I wrong thinking this? Thanks, Charly On Wed, Nov 15, 2017 at 2:02 AM, Jun Rao <j...@confluent.io> wrote: > Hi, Charly, > > Thanks for KIP-225. Your proposal looks reasonable. > > Hi, Jiangjie, > > Do you think the approach that KIP-225 proposes is better for exposing the > per partition metric? Also, do we really need the per partition > record-lag-avg > and record-lag-max? It seems that an external monitoring system can always > derive that from the per partition record-lag. > > Thanks, > > Jun > > On Tue, Nov 14, 2017 at 6:57 AM, charly molter <charly.mol...@gmail.com> > wrote: > > > Hi Jun, Hu, > > > > I have KIP-225 open for adding tags to records-lag: > > https://cwiki.apache.org/confluence/pages/viewpage. > action?pageId=74686649 > > > > I have a patch more or less ready so I could probably get the fix checked > > in (after the vote) and you could build on top of it. Otherwise we could > > merge both KIPs if you want but they do sound different to me. > > > > Thanks! > > Charly > > > > On Tue, Nov 14, 2017 at 11:42 AM, Hu Xi <huxi...@hotmail.com> wrote: > > > > > Jun, > > > > > > > > > Let me double confirm with your comments: > > > > > > 1 remove partition-level records-lead-avg and records-lead-min since > they > > > both can be deduced by external monitoring system. > > > > > > 2 Tag partition-level records-lead with topic&partition info > > > > > > > > > If they are the case you expect, do we need to do the same thing for > > those > > > `lag` metrics? Seems partition-level records-lag metrics are not tagged > > > with topic&partition information which might deserve a bug. > > > > > > > > > huxihx > > > > > > > > > ________________________________ > > > 发件人: Jun Rao <j...@confluent.io> > > > 发送时间: 2017年11月14日 12:44 > > > 收件人: dev@kafka.apache.org > > > 主题: Re: 答复: [DISCUSS]KIP-223 - Add per-topic min lead and per-partition > > > lead metrics to KafkaConsumer > > > > > > Hi, Hu, > > > > > > Currently, records-lag-max is an attribute for the mbean > > > kafka.consumer:type=consumer-fetch-manager-metrics,client- > > > id="{client-id}". > > > So, it probably makes sense for records-lead-min to be an attribute > under > > > the same mbean. > > > > > > The partition level records-lead can probably be an attribute for the > > mbean > > > kafka.consumer:type=consumer-fetch-manager-metrics,client- > > > id="{client-id}",topic=topic1,partition=0, > > > where topic and partition are the tags. This matches the topic level > > mbeans > > > that we have in the consumer. I am not sure what the per partition > level > > > records-lead-min and records-lead-avg are. Are they the min/avg of the > > lead > > > since the consumer is started? I am not sure we need those since an > > > external monitoring system can always derive them from records-lead. > > > > > > Thanks, > > > > > > Jun > > > > > > > > > > > > > > > On Mon, Nov 13, 2017 at 8:10 PM, Hu Xi <huxi...@hotmail.com> wrote: > > > > > > > Jun, > > > > > > > > Thanks for the feedback. Some things need to make sure. Currently, > > these > > > > new-added metrics follow the exact naming convention with those 'lag' > > > > counterparts, as shown below: > > > > > > > > > > > > Consumer-level metric: > > > > > > > > records-lag-max ==> records-lead-min > > > > > > > > > > > > Partition-level metrics: > > > > > > > > <topic>-<partitionId>.records-lag ==> > <topic>-<partitionId>. > > > > records-lead > > > > > > > > <topic>-<partitionId>.records-lag-max ==> <topic>-<partitionId>. > > > > records-lead-min > > > > > > > > <topic>-<partitionId>.records-lag-avg ==> <topic>-<partitionId>. > > > > records-lead-avg > > > > > > > > > > > > Correct me if I am wrong, but what you mentioned `*records-lead-avg > and > > > > records-lead-min don't need the partition prefix since they are > > > aggregates > > > > across all partitions*` seemed to break the naming rule above. Do we > > > > still have to keep the same rule with the "lag" metrics? > > > > > > > > > > > > huxihx > > > > > > > > > > > > > > > > > > > > > > > > ------------------------------ > > > > *发件人:* Jun Rao <j...@confluent.io> > > > > *发送时间:* 2017年11月14日 1:48 > > > > *收件人:* dev@kafka.apache.org > > > > *主题:* Re: [DISCUSS]KIP-223 - Add per-topic min lead and per-partition > > > > lead metrics to KafkaConsumer > > > > > > > > Hi, Hu, > > > > > > > > Thanks for the KIP. Looks good overall. Could you document the mbean > > name > > > > for the new metrics? We probably want the name to be consistent with > > > > records-max-lag as described in > > > > http://kafka.apache.org/documentation/#monitoring. Also, it seems > that > > > [http://apache-kafka.org/images/apache-kafka.png]<http: > > //kafka.apache.org/ > > > documentation/#monitoring> > > > > > > Apache Kafka<http://kafka.apache.org/documentation/#monitoring> > > > kafka.apache.org > > > 1.2 Use Cases. Here is a description of a few of the popular use cases > > for > > > Apache Kafka®. For an overview of a number of these areas in action, > see > > > this blog post. > > > > > > > > > > > > > <http://kafka.apache.org/documentation/#monitoring> > > > [http://apache-kafka.org/images/apache-kafka.png]<http: > > //kafka.apache.org/ > > > documentation/#monitoring> > > > > > > Apache Kafka<http://kafka.apache.org/documentation/#monitoring> > > > kafka.apache.org > > > 1.2 Use Cases. Here is a description of a few of the popular use cases > > for > > > Apache Kafka®. For an overview of a number of these areas in action, > see > > > this blog post. > > > > > > > > > > > > > Apache Kafka <http://kafka.apache.org/documentation/#monitoring> > > > [http://apache-kafka.org/images/apache-kafka.png]<http: > > //kafka.apache.org/ > > > documentation/#monitoring> > > > > > > Apache Kafka<http://kafka.apache.org/documentation/#monitoring> > > > kafka.apache.org > > > 1.2 Use Cases. Here is a description of a few of the popular use cases > > for > > > Apache Kafka®. For an overview of a number of these areas in action, > see > > > this blog post. > > > > > > > > > > > > > kafka.apache.org > > > > 1.2 Use Cases. Here is a description of a few of the popular use > cases > > > for > > > > Apache Kafka®. For an overview of a number of these areas in action, > > see > > > > this blog post. > > > > > > > > > > > > records-lead-avg and records-lead-min don't need the partition prefix > > > since > > > > they are aggregates across all partitions. For records-lead, it seems > > > that > > > > it's better to add the topic partition as a tag, instead of as a > prefix > > > in > > > > the metric name. > > > > > > > > Jun > > > > > > > > > > > > > > > > > > > > On Thu, Nov 9, 2017 at 1:03 AM, Hu Xi <huxi...@hotmail.com> wrote: > > > > > > > > > Hi all, > > > > > > > > > > > > > > > As per Jun Rao's suggestion, I opened up the KIP-223( > > > > https://cwiki.apache. > > > > > org/confluence/display/KAFKA/KIP-223+-+Add+per-topic+min+ > > > > > lead+and+per-partition+lead+metrics+to+KafkaConsumer) concerning > > > adding > > > > > new kinds of lag metrics for KafkaConsumer. Be free to leave your > > > > comments > > > > > here. Thanks in advance. > > > > > > > > > > > > > > > > > > > > > > > > > -- > > Charly Molter > > > -- Charly Molter