Hi all, We can discuss the expiry implementation as part of the PR review. If there are no further comments, I will start vote on this KIP tomorrow. Please let me know if there are any other concerns.
On Fri, Aug 18, 2017 at 4:55 AM, Rajini Sivaram <rajinisiva...@gmail.com> wrote: > Perhaps we could register dynamically for now. If we find that the cost of > retaining these is high, we can add the code to expire them later. Is that > ok? > > Regards, > > Rajini > > > > On Fri, Aug 18, 2017 at 9:41 AM, Ismael Juma <ism...@juma.me.uk> wrote: > >> Can we quantify the cost of having these metrics around if they are >> dynamically registered? Given that the maximum is bounded at development >> time, is it really worth all the extra code? >> >> Ismael >> >> On Fri, Aug 18, 2017 at 9:34 AM, Rajini Sivaram <rajinisiva...@gmail.com> >> wrote: >> >> > Jun, >> > >> > It feels more consistent to add errors as yammer metrics similar to >> other >> > request metrics. Perhaps we could add some code to track and remove >> these >> > if unused? It is a bit more work, but it would keep the externals >> > consistent. >> > >> > Ismael/Manikumar, >> > >> > Agree that version as a String attribute makes more sense. >> Unfortunately, >> > the whole KafkaMetric implementation is written around a single "double" >> > type, so introducing a new type is a big change. But I suppose it can be >> > done. I have updated the KIP. >> > >> > Regards, >> > >> > Rajini >> > >> > >> > On Fri, Aug 18, 2017 at 7:42 AM, Manikumar <manikumar.re...@gmail.com> >> > wrote: >> > >> > > I agree it will be good if we can add "commit id/version" as an >> > > attribute value. >> > > It will be easy to parse. But as of now, KafkaMetric supports only >> > > numerical values. >> > > >> > > On Fri, Aug 18, 2017 at 5:49 AM, Ismael Juma <ism...@juma.me.uk> >> wrote: >> > > >> > > > Hi Rajini, >> > > > >> > > > About the gauges, I was thinking that the attribute would be the >> value >> > > > (i.e. commit id or version). I understand that Kafka Metrics doesn't >> > > > support this (unlike Yammer Metrics), but would it make sense to >> add? >> > > > >> > > > Ismael >> > > > >> > > > On Thu, Aug 17, 2017 at 2:54 PM, Rajini Sivaram < >> > rajinisiva...@gmail.com >> > > > >> > > > wrote: >> > > > >> > > > > Hi Ismael, >> > > > > >> > > > > Thank you for the review. >> > > > > >> > > > > 1. Agree on keeping it simple with dynamic registration and no >> > expiry. >> > > > Will >> > > > > wait for Jun's feedback before updating KIP. >> > > > > 2. I have switched to two metrics for commit-id and version (not >> sure >> > > if >> > > > it >> > > > > matches what you meant). I also added the client-id tag which is >> used >> > > in >> > > > > all metrics from clients. >> > > > > >> > > > > Regards, >> > > > > >> > > > > Rajini >> > > > > >> > > > > >> > > > > On Thu, Aug 17, 2017 at 10:55 AM, Ismael Juma <ism...@juma.me.uk> >> > > wrote: >> > > > > >> > > > > > Thanks for the KIP, Rajini. I think this is helpful too. A few >> > minor >> > > > > > comments. >> > > > > > >> > > > > > 1. About the number of metrics and expiration, if we dynamically >> > > > register >> > > > > > metrics for the error codes, the number is likely to be much >> lower >> > > than >> > > > > > 30*30, probably less than 100. If we were using Kafka Metrics >> for >> > > this, >> > > > > we >> > > > > > could easily add a long expiration period to be conservative, >> but I >> > > am >> > > > > not >> > > > > > sure this is supported by Yammer Metrics. If it is not, there's >> an >> > > > > argument >> > > > > > for keeping it simple. >> > > > > > >> > > > > > 2. Would it make sense to use 2 gauges for the version and >> commit >> > id? >> > > > It >> > > > > > seems more intuitive than having those values as tags. >> > > > > > >> > > > > > Ismael >> > > > > > >> > > > > > On Thu, Aug 17, 2017 at 10:19 AM, Rajini Sivaram < >> > > > > rajinisiva...@gmail.com> >> > > > > > wrote: >> > > > > > >> > > > > > > Hi Jun, >> > > > > > > >> > > > > > > Thank you for the review. >> > > > > > > >> > > > > > > 1. Makes sense. I have updated the KIP. >> > > > > > > 2. Moved to a new group ZooKeeperClient >> > > > > > > 3. It is a gauge, so it will have a single attribute called >> Value >> > > > with >> > > > > a >> > > > > > > constant value of 1. >> > > > > > > >> > > > > > > Regards, >> > > > > > > >> > > > > > > Rajini >> > > > > > > >> > > > > > > On Thu, Aug 17, 2017 at 3:16 AM, Jun Rao <j...@confluent.io> >> > wrote: >> > > > > > > >> > > > > > > > Hi, Rajini, >> > > > > > > > >> > > > > > > > Thanks for the KIP. A few comments. >> > > > > > > > >> > > > > > > > 1. We have 30+ requests and 30+ error code and growing. So, >> the >> > > > > > > combination >> > > > > > > > can be large. Perhaps it's useful to expire an error metric >> if >> > > it's >> > > > > no >> > > > > > > > longer updated after some time? We did something similar for >> > the >> > > > > quota >> > > > > > > > metric. >> > > > > > > > >> > > > > > > > 2. It's a bit weird to put the ZK latency metric under >> > > > > > > > type=SessionExpireListener. >> > > > > > > > Perhaps it's more intuitive to put it in a separate type. >> > > > > > > > >> > > > > > > > 3. For the client version metric, since we representing >> > commit_id >> > > > and >> > > > > > > > version as tags in the metric name. So the mbean will have >> no >> > > > > > attributes? >> > > > > > > > >> > > > > > > > Jun >> > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > On Wed, Aug 16, 2017 at 4:05 PM, Roger Hoover < >> > > > > roger.hoo...@gmail.com> >> > > > > > > > wrote: >> > > > > > > > >> > > > > > > > > I think it would useful to make clear somewhere for each >> > > metric, >> > > > > the >> > > > > > > > level >> > > > > > > > > at which it's counted. I don't know all the details of >> the >> > > Kafka >> > > > > > > > protocol >> > > > > > > > > but it might be something like >> > > > > > > > > >> > > > > > > > > ProduceRequest, Fetch Request - counted at per-partition >> > level >> > > > > > > > > All other requests are 1:1 with client requests? >> > > > > > > > > >> > > > > > > > > Cheers, >> > > > > > > > > >> > > > > > > > > Roger >> > > > > > > > > >> > > > > > > > > On Wed, Aug 16, 2017 at 4:02 PM, Roger Hoover < >> > > > > > roger.hoo...@gmail.com> >> > > > > > > > > wrote: >> > > > > > > > > >> > > > > > > > > > Rajini, >> > > > > > > > > > >> > > > > > > > > > Thank you for the KIP. These are very helpful >> additions. >> > > One >> > > > > > > question >> > > > > > > > > on >> > > > > > > > > > the error code metrics: >> > > > > > > > > > >> > > > > > > > > > Will the total error counting happen at the the level of >> > > topic >> > > > > > > > partition? >> > > > > > > > > > For example, if a single ProduceRequest contains >> messages >> > to >> > > > > append >> > > > > > > to >> > > > > > > > 3 >> > > > > > > > > > partitions and say all 3 appends are successful, the >> > counter >> > > > > > > > > > for kafka.network:type=RequestMetrics,name= >> > > > ErrorsPerSec,request= >> > > > > > > > > ProduceRequest,error=0 >> > > > > > > > > > will be incremented by 3? >> > > > > > > > > > >> > > > > > > > > > Thanks, >> > > > > > > > > > >> > > > > > > > > > Roger >> > > > > > > > > > >> > > > > > > > > > On Wed, Aug 16, 2017 at 12:10 PM, Rajini Sivaram < >> > > > > > > > > rajinisiva...@gmail.com> >> > > > > > > > > > wrote: >> > > > > > > > > > >> > > > > > > > > >> I have created a KIP to add some additional metrics to >> > > support >> > > > > > > health >> > > > > > > > > >> checks: >> > > > > > > > > >> >> > > > > > > > > >> https://cwiki.apache.org/confluence/display/KAFKA/KIP- >> > > 188+-+ >> > > > > > > > > >> Add+new+metrics+to+support+health+checks >> > > > > > > > > >> >> > > > > > > > > >> Feedback and suggestions are welcome. >> > > > > > > > > >> >> > > > > > > > > >> Regards, >> > > > > > > > > >> >> > > > > > > > > >> Rajini >> > > > > > > > > >> >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> > >