Sriram, Makes sense. I am cool moving this stuff into its own repo if people think that is better. I'm not sure it would get much contribution but when I started messing with this I did have a lot of grand ideas of making adding metrics to a sensor dynamic so you could add more stuff in real-time(via jmx, say) and/or externalize all your metrics and config to a separate file like log4j with only the points of instrumentation hard-coded.
-Jay On Wed, Feb 12, 2014 at 2:07 PM, Sriram Subramanian < srsubraman...@linkedin.com> wrote: > I am actually neutral to this change. I found the replies were more > towards the implementation and features so far. I would like the community > to think about the questions below before making a decision. My opinion on > this is that it has potential to be its own project and it would attract > developers who are specifically interested in contributing to metrics. I > am skeptical that the Kafka contributors would focus on improving this > library (apart from bug fixes) instead of developing/contributing to other > core pieces. It would be useful to continue and keep it decoupled from > rest of Kafka (if it resides in the Kafka code base.) so that we can move > it out anytime to its own project. > > > On 2/12/14 1:21 PM, "Jay Kreps" <jay.kr...@gmail.com> wrote: > > >Hey Sriram, > > > >Not sure if these are actually meant as questions or more veiled comments. > >In an case I tried to give my 2 cents inline. > > > >On Tue, Feb 11, 2014 at 11:12 PM, Sriram Subramanian < > >srsubraman...@linkedin.com> wrote: > > > >> I think answering the questions below would help to make a better > >> decision. I am all for writing better code and having superior > >> functionalities but it is worth thinking about stuff outside just code > >>in > >> this case - > >> > >> 1. Does metric form a core piece of kafka? Does it help kafka greatly in > >> providing better core functionalities? I would always like a project to > >>do > >> one thing really well. Metrics is a non trivial amount of code. > >> > > > >Metrics are obviously important, and obviously improving our metrics > >system > >would be good. That said this may or may not be better, and even if it is > >better that betterness might not outweigh other considerations. That is > >what we are discussing. > > > > > >> 2. Does it make sense to be part of Kafka or its own project? If this > >> metrics library has the potential to be better than metrics-core, I > >>would > >> be interested in other projects take advantage of it. > >> > > > >It could be either. > > > >3. Can Kafka maintain this library as new members join and old members > >> leave? Would this be a piece of code that no one (in Kafka) in the > >>future > >> spends time improving if the original author left? > >> > > > >I am not going anywhere in the near term, but if I did, yes, this would be > >like any other code we have. As with yammer metrics or any other code at > >that point we would either use it as is or someone would improve it. > > > > > >> 4. Does it affect the schedule of producer rewrite? This needs its own > >> stabilization and modification to existing metric dashboards if the > >>format > >> is changed. Many times such cost are not factored in and a project loses > >> time before realizing the extra time required to make a library as this > >> operational. > >> > > > >Probably not. The metrics are going to change regardless of whether we use > >the same library or not. If we think this is better I don't mind putting > >in > >a little extra effort to get there. > > > >Irrespective I think this is probably not the right thing to optimize for. > > > > > >> I am sure we can do better when we write code to a specific use case (in > >> this case, kafka) rather than building a generic library that suits all > >> (metrics-core) but I would like us to have answers to the questions > >>above > >> and be prepared before we proceed to support this with the producer > >> rewrite. > > > > > >Naturally we are all considering exactly these things, that is exactly the > >reason I started the thread. > > > >-Jay > > > > > >> On 2/11/14 6:28 PM, "Jun Rao" <jun...@gmail.com> wrote: > >> > >> >Thanks for the detailed write-up. It's well thought through. A few > >> >comments: > >> > > >> >1. I have a couple of concerns on the percentiles. The first issue is > >>that > >> >It requires the user to know the value range. Since the range for > >>things > >> >like message size (in millions) is quite different from those like > >>request > >> >time (less than 100), it's going to be hard to pick a good global > >>default > >> >range. Different apps could be dealing with different message size. So > >> >they > >> >probably will have to customize the range. Another issue is that it can > >> >only report values at the bucket boundaries. So, if you have 1000 > >>buckets > >> >and a value range of 1 million, you will only see 1000 possible values > >>as > >> >the quantile, which is probably too sparse. The implementation of > >> >histogram > >> >in metrics-core keeps a fix size of samples, which avoids both issues. > >> > > >> >2. We need to document the 3-part metrics names better since it's not > >> >obvious what the convention is. Also, currently the name of the sensor > >>and > >> >the metrics defined in it are independent. Would it make sense to have > >>the > >> >sensor name be a prefix of the metric name? > >> > > >> >Overall, this approach seems to be cleaner than metrics-core by > >>decoupling > >> >measuring and reporting. The main benefit of metrics-core seems to be > >>the > >> >existing reporters. Since not that many people voted for metrics-core, > >>I > >> >am > >> >ok with going with the new implementation. My only recommendation is to > >> >address the concern on percentiles. > >> > > >> >Thanks, > >> > > >> >Jun > >> > > >> > > >> > > >> >On Thu, Feb 6, 2014 at 12:51 PM, Jay Kreps <jay.kr...@gmail.com> > wrote: > >> > > >> >> Hey guys, > >> >> > >> >> I wanted to kick off a quick discussion of metrics with respect to > >>the > >> >>new > >> >> producer and consumer (and potentially the server). > >> >> > >> >> At a high level I think there are three approaches we could take: > >> >> 1. Plain vanilla JMX > >> >> 2. Use Coda Hale (AKA Yammer) Metrics > >> >> 3. Do our own metrics (with JMX as one output) > >> >> > >> >> 1. Has the advantage that JMX is the most commonly used java thing > >>and > >> >> plugs in reasonably to most metrics systems. JMX is included in the > >>JDK > >> >>so > >> >> it doesn't impose any additional dependencies on clients. It has the > >> >> disadvantage that plain vanilla JMX is a pain to use. We would need a > >> >>bunch > >> >> of helper code for maintaining counters to make this reasonable. > >> >> > >> >> 2. Coda Hale metrics is pretty good and broadly used. It supports JMX > >> >> output as well as direct output to many other types of systems. The > >> >>primary > >> >> downside we have had with Coda Hale has to do with the clients and > >> >>library > >> >> incompatibilities. We are currently on an older more popular version. > >> >>The > >> >> newer version is a rewrite of the APIs and is incompatible. > >>Originally > >> >> these were totally incompatible and people had to choose one or the > >> >>other. > >> >> I think that has been improved so now the new version is a totally > >> >> different package. But even in this case you end up with both > >>versions > >> >>if > >> >> you use Kafka and we are on a different version than you which is > >>going > >> >>to > >> >> be pretty inconvenient. > >> >> > >> >> 3. Doing our own has the downside of potentially reinventing the > >>wheel, > >> >>and > >> >> potentially needing to work out any bugs in our code. The upsides > >>would > >> >> depend on the how good the reinvention was. As it happens I did a > >>quick > >> >> (~900 loc) version of a metrics library that is under > >> >>kafka.common.metrics. > >> >> I think it has some advantages over the Yammer metrics package for > >>our > >> >> usage beyond just not causing incompatibilities. I will describe this > >> >>code > >> >> so we can discuss the pros and cons. Although I favor this approach I > >> >>have > >> >> no emotional attachment and wouldn't be too sad if I ended up > >>deleting > >> >>it. > >> >> Here are javadocs for this code, though I haven't written much > >> >> documentation yet since I might end up deleting it: > >> >> > >> >> Here is a quick overview of this library. > >> >> > >> >> There are three main public interfaces: > >> >> Metrics - This is a repository of metrics being tracked. > >> >> Metric - A single, named numerical value being measured (i.e. a > >> >>counter). > >> >> Sensor - This is a thing that records values and updates zero or > >>more > >> >> metrics > >> >> > >> >> So let's say we want to track three values about message sizes; > >> >> specifically say we want to record the average, the maximum, the > >>total > >> >>rate > >> >> of bytes being sent, and a count of messages. Then we would do > >>something > >> >> like this: > >> >> > >> >> // setup code > >> >> Metrics metrics = new Metrics(); // this is a global "singleton" > >> >> Sensor sensor = metrics.sensor("kafka.producer.message.sizes"); > >> >> sensor.add("kafka.producer.message-size.avg", new Avg()); > >> >> sensor.add("kafka.producer.message-size.max", new Max()); > >> >> sensor.add("kafka.producer.bytes-sent-per-sec", new Rate()); > >> >> sensor.add("kafka.producer.message-count", new Count()); > >> >> > >> >> // now when we get a message we do this > >> >> sensor.record(messageSize); > >> >> > >> >> The above code creates the global metrics repository, creates a > >>single > >> >> Sensor, and defines 5 named metrics that are updated by that Sensor. > >> >> > >> >> Like Yammer Metrics (YM) I allow you to plug in "reporters", > >>including a > >> >> JMX reporter. Unlike the Coda Hale JMX reporter the reporter I have > >>keys > >> >> off the metric names not the Sensor names, which I think is an > >> >> improvement--I just use the convention that the last portion of the > >> >>name is > >> >> the attribute name, the second to last is the mbean name, and the > >>rest > >> >>is > >> >> the package. So in the above example there is a producer mbean that > >>has > >> >>a > >> >> avg and max attribute and a producer mbean that has a > >>bytes-sent-per-sec > >> >> and message-count attribute. This is nice because you can logically > >> >>group > >> >> the values reported irrespective of where in the program they are > >> >> computed--that is an mbean can logically group attributes computed > >>off > >> >> different sensors. This means you can report values by logical > >> >>subsystem. > >> >> > >> >> I also allow the concept of hierarchical Sensors which I think is a > >>good > >> >> convenience. I have noticed a common pattern in systems where you > >>need > >> >>to > >> >> roll up the same values along different dimensions. An simple > >>example is > >> >> metrics about qps, data rate, etc on the broker. These we want to > >> >>capture > >> >> in aggregate, but also broken down by topic-id. You can do this > >>purely > >> >>by > >> >> defining the sensor hierarchy: > >> >> Sensor allSizes = metrics.sensor("kafka.producer.sizes"); > >> >> Sensor topicSizes = metrics.sensor("kafka.producer." + topic + > >> >>".sizes", > >> >> allSizes); > >> >> Now each actual update will go to the appropriate topicSizes sensor > >> >>(based > >> >> on the topic name), but allSizes metrics will get updated too. I also > >> >> support multiple parents for each sensor as well as multiple layers > >>of > >> >> hiearchy, so you can define a more elaborate DAG of sensors. An > >>example > >> >>of > >> >> how this would be useful is if you wanted to record your metrics > >>broken > >> >> down by topic AND client id as well as the global aggregate. > >> >> > >> >> Each metric can take a configurable Quota value which allows us to > >>limit > >> >> the maximum value of that sensor. This is intended for use on the > >> >>server as > >> >> part of our Quota implementation. The way this works is that you > >>record > >> >> metrics as usual: > >> >> mySensor.record(42.0) > >> >> However if this event occurance causes one of the metrics to exceed > >>its > >> >> maximum allowable value (the quota) this call will throw a > >> >> QuotaViolationException. The cool thing about this is that it means > >>we > >> >>can > >> >> define quotas on anything we capture metrics for, which I think is > >> >>pretty > >> >> cool. > >> >> > >> >> Another question is how to handle windowing of the values? Metrics > >>want > >> >>to > >> >> record the "current" value, but the definition of current is > >>inherently > >> >> nebulous. A few of the obvious gotchas are that if you define > >>"current" > >> >>to > >> >> be a number of events you can end up measuring an arbitrarily long > >> >>window > >> >> of time if the event rate is low (e.g. you think you are getting 50 > >> >> messages/sec because that was the rate yesterday when all events > >> >>topped). > >> >> > >> >> Here is how I approach this. All the metrics use the same windowing > >> >> approach. We define a single window by a length of time or number of > >> >>values > >> >> (you can use either or both--if both the window ends when *either* > >>the > >> >>time > >> >> bound or event bound is hit). The typical problem with hard window > >> >> boundaries is that at the beginning of the window you have no data > >>and > >> >>the > >> >> first few samples are too small to be a valid sample. (Consider if > >>you > >> >>were > >> >> keeping an avg and the first value in the window happens to be very > >>very > >> >> high, if you check the avg at this exact time you will conclude the > >>avg > >> >>is > >> >> very high but on a sample size of one). One simple fix would be to > >> >>always > >> >> report the last complete window, however this is not appropriate here > >> >> because (1) we want to drive quotas off it so it needs to be current, > >> >>and > >> >> (2) since this is for monitoring you kind of care more about the > >>current > >> >> state. The ideal solution here would be to define a backwards looking > >> >> sliding window from the present, but many statistics are actually > >>very > >> >>hard > >> >> to compute in this model without retaining all the values which > >>would be > >> >> hopelessly inefficient. My solution to this is to keep a configurable > >> >> number of windows (default is two) and combine them for the estimate. > >> >>So in > >> >> a two sample case depending on when you ask you have between one and > >>two > >> >> complete samples worth of data to base the answer off of. Provided > >>the > >> >> sample window is large enough to get a valid result this satisfies > >>both > >> >>of > >> >> my criteria of incorporating the most recent data and having > >>reasonable > >> >> variance at all times. > >> >> > >> >> Another approach is to use an exponential weighting scheme to combine > >> >>all > >> >> history but emphasize the recent past. I have not done this as it > >>has a > >> >>lot > >> >> of issues for practical operational metrics. I'd be happy to > >>elaborate > >> >>on > >> >> this if anyone cares... > >> >> > >> >> The window size for metrics has a global default which can be > >> >>overridden at > >> >> either the sensor or individual metric level. > >> >> > >> >> In addition to these time series values the user can directly expose > >> >>some > >> >> method of their choosing JMX-style by implementing the Measurable > >> >>interface > >> >> and registering that value. E.g. > >> >> metrics.addMetric("my.metric", new Measurable() { > >> >> public double measure(MetricConfg config, long now) { > >> >> return this.calculateValueToExpose(); > >> >> } > >> >> }); > >> >> This is useful for exposing things like the accumulator free memory. > >> >> > >> >> The set of metrics is extensible, new metrics can be added by just > >> >> implementing the appropriate interfaces and registering with a > >>sensor. I > >> >> implement the following metrics: > >> >> total - the sum of all values from the given sensor > >> >> count - a windowed count of values from the sensor > >> >> avg - the sample average within the windows > >> >> max - the max over the windows > >> >> min - the min over the windows > >> >> rate - the rate in the windows (e.g. the total or count divided by > >>the > >> >> ellapsed time) > >> >> percentiles - a collection of percentiles computed over the window > >> >> > >> >> My approach to percentiles is a little different from the yammer > >>metrics > >> >> package. My complaint about the yammer metrics approach is that it > >>uses > >> >> rather expensive sampling and uses kind of a lot of memory to get a > >> >> reasonable sample. This is problematic for per-topic measurements. > >> >> > >> >> Instead I use a fixed range for the histogram (e.g. 0.0 to 30000.0) > >> >>which > >> >> directly allows you to specify the desired memory use. Any value > >>below > >> >>the > >> >> minimum is recorded as -Infinity and any value above the maximum as > >> >> +Infinity. I think this is okay as all metrics have an expected range > >> >> except for latency which can be arbitrarily large, but for very high > >> >> latency there is no need to model it exactly (e.g. 30 seconds + > >>really > >> >>is > >> >> effectively infinite). Within the range values are recorded in > >>buckets > >> >> which can be either fixed width or increasing width. The increasing > >> >>width > >> >> is analogous to the idea of significant figures, that is if your > >>value > >> >>is > >> >> in the range 0-10 you might want to be accurate to within 1ms, but if > >> >>it is > >> >> 20000 there is no need to be so accurate. I implemented a linear > >>bucket > >> >> size where the Nth bucket has width proportional to N. An exponential > >> >> bucket size would also be sensible and could likely be derived > >>directly > >> >> from the floating point representation of a the value. > >> >> > >> >> I'd like to get some feedback on this metrics code and make a > >>decision > >> >>on > >> >> whether we want to use it before I actually go ahead and add all the > >> >> instrumentation in the code (otherwise I'll have to redo it if we > >>switch > >> >> approaches). So the next topic of discussion will be which actual > >> >>metrics > >> >> to add. > >> >> > >> >> -Jay > >> >> > >> > >> > >