I agree that dropwizard already offers a lot of functionality and we shouldn't reimplement it. Therefore, I've introduced a Flink histogram interface which is implemented by a dropwizard histogram wrapper. That way Flink has it's own histogram abstraction and dropwizard histograms can be used with Flink via this wrapper class. See the PR [1] for more information.
[1] https://github.com/apache/flink/pull/2112 Cheers, Till On Tue, Jun 14, 2016 at 6:05 PM, Cody Innowhere <e.neve...@gmail.com> wrote: > Counter of dropwizard is thread-safe. > I think dropwizard metrics are implemented fairly well and used quite > widely in open source projects, I personally on the side of using > dropwizard metrics rather than re-implement them, unless for performance > reasons. Still, I'm +1 for adding a wrapper on top of dropwizard metrics. > > On Tue, Jun 14, 2016 at 10:45 PM, Till Rohrmann <trohrm...@apache.org> > wrote: > > > +1 for the thread safe metrics. This should be a rather low hanging fruit > > and easily added. > > > > If we decide to add a histogram, then I would also be in favour of > > implementing our own version of a histogram. This avoids adding a hard > > dependency on Dropwizard or another metrics library to Flink core. Adding > > our own implementation would of course entail to also add a Dropwizard > > wrapper for reporting. Thus, if a user component required a Dropwizard > > histogram, then one could simply use the wrapper. > > > > Alternatively, one could also rely on external system to compute > > histograms. For example, statsD supports the generation of histograms > from > > a stream of measurements. However, these histograms couldn't be used > within > > Flink. > > > > Implementation wise, the histogram would most likely follow the > > implementation of Dropwizard's histogram: > > > > The basic idea is that a histogram can add samples to a reservoir which > it > > uses to calculate a set of statistics when queried. The statistics > > comprises percentiles, mean, standard deviation and number of elements, > for > > example. > > > > The reservoir defines the strategy of how to sample the input stream. > There > > are different strategies conceivable: Uniform sampling, which constructs > a > > long term distribution of the seen elements, exponentially decaying > > sampling, which favours more recent elements, sliding window or buckets. > > > > The question is now whether such an implementation already covers most > use > > cases or whether histograms should support more functionaly. Feedback is > > highly appreciated :-) > > > > Cheers, > > Till > > > > On Mon, Jun 13, 2016 at 6:37 PM, Stephan Ewen <se...@apache.org> wrote: > > > > > I think it is totally fine to add a "ThreadsafeCounter" that uses an > > atomic > > > long internally > > > > > > On Sat, Jun 11, 2016 at 7:25 PM, Steve Cosenza <scose...@twitter.com> > > > wrote: > > > > > > > Also, we may be able to avoid the need for concurrent metrics by > > > > configuring each Finagle source to use a single thread. We'll > > investigate > > > > the performance implications this week and update you with the > results. > > > > > > > > Thanks, > > > > Steve > > > > > > > > > > > > On Friday, June 10, 2016, Steve Cosenza <scose...@twitter.com> > wrote: > > > > > > > >> +Chris Hogue who is also working on operationalizing Flink with me > > > >> > > > >> Hi Stephan, > > > >> > > > >> Thanks for the background on your current implementations! > > > >> > > > >> While we don't require a specific implementation for histogram, > > counter, > > > >> or gauge, it just became clear that we'll need threadsafe versions > of > > > all > > > >> three of these metrics. This is because our messaging source is > > > implemented > > > >> using Finagle, and Finagle expects to be able to emit stats > > concurrently > > > >> from its managed threads. > > > >> > > > >> That being said, if adding threadsafe versions of the Flink counters > > is > > > >> not an option, we'd also be fine with directly reading and writing > > from > > > the > > > >> singleton Codahale MetricsRegistry that you start up in each > > > TaskManager. > > > >> > > > >> Thanks, > > > >> Steve > > > >> > > > >> On Fri, Jun 10, 2016 at 7:10 AM, Stephan Ewen <se...@apache.org> > > wrote: > > > >> > > > >>> A recent discussion brought up the point of adding a "histogram" > > metric > > > >>> type to Flink. This open thread is to gather some more of the > > > requirements > > > >>> for that metric. > > > >>> > > > >>> The most important question is whether users need Flink to offer > > > >>> specific implementations of "Histogram", like for example the " > > > >>> com.codahale.metrics.Histogram", or if a " > > > >>> org.apache.flink.metrics.Histogram" interface would work as well. > > > >>> The histogram could still be reported for example via dropwizard > > > >>> reporters. > > > >>> > > > >>> *Option (1):* If a Flink Histogram works as well, it would be > simple > > to > > > >>> add one. The dropwizard reporter would need to wrap the Flink > > > Histogram for > > > >>> reporting. > > > >>> > > > >>> *Option (2)*: If the code needs the specific Dropwizard Histogram > > type, > > > >>> then one would need a wrapper class that makes a Flink Histogram > look > > > like > > > >>> a dropwizard histogram. > > > >>> > > > >>> ---------- > > > >>> > > > >>> As a bit of background for the discussion, here are some thoughts > > > behind > > > >>> the way that Metrics are currently implemented in Flink. > > > >>> > > > >>> - The metric types in Flink are independent from libraries like > > > >>> "dropwizard" to reduce dependencies and retain freedom to swap > > > >>> implementations. > > > >>> > > > >>> - Metric reporting allows to reuse reporters from dropwizard > > > >>> > > > >>> - Some Flink metric implementations are also more lightweight > than > > > for > > > >>> example in dropwizard. Counters for example are not thread safe, > but > > > do not > > > >>> impose memory barriers. That is important for metrics deep in the > > > streaming > > > >>> runtime. > > > >>> > > > >>> > > > >>> > > > >> > > > > > > > > -- > > > > -Steve > > > > > > > > Sent from Gmail Mobile > > > > > > > > > >