Re: Adding a Histogram Metric

Cody Innowhere Tue, 14 Jun 2016 09:07:11 -0700

Counter of dropwizard is thread-safe.
I think dropwizard metrics are implemented fairly well and used quite
widely in open source projects, I personally on the side of using
dropwizard metrics rather than re-implement them, unless for performance
reasons. Still, I'm +1 for adding a wrapper on top of dropwizard metrics.


On Tue, Jun 14, 2016 at 10:45 PM, Till Rohrmann <trohrm...@apache.org>
wrote:

> +1 for the thread safe metrics. This should be a rather low hanging fruit
> and easily added.
>
> If we decide to add a histogram, then I would also be in favour of
> implementing our own version of a histogram. This avoids adding a hard
> dependency on Dropwizard or another metrics library to Flink core. Adding
> our own implementation would of course entail to also add a Dropwizard
> wrapper for reporting. Thus, if a user component required a Dropwizard
> histogram, then one could simply use the wrapper.
>
> Alternatively, one could also rely on external system to compute
> histograms. For example, statsD supports the generation of histograms from
> a stream of measurements. However, these histograms couldn't be used within
> Flink.
>
> Implementation wise, the histogram would most likely follow the
> implementation of Dropwizard's histogram:
>
> The basic idea is that a histogram can add samples to a reservoir which it
> uses to calculate a set of statistics when queried. The statistics
> comprises percentiles, mean, standard deviation and number of elements, for
> example.
>
> The reservoir defines the strategy of how to sample the input stream. There
> are different strategies conceivable: Uniform sampling, which constructs a
> long term distribution of the seen elements, exponentially decaying
> sampling, which favours more recent elements, sliding window or buckets.
>
> The question is now whether such an implementation already covers most use
> cases or whether histograms should support more functionaly. Feedback is
> highly appreciated :-)
>
> Cheers,
> Till
>
> On Mon, Jun 13, 2016 at 6:37 PM, Stephan Ewen <se...@apache.org> wrote:
>
> > I think it is totally fine to add a "ThreadsafeCounter" that uses an
> atomic
> > long internally
> >
> > On Sat, Jun 11, 2016 at 7:25 PM, Steve Cosenza <scose...@twitter.com>
> > wrote:
> >
> > > Also, we may be able to avoid the need for concurrent metrics by
> > > configuring each Finagle source to use a single thread. We'll
> investigate
> > > the performance implications this week and update you with the results.
> > >
> > > Thanks,
> > > Steve
> > >
> > >
> > > On Friday, June 10, 2016, Steve Cosenza <scose...@twitter.com> wrote:
> > >
> > >> +Chris Hogue who is also working on operationalizing Flink with me
> > >>
> > >> Hi Stephan,
> > >>
> > >> Thanks for the background on your current implementations!
> > >>
> > >> While we don't require a specific implementation for histogram,
> counter,
> > >> or gauge, it just became clear that we'll need threadsafe versions of
> > all
> > >> three of these metrics. This is because our messaging source is
> > implemented
> > >> using Finagle, and Finagle expects to be able to emit stats
> concurrently
> > >> from its managed threads.
> > >>
> > >> That being said, if adding threadsafe versions of the Flink counters
> is
> > >> not an option, we'd also be fine with directly reading and writing
> from
> > the
> > >> singleton Codahale MetricsRegistry that you start up in each
> > TaskManager.
> > >>
> > >> Thanks,
> > >> Steve
> > >>
> > >> On Fri, Jun 10, 2016 at 7:10 AM, Stephan Ewen <se...@apache.org>
> wrote:
> > >>
> > >>> A recent discussion brought up the point of adding a "histogram"
> metric
> > >>> type to Flink. This open thread is to gather some more of the
> > requirements
> > >>> for that metric.
> > >>>
> > >>> The most important question is whether users need Flink to offer
> > >>> specific implementations of "Histogram", like for example the "
> > >>> com.codahale.metrics.Histogram", or if a "
> > >>> org.apache.flink.metrics.Histogram" interface would work as well.
> > >>> The histogram could still be reported for example via dropwizard
> > >>> reporters.
> > >>>
> > >>> *Option (1):* If a Flink Histogram works as well, it would be simple
> to
> > >>> add one. The dropwizard reporter would need to wrap the Flink
> > Histogram for
> > >>> reporting.
> > >>>
> > >>> *Option (2)*: If the code needs the specific Dropwizard Histogram
> type,
> > >>> then one would need a wrapper class that makes a Flink Histogram look
> > like
> > >>> a dropwizard histogram.
> > >>>
> > >>> ----------
> > >>>
> > >>> As a bit of background for the discussion, here are some thoughts
> > behind
> > >>> the way that Metrics are currently implemented in Flink.
> > >>>
> > >>>   - The metric types in Flink are independent from libraries like
> > >>> "dropwizard" to reduce dependencies and retain freedom to swap
> > >>> implementations.
> > >>>
> > >>>   - Metric reporting allows to reuse reporters from dropwizard
> > >>>
> > >>>   - Some Flink metric implementations are also more lightweight than
> > for
> > >>> example in dropwizard. Counters for example are not thread safe, but
> > do not
> > >>> impose memory barriers. That is important for metrics deep in the
> > streaming
> > >>> runtime.
> > >>>
> > >>>
> > >>>
> > >>
> > >
> > > --
> > > -Steve
> > >
> > > Sent from Gmail Mobile
> > >
> >
>

Re: Adding a Histogram Metric

Reply via email to