Re: Adding a Histogram Metric

Chesnay Schepler Sat, 18 Jun 2016 08:06:52 -0700

it would be scoped and reported just like any other Counter.


Regards,
Chesnay

On 18.06.2016 16:04, Steve Cosenza wrote:

Will the added metric be scoped appropriately (e.g. per operator)?

Also, if the added metric is a Counter will it be available whenlisting counters in AbstractReporter?


-Steve

On Friday, June 17, 2016, Chesnay Schepler <[email protected]<mailto:[email protected]>> wrote:


    That is currently not possible. We would have expose the internal
    addMetric(String name, Metric metric) method.

    Regards,
    Chesnay

    On 18.06.2016 04:48, Steve Cosenza wrote:

    Hi Till,

    How would I plugin a custom counter so that I could use
    the existing MetricGroup and AbstractReporter functionality?

    Steve

    On Friday, June 17, 2016, Till Rohrmann <[email protected]
    <javascript:_e(%7B%7D,'cvml','[email protected]');>> wrote:

        Hi Steve,

        I thought again about the thread-safe counters and I'm not so
        sure anymore whether we should add them or not. The reason is
        that we could also mark the Counter class as not final. Then
        it would be possible to define your user-defined counters
        which could be thread-safe. This would reduce the complexity
        on the Flink side since covering all metrics use cases might
        be difficult. Would that work for you? What do the others
        think about it?

        Cheers,
        Till

        On Thu, Jun 16, 2016 at 3:33 PM, Till Rohrmann
        <[email protected]
        <javascript:_e(%7B%7D,'cvml','[email protected]');>> wrote:

            I agree that dropwizard already offers a lot of
            functionality and we shouldn't reimplement it. Therefore,
            I've introduced a Flink histogram interface which is
            implemented by a dropwizard histogram wrapper. That way
            Flink has it's own histogram abstraction and dropwizard
            histograms can be used with Flink via this wrapper class.
            See the PR [1] for more information.

            [1] https://github.com/apache/flink/pull/2112

            Cheers,
            Till

            On Tue, Jun 14, 2016 at 6:05 PM, Cody Innowhere
            <[email protected]
            <javascript:_e(%7B%7D,'cvml','[email protected]');>> wrote:

                Counter of dropwizard is thread-safe.
                I think dropwizard metrics are implemented fairly
                well and used quite
                widely in open source projects, I personally on the
                side of using
                dropwizard metrics rather than re-implement them,
                unless for performance
                reasons. Still, I'm +1 for adding a wrapper on top of
                dropwizard metrics.

                On Tue, Jun 14, 2016 at 10:45 PM, Till Rohrmann
                <[email protected]>
                wrote:

                > +1 for the thread safe metrics. This should be a
                rather low hanging fruit
                > and easily added.
                >
                > If we decide to add a histogram, then I would also
                be in favour of
                > implementing our own version of a histogram. This
                avoids adding a hard
                > dependency on Dropwizard or another metrics library
                to Flink core. Adding
                > our own implementation would of course entail to
                also add a Dropwizard
                > wrapper for reporting. Thus, if a user component
                required a Dropwizard
                > histogram, then one could simply use the wrapper.
                >
                > Alternatively, one could also rely on external
                system to compute
                > histograms. For example, statsD supports the
                generation of histograms from
                > a stream of measurements. However, these histograms
                couldn't be used within
                > Flink.
                >
                > Implementation wise, the histogram would most
                likely follow the
                > implementation of Dropwizard's histogram:
                >
                > The basic idea is that a histogram can add samples
                to a reservoir which it
                > uses to calculate a set of statistics when queried.
                The statistics
                > comprises percentiles, mean, standard deviation and
                number of elements, for
                > example.
                >
                > The reservoir defines the strategy of how to sample
                the input stream. There
                > are different strategies conceivable: Uniform
                sampling, which constructs a
                > long term distribution of the seen elements,
                exponentially decaying
                > sampling, which favours more recent elements,
                sliding window or buckets.
                >
                > The question is now whether such an implementation
                already covers most use
                > cases or whether histograms should support more
                functionaly. Feedback is
                > highly appreciated :-)
                >
                > Cheers,
                > Till
                >
                > On Mon, Jun 13, 2016 at 6:37 PM, Stephan Ewen
                <[email protected]> wrote:
                >
                > > I think it is totally fine to add a
                "ThreadsafeCounter" that uses an
                > atomic
                > > long internally
                > >
                > > On Sat, Jun 11, 2016 at 7:25 PM, Steve Cosenza
                <[email protected]
                <javascript:_e(%7B%7D,'cvml','[email protected]');>>
                > > wrote:
                > >
                > > > Also, we may be able to avoid the need for
                concurrent metrics by
                > > > configuring each Finagle source to use a single
                thread. We'll
                > investigate
                > > > the performance implications this week and
                update you with the results.
                > > >
                > > > Thanks,
                > > > Steve
                > > >
                > > >
                > > > On Friday, June 10, 2016, Steve Cosenza
                <[email protected]
                <javascript:_e(%7B%7D,'cvml','[email protected]');>>
                wrote:
                > > >
                > > >> +Chris Hogue who is also working on
                operationalizing Flink with me
                > > >>
                > > >> Hi Stephan,
                > > >>
                > > >> Thanks for the background on your current
                implementations!
                > > >>
                > > >> While we don't require a specific
                implementation for histogram,
                > counter,
                > > >> or gauge, it just became clear that we'll need
                threadsafe versions of
                > > all
                > > >> three of these metrics. This is because our
                messaging source is
                > > implemented
                > > >> using Finagle, and Finagle expects to be able
                to emit stats
                > concurrently
                > > >> from its managed threads.
                > > >>
                > > >> That being said, if adding threadsafe versions
                of the Flink counters
                > is
                > > >> not an option, we'd also be fine with directly
                reading and writing
                > from
                > > the
                > > >> singleton Codahale MetricsRegistry that you
                start up in each
                > > TaskManager.
                > > >>
                > > >> Thanks,
                > > >> Steve
                > > >>
                > > >> On Fri, Jun 10, 2016 at 7:10 AM, Stephan Ewen
                <[email protected]
                <javascript:_e(%7B%7D,'cvml','[email protected]');>>
                > wrote:
                > > >>
                > > >>> A recent discussion brought up the point of
                adding a "histogram"
                > metric
                > > >>> type to Flink. This open thread is to gather
                some more of the
                > > requirements
                > > >>> for that metric.
                > > >>>
                > > >>> The most important question is whether users
                need Flink to offer
                > > >>> specific implementations of "Histogram", like
                for example the "
                > > >>> com.codahale.metrics.Histogram", or if a "
                > > >>> org.apache.flink.metrics.Histogram" interface
                would work as well.
                > > >>> The histogram could still be reported for
                example via dropwizard
                > > >>> reporters.
                > > >>>
                > > >>> *Option (1):* If a Flink Histogram works as
                well, it would be simple
                > to
                > > >>> add one. The dropwizard reporter would need
                to wrap the Flink
                > > Histogram for
                > > >>> reporting.
                > > >>>
                > > >>> *Option (2)*: If the code needs the specific
                Dropwizard Histogram
                > type,
                > > >>> then one would need a wrapper class that
                makes a Flink Histogram look
                > > like
                > > >>> a dropwizard histogram.
                > > >>>
                > > >>> ----------
                > > >>>
                > > >>> As a bit of background for the discussion,
                here are some thoughts
                > > behind
                > > >>> the way that Metrics are currently
                implemented in Flink.
                > > >>>
                > > >>>   - The metric types in Flink are independent
                from libraries like
                > > >>> "dropwizard" to reduce dependencies and
                retain freedom to swap
                > > >>> implementations.
                > > >>>
                > > >>>   - Metric reporting allows to reuse
                reporters from dropwizard
                > > >>>
                > > >>>   - Some Flink metric implementations are
                also more lightweight than
                > > for
                > > >>> example in dropwizard. Counters for example
                are not thread safe, but
                > > do not
                > > >>> impose memory barriers. That is important for
                metrics deep in the
                > > streaming
                > > >>> runtime.
                > > >>>
                > > >>>
                > > >>>
                > > >>
                > > >
                > > > --
                > > > -Steve
                > > >
                > > > Sent from Gmail Mobile
                > > >
                > >
                >

---Steve


    Sent from Gmail Mobile




--
-Steve

Sent from Gmail Mobile

Re: Adding a Histogram Metric

Reply via email to