Re: [DISCUSS] KIP-274: Kafka Streams Skipped Records Metrics

John Roesler Wed, 04 Apr 2018 12:47:02 -0700

Cool, if you're satisfied with the KIP now, maybe I can lobby for your vote
;)


The vote thread is still at only one binding +1, I think.

Thanks,
-John

On Tue, Apr 3, 2018 at 9:05 PM, Matthias J. Sax <[email protected]>
wrote:

> Sounds great!
>
> The cryptic topic names can be an issue -- however, people can
> `describe()`  their topology to map the name to the corresponding
> sub-topology/tasks to narrow the error down to the corresponding
> operators. I think, this should be "sufficient for now" for debugging.
>
> Renaming those topic seems to be out-of-scope for this KIP.
>
>
> -Matthias
>
> On 4/3/18 2:45 PM, Guozhang Wang wrote:
> > Thanks John, your proposal looks fine to me.
> >
> > I'll go ahead and look into the PR for more details myself.
> >
> >
> > Guozhang
> >
> > On Tue, Apr 3, 2018 at 1:35 PM, Bill Bejeck <[email protected]> wrote:
> >
> >> Hi John,
> >>
> >> Thanks for making the updates.
> >>
> >> I agree with the information you've included in the logs as described
> >> above, as log statements without enough context/information can be
> >> frustrating.
> >>
> >> -Bill
> >>
> >> On Tue, Apr 3, 2018 at 3:29 PM, John Roesler <[email protected]> wrote:
> >>
> >>> Allrighty, how about this, then...
> >>>
> >>> I'll move the metric back to the StreamThread and maintain the existing
> >> tag
> >>> (client-id=...(per-thread client-id)). It won't be present in the
> >>> TopologyTestDriver's metrics.
> >>>
> >>> As a side note, I'm not sure that the location of the log messages has
> >>> visibility into the name of the thread or the task, or the processor
> >> node,
> >>> for that matter. But at the end of the day, I don't think it really
> >>> matters.
> >>>
> >>> None of those identifiers are in the public interface or
> user-controlled.
> >>> For them to be useful for debugging, users would have to gain a very
> deep
> >>> understanding of how their DSL program gets executed. From my
> >> perspective,
> >>> they are all included in metric tags only to prevent collisions between
> >> the
> >>> same metrics in different (e.g.) threads.
> >>>
> >>> I think what's important is to provide the right information in the
> logs
> >>> that users will be able to debug their issues. This is why the logs in
> my
> >>> pr include the topic/partition/offset of the offending data, as well as
> >> the
> >>> stacktrace of the exception from the deserializer (or for timestamps,
> the
> >>> extracted timestamp and the class name of their extractor). This
> >>> information alone should let them pinpoint the offending data and fix
> it.
> >>>
> >>> (I am aware that that topic name might be a repartition topic, and
> >>> therefore also esoteric from the user's perspective, but I think it's
> the
> >>> best we can do right now. It might be nice to explicitly take on a
> >>> debugging ergonomics task in the future and give all processor nodes
> >>> human-friendly names. Then, we could surface these names in any logs or
> >>> exceptions. But I'm inclined to call this out-of-scope for now.)
> >>>
> >>> Thanks again,
> >>> -John
> >>>
> >>> On Tue, Apr 3, 2018 at 1:40 PM, Guozhang Wang <[email protected]>
> >> wrote:
> >>>
> >>>> 1. If we can indeed gather all the context information from the log4j
> >>>> entries I'd suggest we change to thread-level (I'm not sure if that is
> >>>> doable, so if John have already some WIP PR that can help us decide).
> >>>>
> >>>> 2. We can consider adding the API in TopologyTestDriver for general
> >>> testing
> >>>> purposes; that being said, I think Matthias has a good point that this
> >>>> alone should not be a driving motivation for us to keep this metric as
> >>>> task-level if 1) is true.
> >>>>
> >>>>
> >>>>
> >>>> Guozhang
> >>>>
> >>>>
> >>>> On Tue, Apr 3, 2018 at 11:36 AM, Matthias J. Sax <
> >> [email protected]>
> >>>> wrote:
> >>>>
> >>>>> Thanks Guozhang, that was my intent.
> >>>>>
> >>>>> @John: yes, we should not nail down the exact log message. It's just
> >> to
> >>>>> point out the trade-off. If we can get the required information in
> >> the
> >>>>> logs, we might not need task level metrics.
> >>>>>
> >>>>>
> >>>>> -Matthias
> >>>>>
> >>>>> On 4/3/18 11:26 AM, Guozhang Wang wrote:
> >>>>>> I think Matthias' comment is that, we can still record the metrics
> >> on
> >>>> the
> >>>>>> thread-level, while having the WARN log entry to include sufficient
> >>>>> context
> >>>>>> information so that users can still easily narrow down the
> >>>> investigation
> >>>>>> scope.
> >>>>>>
> >>>>>>
> >>>>>> Guozhang
> >>>>>>
> >>>>>> On Tue, Apr 3, 2018 at 11:22 AM, John Roesler <[email protected]>
> >>>> wrote:
> >>>>>>
> >>>>>>> I agree we should add as much information as is reasonable to the
> >>> log.
> >>>>> For
> >>>>>>> example, see this WIP PR I started for this KIP:
> >>>>>>>
> >>>>>>> https://github.com/apache/kafka/pull/4812/files#diff-
> >>>>>>> 88d129f048bc842c7db5b2566a45fce8R80
> >>>>>>>
> >>>>>>> and
> >>>>>>>
> >>>>>>> https://github.com/apache/kafka/pull/4812/files#diff-
> >>>>>>> 69e6789eb675ec978a1abd24fed96eb1R111
> >>>>>>>
> >>>>>>> I'm not sure if we should nail down the log messages in the KIP or
> >>> in
> >>>>> the
> >>>>>>> PR discussion. What say you?
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> -John
> >>>>>>>
> >>>>>>> On Tue, Apr 3, 2018 at 12:20 AM, Matthias J. Sax <
> >>>> [email protected]
> >>>>>>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> Thanks for sharing your thoughts. As I mentioned originally, I am
> >>> not
> >>>>>>>> sure about the right log level either. Your arguments are
> >>> convincing
> >>>> --
> >>>>>>>> thus, I am fine with keeping WARN level.
> >>>>>>>>
> >>>>>>>> The task vs thread level argument is an interesting one.
> >> However, I
> >>>> am
> >>>>>>>> wondering if we should add this information into the
> >> corresponding
> >>>> WARN
> >>>>>>>> logs that we write anyway? For this case, we can also log the
> >>>>>>>> corresponding operator (and other information like topic name etc
> >>> if
> >>>>>>>> needed). WDYT about this?
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> -Matthias
> >>>>>>>>
> >>>>>>>> On 4/2/18 8:31 PM, Guozhang Wang wrote:
> >>>>>>>>> Regarding logging: I'm inclined to keep logging at WARN level
> >>> since
> >>>>>>>> skipped
> >>>>>>>>> records are not expected in normal execution (for all reasons
> >> that
> >>>> we
> >>>>>>> are
> >>>>>>>>> aware of), and hence when error happens users should be alerted
> >>> from
> >>>>>>>>> metrics and looked into the log files, so to me if it is really
> >>>>>>> spamming
> >>>>>>>>> the log files it is also a good alert for users. Besides for
> >>>>>>> deserialize
> >>>>>>>>> errors we already log at WARN level for this reason.
> >>>>>>>>>
> >>>>>>>>> Regarding the metrics-levels: I was pondering on that as well.
> >>> What
> >>>>>>> made
> >>>>>>>> me
> >>>>>>>>> to think and agree on task-level than thread-level is that for
> >>> some
> >>>>>>>> reasons
> >>>>>>>>> like window retention, they may possibly be happening on a
> >> subset
> >>> of
> >>>>>>>> input
> >>>>>>>>> partitions, and tasks are correlated with partitions the
> >>> task-level
> >>>>>>>> metrics
> >>>>>>>>> can help users to narrow down on the specific input data
> >>> partitions.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Guozhang
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Mon, Apr 2, 2018 at 6:43 PM, John Roesler <[email protected]
> >>>
> >>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> Hi Matthias,
> >>>>>>>>>>
> >>>>>>>>>> No worries! Thanks for the reply.
> >>>>>>>>>>
> >>>>>>>>>> 1) There isn't a connection. I tried using the
> >> TopologyTestDriver
> >>>> to
> >>>>>>>> write
> >>>>>>>>>> a quick test exercising the current behavior and discovered
> >> that
> >>>> the
> >>>>>>>>>> metrics weren't available. It seemed like they should be, so I
> >>>> tacked
> >>>>>>>> it on
> >>>>>>>>>> to this KIP. If you feel it's inappropriate, I can pull it back
> >>>> out.
> >>>>>>>>>>
> >>>>>>>>>> 2) I was also concerned about that, but I figured it would come
> >>> up
> >>>> in
> >>>>>>>>>> discussion if I just went ahead and proposed it. And here we
> >> are!
> >>>>>>>>>>
> >>>>>>>>>> Here's my thought: maybe there are two classes of skips:
> >>>> "controlled"
> >>>>>>>> and
> >>>>>>>>>> "uncontrolled", where "controlled" means, as an app author, I
> >>>>>>>> deliberately
> >>>>>>>>>> filter out some events, and "uncontrolled" means that I simply
> >>>> don't
> >>>>>>>>>> account for some feature of the data, and the framework skips
> >>> them
> >>>>> (as
> >>>>>>>>>> opposed to crashing).
> >>>>>>>>>>
> >>>>>>>>>> In this breakdowns, the skips I'm adding metrics for are all
> >>>>>>>> uncontrolled
> >>>>>>>>>> skips (and we hope to measure all the uncontrolled skips). Our
> >>>> skips
> >>>>>>> are
> >>>>>>>>>> well documented, so it wouldn't be terrible to have an
> >>> application
> >>>> in
> >>>>>>>> which
> >>>>>>>>>> you know you expect to have tons of uncontrolled skips, but
> >> it's
> >>>> not
> >>>>>>>> great
> >>>>>>>>>> either, since you may also have some *unexpected* uncontrolled
> >>>> skips.
> >>>>>>>> It'll
> >>>>>>>>>> be difficult to notice, since you're probably not alerting on
> >> the
> >>>>>>> metric
> >>>>>>>>>> and filtering out the logs (whatever their level).
> >>>>>>>>>>
> >>>>>>>>>> I'd recommend any app author, as an alternative, to convert all
> >>>>>>> expected
> >>>>>>>>>> skips to controlled ones, by updating the topology to filter
> >>> those
> >>>>>>>> records
> >>>>>>>>>> out.
> >>>>>>>>>>
> >>>>>>>>>> Following from my recommendation, as a library author, I'm
> >>> inclined
> >>>>> to
> >>>>>>>> mark
> >>>>>>>>>> those logs WARN, since in my opinion, they should be concerning
> >>> to
> >>>>> the
> >>>>>>>> app
> >>>>>>>>>> authors. I'd definitely want to show, rather than hide, them by
> >>>>>>>> default, so
> >>>>>>>>>> I would pick INFO at least.
> >>>>>>>>>>
> >>>>>>>>>> That said, logging is always a tricky issue for lower-level
> >>>> libraries
> >>>>>>>> that
> >>>>>>>>>> run inside user code, since we don't have all the information
> >> we
> >>>> need
> >>>>>>> to
> >>>>>>>>>> make the right call.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On your last note, yeah, I got that impression from Guozhang as
> >>>> well.
> >>>>>>>>>> Thanks for the clarification.
> >>>>>>>>>>
> >>>>>>>>>> -John
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On Mon, Apr 2, 2018 at 4:03 PM, Matthias J. Sax <
> >>>>>>> [email protected]>
> >>>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> John,
> >>>>>>>>>>>
> >>>>>>>>>>> sorry for my late reply and thanks for updating the KIP.
> >>>>>>>>>>>
> >>>>>>>>>>> I like your approach about "metrics are for monitoring, logs
> >> are
> >>>> for
> >>>>>>>>>>> debugging" -- however:
> >>>>>>>>>>>
> >>>>>>>>>>> 1) I don't see a connection between this and the task-level
> >>>> metrics
> >>>>>>>> that
> >>>>>>>>>>> you propose to get the metrics in `TopologyTestDriver`. I
> >> don't
> >>>>> think
> >>>>>>>>>>> people would monitor the `TopologyTestDriver` an thus
> >> wondering
> >>>> why
> >>>>>>> it
> >>>>>>>>>>> is important to include the metrics there? Thread-level metric
> >>>> might
> >>>>>>> be
> >>>>>>>>>>> easier to monitor though (ie, less different metric to
> >> monitor).
> >>>>>>>>>>>
> >>>>>>>>>>> 2) I am a little worried about WARN level logging and that it
> >>>> might
> >>>>>>> be
> >>>>>>>>>>> too chatty -- as you pointed out, it's about debugging, thus
> >>> DEBUG
> >>>>>>>> level
> >>>>>>>>>>> might be better. Not 100% sure about this to be honest. What
> >> is
> >>>> the
> >>>>>>>>>>> general assumption about the frequency for skipped records? I
> >>>> could
> >>>>>>>>>>> imagine cases for which skipped records are quite frequent and
> >>>> thus,
> >>>>>>>>>>> WARN level logs might "flood" the logs
> >>>>>>>>>>>
> >>>>>>>>>>> One final remark:
> >>>>>>>>>>>
> >>>>>>>>>>>> More
> >>>>>>>>>>>> generally, I would like to establish a pattern in which we
> >>> could
> >>>>> add
> >>>>>>>>>> new
> >>>>>>>>>>>> values for the "reason" tags without needing a KIP to do so.
> >>>>>>>>>>>
> >>>>>>>>>>> From my understanding, this is not feasible. Changing metrics
> >> is
> >>>>>>> always
> >>>>>>>>>>> considered a public API change, and we need a KIP for any
> >>> change.
> >>>> As
> >>>>>>> we
> >>>>>>>>>>> moved away from tagging, it doesn't matter for the KIP anymore
> >>> --
> >>>>>>> just
> >>>>>>>>>>> wanted to point it out.
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> -Matthias
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On 3/30/18 2:47 PM, John Roesler wrote:
> >>>>>>>>>>>> Allrighty! The KIP is updated.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Thanks again, all, for the feedback.
> >>>>>>>>>>>> -John
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Fri, Mar 30, 2018 at 3:35 PM, John Roesler <
> >>> [email protected]
> >>>>>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Hey Guozhang and Bill,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Ok, I'll update the KIP. At the risk of disturbing
> >> consensus,
> >>>> I'd
> >>>>>>>> like
> >>>>>>>>>>> to
> >>>>>>>>>>>>> put it in the task instead of the thread so that it'll show
> >> up
> >>>> in
> >>>>>>> the
> >>>>>>>>>>>>> TopologyTestDriver metrics as well.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I'm leaning toward keeping the scope where it is right now,
> >>> but
> >>>> if
> >>>>>>>>>>> others
> >>>>>>>>>>>>> want to advocate for tossing in some more metrics, we can go
> >>>> that
> >>>>>>>>>> route.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Thanks all,
> >>>>>>>>>>>>> -John
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Fri, Mar 30, 2018 at 2:37 PM, Bill Bejeck <
> >>> [email protected]
> >>>>>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> Thanks for the KIP John, and sorry for the late comments.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I'm on the fence with providing a single level metrics,
> >> but I
> >>>>>>> think
> >>>>>>>>>>> we'll
> >>>>>>>>>>>>>> have that discussion outside of this KIP.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> * maintain one skipped-record metric (could be per-thread,
> >>>>>>>> per-task,
> >>>>>>>>>>> or
> >>>>>>>>>>>>>>> per-processor-node) with no "reason"
> >>>>>>>>>>>>>>> * introduce a warn-level log detailing the
> >>>>> topic/partition/offset
> >>>>>>>>>> and
> >>>>>>>>>>>>>>> reason of the skipped record
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I'm +1 on both of these suggestions.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Finally, we have had requests in the past for some metrics
> >>>> around
> >>>>>>>>>> when
> >>>>>>>>>>>>>> persistent store removes an expired window.  Would adding
> >>> that
> >>>> to
> >>>>>>>> our
> >>>>>>>>>>>>>> metrics stretch the scope of this KIP too much?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Thanks again and overall I'm +1 on this KIP
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Bill
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Fri, Mar 30, 2018 at 2:00 PM, Guozhang Wang <
> >>>>>>> [email protected]>
> >>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> The proposal sounds good to me. About "maintain only one
> >>> level
> >>>>> of
> >>>>>>>>>>>>>> metrics"
> >>>>>>>>>>>>>>> maybe we can discuss about that separately from this KIP
> >>> since
> >>>>>>> that
> >>>>>>>>>>>>>> would
> >>>>>>>>>>>>>>> be a larger scope of discussion. I agree that if we are
> >>> going
> >>>> to
> >>>>>>>>>>>>>> maintain
> >>>>>>>>>>>>>>> only one-level metrics it should be lowest level and we
> >>> would
> >>>>> let
> >>>>>>>>>>> users
> >>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>> do the roll-ups themselves, but I'm still not fully
> >>> convinced
> >>>>>>> that
> >>>>>>>>>> we
> >>>>>>>>>>>>>>> should just provide single-level metrics, because 1) I
> >> think
> >>>> for
> >>>>>>>>>>>>>> different
> >>>>>>>>>>>>>>> metrics people may be interested to investigate into
> >>> different
> >>>>>>>>>>>>>>> granularities, e.g. for poll / commit rate these are at
> >> the
> >>>>>>> lowest
> >>>>>>>>>>>>>>> task-level metrics, while for process-rate / skip-rate
> >> they
> >>>> can
> >>>>>>> be
> >>>>>>>>>> as
> >>>>>>>>>>>>>> low
> >>>>>>>>>>>>>>> as processor-node metrics, and 2) user-side rolling ups
> >> may
> >>>> not
> >>>>>>> be
> >>>>>>>>>>> very
> >>>>>>>>>>>>>>> straight-forward. But for 2) if someone can provide an
> >>>> efficient
> >>>>>>>> and
> >>>>>>>>>>>>>> easy
> >>>>>>>>>>>>>>> implementation of that I can be persuaded :)
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> For now I'm thinking we can add the metric on
> >> thread-level,
> >>>>>>> either
> >>>>>>>>>>> with
> >>>>>>>>>>>>>>> finer grained ones with "reason" tag plus an aggregated
> >> one
> >>>>>>> without
> >>>>>>>>>>> the
> >>>>>>>>>>>>>>> tag, or just having a single aggregated metric without the
> >>> tag
> >>>>>>>> looks
> >>>>>>>>>>>>>> good
> >>>>>>>>>>>>>>> to me.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Guozhang
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On Fri, Mar 30, 2018 at 8:05 AM, John Roesler <
> >>>>> [email protected]
> >>>>>>>>
> >>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Hey Guozhang,
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Thanks for the reply. Regarding JMX, I can dig it. I'll
> >>>> provide
> >>>>>>> a
> >>>>>>>>>>>>>> list in
> >>>>>>>>>>>>>>>> the KIP. I was also thinking we'd better start a
> >>>> documentation
> >>>>>>>> page
> >>>>>>>>>>>>>> with
> >>>>>>>>>>>>>>>> the metrics listed.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> I'd have no problem logging a warning when we skip
> >> records.
> >>>> On
> >>>>>>> the
> >>>>>>>>>>>>>> metric
> >>>>>>>>>>>>>>>> front, really I'm just pushing for us to maintain only
> >> one
> >>>>> level
> >>>>>>>> of
> >>>>>>>>>>>>>>>> metrics. If that's more or less granular (i.e., maybe we
> >>>> don't
> >>>>>>>>>> have a
> >>>>>>>>>>>>>>>> metric per reason and log the reason instead), that's
> >> fine
> >>> by
> >>>>>>> me.
> >>>>>>>> I
> >>>>>>>>>>>>>> just
> >>>>>>>>>>>>>>>> don't think it provides a lot of extra value per
> >> complexity
> >>>>>>>>>>> (interface
> >>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>> implementation) to maintain roll-ups at the thread level
> >> in
> >>>>>>>>>> addition
> >>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>> lower-level metrics.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> How about this instead:
> >>>>>>>>>>>>>>>> * maintain one skipped-record metric (could be
> >> per-thread,
> >>>>>>>>>> per-task,
> >>>>>>>>>>>>>> or
> >>>>>>>>>>>>>>>> per-processor-node) with no "reason"
> >>>>>>>>>>>>>>>> * introduce a warn-level log detailing the
> >>>>>>> topic/partition/offset
> >>>>>>>>>> and
> >>>>>>>>>>>>>>>> reason of the skipped record
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> If you like that, I can update the KIP.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>>>> -John
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> On Thu, Mar 29, 2018 at 6:22 PM, Guozhang Wang <
> >>>>>>>> [email protected]
> >>>>>>>>>>>
> >>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> One thing you mention is the notion of setting alerts
> >> on
> >>>>>>> coarser
> >>>>>>>>>>>>>>>> metrics
> >>>>>>>>>>>>>>>>> being easier than finer ones. All the metric alerting
> >>>> systems
> >>>>> I
> >>>>>>>>>> have
> >>>>>>>>>>>>>>> used
> >>>>>>>>>>>>>>>>> make it equally easy to alert on metrics by-tag or over
> >>>> tags.
> >>>>>>> So
> >>>>>>>>>> my
> >>>>>>>>>>>>>>>>> experience doesn't say that this is a use case. Were you
> >>>>>>> thinking
> >>>>>>>>>>>>>> of an
> >>>>>>>>>>>>>>>>> alerting system that makes such a pre-aggregation
> >>> valuable?
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> For the commonly used JMX reporter tags will be encoded
> >>>>>>> directly
> >>>>>>>>>> as
> >>>>>>>>>>>>>>> part
> >>>>>>>>>>>>>>>> of
> >>>>>>>>>>>>>>>>> the object name, and if users wants to monitor them they
> >>>> need
> >>>>>>> to
> >>>>>>>>>>>>>> know
> >>>>>>>>>>>>>>>> these
> >>>>>>>>>>>>>>>>> values before hand. That is also why I think we do want
> >> to
> >>>>> list
> >>>>>>>>>> all
> >>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>> possible values of the reason tags in the KIP, since
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> In my email in response to Matthias, I gave an example
> >> of
> >>>> the
> >>>>>>>>>>>>>> kind of
> >>>>>>>>>>>>>>>>> scenario that would lead me as an operator to run with
> >>> DEBUG
> >>>>> on
> >>>>>>>>>> all
> >>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>> time, since I wouldn't be sure, having seen a skipped
> >>> record
> >>>>>>>> once,
> >>>>>>>>>>>>>> that
> >>>>>>>>>>>>>>>> it
> >>>>>>>>>>>>>>>>> would ever happen again. The solution is to capture all
> >>> the
> >>>>>>>>>>>>>> available
> >>>>>>>>>>>>>>>>> information about the reason and location of skips all
> >> the
> >>>>>>> time.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> That is a good point. I think we can either expose all
> >>>> levels
> >>>>>>>>>>>>>> metrics
> >>>>>>>>>>>>>>> as
> >>>>>>>>>>>>>>>> by
> >>>>>>>>>>>>>>>>> default, or only expose the most lower-level metrics and
> >>> get
> >>>>>>> rid
> >>>>>>>>>> of
> >>>>>>>>>>>>>>> other
> >>>>>>>>>>>>>>>>> levels to let users do roll-ups themselves (which will
> >> be
> >>> a
> >>>>>>> much
> >>>>>>>>>>>>>> larger
> >>>>>>>>>>>>>>>>> scope for discussion), or we can encourage users to not
> >>>> purely
> >>>>>>>>>>>>>> depend
> >>>>>>>>>>>>>>> on
> >>>>>>>>>>>>>>>>> metrics for such trouble shooting: that is to say, users
> >>>> only
> >>>>>>> be
> >>>>>>>>>>>>>>> alerted
> >>>>>>>>>>>>>>>>> based on metrics, and we can log a info / warn log4j
> >> entry
> >>>>> each
> >>>>>>>>>>>>>> time we
> >>>>>>>>>>>>>>>> are
> >>>>>>>>>>>>>>>>> about to skip a record all over the places, so that upon
> >>>> being
> >>>>>>>>>>>>>> notified
> >>>>>>>>>>>>>>>>> users can look into the logs to find the details on
> >> where
> >>> /
> >>>>>>> when
> >>>>>>>>>> it
> >>>>>>>>>>>>>>>>> happens. WDYT?
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Guozhang
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> On Thu, Mar 29, 2018 at 3:57 PM, John Roesler <
> >>>>>>> [email protected]
> >>>>>>>>>
> >>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Hey Guozhang,
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Thanks for the review.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> 1.
> >>>>>>>>>>>>>>>>>> Matthias raised the same question about the "reason"
> >> tag
> >>>>>>> values.
> >>>>>>>>>> I
> >>>>>>>>>>>>>>> can
> >>>>>>>>>>>>>>>>> list
> >>>>>>>>>>>>>>>>>> all possible values of the "reason" tag, but I'm
> >> thinking
> >>>>> this
> >>>>>>>>>>>>>> level
> >>>>>>>>>>>>>>> of
> >>>>>>>>>>>>>>>>>> detail may not be KIP-worthy, maybe the code and
> >>>>> documentation
> >>>>>>>>>>>>>> review
> >>>>>>>>>>>>>>>>> would
> >>>>>>>>>>>>>>>>>> be sufficient. If you all disagree and would like it
> >>>> included
> >>>>>>> in
> >>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>> KIP, I
> >>>>>>>>>>>>>>>>>> can certainly do that.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> If we do provide roll-up metrics, I agree with the
> >>> pattern
> >>>> of
> >>>>>>>>>>>>>> keeping
> >>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>> same name but eliminating the tags for the dimensions
> >>> that
> >>>>>>> were
> >>>>>>>>>>>>>>>>> rolled-up.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> 2.
> >>>>>>>>>>>>>>>>>> I'm not too sure that implementation efficiency really
> >>>>>>> becomes a
> >>>>>>>>>>>>>>> factor
> >>>>>>>>>>>>>>>>> in
> >>>>>>>>>>>>>>>>>> choosing whether to (by default) update one coarse
> >> metric
> >>>> at
> >>>>>>> the
> >>>>>>>>>>>>>>> thread
> >>>>>>>>>>>>>>>>>> level or one granular metric at the processor-node
> >> level,
> >>>>>>> since
> >>>>>>>>>>>>>> it's
> >>>>>>>>>>>>>>>> just
> >>>>>>>>>>>>>>>>>> one metric being updated either way. I do agree that if
> >>> we
> >>>>>>> were
> >>>>>>>>>> to
> >>>>>>>>>>>>>>>> update
> >>>>>>>>>>>>>>>>>> the granular metrics and multiple roll-ups, then we
> >>> should
> >>>>>>>>>>>>>> consider
> >>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>> efficiency.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> I agree it's probably not necessary to surface the
> >>> metrics
> >>>>> for
> >>>>>>>>>> all
> >>>>>>>>>>>>>>>> nodes
> >>>>>>>>>>>>>>>>>> regardless of whether they can or do skip records.
> >>> Perhaps
> >>>> we
> >>>>>>>> can
> >>>>>>>>>>>>>>>> lazily
> >>>>>>>>>>>>>>>>>> register the metrics.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> In my email in response to Matthias, I gave an example
> >> of
> >>>> the
> >>>>>>>>>>>>>> kind of
> >>>>>>>>>>>>>>>>>> scenario that would lead me as an operator to run with
> >>>> DEBUG
> >>>>>>> on
> >>>>>>>>>>>>>> all
> >>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>> time, since I wouldn't be sure, having seen a skipped
> >>>> record
> >>>>>>>>>> once,
> >>>>>>>>>>>>>>> that
> >>>>>>>>>>>>>>>>> it
> >>>>>>>>>>>>>>>>>> would ever happen again. The solution is to capture all
> >>> the
> >>>>>>>>>>>>>> available
> >>>>>>>>>>>>>>>>>> information about the reason and location of skips all
> >>> the
> >>>>>>> time.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> One thing you mention is the notion of setting alerts
> >> on
> >>>>>>> coarser
> >>>>>>>>>>>>>>>> metrics
> >>>>>>>>>>>>>>>>>> being easier than finer ones. All the metric alerting
> >>>> systems
> >>>>>>> I
> >>>>>>>>>>>>>> have
> >>>>>>>>>>>>>>>> used
> >>>>>>>>>>>>>>>>>> make it equally easy to alert on metrics by-tag or over
> >>>> tags.
> >>>>>>> So
> >>>>>>>>>>>>>> my
> >>>>>>>>>>>>>>>>>> experience doesn't say that this is a use case. Were
> >> you
> >>>>>>>> thinking
> >>>>>>>>>>>>>> of
> >>>>>>>>>>>>>>> an
> >>>>>>>>>>>>>>>>>> alerting system that makes such a pre-aggregation
> >>> valuable?
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Thanks again,
> >>>>>>>>>>>>>>>>>> -John
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> On Thu, Mar 29, 2018 at 5:24 PM, Guozhang Wang <
> >>>>>>>>>>>>>> [email protected]>
> >>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Hello John,
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Thanks for the KIP. Some comments:
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> 1. Could you list all the possible values of the
> >>> "reason"
> >>>>>>> tag?
> >>>>>>>>>>>>>> In
> >>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>> JIRA
> >>>>>>>>>>>>>>>>>>> ticket I left some potential reasons but I'm not clear
> >>> if
> >>>>>>>> you're
> >>>>>>>>>>>>>>>> going
> >>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>> categorize each of them as a separate reason, or is
> >>> there
> >>>>> any
> >>>>>>>>>>>>>>>>> additional
> >>>>>>>>>>>>>>>>>>> ones you have in mind.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Also I'm wondering if we should add another metric
> >> that
> >>> do
> >>>>>>> not
> >>>>>>>>>>>>>> have
> >>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>> reason tag but aggregates among all possible reasons?
> >>> This
> >>>>> is
> >>>>>>>>>>>>>> for
> >>>>>>>>>>>>>>>> users
> >>>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>> easily set their alerting notifications (otherwise
> >> they
> >>>> have
> >>>>>>> to
> >>>>>>>>>>>>>>> write
> >>>>>>>>>>>>>>>>> on
> >>>>>>>>>>>>>>>>>>> notification rule per reason) in their monitoring
> >>> systems.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> 2. Note that the processor-node metrics is actually
> >>>>>>>> "per-thread,
> >>>>>>>>>>>>>>>>>> per-task,
> >>>>>>>>>>>>>>>>>>> per-processor-node", and today we only set the
> >>> per-thread
> >>>>>>>>>>>>>> metrics
> >>>>>>>>>>>>>>> as
> >>>>>>>>>>>>>>>>> INFO
> >>>>>>>>>>>>>>>>>>> while leaving the lower two layers as DEBUG. I agree
> >>> with
> >>>>>>> your
> >>>>>>>>>>>>>>>> argument
> >>>>>>>>>>>>>>>>>>> that we are missing the per-client roll-up metrics
> >>> today,
> >>>>> but
> >>>>>>>>>>>>>> I'm
> >>>>>>>>>>>>>>>>>> convinced
> >>>>>>>>>>>>>>>>>>> that the right way to approach it would be
> >>>>>>>>>>>>>>>> "just-providing-the-lowest-
> >>>>>>>>>>>>>>>>>>> level
> >>>>>>>>>>>>>>>>>>> metrics only".
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Note the recoding implementation of these three levels
> >>> are
> >>>>>>>>>>>>>>> different
> >>>>>>>>>>>>>>>>>>> internally today: we did not just do the rolling up to
> >>>>>>> generate
> >>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>> higher-level metrics from the lower level ones, but we
> >>>> just
> >>>>>>>>>>>>>> record
> >>>>>>>>>>>>>>>> them
> >>>>>>>>>>>>>>>>>>> separately, which means that, if we turn on multiple
> >>>> levels
> >>>>>>> of
> >>>>>>>>>>>>>>>> metrics,
> >>>>>>>>>>>>>>>>>> we
> >>>>>>>>>>>>>>>>>>> maybe duplicate collecting some metrics. One can argue
> >>>> that
> >>>>>>> is
> >>>>>>>>>>>>>> not
> >>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>> best
> >>>>>>>>>>>>>>>>>>> way to represent multi-level metrics collecting and
> >>>>>>> reporting,
> >>>>>>>>>>>>>> but
> >>>>>>>>>>>>>>> by
> >>>>>>>>>>>>>>>>>> only
> >>>>>>>>>>>>>>>>>>> enabling thread-level metrics as INFO today, that
> >>>>>>>> implementation
> >>>>>>>>>>>>>>>> could
> >>>>>>>>>>>>>>>>> be
> >>>>>>>>>>>>>>>>>>> more efficient than only collecting the metrics at the
> >>>>> lowest
> >>>>>>>>>>>>>>> level,
> >>>>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>>> then do the roll-up calculations outside of the
> >> metrics
> >>>>>>>> classes.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Plus, today not all processor-nodes may possibly skip
> >>>>>>> records,
> >>>>>>>>>>>>>>> AFAIK
> >>>>>>>>>>>>>>>> we
> >>>>>>>>>>>>>>>>>>> will only skip records at the source, sink, window and
> >>>>>>>>>>>>>> aggregation
> >>>>>>>>>>>>>>>>>>> processor nodes, so adding a metric per processor
> >> looks
> >>>> like
> >>>>>>> an
> >>>>>>>>>>>>>>>>> overkill
> >>>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>> me as well. On the other hand, from user's perspective
> >>> the
> >>>>>>>>>>>>>> "reason"
> >>>>>>>>>>>>>>>> tag
> >>>>>>>>>>>>>>>>>> may
> >>>>>>>>>>>>>>>>>>> be sufficient for them to narrow down where inside the
> >>>>>>> topology
> >>>>>>>>>>>>>> is
> >>>>>>>>>>>>>>>>>> causing
> >>>>>>>>>>>>>>>>>>> records to be dropped on the floor. So I think the
> >>>>>>> "per-thread,
> >>>>>>>>>>>>>>>>> per-task"
> >>>>>>>>>>>>>>>>>>> level metrics should be sufficient for them in trouble
> >>>> shoot
> >>>>>>> in
> >>>>>>>>>>>>>>> DEBUG
> >>>>>>>>>>>>>>>>>> mode,
> >>>>>>>>>>>>>>>>>>> and we can add another "per-thread" level metrics as
> >>> INFO
> >>>>>>> which
> >>>>>>>>>>>>>> is
> >>>>>>>>>>>>>>>>> turned
> >>>>>>>>>>>>>>>>>>> on by default. So under normal execution users still
> >>> only
> >>>>>>> need
> >>>>>>>>>>>>>> INFO
> >>>>>>>>>>>>>>>>> level
> >>>>>>>>>>>>>>>>>>> metrics for alerting (e.g. set alerts on all
> >>>> skipped-records
> >>>>>>>>>>>>>>> metrics
> >>>>>>>>>>>>>>>> as
> >>>>>>>>>>>>>>>>>>> non-zero), and then upon trouble shooting they can
> >> turn
> >>> on
> >>>>>>>> DEBUG
> >>>>>>>>>>>>>>>>> metrics
> >>>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>> look into which task is actually causing the skipped
> >>>>> records.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Guozhang
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> On Thu, Mar 29, 2018 at 2:03 PM, Matthias J. Sax <
> >>>>>>>>>>>>>>>>> [email protected]>
> >>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Thanks for the KIP John.
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Reading the material on the related Jiras, I am
> >>> wondering
> >>>>>>> what
> >>>>>>>>>>>>>>>>> `reason`
> >>>>>>>>>>>>>>>>>>>> tags you want to introduce? Can you elaborate? The
> >> KIP
> >>>>>>> should
> >>>>>>>>>>>>>>> list
> >>>>>>>>>>>>>>>>>> those
> >>>>>>>>>>>>>>>>>>>> IMHO.
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> About the fine grained metrics vs the roll-up: you
> >> say
> >>>> that
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> the coarse metric aggregates across two dimensions
> >>>>>>>>>>>>>>> simultaneously
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Can you elaborate why this is an issue? I am not
> >>>> convinced
> >>>>>>> atm
> >>>>>>>>>>>>>>> that
> >>>>>>>>>>>>>>>>> we
> >>>>>>>>>>>>>>>>>>>> should put the fine grained metrics into INFO level
> >> and
> >>>>>>> remove
> >>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>> roll-up at thread level.
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Given that they have to do this sum to get a usable
> >>>>>>>>>>>>>> top-level
> >>>>>>>>>>>>>>>> view
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> This is a fair concern, but I don't share the
> >>> conclusion.
> >>>>>>>>>>>>>>> Offering
> >>>>>>>>>>>>>>>> a
> >>>>>>>>>>>>>>>>>>>> built-in `KafkaStreams` "client" roll-up out of the
> >> box
> >>>>>>> might
> >>>>>>>>>>>>>> be
> >>>>>>>>>>>>>>> a
> >>>>>>>>>>>>>>>>>>>> better solution. In the past we did not offer this
> >> due
> >>> to
> >>>>>>>>>>>>>>>> performance
> >>>>>>>>>>>>>>>>>>>> concerns, but we could allow an "opt-in" mechanism.
> >> If
> >>>> you
> >>>>>>>>>>>>>>>> disagree,
> >>>>>>>>>>>>>>>>>> can
> >>>>>>>>>>>>>>>>>>>> you provide some reasoning and add them to the
> >>> "Rejected
> >>>>>>>>>>>>>>>>> alternatives"
> >>>>>>>>>>>>>>>>>>>> section.
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> To rephrase: I understand the issue about missing
> >>>> top-level
> >>>>>>>>>>>>>> view,
> >>>>>>>>>>>>>>>> but
> >>>>>>>>>>>>>>>>>>>> instead of going more fine grained, we should
> >> consider
> >>> to
> >>>>>>> add
> >>>>>>>>>>>>>>> this
> >>>>>>>>>>>>>>>>>>>> top-level view and add/keep the fine grained metrics
> >> at
> >>>>>>> DEBUG
> >>>>>>>>>>>>>>> level
> >>>>>>>>>>>>>>>>>> only
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> I am +1 to add TopologyTestDriver#metrics() and to
> >>> remove
> >>>>>>> old
> >>>>>>>>>>>>>>>> metrics
> >>>>>>>>>>>>>>>>>>>> directly as you suggested.
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> -Matthias
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> On 3/28/18 6:42 PM, Ted Yu wrote:
> >>>>>>>>>>>>>>>>>>>>> Looks good to me.
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> On Wed, Mar 28, 2018 at 3:11 PM, John Roesler <
> >>>>>>>>>>>>>>> [email protected]
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> Hello all,
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> I am proposing KIP-274 to improve the metrics
> >> around
> >>>>>>>>>>>>>> skipped
> >>>>>>>>>>>>>>>>> records
> >>>>>>>>>>>>>>>>>>> in
> >>>>>>>>>>>>>>>>>>>>>> Streams.
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> Please find the details here:
> >>>>>>>>>>>>>>>>>>>>>> https://cwiki.apache.org/
> >>> confluence/display/KAFKA/KIP-
> >>>>>>>>>>>>>>>>>>>>>> 274%3A+Kafka+Streams+Skipped+Records+Metrics
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> Please let me know what you think!
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>>>>>>>>>> -John
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>>>> -- Guozhang
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>> -- Guozhang
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>> -- Guozhang
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>> --
> >>>> -- Guozhang
> >>>>
> >>>
> >>
> >
> >
> >
>
>

Re: [DISCUSS] KIP-274: Kafka Streams Skipped Records Metrics

Reply via email to