I agree we should add as much information as is reasonable to the log. For example, see this WIP PR I started for this KIP:
https://github.com/apache/kafka/pull/4812/files#diff-88d129f048bc842c7db5b2566a45fce8R80 and https://github.com/apache/kafka/pull/4812/files#diff-69e6789eb675ec978a1abd24fed96eb1R111 I'm not sure if we should nail down the log messages in the KIP or in the PR discussion. What say you? Thanks, -John On Tue, Apr 3, 2018 at 12:20 AM, Matthias J. Sax <matth...@confluent.io> wrote: > Thanks for sharing your thoughts. As I mentioned originally, I am not > sure about the right log level either. Your arguments are convincing -- > thus, I am fine with keeping WARN level. > > The task vs thread level argument is an interesting one. However, I am > wondering if we should add this information into the corresponding WARN > logs that we write anyway? For this case, we can also log the > corresponding operator (and other information like topic name etc if > needed). WDYT about this? > > > -Matthias > > On 4/2/18 8:31 PM, Guozhang Wang wrote: > > Regarding logging: I'm inclined to keep logging at WARN level since > skipped > > records are not expected in normal execution (for all reasons that we are > > aware of), and hence when error happens users should be alerted from > > metrics and looked into the log files, so to me if it is really spamming > > the log files it is also a good alert for users. Besides for deserialize > > errors we already log at WARN level for this reason. > > > > Regarding the metrics-levels: I was pondering on that as well. What made > me > > to think and agree on task-level than thread-level is that for some > reasons > > like window retention, they may possibly be happening on a subset of > input > > partitions, and tasks are correlated with partitions the task-level > metrics > > can help users to narrow down on the specific input data partitions. > > > > > > Guozhang > > > > > > On Mon, Apr 2, 2018 at 6:43 PM, John Roesler <j...@confluent.io> wrote: > > > >> Hi Matthias, > >> > >> No worries! Thanks for the reply. > >> > >> 1) There isn't a connection. I tried using the TopologyTestDriver to > write > >> a quick test exercising the current behavior and discovered that the > >> metrics weren't available. It seemed like they should be, so I tacked > it on > >> to this KIP. If you feel it's inappropriate, I can pull it back out. > >> > >> 2) I was also concerned about that, but I figured it would come up in > >> discussion if I just went ahead and proposed it. And here we are! > >> > >> Here's my thought: maybe there are two classes of skips: "controlled" > and > >> "uncontrolled", where "controlled" means, as an app author, I > deliberately > >> filter out some events, and "uncontrolled" means that I simply don't > >> account for some feature of the data, and the framework skips them (as > >> opposed to crashing). > >> > >> In this breakdowns, the skips I'm adding metrics for are all > uncontrolled > >> skips (and we hope to measure all the uncontrolled skips). Our skips are > >> well documented, so it wouldn't be terrible to have an application in > which > >> you know you expect to have tons of uncontrolled skips, but it's not > great > >> either, since you may also have some *unexpected* uncontrolled skips. > It'll > >> be difficult to notice, since you're probably not alerting on the metric > >> and filtering out the logs (whatever their level). > >> > >> I'd recommend any app author, as an alternative, to convert all expected > >> skips to controlled ones, by updating the topology to filter those > records > >> out. > >> > >> Following from my recommendation, as a library author, I'm inclined to > mark > >> those logs WARN, since in my opinion, they should be concerning to the > app > >> authors. I'd definitely want to show, rather than hide, them by > default, so > >> I would pick INFO at least. > >> > >> That said, logging is always a tricky issue for lower-level libraries > that > >> run inside user code, since we don't have all the information we need to > >> make the right call. > >> > >> > >> > >> On your last note, yeah, I got that impression from Guozhang as well. > >> Thanks for the clarification. > >> > >> -John > >> > >> > >> > >> On Mon, Apr 2, 2018 at 4:03 PM, Matthias J. Sax <matth...@confluent.io> > >> wrote: > >> > >>> John, > >>> > >>> sorry for my late reply and thanks for updating the KIP. > >>> > >>> I like your approach about "metrics are for monitoring, logs are for > >>> debugging" -- however: > >>> > >>> 1) I don't see a connection between this and the task-level metrics > that > >>> you propose to get the metrics in `TopologyTestDriver`. I don't think > >>> people would monitor the `TopologyTestDriver` an thus wondering why it > >>> is important to include the metrics there? Thread-level metric might be > >>> easier to monitor though (ie, less different metric to monitor). > >>> > >>> 2) I am a little worried about WARN level logging and that it might be > >>> too chatty -- as you pointed out, it's about debugging, thus DEBUG > level > >>> might be better. Not 100% sure about this to be honest. What is the > >>> general assumption about the frequency for skipped records? I could > >>> imagine cases for which skipped records are quite frequent and thus, > >>> WARN level logs might "flood" the logs > >>> > >>> One final remark: > >>> > >>>> More > >>>> generally, I would like to establish a pattern in which we could add > >> new > >>>> values for the "reason" tags without needing a KIP to do so. > >>> > >>> From my understanding, this is not feasible. Changing metrics is always > >>> considered a public API change, and we need a KIP for any change. As we > >>> moved away from tagging, it doesn't matter for the KIP anymore -- just > >>> wanted to point it out. > >>> > >>> > >>> -Matthias > >>> > >>> > >>> On 3/30/18 2:47 PM, John Roesler wrote: > >>>> Allrighty! The KIP is updated. > >>>> > >>>> Thanks again, all, for the feedback. > >>>> -John > >>>> > >>>> On Fri, Mar 30, 2018 at 3:35 PM, John Roesler <j...@confluent.io> > >> wrote: > >>>> > >>>>> Hey Guozhang and Bill, > >>>>> > >>>>> Ok, I'll update the KIP. At the risk of disturbing consensus, I'd > like > >>> to > >>>>> put it in the task instead of the thread so that it'll show up in the > >>>>> TopologyTestDriver metrics as well. > >>>>> > >>>>> I'm leaning toward keeping the scope where it is right now, but if > >>> others > >>>>> want to advocate for tossing in some more metrics, we can go that > >> route. > >>>>> > >>>>> Thanks all, > >>>>> -John > >>>>> > >>>>> On Fri, Mar 30, 2018 at 2:37 PM, Bill Bejeck <bbej...@gmail.com> > >> wrote: > >>>>> > >>>>>> Thanks for the KIP John, and sorry for the late comments. > >>>>>> > >>>>>> I'm on the fence with providing a single level metrics, but I think > >>> we'll > >>>>>> have that discussion outside of this KIP. > >>>>>> > >>>>>>> * maintain one skipped-record metric (could be per-thread, > per-task, > >>> or > >>>>>>> per-processor-node) with no "reason" > >>>>>>> * introduce a warn-level log detailing the topic/partition/offset > >> and > >>>>>>> reason of the skipped record > >>>>>> > >>>>>> I'm +1 on both of these suggestions. > >>>>>> > >>>>>> Finally, we have had requests in the past for some metrics around > >> when > >>>>>> persistent store removes an expired window. Would adding that to > our > >>>>>> metrics stretch the scope of this KIP too much? > >>>>>> > >>>>>> Thanks again and overall I'm +1 on this KIP > >>>>>> > >>>>>> Bill > >>>>>> > >>>>>> On Fri, Mar 30, 2018 at 2:00 PM, Guozhang Wang <wangg...@gmail.com> > >>>>>> wrote: > >>>>>> > >>>>>>> The proposal sounds good to me. About "maintain only one level of > >>>>>> metrics" > >>>>>>> maybe we can discuss about that separately from this KIP since that > >>>>>> would > >>>>>>> be a larger scope of discussion. I agree that if we are going to > >>>>>> maintain > >>>>>>> only one-level metrics it should be lowest level and we would let > >>> users > >>>>>> to > >>>>>>> do the roll-ups themselves, but I'm still not fully convinced that > >> we > >>>>>>> should just provide single-level metrics, because 1) I think for > >>>>>> different > >>>>>>> metrics people may be interested to investigate into different > >>>>>>> granularities, e.g. for poll / commit rate these are at the lowest > >>>>>>> task-level metrics, while for process-rate / skip-rate they can be > >> as > >>>>>> low > >>>>>>> as processor-node metrics, and 2) user-side rolling ups may not be > >>> very > >>>>>>> straight-forward. But for 2) if someone can provide an efficient > and > >>>>>> easy > >>>>>>> implementation of that I can be persuaded :) > >>>>>>> > >>>>>>> For now I'm thinking we can add the metric on thread-level, either > >>> with > >>>>>>> finer grained ones with "reason" tag plus an aggregated one without > >>> the > >>>>>>> tag, or just having a single aggregated metric without the tag > looks > >>>>>> good > >>>>>>> to me. > >>>>>>> > >>>>>>> > >>>>>>> Guozhang > >>>>>>> > >>>>>>> On Fri, Mar 30, 2018 at 8:05 AM, John Roesler <j...@confluent.io> > >>>>>> wrote: > >>>>>>> > >>>>>>>> Hey Guozhang, > >>>>>>>> > >>>>>>>> Thanks for the reply. Regarding JMX, I can dig it. I'll provide a > >>>>>> list in > >>>>>>>> the KIP. I was also thinking we'd better start a documentation > page > >>>>>> with > >>>>>>>> the metrics listed. > >>>>>>>> > >>>>>>>> I'd have no problem logging a warning when we skip records. On the > >>>>>> metric > >>>>>>>> front, really I'm just pushing for us to maintain only one level > of > >>>>>>>> metrics. If that's more or less granular (i.e., maybe we don't > >> have a > >>>>>>>> metric per reason and log the reason instead), that's fine by me. > I > >>>>>> just > >>>>>>>> don't think it provides a lot of extra value per complexity > >>> (interface > >>>>>>> and > >>>>>>>> implementation) to maintain roll-ups at the thread level in > >> addition > >>>>>> to > >>>>>>>> lower-level metrics. > >>>>>>>> > >>>>>>>> How about this instead: > >>>>>>>> * maintain one skipped-record metric (could be per-thread, > >> per-task, > >>>>>> or > >>>>>>>> per-processor-node) with no "reason" > >>>>>>>> * introduce a warn-level log detailing the topic/partition/offset > >> and > >>>>>>>> reason of the skipped record > >>>>>>>> > >>>>>>>> If you like that, I can update the KIP. > >>>>>>>> > >>>>>>>> Thanks, > >>>>>>>> -John > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> On Thu, Mar 29, 2018 at 6:22 PM, Guozhang Wang < > wangg...@gmail.com > >>> > >>>>>>> wrote: > >>>>>>>> > >>>>>>>>>> One thing you mention is the notion of setting alerts on coarser > >>>>>>>> metrics > >>>>>>>>> being easier than finer ones. All the metric alerting systems I > >> have > >>>>>>> used > >>>>>>>>> make it equally easy to alert on metrics by-tag or over tags. So > >> my > >>>>>>>>> experience doesn't say that this is a use case. Were you thinking > >>>>>> of an > >>>>>>>>> alerting system that makes such a pre-aggregation valuable? > >>>>>>>>> > >>>>>>>>> For the commonly used JMX reporter tags will be encoded directly > >> as > >>>>>>> part > >>>>>>>> of > >>>>>>>>> the object name, and if users wants to monitor them they need to > >>>>>> know > >>>>>>>> these > >>>>>>>>> values before hand. That is also why I think we do want to list > >> all > >>>>>> the > >>>>>>>>> possible values of the reason tags in the KIP, since > >>>>>>>>> > >>>>>>>>>> In my email in response to Matthias, I gave an example of the > >>>>>> kind of > >>>>>>>>> scenario that would lead me as an operator to run with DEBUG on > >> all > >>>>>> the > >>>>>>>>> time, since I wouldn't be sure, having seen a skipped record > once, > >>>>>> that > >>>>>>>> it > >>>>>>>>> would ever happen again. The solution is to capture all the > >>>>>> available > >>>>>>>>> information about the reason and location of skips all the time. > >>>>>>>>> > >>>>>>>>> That is a good point. I think we can either expose all levels > >>>>>> metrics > >>>>>>> as > >>>>>>>> by > >>>>>>>>> default, or only expose the most lower-level metrics and get rid > >> of > >>>>>>> other > >>>>>>>>> levels to let users do roll-ups themselves (which will be a much > >>>>>> larger > >>>>>>>>> scope for discussion), or we can encourage users to not purely > >>>>>> depend > >>>>>>> on > >>>>>>>>> metrics for such trouble shooting: that is to say, users only be > >>>>>>> alerted > >>>>>>>>> based on metrics, and we can log a info / warn log4j entry each > >>>>>> time we > >>>>>>>> are > >>>>>>>>> about to skip a record all over the places, so that upon being > >>>>>> notified > >>>>>>>>> users can look into the logs to find the details on where / when > >> it > >>>>>>>>> happens. WDYT? > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> Guozhang > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> On Thu, Mar 29, 2018 at 3:57 PM, John Roesler <j...@confluent.io > > > >>>>>>> wrote: > >>>>>>>>> > >>>>>>>>>> Hey Guozhang, > >>>>>>>>>> > >>>>>>>>>> Thanks for the review. > >>>>>>>>>> > >>>>>>>>>> 1. > >>>>>>>>>> Matthias raised the same question about the "reason" tag values. > >> I > >>>>>>> can > >>>>>>>>> list > >>>>>>>>>> all possible values of the "reason" tag, but I'm thinking this > >>>>>> level > >>>>>>> of > >>>>>>>>>> detail may not be KIP-worthy, maybe the code and documentation > >>>>>> review > >>>>>>>>> would > >>>>>>>>>> be sufficient. If you all disagree and would like it included in > >>>>>> the > >>>>>>>>> KIP, I > >>>>>>>>>> can certainly do that. > >>>>>>>>>> > >>>>>>>>>> If we do provide roll-up metrics, I agree with the pattern of > >>>>>> keeping > >>>>>>>> the > >>>>>>>>>> same name but eliminating the tags for the dimensions that were > >>>>>>>>> rolled-up. > >>>>>>>>>> > >>>>>>>>>> 2. > >>>>>>>>>> I'm not too sure that implementation efficiency really becomes a > >>>>>>> factor > >>>>>>>>> in > >>>>>>>>>> choosing whether to (by default) update one coarse metric at the > >>>>>>> thread > >>>>>>>>>> level or one granular metric at the processor-node level, since > >>>>>> it's > >>>>>>>> just > >>>>>>>>>> one metric being updated either way. I do agree that if we were > >> to > >>>>>>>> update > >>>>>>>>>> the granular metrics and multiple roll-ups, then we should > >>>>>> consider > >>>>>>> the > >>>>>>>>>> efficiency. > >>>>>>>>>> > >>>>>>>>>> I agree it's probably not necessary to surface the metrics for > >> all > >>>>>>>> nodes > >>>>>>>>>> regardless of whether they can or do skip records. Perhaps we > can > >>>>>>>> lazily > >>>>>>>>>> register the metrics. > >>>>>>>>>> > >>>>>>>>>> In my email in response to Matthias, I gave an example of the > >>>>>> kind of > >>>>>>>>>> scenario that would lead me as an operator to run with DEBUG on > >>>>>> all > >>>>>>> the > >>>>>>>>>> time, since I wouldn't be sure, having seen a skipped record > >> once, > >>>>>>> that > >>>>>>>>> it > >>>>>>>>>> would ever happen again. The solution is to capture all the > >>>>>> available > >>>>>>>>>> information about the reason and location of skips all the time. > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> One thing you mention is the notion of setting alerts on coarser > >>>>>>>> metrics > >>>>>>>>>> being easier than finer ones. All the metric alerting systems I > >>>>>> have > >>>>>>>> used > >>>>>>>>>> make it equally easy to alert on metrics by-tag or over tags. So > >>>>>> my > >>>>>>>>>> experience doesn't say that this is a use case. Were you > thinking > >>>>>> of > >>>>>>> an > >>>>>>>>>> alerting system that makes such a pre-aggregation valuable? > >>>>>>>>>> > >>>>>>>>>> Thanks again, > >>>>>>>>>> -John > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> On Thu, Mar 29, 2018 at 5:24 PM, Guozhang Wang < > >>>>>> wangg...@gmail.com> > >>>>>>>>> wrote: > >>>>>>>>>> > >>>>>>>>>>> Hello John, > >>>>>>>>>>> > >>>>>>>>>>> Thanks for the KIP. Some comments: > >>>>>>>>>>> > >>>>>>>>>>> 1. Could you list all the possible values of the "reason" tag? > >>>>>> In > >>>>>>> the > >>>>>>>>>> JIRA > >>>>>>>>>>> ticket I left some potential reasons but I'm not clear if > you're > >>>>>>>> going > >>>>>>>>> to > >>>>>>>>>>> categorize each of them as a separate reason, or is there any > >>>>>>>>> additional > >>>>>>>>>>> ones you have in mind. > >>>>>>>>>>> > >>>>>>>>>>> Also I'm wondering if we should add another metric that do not > >>>>>> have > >>>>>>>> the > >>>>>>>>>>> reason tag but aggregates among all possible reasons? This is > >>>>>> for > >>>>>>>> users > >>>>>>>>>> to > >>>>>>>>>>> easily set their alerting notifications (otherwise they have to > >>>>>>> write > >>>>>>>>> on > >>>>>>>>>>> notification rule per reason) in their monitoring systems. > >>>>>>>>>>> > >>>>>>>>>>> 2. Note that the processor-node metrics is actually > "per-thread, > >>>>>>>>>> per-task, > >>>>>>>>>>> per-processor-node", and today we only set the per-thread > >>>>>> metrics > >>>>>>> as > >>>>>>>>> INFO > >>>>>>>>>>> while leaving the lower two layers as DEBUG. I agree with your > >>>>>>>> argument > >>>>>>>>>>> that we are missing the per-client roll-up metrics today, but > >>>>>> I'm > >>>>>>>>>> convinced > >>>>>>>>>>> that the right way to approach it would be > >>>>>>>> "just-providing-the-lowest- > >>>>>>>>>>> level > >>>>>>>>>>> metrics only". > >>>>>>>>>>> > >>>>>>>>>>> Note the recoding implementation of these three levels are > >>>>>>> different > >>>>>>>>>>> internally today: we did not just do the rolling up to generate > >>>>>> the > >>>>>>>>>>> higher-level metrics from the lower level ones, but we just > >>>>>> record > >>>>>>>> them > >>>>>>>>>>> separately, which means that, if we turn on multiple levels of > >>>>>>>> metrics, > >>>>>>>>>> we > >>>>>>>>>>> maybe duplicate collecting some metrics. One can argue that is > >>>>>> not > >>>>>>>> the > >>>>>>>>>> best > >>>>>>>>>>> way to represent multi-level metrics collecting and reporting, > >>>>>> but > >>>>>>> by > >>>>>>>>>> only > >>>>>>>>>>> enabling thread-level metrics as INFO today, that > implementation > >>>>>>>> could > >>>>>>>>> be > >>>>>>>>>>> more efficient than only collecting the metrics at the lowest > >>>>>>> level, > >>>>>>>>> and > >>>>>>>>>>> then do the roll-up calculations outside of the metrics > classes. > >>>>>>>>>>> > >>>>>>>>>>> Plus, today not all processor-nodes may possibly skip records, > >>>>>>> AFAIK > >>>>>>>> we > >>>>>>>>>>> will only skip records at the source, sink, window and > >>>>>> aggregation > >>>>>>>>>>> processor nodes, so adding a metric per processor looks like an > >>>>>>>>> overkill > >>>>>>>>>> to > >>>>>>>>>>> me as well. On the other hand, from user's perspective the > >>>>>> "reason" > >>>>>>>> tag > >>>>>>>>>> may > >>>>>>>>>>> be sufficient for them to narrow down where inside the topology > >>>>>> is > >>>>>>>>>> causing > >>>>>>>>>>> records to be dropped on the floor. So I think the "per-thread, > >>>>>>>>> per-task" > >>>>>>>>>>> level metrics should be sufficient for them in trouble shoot in > >>>>>>> DEBUG > >>>>>>>>>> mode, > >>>>>>>>>>> and we can add another "per-thread" level metrics as INFO which > >>>>>> is > >>>>>>>>> turned > >>>>>>>>>>> on by default. So under normal execution users still only need > >>>>>> INFO > >>>>>>>>> level > >>>>>>>>>>> metrics for alerting (e.g. set alerts on all skipped-records > >>>>>>> metrics > >>>>>>>> as > >>>>>>>>>>> non-zero), and then upon trouble shooting they can turn on > DEBUG > >>>>>>>>> metrics > >>>>>>>>>> to > >>>>>>>>>>> look into which task is actually causing the skipped records. > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> Guozhang > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> On Thu, Mar 29, 2018 at 2:03 PM, Matthias J. Sax < > >>>>>>>>> matth...@confluent.io> > >>>>>>>>>>> wrote: > >>>>>>>>>>> > >>>>>>>>>>>> Thanks for the KIP John. > >>>>>>>>>>>> > >>>>>>>>>>>> Reading the material on the related Jiras, I am wondering what > >>>>>>>>> `reason` > >>>>>>>>>>>> tags you want to introduce? Can you elaborate? The KIP should > >>>>>>> list > >>>>>>>>>> those > >>>>>>>>>>>> IMHO. > >>>>>>>>>>>> > >>>>>>>>>>>> About the fine grained metrics vs the roll-up: you say that > >>>>>>>>>>>> > >>>>>>>>>>>>> the coarse metric aggregates across two dimensions > >>>>>>> simultaneously > >>>>>>>>>>>> > >>>>>>>>>>>> Can you elaborate why this is an issue? I am not convinced atm > >>>>>>> that > >>>>>>>>> we > >>>>>>>>>>>> should put the fine grained metrics into INFO level and remove > >>>>>>> the > >>>>>>>>>>>> roll-up at thread level. > >>>>>>>>>>>> > >>>>>>>>>>>>> Given that they have to do this sum to get a usable > >>>>>> top-level > >>>>>>>> view > >>>>>>>>>>>> > >>>>>>>>>>>> This is a fair concern, but I don't share the conclusion. > >>>>>>> Offering > >>>>>>>> a > >>>>>>>>>>>> built-in `KafkaStreams` "client" roll-up out of the box might > >>>>>> be > >>>>>>> a > >>>>>>>>>>>> better solution. In the past we did not offer this due to > >>>>>>>> performance > >>>>>>>>>>>> concerns, but we could allow an "opt-in" mechanism. If you > >>>>>>>> disagree, > >>>>>>>>>> can > >>>>>>>>>>>> you provide some reasoning and add them to the "Rejected > >>>>>>>>> alternatives" > >>>>>>>>>>>> section. > >>>>>>>>>>>> > >>>>>>>>>>>> To rephrase: I understand the issue about missing top-level > >>>>>> view, > >>>>>>>> but > >>>>>>>>>>>> instead of going more fine grained, we should consider to add > >>>>>>> this > >>>>>>>>>>>> top-level view and add/keep the fine grained metrics at DEBUG > >>>>>>> level > >>>>>>>>>> only > >>>>>>>>>>>> > >>>>>>>>>>>> I am +1 to add TopologyTestDriver#metrics() and to remove old > >>>>>>>> metrics > >>>>>>>>>>>> directly as you suggested. > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> -Matthias > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> On 3/28/18 6:42 PM, Ted Yu wrote: > >>>>>>>>>>>>> Looks good to me. > >>>>>>>>>>>>> > >>>>>>>>>>>>> On Wed, Mar 28, 2018 at 3:11 PM, John Roesler < > >>>>>>> j...@confluent.io > >>>>>>>>> > >>>>>>>>>>> wrote: > >>>>>>>>>>>>> > >>>>>>>>>>>>>> Hello all, > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> I am proposing KIP-274 to improve the metrics around > >>>>>> skipped > >>>>>>>>> records > >>>>>>>>>>> in > >>>>>>>>>>>>>> Streams. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Please find the details here: > >>>>>>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP- > >>>>>>>>>>>>>> 274%3A+Kafka+Streams+Skipped+Records+Metrics > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Please let me know what you think! > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Thanks, > >>>>>>>>>>>>>> -John > >>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> -- > >>>>>>>>>>> -- Guozhang > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> -- > >>>>>>>>> -- Guozhang > >>>>>>>>> > >>>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> -- > >>>>>>> -- Guozhang > >>>>>>> > >>>>>> > >>>>> > >>>>> > >>>> > >>> > >>> > >> > > > > > > > >