Apologies for this duplicate reply, I did not notice the success confirmation on the first submission.
On 2021/06/14 04:52:11, Travis Bischel <travis.bisc...@gmail.com> wrote: > Hi! I have a few thoughts on this KIP. First, I'd like to thank you for your > work > and writeup, it's clear that a lot of thought went into this and it's very > thorough! > However, I'm not convinced it's the right approach from a fundamental level. > > Fundamentally, this KIP seems like somewhat of a solution to an organizational > problem. Metrics are organizational concerns, not Kafka operator concerns. > Clients should make it easy to plug in metrics (this is the approach I take in > my own client), and organizations should have processes such that all clients > gather and ship metrics how that organization desires. If an organization is > set up correctly, there is no reason for metrics to be forwarded through > Kafka. > This feels like a solution to an organization not properly setting up how > processes ship metrics, and in some ways, it's an overbroad solution, and in > other ways, it doesn't cover the entire problem. > > From the perspective of Kafka operators, it is easy to see that this KIP is > nice in that it just dictates what clients should support for metrics and that > the metrics should ship through Kafka. But, from the perspective of an > observability team, this workflow is basically hijacking the standard flow > that > organizations may have. I would rather have applications collect metrics and > ship them the same way every other application does. I'd rather not have to > configure additional plugins within Kafka to take metrics and forward them. > > More importantly, this KIP prescibes cardinality problems, requires that to > officially support the KIP a client must support all relevant metrics within > the KIP, and requires that a client cannot support other metrics unless those > other metrics also go through a KIP process. It is difficult to imagine all of > these metrics being relevant to every organization, and there is no way for an > organization to filter what is relevant within the client. Instead, the > filtering is pushed downwards, meaning more network IO and more CPU costs to > filter what is irrelevant and aggregate what needs to be aggregated, and more > time for an organization to setup whatever it is that will be doing this > filtering and aggregating. Contrast this with a client that enables hooking in > to capture numbers that are relevant within an org itself: the org can gather > what they want, ship only want they want, and ship directly to the > observability system they have already set up. As an aside, it may also be > wise to avoid shipping metrics through Kafka about client interaction with > Kafka, because if Kafka is having problems, then orgs lose insight into those > problems. This would be like statuspage using itself for status on its own > systems. > > Another downside is that by dictating the important metrics, this KIP either > has two choices: try to choose what is important to every org, and inevitably > leave out something important to somebody else, or just add everything and let > the orgs filter. This KIP mostly looks to go with the latter approach, meaning > orgs will be shipping & filtering. With hooks, an org would be able to gather > exactly what they want. > > As well, I expect that org applications have metrics on the state of the > applications outside of the Kafka client. Applications are already sending > non-Kafka-client related metrics outbound to observability systems. If a Kafka > client provided hooks, then users could just gather the additional relevant > Kafka client metrics and ship those metrics the same way they do all of their > other metrics. It feels a bit odd for a Kafka client to have its own separate > way of forwarding metrics. Another benefit hooks in clients is that > organizations do not _have_ to set up additional plugins to forward metrics > from Kafka. Hooks avoid extra organizational work. > > The option that the KIP provides for users of clients to opt out of metrics > may > avoid some of the above issues (by just disabling things at the user level), > but that's not really great from the perspective of client authors, because > the > existence of this KIP forces authors to either just not implement the KIP, or > increase complexity within the KIP. Further, from an operator perspective, if > I > would prefer clients to ship metrics through the systems they already have in > place, now I have to expect that anything that uses librdkafka or the official > Java client will be shipping me metrics that I have to deal with (since the > KIP > is default enabled). > > Lastly, I'm a little wary that this KIP may stem from a product goal of > Confluent: since most everything uses librdkafka or the Java client, then by > defaulting clients sending metrics, Confluent gets an easy way to provide > metric panels for a nice cloud UI. If any client does not want to support > these > metrics, and then a user wonders why these hypothetical panels have no > metrics, > then Confluent can just reply "use a supported client". Even if this > (potentially unlikely) scenario is true, then hooks would still be a great > alternative, because then Confluent could provide drop-in hooks for any client > and the end result of easy-panels would be the same. > > In summary, > > - Metrics are more of an organizational concern, not specifically a broker > operator concern. > > - The proposal seems to hijack how metrics are gathered within organizations > > - I don't think KIPs should dictate which metrics should be gathered and which > should not. Clients instead should make it easy for users to gather anything > they could be interested in, and ignore anything they are not. > > - I think hooks are more extensible, more exact, and fit better into > organizational workflows. > > On 2021/06/02 12:45:45, Magnus Edenhill <mag...@edenhill.se> wrote: > > Hey all, > > > > I'm proposing KIP-714 to add remote Client metrics and observability. > > This functionality will allow centralized monitoring and troubleshooting of > > clients and their internals. > > > > Please see > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-714%3A+Client+metrics+and+observability > > > > Looking forward to your feedback! > > > > Regards, > > Magnus > > >