[ https://issues.apache.org/jira/browse/KAFKA-3714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Work on KAFKA-3714 started by Eno Thereska. ------------------------------------------- > Allow users greater access to register custom streams metrics > ------------------------------------------------------------- > > Key: KAFKA-3714 > URL: https://issues.apache.org/jira/browse/KAFKA-3714 > Project: Kafka > Issue Type: Improvement > Components: streams > Reporter: Jeff Klukas > Assignee: Eno Thereska > Priority: Minor > Labels: api > Fix For: 0.10.3.0 > > > Copying in some discussion that originally appeared in > https://github.com/apache/kafka/pull/1362#issuecomment-219064302 > Kafka Streams is largely a higher-level abstraction on top of producers and > consumers, and it seems sensible to match the KafkaStreams interface to that > of KafkaProducer and KafkaConsumer where possible. For producers and > consumers, the metric registry is internal and metrics are only exposed as an > unmodifiable map. This allows users to access client metric values for use in > application health checks, etc., but doesn't allow them to register new > metrics. > That approach seems reasonable if we assume that a user interested in > defining custom metrics is already going to be using a separate metrics > library. In such a case, users will likely find it easier to define metrics > using whatever library they're familiar with rather than learning the API for > Kafka's Metrics class. Is this a reasonable assumption? > If we want to expose the Metrics instance so that users can define arbitrary > metrics, I'd argue that there's need for documentation updates. In > particular, I find the notion of metric tags confusing. Tags can be defined > in a MetricConfig when the Metrics instance is constructed, > StreamsMetricsImpl is maintaining its own set of tags, and users can set tag > overrides. > If a user were to get access to the Metrics instance, they would be missing > the tags defined in StreamsMetricsImpl. I'm imagining that users would want > their custom metrics to sit alongside the predefined metrics with the same > tags, and users shouldn't be expected to manage those additional tags > themselves. > So, why are we allowing users to define their own metrics via the > StreamsMetrics interface in the first place? Is it that we'd like to be able > to provide a built-in latency metric, but the definition depends on the > details of the use case so there's no generic solution? That would be > sufficient motivation for this special case of addLatencySensor. If we want > to continue down that path and give users access to define a wider range of > custom metrics, I'd prefer to extend the StreamsMetrics interface so that > users can call methods on that object, automatically getting the tags > appropriate for that instance rather than interacting with the raw Metrics > instance. > --- > Guozhang had the following comments: > 1) For the producer/consumer cases, all internal metrics are provided and > abstracted from users, and they just need to read the documentation to poll > whatever provided metrics that they are interested; and if they want to > define more metrics, they are likely to be outside the clients themselves and > they can use whatever methods they like, so Metrics do not need to be exposed > to users. > 2) For streams, things are a bit different: users define the computational > logic, which becomes part of the "Streams Client" processing and may be of > interests to be monitored by user themselves; think of a customized processor > that sends an email to some address based on a condition, and users want to > monitor the average rate of emails sent. Hence it is worth considering > whether or not they should be able to access the Metrics instance to define > their own along side the pre-defined metrics provided by the library. > 3) Now, since the Metrics class was not previously designed for public usage, > it is not designed to be very user-friendly for defining sensors, especially > the semantics differences between name / scope / tags. StreamsMetrics tries > to hide some of these semantics confusion from users, but it still expose > tags and hence is not perfect in doing so. We need to think of a better > approach so that: 1) user defined metrics will be "aligned" (i.e. with the > same name prefix within a single application, with similar scope hierarchy > definition, etc) with library provided metrics, 2) natural APIs to do so. > I do not have concrete ideas about 3) above on top of my head, comments are > more than welcomed. > --- > I'm not sure that I agree that 1) and 2) are truly different situations. A > user might choose to send email messages within a bare consumer rather than a > streams application, and still want to maintain a metric of sent emails. In > this bare consumer case, we'd expect the user to define that email-sent > metric outside of Kafka's metrics machinery. -- This message was sent by Atlassian JIRA (v6.3.4#6332)