Hi all, I've updated the KIP with responses to the latest comments: Java client dependencies (Thanks Kirk!), alternate designs (separate cluster, separate producer, etc), etc.
I will revive the vote thread. Thanks, Magnus Den mån 13 dec. 2021 kl 22:32 skrev Ryanne Dolan <ryannedo...@gmail.com>: > I think we should be very careful about introducing new runtime > dependencies into the clients. Historically this has been rare and > essentially necessary (e.g. compression libs). > > Ryanne > > On Mon, Dec 13, 2021, 1:06 PM Kirk True <k...@mustardgrain.com> wrote: > > > Hi Jun, > > > > On Thu, Dec 9, 2021, at 2:28 PM, Jun Rao wrote: > > > 13. Using OpenTelemetry. Does that require runtime dependency > > > on OpenTelemetry library? How good is the compatibility story > > > of OpenTelemetry? This is important since an application could have > other > > > OpenTelemetry dependencies than the Kafka client. > > > > The current design is that the OpenTelemetry JARs would ship with the > > client. Perhaps we can design the client such that the JARs aren't even > > loaded if the user has opted out. The user could even exclude the JARs > from > > their dependencies if they so wished. > > > > I can't speak to the compatibility of the libraries. Is it possible that > > we include a shaded version? > > > > Thanks, > > Kirk > > > > > > > > 14. The proposal listed idempotence=true. This is more of a > configuration > > > than a metric. Are we including that as a metric? What other > > configurations > > > are we including? Should we separate the configurations from the > metrics? > > > > > > Thanks, > > > > > > Jun > > > > > > On Mon, Nov 29, 2021 at 7:34 AM Magnus Edenhill <mag...@edenhill.se> > > wrote: > > > > > > > Hey Bob, > > > > > > > > That's a good point. > > > > > > > > Request type labels were considered but since they're already tracked > > by > > > > broker-side metrics > > > > they were left out as to avoid metric duplication, however those > > metrics > > > > are not per connection, > > > > so they won't be that useful in practice for troubleshooting specific > > > > client instances. > > > > > > > > I'll add the request_type label to the relevant metrics. > > > > > > > > Thanks, > > > > Magnus > > > > > > > > > > > > Den tis 23 nov. 2021 kl 19:20 skrev Bob Barrett > > > > <bob.barr...@confluent.io.invalid>: > > > > > > > > > Hi Magnus, > > > > > > > > > > Thanks for the thorough KIP, this seems very useful. > > > > > > > > > > Would it make sense to include the request type as a label for the > > > > > `client.request.success`, `client.request.errors` and > > > > `client.request.rtt` > > > > > metrics? I think it would be very useful to see which specific > > requests > > > > are > > > > > succeeding and failing for a client. One specific case I can think > of > > > > where > > > > > this could be useful is producer batch timeouts. If a Java > > application > > > > does > > > > > not enable producer client logs (unfortunately, in my experience > this > > > > > happens more often than it should), the application logs will only > > > > contain > > > > > the expiration error message, but no information about what is > > causing > > > > the > > > > > timeout. The requests might all be succeeding but taking too long > to > > > > > process batches, or metadata requests might be failing, or some or > > all > > > > > produce requests might be failing (if the bootstrap servers are > > reachable > > > > > from the client but one or more other brokers are not, for > example). > > If > > > > the > > > > > cluster operator is able to identify the specific requests that are > > slow > > > > or > > > > > failing for a client, they will be better able to diagnose the > issue > > > > > causing batch timeouts. > > > > > > > > > > One drawback I can think of is that this will increase the > > cardinality of > > > > > the request metrics. But any given client is only going to use a > > small > > > > > subset of the request types, and since we already have partition > > labels > > > > for > > > > > the topic-level metrics, I think request labels will still make up > a > > > > > relatively small percentage of the set of metrics. > > > > > > > > > > Thanks, > > > > > Bob > > > > > > > > > > On Mon, Nov 22, 2021 at 2:08 AM Viktor Somogyi-Vass < > > > > > viktorsomo...@gmail.com> > > > > > wrote: > > > > > > > > > > > Hi Magnus, > > > > > > > > > > > > I think this is a very useful addition. We also have a similar > (but > > > > much > > > > > > more simplistic) implementation of this. Maybe I missed it in the > > KIP > > > > but > > > > > > what about adding metrics about the subscription cache itself? > > That I > > > > > think > > > > > > would improve its usability and debuggability as we'd be able to > > see > > > > its > > > > > > performance, hit/miss rates, eviction counts and others. > > > > > > > > > > > > Best, > > > > > > Viktor > > > > > > > > > > > > On Thu, Nov 18, 2021 at 5:12 PM Magnus Edenhill < > > mag...@edenhill.se> > > > > > > wrote: > > > > > > > > > > > > > Hi Mickael, > > > > > > > > > > > > > > see inline. > > > > > > > > > > > > > > Den ons 10 nov. 2021 kl 15:21 skrev Mickael Maison < > > > > > > > mickael.mai...@gmail.com > > > > > > > >: > > > > > > > > > > > > > > > Hi Magnus, > > > > > > > > > > > > > > > > I see you've addressed some of the points I raised above but > > some > > > > (4, > > > > > > > > 5) have not been addressed yet. > > > > > > > > > > > > > > > > > > > > > > Re 4) How will the user/app know metrics are being sent. > > > > > > > > > > > > > > One possibility is to add a JMX metric (thus for user > > consumption) > > > > for > > > > > > the > > > > > > > number of metric pushes the > > > > > > > client has performed, or perhaps the number of metrics > > subscriptions > > > > > > > currently being collected. > > > > > > > Would that be sufficient? > > > > > > > > > > > > > > Re 5) Metric sizes and rates > > > > > > > > > > > > > > A worst case scenario for a producer that is producing to 50 > > unique > > > > > > topics > > > > > > > and emitting all standard metrics yields > > > > > > > a serialized size of around 100KB prior to compression, which > > > > > compresses > > > > > > > down to about 20-30% of that depending > > > > > > > on compression type and topic name uniqueness. > > > > > > > The numbers for a consumer would be similar. > > > > > > > > > > > > > > In practice the number of unique topics would be far less, and > > the > > > > > > > subscription set would typically be for a subset of metrics. > > > > > > > So we're probably closer to 1kb, or less, compressed size per > > client > > > > > per > > > > > > > push interval. > > > > > > > > > > > > > > As both the subscription set and push intervals are controlled > > by the > > > > > > > cluster operator it shouldn't be too hard > > > > > > > to strike a good balance between metrics overhead and > > granularity. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I'm really uneasy with this being enabled by default on the > > client > > > > > > > > side. When collecting data, I think the best practice is to > > ensure > > > > > > > > users are explicitly enabling it. > > > > > > > > > > > > > > > > > > > > > > Requiring metrics to be explicitly enabled on clients severely > > > > cripples > > > > > > its > > > > > > > usability and value. > > > > > > > > > > > > > > One of the problems that this KIP aims to solve is for useful > > metrics > > > > > to > > > > > > be > > > > > > > available on demand > > > > > > > regardless of the technical expertise of the user. As Ryanne > > points, > > > > > out > > > > > > a > > > > > > > savvy user/organization > > > > > > > will typically have metrics collection and monitoring in place > > > > already, > > > > > > and > > > > > > > the benefits of this KIP > > > > > > > are then more of a common set and format metrics across client > > > > > > > implementations and languages. > > > > > > > But that is not the typical Kafka user in my experience, > they're > > not > > > > > > Kafka > > > > > > > experts and they don't have the > > > > > > > knowledge of how to best instrument their clients. > > > > > > > Having metrics enabled by default for this user base allows the > > Kafka > > > > > > > operators to proactively and reactively > > > > > > > monitor and troubleshoot client issues, without the need for > the > > less > > > > > > savvy > > > > > > > user to do anything. > > > > > > > It is often too late to tell a user to enable metrics when the > > > > problem > > > > > > has > > > > > > > already occurred. > > > > > > > > > > > > > > Now, to be clear, even though metrics are enabled by default on > > > > clients > > > > > > it > > > > > > > is not enabled by default > > > > > > > on the brokers; the Kafka operator needs to build and set up a > > > > metrics > > > > > > > plugin and add metrics subscriptions > > > > > > > before anything is sent from the client. > > > > > > > It is opt-out on the clients and opt-in on the broker. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > You mentioned brokers already have > > > > > > > > some(most?) of the information contained in metrics, if so > > then why > > > > > > > > are we collecting it again? Surely there must be some new > > > > information > > > > > > > > in the client metrics. > > > > > > > > > > > > > > > > > > > > > > From the user's perspective the Kafka infrastructure extends > from > > > > > > > producer.send() to > > > > > > > messages being returned from consumer.poll(), a giant black box > > where > > > > > > > there's a lot going on between those > > > > > > > two points. The brokers currently only see what happens once > > those > > > > > > requests > > > > > > > and messages hits the broker, > > > > > > > but as Kafka clients are complex pieces of machinery there's a > > myriad > > > > > of > > > > > > > queues, timers, and state > > > > > > > that's critical to the operation and infrastructure that's not > > > > > currently > > > > > > > visible to the operator. > > > > > > > Relying on the user to accurately and timely provide this > missing > > > > > > > information is not generally feasible. > > > > > > > > > > > > > > > > > > > > > Most of the standard metrics listed in the KIP are data points > > that > > > > the > > > > > > > broker does not have. > > > > > > > Only a small number of metrics are duplicates (like the request > > > > counts > > > > > > and > > > > > > > sizes), but they are included > > > > > > > to ease correlation when inspecting these client metrics. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Moreover this is a brand new feature so it's even harder to > > justify > > > > > > > > enabling it and forcing onto all our users. If disabled by > > default, > > > > > > > > it's relatively easy to enable in a new release if we decide > > to, > > > > but > > > > > > > > once enabled by default it's much harder to disable. Also > this > > > > > feature > > > > > > > > will apply to all future metrics we will add. > > > > > > > > > > > > > > > > > > > > > > I think maturity of a feature implementation should be the > > deciding > > > > > > factor, > > > > > > > rather than > > > > > > > the design of it (which this KIP is). I.e., if the > > implementation is > > > > > not > > > > > > > deemed mature enough > > > > > > > for release X.Y it will be disabled. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Overall I think it's an interesting feature but I'd prefer to > > be > > > > > > > > slightly defensive and see how it works in practice before > > enabling > > > > > it > > > > > > > > everywhere. > > > > > > > > > > > > > > > > > > > > > > Right, and I agree on being defensive, but since this feature > > still > > > > > > > requires manual > > > > > > > enabling on the brokers before actually being used, I think > that > > > > gives > > > > > > > enough control > > > > > > > to opt-in or out of this feature as needed. > > > > > > > > > > > > > > Thanks for your comments! > > > > > > > > > > > > > > Regards, > > > > > > > Magnus > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > Mickael > > > > > > > > > > > > > > > > On Mon, Nov 8, 2021 at 7:55 AM Magnus Edenhill < > > mag...@edenhill.se > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > Thanks David for pointing this out, > > > > > > > > > I've updated the KIP to include client_id as a matching > > selector. > > > > > > > > > > > > > > > > > > Regards, > > > > > > > > > Magnus > > > > > > > > > > > > > > > > > > Den tors 4 nov. 2021 kl 18:01 skrev David Mao > > > > > > > <d...@confluent.io.invalid > > > > > > > > >: > > > > > > > > > > > > > > > > > > > Hey Magnus, > > > > > > > > > > > > > > > > > > > > I noticed that the KIP outlines the initial selectors > > supported > > > > > as: > > > > > > > > > > > > > > > > > > > > - client_instance_id - CLIENT_INSTANCE_ID UUID string > > > > > > > > representation. > > > > > > > > > > - client_software_name - client software > implementation > > > > name. > > > > > > > > > > - client_software_version - client software > > implementation > > > > > > > version. > > > > > > > > > > > > > > > > > > > > In the given reactive monitoring workflow, we mention > that > > the > > > > > > > > application > > > > > > > > > > user does not know their client's client instance ID, but > > it's > > > > > > > outlined > > > > > > > > > > that the operator can add a metrics subscription > selecting > > for > > > > > > > > clientId. I > > > > > > > > > > don't see clientId as one of the supported selectors. > > > > > > > > > > I can see how this would have made sense in a previous > > > > iteration > > > > > > > given > > > > > > > > that > > > > > > > > > > the previous client instance ID proposal was to construct > > the > > > > > > client > > > > > > > > > > instance ID using clientId as a prefix. Now that the > client > > > > > > instance > > > > > > > > ID is > > > > > > > > > > a UUID, would we want to add clientId as a supported > > selector? > > > > > > > > > > Let me know what you think. > > > > > > > > > > > > > > > > > > > > David > > > > > > > > > > > > > > > > > > > > On Tue, Oct 19, 2021 at 12:33 PM Magnus Edenhill < > > > > > > mag...@edenhill.se > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > Hi Mickael! > > > > > > > > > > > > > > > > > > > > > > Den tis 19 okt. 2021 kl 19:30 skrev Mickael Maison < > > > > > > > > > > > mickael.mai...@gmail.com > > > > > > > > > > > >: > > > > > > > > > > > > > > > > > > > > > > > Hi Magnus, > > > > > > > > > > > > > > > > > > > > > > > > Thanks for the proposal. > > > > > > > > > > > > > > > > > > > > > > > > 1. Looking at the protocol section, isn't > > > > "ClientInstanceId" > > > > > > > > expected > > > > > > > > > > > > to be a field in GetTelemetrySubscriptionsResponseV0? > > > > > > Otherwise, > > > > > > > > how > > > > > > > > > > > > does a client retrieve this value? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Good catch, it got removed by mistake in one of the > > edits. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 2. In the client API section, you mention a new > method > > > > > > > > > > > > "clientInstanceId()". Can you clarify which > interfaces > > are > > > > > > > > affected? > > > > > > > > > > > > Is it only Consumer and Producer? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > And Admin. Will update the KIP. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 3. I'm a bit concerned this is enabled by default. > > Even if > > > > > the > > > > > > > data > > > > > > > > > > > > collected is supposed to be not sensitive, I think > > this can > > > > > be > > > > > > > > > > > > problematic in some environments. Also users don't > > seem to > > > > > have > > > > > > > the > > > > > > > > > > > > choice to only expose some metrics. Knowing how much > > data > > > > > > transit > > > > > > > > > > > > through some applications can be considered critical. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The broker already knows how much data transits through > > the > > > > > > client > > > > > > > > > > though, > > > > > > > > > > > right? > > > > > > > > > > > Care has been taken not to expose information in the > > standard > > > > > > > metrics > > > > > > > > > > that > > > > > > > > > > > might > > > > > > > > > > > reveal sensitive information. > > > > > > > > > > > > > > > > > > > > > > Do you have an example of how the proposed metrics > could > > leak > > > > > > > > sensitive > > > > > > > > > > > information? > > > > > > > > > > > As for limiting the what metrics to export; I guess > that > > > > could > > > > > > make > > > > > > > > sense > > > > > > > > > > > in some > > > > > > > > > > > very sensitive use-cases, but those users might disable > > > > metrics > > > > > > > > > > altogether > > > > > > > > > > > for now. > > > > > > > > > > > Could these concerns be addressed by a later KIP? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 4. As a user, how do you know if your application is > > > > actively > > > > > > > > sending > > > > > > > > > > > > metrics? Are there new metrics exposing what's going > > on, > > > > like > > > > > > how > > > > > > > > much > > > > > > > > > > > > data is being sent? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > That's a good question. > > > > > > > > > > > Since the proposed metrics interface is not aimed at, > or > > > > > directly > > > > > > > > > > available > > > > > > > > > > > to, the application > > > > > > > > > > > I guess there's little point of adding it here, but > > instead > > > > > > adding > > > > > > > > > > > something to the > > > > > > > > > > > existing JMX metrics? > > > > > > > > > > > Do you have any suggestions? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 5. If all metrics are enabled on a regular Consumer > or > > > > > > Producer, > > > > > > > do > > > > > > > > > > > > you have an idea how much throughput this would use? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > It depends on the number of partition/topics/etc the > > client > > > > is > > > > > > > > producing > > > > > > > > > > > to/consuming from. > > > > > > > > > > > I'll add some sizes to the KIP for some typical > > use-cases. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > Magnus > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Oct 19, 2021 at 5:06 PM Magnus Edenhill < > > > > > > > > mag...@edenhill.se> > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > Den tis 19 okt. 2021 kl 13:22 skrev Tom Bentley < > > > > > > > > tbent...@redhat.com > > > > > > > > > > >: > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi Magnus, > > > > > > > > > > > > > > > > > > > > > > > > > > > > I reviewed the KIP since you called the vote > > (sorry for > > > > > not > > > > > > > > > > reviewing > > > > > > > > > > > > when > > > > > > > > > > > > > > you announced your intention to call the vote). I > > have > > > > a > > > > > > few > > > > > > > > > > > questions > > > > > > > > > > > > on > > > > > > > > > > > > > > some of the details. > > > > > > > > > > > > > > > > > > > > > > > > > > > > 1. There's no Javadoc on > > ClientTelemetryPayload.data(), > > > > > so > > > > > > I > > > > > > > > don't > > > > > > > > > > > know > > > > > > > > > > > > > > whether the payload is exposed through this > method > > as > > > > > > > > compressed or > > > > > > > > > > > > not. > > > > > > > > > > > > > > Later on you say "Decompression of the payloads > > will be > > > > > > > > handled by > > > > > > > > > > > the > > > > > > > > > > > > > > broker metrics plugin, the broker should expose a > > > > > suitable > > > > > > > > > > > > decompression > > > > > > > > > > > > > > API to the metrics plugin for this purpose.", > which > > > > > > suggests > > > > > > > > it's > > > > > > > > > > the > > > > > > > > > > > > > > compressed data in the buffer, but then we don't > > know > > > > > which > > > > > > > > codec > > > > > > > > > > was > > > > > > > > > > > > used, > > > > > > > > > > > > > > nor the API via which the plugin should > decompress > > it > > > > if > > > > > > > > required > > > > > > > > > > for > > > > > > > > > > > > > > forwarding to the ultimate metrics store. Should > > the > > > > > > > > > > > > ClientTelemetryPayload > > > > > > > > > > > > > > expose a method to get the compression and a > > > > > decompressor? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Good point, updated. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 2. The client-side API is expressed as > > StringOrError > > > > > > > > > > > > > > ClientInstance::ClientInstanceId(int > timeout_ms). I > > > > > > > understand > > > > > > > > that > > > > > > > > > > > > you're > > > > > > > > > > > > > > thinking about the librdkafka implementation, but > > it > > > > > would > > > > > > be > > > > > > > > good > > > > > > > > > > to > > > > > > > > > > > > show > > > > > > > > > > > > > > the API as it would appear on the Apache Kafka > > clients. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > This was meant as pseudo-code, but I changed it to > > Java. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 3. "PushTelemetryRequest|Response - protocol > > request > > > > used > > > > > > by > > > > > > > > the > > > > > > > > > > > > client to > > > > > > > > > > > > > > send metrics to any broker it is connected to." > To > > be > > > > > > clear, > > > > > > > > this > > > > > > > > > > > means > > > > > > > > > > > > > > that the client can choose any of the connected > > brokers > > > > > and > > > > > > > > push to > > > > > > > > > > > > just > > > > > > > > > > > > > > one of them? What should a supporting client do > if > > it > > > > > gets > > > > > > an > > > > > > > > error > > > > > > > > > > > > when > > > > > > > > > > > > > > pushing metrics to a broker, retry sending to the > > same > > > > > > broker > > > > > > > > or > > > > > > > > > > try > > > > > > > > > > > > > > pushing to another broker, or drop the metrics? > > Should > > > > > > > > supporting > > > > > > > > > > > > clients > > > > > > > > > > > > > > send successive requests to a single broker, or > > round > > > > > > robin, > > > > > > > > or is > > > > > > > > > > > > that up > > > > > > > > > > > > > > to the client author? I'm guessing the behaviour > > should > > > > > be > > > > > > > > sticky > > > > > > > > > > to > > > > > > > > > > > > > > support the rate limiting features, but I think > it > > > > would > > > > > be > > > > > > > > good > > > > > > > > > > for > > > > > > > > > > > > client > > > > > > > > > > > > > > authors if this section were explicit on the > > > > recommended > > > > > > > > behaviour. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > You are right, I've updated the KIP to make this > > clearer. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 4. "Mapping the client instance id to an actual > > > > > application > > > > > > > > > > instance > > > > > > > > > > > > > > running on a (virtual) machine can be done by > > > > inspecting > > > > > > the > > > > > > > > > > metrics > > > > > > > > > > > > > > resource labels, such as the client source > address > > and > > > > > > source > > > > > > > > port, > > > > > > > > > > > or > > > > > > > > > > > > > > security principal, all of which are added by the > > > > > receiving > > > > > > > > broker. > > > > > > > > > > > > This > > > > > > > > > > > > > > will allow the operator together with the user to > > > > > identify > > > > > > > the > > > > > > > > > > actual > > > > > > > > > > > > > > application instance." Is this really always > true? > > The > > > > > > source > > > > > > > > IP > > > > > > > > > > and > > > > > > > > > > > > port > > > > > > > > > > > > > > might be a loadbalancer/proxy in some setups. The > > > > > > principal, > > > > > > > as > > > > > > > > > > > already > > > > > > > > > > > > > > mentioned in the KIP, might be shared between > > multiple > > > > > > > > > > applications. > > > > > > > > > > > > So at > > > > > > > > > > > > > > worst the organization running the clients might > > have > > > > to > > > > > > > > consult > > > > > > > > > > the > > > > > > > > > > > > logs > > > > > > > > > > > > > > of a set of client applications, right? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Yes, that's correct. There's no guaranteed mapping > > from > > > > > > > > > > > > client_instance_id > > > > > > > > > > > > > to > > > > > > > > > > > > > an actual instance, that's why the KIP recommends > > client > > > > > > > > > > > implementations > > > > > > > > > > > > to > > > > > > > > > > > > > log the client instance id > > > > > > > > > > > > > upon retrieval, and also provide an API for the > > > > application > > > > > > to > > > > > > > > > > retrieve > > > > > > > > > > > > the > > > > > > > > > > > > > instance id programmatically > > > > > > > > > > > > > if it has a better way of exposing it. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 5. "Tests indicate that a compression ratio up to > > 10x is > > > > > > > > possible for > > > > > > > > > > > the > > > > > > > > > > > > > > standard metrics." Client authors might > appreciate > > your > > > > > > > > mentioning > > > > > > > > > > > > which > > > > > > > > > > > > > > compression codec got these results. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Good point. Updated. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 6. "Should the client send a push request prior > to > > > > expiry > > > > > > of > > > > > > > > the > > > > > > > > > > > > previously > > > > > > > > > > > > > > calculated PushIntervalMs the broker will discard > > the > > > > > > metrics > > > > > > > > and > > > > > > > > > > > > return a > > > > > > > > > > > > > > PushTelemetryResponse with the ErrorCode set to > > > > > > RateLimited." > > > > > > > > Is > > > > > > > > > > this > > > > > > > > > > > > > > RATE_LIMITED a new error code? It's not mentioned > > in > > > > the > > > > > > "New > > > > > > > > Error > > > > > > > > > > > > Codes" > > > > > > > > > > > > > > section. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > That's a leftover, it should be using the standard > > > > > > ThrottleTime > > > > > > > > > > > > mechanism. > > > > > > > > > > > > > Fixed. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 7. In the section "Standard client resource > labels" > > > > > > > > application_id > > > > > > > > > > is > > > > > > > > > > > > > > described as Kafka Streams only, but the section > of > > > > > "Client > > > > > > > > > > > > Identification" > > > > > > > > > > > > > > talks about "application instance id as an > optional > > > > > future > > > > > > > > > > > nice-to-have > > > > > > > > > > > > > > that may be included as a metrics label if it has > > been > > > > > set > > > > > > by > > > > > > > > the > > > > > > > > > > > > user", so > > > > > > > > > > > > > > I'm confused whether non-Kafka Streams clients > > should > > > > set > > > > > > an > > > > > > > > > > > > application_id > > > > > > > > > > > > > > or not. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I'll clarify this in the KIP, but basically we > would > > need > > > > > to > > > > > > > add > > > > > > > > an ` > > > > > > > > > > > > > application.id` config > > > > > > > > > > > > > property for non-streams clients for this purpose, > > and > > > > > that's > > > > > > > > outside > > > > > > > > > > > the > > > > > > > > > > > > > scope of this KIP since we want to make it > > zero-conf:ish > > > > on > > > > > > the > > > > > > > > > > client > > > > > > > > > > > > side. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Kind regards, > > > > > > > > > > > > > > > > > > > > > > > > > > > > Tom > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for the review, > > > > > > > > > > > > > Magnus > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Oct 7, 2021 at 5:26 PM Magnus Edenhill < > > > > > > > > mag...@edenhill.se > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi all, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I've updated the KIP following our recent > > discussions > > > > > on > > > > > > > the > > > > > > > > > > > mailing > > > > > > > > > > > > > > list: > > > > > > > > > > > > > > > - split the protocol in two, one for getting > the > > > > > metrics > > > > > > > > > > > > subscriptions, > > > > > > > > > > > > > > > and one for pushing the metrics. > > > > > > > > > > > > > > > - simplifications: initially only one > supported > > > > > metrics > > > > > > > > format, > > > > > > > > > > no > > > > > > > > > > > > > > > client.id in the instance id, etc. > > > > > > > > > > > > > > > - made CLIENT_METRICS subscription > configuration > > > > > entries > > > > > > > > more > > > > > > > > > > > > structured > > > > > > > > > > > > > > > and allowing better client matching > selectors > > (not > > > > > > only > > > > > > > > on the > > > > > > > > > > > > > > instance > > > > > > > > > > > > > > > id, but also the other > > > > > > > > > > > > > > > client resource labels, such as > > > > > client_software_name, > > > > > > > > etc.). > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Unless there are further comments I'll call the > > vote > > > > > in a > > > > > > > > day or > > > > > > > > > > > two. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Regards, > > > > > > > > > > > > > > > Magnus > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Den mån 4 okt. 2021 kl 20:57 skrev Magnus > > Edenhill < > > > > > > > > > > > > mag...@edenhill.se>: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi Gwen, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I'm finishing up the KIP based on the last > > couple > > > > of > > > > > > > > discussion > > > > > > > > > > > > points > > > > > > > > > > > > > > in > > > > > > > > > > > > > > > > this thread > > > > > > > > > > > > > > > > and will call the Vote later this week. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > > > > > > Magnus > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Den lör 2 okt. 2021 kl 02:01 skrev Gwen > Shapira > > > > > > > > > > > > > > > <g...@confluent.io.invalid > > > > > > > > > > > > > > > > >: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> Hey, > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> I noticed that there was no discussion for > the > > > > last > > > > > 10 > > > > > > > > days, > > > > > > > > > > > but I > > > > > > > > > > > > > > > >> couldn't > > > > > > > > > > > > > > > >> find the vote thread. Is there one that I'm > > > > missing? > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> Gwen > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> On Wed, Sep 22, 2021 at 4:58 AM Magnus > > Edenhill < > > > > > > > > > > > > mag...@edenhill.se> > > > > > > > > > > > > > > > >> wrote: > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > Den tis 21 sep. 2021 kl 06:58 skrev Colin > > > > McCabe < > > > > > > > > > > > > > > cmcc...@apache.org > > > > > > > > > > > > > > > >: > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > On Mon, Sep 20, 2021, at 17:35, Feng Min > > > > wrote: > > > > > > > > > > > > > > > >> > > > Thanks Magnus & Colin for the > > discussion. > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > Based on KIP-714's stateless design, > > Client > > > > > can > > > > > > > > pretty > > > > > > > > > > > much > > > > > > > > > > > > use > > > > > > > > > > > > > > > any > > > > > > > > > > > > > > > >> > > > connection to any broker to send > > metrics. We > > > > > are > > > > > > > not > > > > > > > > > > > > associating > > > > > > > > > > > > > > > >> > > connection > > > > > > > > > > > > > > > >> > > > with client metric state. Is my > > > > understanding > > > > > > > > correct? > > > > > > > > > > If > > > > > > > > > > > > yes, > > > > > > > > > > > > > > > how > > > > > > > > > > > > > > > >> > about > > > > > > > > > > > > > > > >> > > > the following two scenarios > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > 1) One Client (Client-ID) registers > two > > > > > > different > > > > > > > > client > > > > > > > > > > > > > > instance > > > > > > > > > > > > > > > id > > > > > > > > > > > > > > > >> > via > > > > > > > > > > > > > > > >> > > > separate registration. Is it > permitted? > > If > > > > OK, > > > > > > how > > > > > > > > to > > > > > > > > > > > > > > distinguish > > > > > > > > > > > > > > > >> them > > > > > > > > > > > > > > > >> > > from > > > > > > > > > > > > > > > >> > > > the case 2 below. > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > Hi Feng, > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > My understanding, which Magnus can > > clarify I > > > > > > guess, > > > > > > > is > > > > > > > > > > that > > > > > > > > > > > > you > > > > > > > > > > > > > > > could > > > > > > > > > > > > > > > >> > have > > > > > > > > > > > > > > > >> > > something like two Producer instances > > running > > > > > with > > > > > > > the > > > > > > > > > > same > > > > > > > > > > > > > > > client.id > > > > > > > > > > > > > > > >> > > (perhaps because they're using the same > > config > > > > > > file, > > > > > > > > for > > > > > > > > > > > > example). > > > > > > > > > > > > > > > >> They > > > > > > > > > > > > > > > >> > > could even be in the same process. But > > they > > > > > would > > > > > > > get > > > > > > > > > > > separate > > > > > > > > > > > > > > > UUIDs. > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > I believe Magnus used the term client to > > mean > > > > > > > > "Producer or > > > > > > > > > > > > > > > Consumer". > > > > > > > > > > > > > > > >> So > > > > > > > > > > > > > > > >> > > if you have both a Producer and a > > Consumer in > > > > > your > > > > > > > > > > > > application I > > > > > > > > > > > > > > > would > > > > > > > > > > > > > > > >> > > expect you'd get separate UUIDs for > both. > > > > Again > > > > > > > > Magnus can > > > > > > > > > > > > chime > > > > > > > > > > > > > > in > > > > > > > > > > > > > > > >> > here, I > > > > > > > > > > > > > > > >> > > guess. > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > That's correct. > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > 2) How about the client restarting? > > What's > > > > the > > > > > > > > > > > expectation? > > > > > > > > > > > > > > Should > > > > > > > > > > > > > > > >> the > > > > > > > > > > > > > > > >> > > > server expect the client to carry a > > > > persisted > > > > > > > client > > > > > > > > > > > > instance id > > > > > > > > > > > > > > > or > > > > > > > > > > > > > > > >> > > should > > > > > > > > > > > > > > > >> > > > the client be treated as a new > instance? > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > The KIP doesn't describe any mechanism > for > > > > > > > > persistence, > > > > > > > > > > so I > > > > > > > > > > > > would > > > > > > > > > > > > > > > >> assume > > > > > > > > > > > > > > > >> > > that when you restart the client you get > > a new > > > > > > > UUID. I > > > > > > > > > > agree > > > > > > > > > > > > that > > > > > > > > > > > > > > it > > > > > > > > > > > > > > > >> > would > > > > > > > > > > > > > > > >> > > be good to spell this out. > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > Right, it will not be persisted since a > > client > > > > > > > instance > > > > > > > > > > can't > > > > > > > > > > > be > > > > > > > > > > > > > > > >> restarted. > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > Will update the KIP to make this clearer. > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > /Magnus > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> -- > > > > > > > > > > > > > > > >> Gwen Shapira > > > > > > > > > > > > > > > >> Engineering Manager | Confluent > > > > > > > > > > > > > > > >> 650.450.2760 | @gwenshap > > > > > > > > > > > > > > > >> Follow us: Twitter | blog > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >