Hi all,

I've updated the KIP with responses to the latest comments: Java client
dependencies (Thanks Kirk!), alternate designs (separate cluster, separate
producer, etc), etc.

I will revive the vote thread.

Thanks,
Magnus


Den mån 13 dec. 2021 kl 22:32 skrev Ryanne Dolan <ryannedo...@gmail.com>:

> I think we should be very careful about introducing new runtime
> dependencies into the clients. Historically this has been rare and
> essentially necessary (e.g. compression libs).
>
> Ryanne
>
> On Mon, Dec 13, 2021, 1:06 PM Kirk True <k...@mustardgrain.com> wrote:
>
> > Hi Jun,
> >
> > On Thu, Dec 9, 2021, at 2:28 PM, Jun Rao wrote:
> > > 13. Using OpenTelemetry. Does that require runtime dependency
> > > on OpenTelemetry library? How good is the compatibility story
> > > of OpenTelemetry? This is important since an application could have
> other
> > > OpenTelemetry dependencies than the Kafka client.
> >
> > The current design is that the OpenTelemetry JARs would ship with the
> > client. Perhaps we can design the client such that the JARs aren't even
> > loaded if the user has opted out. The user could even exclude the JARs
> from
> > their dependencies if they so wished.
> >
> > I can't speak to the compatibility of the libraries. Is it possible that
> > we include a shaded version?
> >
> > Thanks,
> > Kirk
> >
> > >
> > > 14. The proposal listed idempotence=true. This is more of a
> configuration
> > > than a metric. Are we including that as a metric? What other
> > configurations
> > > are we including? Should we separate the configurations from the
> metrics?
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Mon, Nov 29, 2021 at 7:34 AM Magnus Edenhill <mag...@edenhill.se>
> > wrote:
> > >
> > > > Hey Bob,
> > > >
> > > > That's a good point.
> > > >
> > > > Request type labels were considered but since they're already tracked
> > by
> > > > broker-side metrics
> > > > they were left out as to avoid metric duplication, however those
> > metrics
> > > > are not per connection,
> > > > so they won't be that useful in practice for troubleshooting specific
> > > > client instances.
> > > >
> > > > I'll add the request_type label to the relevant metrics.
> > > >
> > > > Thanks,
> > > > Magnus
> > > >
> > > >
> > > > Den tis 23 nov. 2021 kl 19:20 skrev Bob Barrett
> > > > <bob.barr...@confluent.io.invalid>:
> > > >
> > > > > Hi Magnus,
> > > > >
> > > > > Thanks for the thorough KIP, this seems very useful.
> > > > >
> > > > > Would it make sense to include the request type as a label for the
> > > > > `client.request.success`, `client.request.errors` and
> > > > `client.request.rtt`
> > > > > metrics? I think it would be very useful to see which specific
> > requests
> > > > are
> > > > > succeeding and failing for a client. One specific case I can think
> of
> > > > where
> > > > > this could be useful is producer batch timeouts. If a Java
> > application
> > > > does
> > > > > not enable producer client logs (unfortunately, in my experience
> this
> > > > > happens more often than it should), the application logs will only
> > > > contain
> > > > > the expiration error message, but no information about what is
> > causing
> > > > the
> > > > > timeout. The requests might all be succeeding but taking too long
> to
> > > > > process batches, or metadata requests might be failing, or some or
> > all
> > > > > produce requests might be failing (if the bootstrap servers are
> > reachable
> > > > > from the client but one or more other brokers are not, for
> example).
> > If
> > > > the
> > > > > cluster operator is able to identify the specific requests that are
> > slow
> > > > or
> > > > > failing for a client, they will be better able to diagnose the
> issue
> > > > > causing batch timeouts.
> > > > >
> > > > > One drawback I can think of is that this will increase the
> > cardinality of
> > > > > the request metrics. But any given client is only going to use a
> > small
> > > > > subset of the request types, and since we already have partition
> > labels
> > > > for
> > > > > the topic-level metrics, I think request labels will still make up
> a
> > > > > relatively small percentage of the set of metrics.
> > > > >
> > > > > Thanks,
> > > > > Bob
> > > > >
> > > > > On Mon, Nov 22, 2021 at 2:08 AM Viktor Somogyi-Vass <
> > > > > viktorsomo...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Hi Magnus,
> > > > > >
> > > > > > I think this is a very useful addition. We also have a similar
> (but
> > > > much
> > > > > > more simplistic) implementation of this. Maybe I missed it in the
> > KIP
> > > > but
> > > > > > what about adding metrics about the subscription cache itself?
> > That I
> > > > > think
> > > > > > would improve its usability and debuggability as we'd be able to
> > see
> > > > its
> > > > > > performance, hit/miss rates, eviction counts and others.
> > > > > >
> > > > > > Best,
> > > > > > Viktor
> > > > > >
> > > > > > On Thu, Nov 18, 2021 at 5:12 PM Magnus Edenhill <
> > mag...@edenhill.se>
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Mickael,
> > > > > > >
> > > > > > > see inline.
> > > > > > >
> > > > > > > Den ons 10 nov. 2021 kl 15:21 skrev Mickael Maison <
> > > > > > > mickael.mai...@gmail.com
> > > > > > > >:
> > > > > > >
> > > > > > > > Hi Magnus,
> > > > > > > >
> > > > > > > > I see you've addressed some of the points I raised above but
> > some
> > > > (4,
> > > > > > > > 5) have not been addressed yet.
> > > > > > > >
> > > > > > >
> > > > > > > Re 4) How will the user/app know metrics are being sent.
> > > > > > >
> > > > > > > One possibility is to add a JMX metric (thus for user
> > consumption)
> > > > for
> > > > > > the
> > > > > > > number of metric pushes the
> > > > > > > client has performed, or perhaps the number of metrics
> > subscriptions
> > > > > > > currently being collected.
> > > > > > > Would that be sufficient?
> > > > > > >
> > > > > > > Re 5) Metric sizes and rates
> > > > > > >
> > > > > > > A worst case scenario for a producer that is producing to 50
> > unique
> > > > > > topics
> > > > > > > and emitting all standard metrics yields
> > > > > > > a serialized size of around 100KB prior to compression, which
> > > > > compresses
> > > > > > > down to about 20-30% of that depending
> > > > > > > on compression type and topic name uniqueness.
> > > > > > > The numbers for a consumer would be similar.
> > > > > > >
> > > > > > > In practice the number of unique topics would be far less, and
> > the
> > > > > > > subscription set would typically be for a subset of metrics.
> > > > > > > So we're probably closer to 1kb, or less, compressed size per
> > client
> > > > > per
> > > > > > > push interval.
> > > > > > >
> > > > > > > As both the subscription set and push intervals are controlled
> > by the
> > > > > > > cluster operator it shouldn't be too hard
> > > > > > > to strike a good balance between metrics overhead and
> > granularity.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > I'm really uneasy with this being enabled by default on the
> > client
> > > > > > > > side. When collecting data, I think the best practice is to
> > ensure
> > > > > > > > users are explicitly enabling it.
> > > > > > > >
> > > > > > >
> > > > > > > Requiring metrics to be explicitly enabled on clients severely
> > > > cripples
> > > > > > its
> > > > > > > usability and value.
> > > > > > >
> > > > > > > One of the problems that this KIP aims to solve is for useful
> > metrics
> > > > > to
> > > > > > be
> > > > > > > available on demand
> > > > > > > regardless of the technical expertise of the user. As Ryanne
> > points,
> > > > > out
> > > > > > a
> > > > > > > savvy user/organization
> > > > > > > will typically have metrics collection and monitoring in place
> > > > already,
> > > > > > and
> > > > > > > the benefits of this KIP
> > > > > > > are then more of a common set and format metrics across client
> > > > > > > implementations and languages.
> > > > > > > But that is not the typical Kafka user in my experience,
> they're
> > not
> > > > > > Kafka
> > > > > > > experts and they don't have the
> > > > > > > knowledge of how to best instrument their clients.
> > > > > > > Having metrics enabled by default for this user base allows the
> > Kafka
> > > > > > > operators to proactively and reactively
> > > > > > > monitor and troubleshoot client issues, without the need for
> the
> > less
> > > > > > savvy
> > > > > > > user to do anything.
> > > > > > > It is often too late to tell a user to enable metrics when the
> > > > problem
> > > > > > has
> > > > > > > already occurred.
> > > > > > >
> > > > > > > Now, to be clear, even though metrics are enabled by default on
> > > > clients
> > > > > > it
> > > > > > > is not enabled by default
> > > > > > > on the brokers; the Kafka operator needs to build and set up a
> > > > metrics
> > > > > > > plugin and add metrics subscriptions
> > > > > > > before anything is sent from the client.
> > > > > > > It is opt-out on the clients and opt-in on the broker.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > > You mentioned brokers already have
> > > > > > > > some(most?) of the information contained in metrics, if so
> > then why
> > > > > > > > are we collecting it again? Surely there must be some new
> > > > information
> > > > > > > > in the client metrics.
> > > > > > > >
> > > > > > >
> > > > > > > From the user's perspective the Kafka infrastructure extends
> from
> > > > > > > producer.send() to
> > > > > > > messages being returned from consumer.poll(), a giant black box
> > where
> > > > > > > there's a lot going on between those
> > > > > > > two points. The brokers currently only see what happens once
> > those
> > > > > > requests
> > > > > > > and messages hits the broker,
> > > > > > > but as Kafka clients are complex pieces of machinery there's a
> > myriad
> > > > > of
> > > > > > > queues, timers, and state
> > > > > > > that's critical to the operation and infrastructure that's not
> > > > > currently
> > > > > > > visible to the operator.
> > > > > > > Relying on the user to accurately and timely provide this
> missing
> > > > > > > information is not generally feasible.
> > > > > > >
> > > > > > >
> > > > > > > Most of the standard metrics listed in the KIP are data points
> > that
> > > > the
> > > > > > > broker does not have.
> > > > > > > Only a small number of metrics are duplicates (like the request
> > > > counts
> > > > > > and
> > > > > > > sizes), but they are included
> > > > > > > to ease correlation when inspecting these client metrics.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > > Moreover this is a brand new feature so it's even harder to
> > justify
> > > > > > > > enabling it and forcing onto all our users. If disabled by
> > default,
> > > > > > > > it's relatively easy to enable in a new release if we decide
> > to,
> > > > but
> > > > > > > > once enabled by default it's much harder to disable. Also
> this
> > > > > feature
> > > > > > > > will apply to all future metrics we will add.
> > > > > > > >
> > > > > > >
> > > > > > > I think maturity of a feature implementation should be the
> > deciding
> > > > > > factor,
> > > > > > > rather than
> > > > > > > the design of it (which this KIP is). I.e., if the
> > implementation is
> > > > > not
> > > > > > > deemed mature enough
> > > > > > > for release X.Y it will be disabled.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > > Overall I think it's an interesting feature but I'd prefer to
> > be
> > > > > > > > slightly defensive and see how it works in practice before
> > enabling
> > > > > it
> > > > > > > > everywhere.
> > > > > > > >
> > > > > > >
> > > > > > > Right, and I agree on being defensive, but since this feature
> > still
> > > > > > > requires manual
> > > > > > > enabling on the brokers before actually being used, I think
> that
> > > > gives
> > > > > > > enough control
> > > > > > > to opt-in or out of this feature as needed.
> > > > > > >
> > > > > > > Thanks for your comments!
> > > > > > >
> > > > > > > Regards,
> > > > > > > Magnus
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Mickael
> > > > > > > >
> > > > > > > > On Mon, Nov 8, 2021 at 7:55 AM Magnus Edenhill <
> > mag...@edenhill.se
> > > > >
> > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > Thanks David for pointing this out,
> > > > > > > > > I've updated the KIP to include client_id as a matching
> > selector.
> > > > > > > > >
> > > > > > > > > Regards,
> > > > > > > > > Magnus
> > > > > > > > >
> > > > > > > > > Den tors 4 nov. 2021 kl 18:01 skrev David Mao
> > > > > > > <d...@confluent.io.invalid
> > > > > > > > >:
> > > > > > > > >
> > > > > > > > > > Hey Magnus,
> > > > > > > > > >
> > > > > > > > > > I noticed that the KIP outlines the initial selectors
> > supported
> > > > > as:
> > > > > > > > > >
> > > > > > > > > >    - client_instance_id - CLIENT_INSTANCE_ID UUID string
> > > > > > > > representation.
> > > > > > > > > >    - client_software_name  - client software
> implementation
> > > > name.
> > > > > > > > > >    - client_software_version  - client software
> > implementation
> > > > > > > version.
> > > > > > > > > >
> > > > > > > > > > In the given reactive monitoring workflow, we mention
> that
> > the
> > > > > > > > application
> > > > > > > > > > user does not know their client's client instance ID, but
> > it's
> > > > > > > outlined
> > > > > > > > > > that the operator can add a metrics subscription
> selecting
> > for
> > > > > > > > clientId. I
> > > > > > > > > > don't see clientId as one of the supported selectors.
> > > > > > > > > > I can see how this would have made sense in a previous
> > > > iteration
> > > > > > > given
> > > > > > > > that
> > > > > > > > > > the previous client instance ID proposal was to construct
> > the
> > > > > > client
> > > > > > > > > > instance ID using clientId as a prefix. Now that the
> client
> > > > > > instance
> > > > > > > > ID is
> > > > > > > > > > a UUID, would we want to add clientId as a supported
> > selector?
> > > > > > > > > > Let me know what you think.
> > > > > > > > > >
> > > > > > > > > > David
> > > > > > > > > >
> > > > > > > > > > On Tue, Oct 19, 2021 at 12:33 PM Magnus Edenhill <
> > > > > > mag...@edenhill.se
> > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi Mickael!
> > > > > > > > > > >
> > > > > > > > > > > Den tis 19 okt. 2021 kl 19:30 skrev Mickael Maison <
> > > > > > > > > > > mickael.mai...@gmail.com
> > > > > > > > > > > >:
> > > > > > > > > > >
> > > > > > > > > > > > Hi Magnus,
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks for the proposal.
> > > > > > > > > > > >
> > > > > > > > > > > > 1. Looking at the protocol section, isn't
> > > > "ClientInstanceId"
> > > > > > > > expected
> > > > > > > > > > > > to be a field in GetTelemetrySubscriptionsResponseV0?
> > > > > > Otherwise,
> > > > > > > > how
> > > > > > > > > > > > does a client retrieve this value?
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Good catch, it got removed by mistake in one of the
> > edits.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > 2. In the client API section, you mention a new
> method
> > > > > > > > > > > > "clientInstanceId()". Can you clarify which
> interfaces
> > are
> > > > > > > > affected?
> > > > > > > > > > > > Is it only Consumer and Producer?
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > And Admin. Will update the KIP.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > > 3. I'm a bit concerned this is enabled by default.
> > Even if
> > > > > the
> > > > > > > data
> > > > > > > > > > > > collected is supposed to be not sensitive, I think
> > this can
> > > > > be
> > > > > > > > > > > > problematic in some environments. Also users don't
> > seem to
> > > > > have
> > > > > > > the
> > > > > > > > > > > > choice to only expose some metrics. Knowing how much
> > data
> > > > > > transit
> > > > > > > > > > > > through some applications can be considered critical.
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > The broker already knows how much data transits through
> > the
> > > > > > client
> > > > > > > > > > though,
> > > > > > > > > > > right?
> > > > > > > > > > > Care has been taken not to expose information in the
> > standard
> > > > > > > metrics
> > > > > > > > > > that
> > > > > > > > > > > might
> > > > > > > > > > > reveal sensitive information.
> > > > > > > > > > >
> > > > > > > > > > > Do you have an example of how the proposed metrics
> could
> > leak
> > > > > > > > sensitive
> > > > > > > > > > > information?
> > > > > > > > > > > As for limiting the what metrics to export; I guess
> that
> > > > could
> > > > > > make
> > > > > > > > sense
> > > > > > > > > > > in some
> > > > > > > > > > > very sensitive use-cases, but those users might disable
> > > > metrics
> > > > > > > > > > altogether
> > > > > > > > > > > for now.
> > > > > > > > > > > Could these concerns be addressed by a later KIP?
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > 4. As a user, how do you know if your application is
> > > > actively
> > > > > > > > sending
> > > > > > > > > > > > metrics? Are there new metrics exposing what's going
> > on,
> > > > like
> > > > > > how
> > > > > > > > much
> > > > > > > > > > > > data is being sent?
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > That's a good question.
> > > > > > > > > > > Since the proposed metrics interface is not aimed at,
> or
> > > > > directly
> > > > > > > > > > available
> > > > > > > > > > > to, the application
> > > > > > > > > > > I guess there's little point of adding it here, but
> > instead
> > > > > > adding
> > > > > > > > > > > something to the
> > > > > > > > > > > existing JMX metrics?
> > > > > > > > > > > Do you have any suggestions?
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > > 5. If all metrics are enabled on a regular Consumer
> or
> > > > > > Producer,
> > > > > > > do
> > > > > > > > > > > > you have an idea how much throughput this would use?
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > It depends on the number of partition/topics/etc the
> > client
> > > > is
> > > > > > > > producing
> > > > > > > > > > > to/consuming from.
> > > > > > > > > > > I'll add some sizes to the KIP for some typical
> > use-cases.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Thanks,
> > > > > > > > > > > Magnus
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > > Thanks
> > > > > > > > > > > >
> > > > > > > > > > > > On Tue, Oct 19, 2021 at 5:06 PM Magnus Edenhill <
> > > > > > > > mag...@edenhill.se>
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > Den tis 19 okt. 2021 kl 13:22 skrev Tom Bentley <
> > > > > > > > tbent...@redhat.com
> > > > > > > > > > >:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi Magnus,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I reviewed the KIP since you called the vote
> > (sorry for
> > > > > not
> > > > > > > > > > reviewing
> > > > > > > > > > > > when
> > > > > > > > > > > > > > you announced your intention to call the vote). I
> > have
> > > > a
> > > > > > few
> > > > > > > > > > > questions
> > > > > > > > > > > > on
> > > > > > > > > > > > > > some of the details.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 1. There's no Javadoc on
> > ClientTelemetryPayload.data(),
> > > > > so
> > > > > > I
> > > > > > > > don't
> > > > > > > > > > > know
> > > > > > > > > > > > > > whether the payload is exposed through this
> method
> > as
> > > > > > > > compressed or
> > > > > > > > > > > > not.
> > > > > > > > > > > > > > Later on you say "Decompression of the payloads
> > will be
> > > > > > > > handled by
> > > > > > > > > > > the
> > > > > > > > > > > > > > broker metrics plugin, the broker should expose a
> > > > > suitable
> > > > > > > > > > > > decompression
> > > > > > > > > > > > > > API to the metrics plugin for this purpose.",
> which
> > > > > > suggests
> > > > > > > > it's
> > > > > > > > > > the
> > > > > > > > > > > > > > compressed data in the buffer, but then we don't
> > know
> > > > > which
> > > > > > > > codec
> > > > > > > > > > was
> > > > > > > > > > > > used,
> > > > > > > > > > > > > > nor the API via which the plugin should
> decompress
> > it
> > > > if
> > > > > > > > required
> > > > > > > > > > for
> > > > > > > > > > > > > > forwarding to the ultimate metrics store. Should
> > the
> > > > > > > > > > > > ClientTelemetryPayload
> > > > > > > > > > > > > > expose a method to get the compression and a
> > > > > decompressor?
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Good point, updated.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > > 2. The client-side API is expressed as
> > StringOrError
> > > > > > > > > > > > > > ClientInstance::ClientInstanceId(int
> timeout_ms). I
> > > > > > > understand
> > > > > > > > that
> > > > > > > > > > > > you're
> > > > > > > > > > > > > > thinking about the librdkafka implementation, but
> > it
> > > > > would
> > > > > > be
> > > > > > > > good
> > > > > > > > > > to
> > > > > > > > > > > > show
> > > > > > > > > > > > > > the API as it would appear on the Apache Kafka
> > clients.
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > This was meant as pseudo-code, but I changed it to
> > Java.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > > 3. "PushTelemetryRequest|Response - protocol
> > request
> > > > used
> > > > > > by
> > > > > > > > the
> > > > > > > > > > > > client to
> > > > > > > > > > > > > > send metrics to any broker it is connected to."
> To
> > be
> > > > > > clear,
> > > > > > > > this
> > > > > > > > > > > means
> > > > > > > > > > > > > > that the client can choose any of the connected
> > brokers
> > > > > and
> > > > > > > > push to
> > > > > > > > > > > > just
> > > > > > > > > > > > > > one of them? What should a supporting client do
> if
> > it
> > > > > gets
> > > > > > an
> > > > > > > > error
> > > > > > > > > > > > when
> > > > > > > > > > > > > > pushing metrics to a broker, retry sending to the
> > same
> > > > > > broker
> > > > > > > > or
> > > > > > > > > > try
> > > > > > > > > > > > > > pushing to another broker, or drop the metrics?
> > Should
> > > > > > > > supporting
> > > > > > > > > > > > clients
> > > > > > > > > > > > > > send successive requests to a single broker, or
> > round
> > > > > > robin,
> > > > > > > > or is
> > > > > > > > > > > > that up
> > > > > > > > > > > > > > to the client author? I'm guessing the behaviour
> > should
> > > > > be
> > > > > > > > sticky
> > > > > > > > > > to
> > > > > > > > > > > > > > support the rate limiting features, but I think
> it
> > > > would
> > > > > be
> > > > > > > > good
> > > > > > > > > > for
> > > > > > > > > > > > client
> > > > > > > > > > > > > > authors if this section were explicit on the
> > > > recommended
> > > > > > > > behaviour.
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > You are right, I've updated the KIP to make this
> > clearer.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > > 4. "Mapping the client instance id to an actual
> > > > > application
> > > > > > > > > > instance
> > > > > > > > > > > > > > running on a (virtual) machine can be done by
> > > > inspecting
> > > > > > the
> > > > > > > > > > metrics
> > > > > > > > > > > > > > resource labels, such as the client source
> address
> > and
> > > > > > source
> > > > > > > > port,
> > > > > > > > > > > or
> > > > > > > > > > > > > > security principal, all of which are added by the
> > > > > receiving
> > > > > > > > broker.
> > > > > > > > > > > > This
> > > > > > > > > > > > > > will allow the operator together with the user to
> > > > > identify
> > > > > > > the
> > > > > > > > > > actual
> > > > > > > > > > > > > > application instance." Is this really always
> true?
> > The
> > > > > > source
> > > > > > > > IP
> > > > > > > > > > and
> > > > > > > > > > > > port
> > > > > > > > > > > > > > might be a loadbalancer/proxy in some setups. The
> > > > > > principal,
> > > > > > > as
> > > > > > > > > > > already
> > > > > > > > > > > > > > mentioned in the KIP, might be shared between
> > multiple
> > > > > > > > > > applications.
> > > > > > > > > > > > So at
> > > > > > > > > > > > > > worst the organization running the clients might
> > have
> > > > to
> > > > > > > > consult
> > > > > > > > > > the
> > > > > > > > > > > > logs
> > > > > > > > > > > > > > of a set of client applications, right?
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Yes, that's correct. There's no guaranteed mapping
> > from
> > > > > > > > > > > > client_instance_id
> > > > > > > > > > > > > to
> > > > > > > > > > > > > an actual instance, that's why the KIP recommends
> > client
> > > > > > > > > > > implementations
> > > > > > > > > > > > to
> > > > > > > > > > > > > log the client instance id
> > > > > > > > > > > > > upon retrieval, and also provide an API for the
> > > > application
> > > > > > to
> > > > > > > > > > retrieve
> > > > > > > > > > > > the
> > > > > > > > > > > > > instance id programmatically
> > > > > > > > > > > > > if it has a better way of exposing it.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > 5. "Tests indicate that a compression ratio up to
> > 10x is
> > > > > > > > possible for
> > > > > > > > > > > the
> > > > > > > > > > > > > > standard metrics." Client authors might
> appreciate
> > your
> > > > > > > > mentioning
> > > > > > > > > > > > which
> > > > > > > > > > > > > > compression codec got these results.
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Good point. Updated.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > > 6. "Should the client send a push request prior
> to
> > > > expiry
> > > > > > of
> > > > > > > > the
> > > > > > > > > > > > previously
> > > > > > > > > > > > > > calculated PushIntervalMs the broker will discard
> > the
> > > > > > metrics
> > > > > > > > and
> > > > > > > > > > > > return a
> > > > > > > > > > > > > > PushTelemetryResponse with the ErrorCode set to
> > > > > > RateLimited."
> > > > > > > > Is
> > > > > > > > > > this
> > > > > > > > > > > > > > RATE_LIMITED a new error code? It's not mentioned
> > in
> > > > the
> > > > > > "New
> > > > > > > > Error
> > > > > > > > > > > > Codes"
> > > > > > > > > > > > > > section.
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > That's a leftover, it should be using the standard
> > > > > > ThrottleTime
> > > > > > > > > > > > mechanism.
> > > > > > > > > > > > > Fixed.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > > 7. In the section "Standard client resource
> labels"
> > > > > > > > application_id
> > > > > > > > > > is
> > > > > > > > > > > > > > described as Kafka Streams only, but the section
> of
> > > > > "Client
> > > > > > > > > > > > Identification"
> > > > > > > > > > > > > > talks about "application instance id as an
> optional
> > > > > future
> > > > > > > > > > > nice-to-have
> > > > > > > > > > > > > > that may be included as a metrics label if it has
> > been
> > > > > set
> > > > > > by
> > > > > > > > the
> > > > > > > > > > > > user", so
> > > > > > > > > > > > > > I'm confused whether non-Kafka Streams clients
> > should
> > > > set
> > > > > > an
> > > > > > > > > > > > application_id
> > > > > > > > > > > > > > or not.
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > I'll clarify this in the KIP, but basically we
> would
> > need
> > > > > to
> > > > > > > add
> > > > > > > > an `
> > > > > > > > > > > > > application.id` config
> > > > > > > > > > > > > property for non-streams clients for this purpose,
> > and
> > > > > that's
> > > > > > > > outside
> > > > > > > > > > > the
> > > > > > > > > > > > > scope of this KIP since we want to make it
> > zero-conf:ish
> > > > on
> > > > > > the
> > > > > > > > > > client
> > > > > > > > > > > > side.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Kind regards,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Tom
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks for the review,
> > > > > > > > > > > > > Magnus
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Thu, Oct 7, 2021 at 5:26 PM Magnus Edenhill <
> > > > > > > > mag...@edenhill.se
> > > > > > > > > > >
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hi all,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I've updated the KIP following our recent
> > discussions
> > > > > on
> > > > > > > the
> > > > > > > > > > > mailing
> > > > > > > > > > > > > > list:
> > > > > > > > > > > > > > >  - split the protocol in two, one for getting
> the
> > > > > metrics
> > > > > > > > > > > > subscriptions,
> > > > > > > > > > > > > > > and one for pushing the metrics.
> > > > > > > > > > > > > > >  - simplifications: initially only one
> supported
> > > > > metrics
> > > > > > > > format,
> > > > > > > > > > no
> > > > > > > > > > > > > > > client.id in the instance id, etc.
> > > > > > > > > > > > > > >  - made CLIENT_METRICS subscription
> configuration
> > > > > entries
> > > > > > > > more
> > > > > > > > > > > > structured
> > > > > > > > > > > > > > >    and allowing better client matching
> selectors
> > (not
> > > > > > only
> > > > > > > > on the
> > > > > > > > > > > > > > instance
> > > > > > > > > > > > > > > id, but also the other
> > > > > > > > > > > > > > >    client resource labels, such as
> > > > > client_software_name,
> > > > > > > > etc.).
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Unless there are further comments I'll call the
> > vote
> > > > > in a
> > > > > > > > day or
> > > > > > > > > > > two.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > > > Magnus
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Den mån 4 okt. 2021 kl 20:57 skrev Magnus
> > Edenhill <
> > > > > > > > > > > > mag...@edenhill.se>:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Hi Gwen,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I'm finishing up the KIP based on the last
> > couple
> > > > of
> > > > > > > > discussion
> > > > > > > > > > > > points
> > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > this thread
> > > > > > > > > > > > > > > > and will call the Vote later this week.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > Magnus
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Den lör 2 okt. 2021 kl 02:01 skrev Gwen
> Shapira
> > > > > > > > > > > > > > > <g...@confluent.io.invalid
> > > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >> Hey,
> > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > >> I noticed that there was no discussion for
> the
> > > > last
> > > > > 10
> > > > > > > > days,
> > > > > > > > > > > but I
> > > > > > > > > > > > > > > >> couldn't
> > > > > > > > > > > > > > > >> find the vote thread. Is there one that I'm
> > > > missing?
> > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > >> Gwen
> > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > >> On Wed, Sep 22, 2021 at 4:58 AM Magnus
> > Edenhill <
> > > > > > > > > > > > mag...@edenhill.se>
> > > > > > > > > > > > > > > >> wrote:
> > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > >> > Den tis 21 sep. 2021 kl 06:58 skrev Colin
> > > > McCabe <
> > > > > > > > > > > > > > cmcc...@apache.org
> > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > >> > > On Mon, Sep 20, 2021, at 17:35, Feng Min
> > > > wrote:
> > > > > > > > > > > > > > > >> > > > Thanks Magnus & Colin for the
> > discussion.
> > > > > > > > > > > > > > > >> > > >
> > > > > > > > > > > > > > > >> > > > Based on KIP-714's stateless design,
> > Client
> > > > > can
> > > > > > > > pretty
> > > > > > > > > > > much
> > > > > > > > > > > > use
> > > > > > > > > > > > > > > any
> > > > > > > > > > > > > > > >> > > > connection to any broker to send
> > metrics. We
> > > > > are
> > > > > > > not
> > > > > > > > > > > > associating
> > > > > > > > > > > > > > > >> > > connection
> > > > > > > > > > > > > > > >> > > > with client metric state. Is my
> > > > understanding
> > > > > > > > correct?
> > > > > > > > > > If
> > > > > > > > > > > > yes,
> > > > > > > > > > > > > > > how
> > > > > > > > > > > > > > > >> > about
> > > > > > > > > > > > > > > >> > > > the following two scenarios
> > > > > > > > > > > > > > > >> > > >
> > > > > > > > > > > > > > > >> > > > 1) One Client (Client-ID) registers
> two
> > > > > > different
> > > > > > > > client
> > > > > > > > > > > > > > instance
> > > > > > > > > > > > > > > id
> > > > > > > > > > > > > > > >> > via
> > > > > > > > > > > > > > > >> > > > separate registration. Is it
> permitted?
> > If
> > > > OK,
> > > > > > how
> > > > > > > > to
> > > > > > > > > > > > > > distinguish
> > > > > > > > > > > > > > > >> them
> > > > > > > > > > > > > > > >> > > from
> > > > > > > > > > > > > > > >> > > > the case 2 below.
> > > > > > > > > > > > > > > >> > > >
> > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > >> > > Hi Feng,
> > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > >> > > My understanding, which Magnus can
> > clarify I
> > > > > > guess,
> > > > > > > is
> > > > > > > > > > that
> > > > > > > > > > > > you
> > > > > > > > > > > > > > > could
> > > > > > > > > > > > > > > >> > have
> > > > > > > > > > > > > > > >> > > something like two Producer instances
> > running
> > > > > with
> > > > > > > the
> > > > > > > > > > same
> > > > > > > > > > > > > > > client.id
> > > > > > > > > > > > > > > >> > > (perhaps because they're using the same
> > config
> > > > > > file,
> > > > > > > > for
> > > > > > > > > > > > example).
> > > > > > > > > > > > > > > >> They
> > > > > > > > > > > > > > > >> > > could even be in the same process. But
> > they
> > > > > would
> > > > > > > get
> > > > > > > > > > > separate
> > > > > > > > > > > > > > > UUIDs.
> > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > >> > > I believe Magnus used the term client to
> > mean
> > > > > > > > "Producer or
> > > > > > > > > > > > > > > Consumer".
> > > > > > > > > > > > > > > >> So
> > > > > > > > > > > > > > > >> > > if you have both a Producer and a
> > Consumer in
> > > > > your
> > > > > > > > > > > > application I
> > > > > > > > > > > > > > > would
> > > > > > > > > > > > > > > >> > > expect you'd get separate UUIDs for
> both.
> > > > Again
> > > > > > > > Magnus can
> > > > > > > > > > > > chime
> > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > >> > here, I
> > > > > > > > > > > > > > > >> > > guess.
> > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > >> > That's correct.
> > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > >> > > > 2) How about the client restarting?
> > What's
> > > > the
> > > > > > > > > > > expectation?
> > > > > > > > > > > > > > Should
> > > > > > > > > > > > > > > >> the
> > > > > > > > > > > > > > > >> > > > server expect the client to carry a
> > > > persisted
> > > > > > > client
> > > > > > > > > > > > instance id
> > > > > > > > > > > > > > > or
> > > > > > > > > > > > > > > >> > > should
> > > > > > > > > > > > > > > >> > > > the client be treated as a new
> instance?
> > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > >> > > The KIP doesn't describe any mechanism
> for
> > > > > > > > persistence,
> > > > > > > > > > so I
> > > > > > > > > > > > would
> > > > > > > > > > > > > > > >> assume
> > > > > > > > > > > > > > > >> > > that when you restart the client you get
> > a new
> > > > > > > UUID. I
> > > > > > > > > > agree
> > > > > > > > > > > > that
> > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > >> > would
> > > > > > > > > > > > > > > >> > > be good to spell this out.
> > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > >> > Right, it will not be persisted since a
> > client
> > > > > > > instance
> > > > > > > > > > can't
> > > > > > > > > > > be
> > > > > > > > > > > > > > > >> restarted.
> > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > >> > Will update the KIP to make this clearer.
> > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > >> > /Magnus
> > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > >> --
> > > > > > > > > > > > > > > >> Gwen Shapira
> > > > > > > > > > > > > > > >> Engineering Manager | Confluent
> > > > > > > > > > > > > > > >> 650.450.2760 | @gwenshap
> > > > > > > > > > > > > > > >> Follow us: Twitter | blog
> > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to