On Fri, Aug 23, 2019, at 00:07, Magnus Edenhill wrote:
> Great proposal, this feature is well overdue!
> 
> 1)
> From an operator's perspective I don't think the kafka client
> implementation name and version are sufficient,
> I also believe the application name and version are of interest.
> You could have all applications in your cluster run the same kafka client
> and version, but only one type or
> version of an application misbehave and needing to be tracked down.
Hi Magnus,

I think it might be better to leave this out of scope for now, and think about 
it in the context of more generalized request tracing.  This is a very deep 
rabbit hole, and I could easily see it delaying this KIP for a long time.  For 
example, if you have multiple Spark jobs producing to Kafka, just knowing that 
a client is being used by Spark may not be that helpful.  So maybe you want a 
third set of fields to describe the spark application ID and version, etc?  And 
then maybe that, itself, was created by some framework... etc. Probably better 
to defer this discussion for now and see how version tracking works out.

> 
> While the application and client name and version could be combined in the
> ClientName/ClientVersion fields by
> the user (e.g. like User-Agent), it would not be in a generalized format
> and hard for generic monitoring tools to parse correctly.
> 
> So I'd suggest keeping ClientName and ClientVersion as the client
> implementation name ("java" or "org.apache.kafka...") and version,
> which can't be changed by the user/app developer, and providing two
> optional fields for the application counterpart:
> ApplicationName and ApplicationVersion, which are backed by corresponding
> configuration properties (application.name, application.version).
> 
> 2)
> Do ..Name and ..Version need to be two separate fields, seeing how the two
> fields are ambigious when separated?
> If we're looking to identify unique versions, combining the two fields
> would be sufficient (e.g., "java-2.3.1", "librdkafka/1.2.0", "sarama@1.2.3")
> and perhaps easier to work with.
> The actual format or content of the name-version string is irrelevant as
> long as it identifies a unique name+version.
> 

Hmm.  Wouldn't the same arguments you made above about a combined named+version 
field being "hard for generic monitoring tools to parse correctly" apply here?  
In any case, there seems to be no reason not to just have two fields.  It 
avoids string parsing.

> 
> 3)
> As for allowed characters, will the broker fail the ApiVersionResponse if
> any of these fields contain invalid characters,
> or will the broker sanitize the strings?
> For future backwards compatibility (when the broker constraints change but
> clients are not updated) I suggest the latter.
> 

I would argue that we should be strict about the characters that we accept, and 
just close the connection if the string is bad.  There's no reason to let 
clients troll us with "librdkafka  " (two spaces at the end) or a version 
string with slashes or control characters in it.  In fact, I would argue we 
should just allow something like ([\.][a-z][A-Z][0-9])+  This ensures that JMX 
will work well.  We can always loosen the restrictions later if there is a real 
need.

> 4)
> And while we're at it, can we add the broker name and version to the
> ApiVersionResponse?
> While an application must not use this information to detect features (Hi
> Jay!), it is good for troubleshooting
> and providing more meaningful logs to the client user in case a feature
> (based on the actual api versions) is not available.

I can think of several cases where people tried to set up client-side hacks 
based on the broker version, and were only stopped by the fact that we don't 
expose this information.  I agree with Jay that we should think very carefully 
before exposing it.  In any case, this seems out of scope...

best,
Colin

> 
> /Magnus
> 
> 
> Den tors 22 aug. 2019 kl 10:09 skrev David Jacot <dja...@confluent.io>:
> 
> > Hi Satish,
> >
> > Thank you for your feedback!
> >
> > Please find my answers below.
> >
> > >> Did you consider taking version property by loading
> > “kafka/kafka-version.properties” as a resource while java client is
> > initialized?  “kafka/kafka-version.properties” is shipped with
> > kafka-clients jar.
> >
> > I wasn't aware of the property file. It is exactly what I need. Thanks for
> > pointing that out!
> >
> > >> I assume this metric value will be the total no of clients connected
> > to a broker irrespective of whether name and version follow the
> > expected pattern ([-.\w]+) or not.
> >
> > That is correct.
> >
> > >> It seems client name and client version are treated as tags for
> > `ConnectedClients` metric. If so, you may implement this metric
> > similar to `BrokerTopicMetrics` with topic tag as mentioned here[1].
> > When is the metric removed for a specific client-name and
> > client-version?
> >
> > That is correct. Client name and version are treated as tags like in
> > BrokerTopicMetrics. My plan is to remove the metric when it goes
> > back to zero - when all clients with a given name & version are
> > disconnected.
> >
> > Best,
> > David
> >
> > On Wed, Aug 21, 2019 at 6:52 PM Satish Duggana <satish.dugg...@gmail.com>
> > wrote:
> >
> > > Hi David,
> > > Thanks for the KIP. I have a couple of questions.
> > >
> > > >> For the Java client, the idea is to define two constants in the code
> > to
> > > store its name and its version. If possible, the version will be set
> > > automatically based on metadata coming from gradle (or the repo itself)
> > to
> > > avoid having to do manual changes.
> > >
> > > Did you consider taking version property by loading
> > > “kafka/kafka-version.properties” as a resource while java client is
> > > initialized?  “kafka/kafka-version.properties” is shipped with
> > > kafka-clients jar.
> > >
> > > >> kafka.server:type=ClientMetrics,name=ConnectedClients
> > > I assume this metric value will be the total no of clients connected
> > > to a broker irrespective of whether name and version follow the
> > > expected pattern ([-.\w]+) or not.
> > >
> > > >>
> > >
> > kafka.server:type=ClientMetrics,name=ConnectedClients,clientname=([-.\w]+),clientversion=([-.\w]+)
> > > It seems client name and client version are treated as tags for
> > > `ConnectedClients` metric. If so, you may implement this metric
> > > similar to `BrokerTopicMetrics` with topic tag as mentioned here[1].
> > > When is the metric removed for a specific client-name and
> > > client-version?
> > >
> > > 1.
> > >
> > https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/KafkaRequestHandler.scala#L231
> > >
> > > Thanks,
> > > Satish.
> > >
> > >
> > >
> > >
> > > On Wed, Aug 21, 2019 at 5:33 PM David Jacot <dja...@confluent.io> wrote:
> > > >
> > > > Hi all,
> > > >
> > > > I would like to start a discussion for KIP-511:
> > > >
> > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-511%3A+Collect+and+Expose+Client%27s+Name+and+Version+in+the+Brokers
> > > >
> > > > Let me know what you think.
> > > >
> > > > Best,
> > > > David
> > >
> >
>

Reply via email to