Hi all, I have updated the KIP to address the various comments. I have also added a section about the handling of the ApiVersionsRequest/Response in the broker.
Please, let me know what you think. I would like to make it for the next release if possible. Best, David On Fri, Aug 30, 2019 at 10:31 AM David Jacot <dja...@confluent.io> wrote: > Hi Magnus, > > Thank you for your feedback. Please, find my comments below. > > 1. I thought that the clientId was meant for this purpose (providing > information about the application). Is there a gap I don't see? > > 2. I have put two fields to avoid requiring deep validation and parsing on > the broker. I believe that it will be easier to use the metadata downstream > like this. > > 3. Good point. I think that the broker should have some sort of validation > and fail the ApiVersionRequest and/or the Connection if the validation > fails. To avoid backward compatibility issues, I think we should come up > with a minimal validation and ensure it won't become more restrictive in > the future. > > 4. I don't have strong opinion regarding this one but as the focus of the > KIP is to gather the client's information, I suggest to discuss/address > this later on. > > Best, > David > > On Thu, Aug 29, 2019 at 6:56 PM Colin McCabe <cmcc...@apache.org> wrote: > >> On Fri, Aug 23, 2019, at 00:07, Magnus Edenhill wrote: >> > Great proposal, this feature is well overdue! >> > >> > 1) >> > From an operator's perspective I don't think the kafka client >> > implementation name and version are sufficient, >> > I also believe the application name and version are of interest. >> > You could have all applications in your cluster run the same kafka >> client >> > and version, but only one type or >> > version of an application misbehave and needing to be tracked down. >> >> Hi Magnus, >> >> I think it might be better to leave this out of scope for now, and think >> about it in the context of more generalized request tracing. This is a >> very deep rabbit hole, and I could easily see it delaying this KIP for a >> long time. For example, if you have multiple Spark jobs producing to >> Kafka, just knowing that a client is being used by Spark may not be that >> helpful. So maybe you want a third set of fields to describe the spark >> application ID and version, etc? And then maybe that, itself, was created >> by some framework... etc. Probably better to defer this discussion for now >> and see how version tracking works out. >> >> > >> > While the application and client name and version could be combined in >> the >> > ClientName/ClientVersion fields by >> > the user (e.g. like User-Agent), it would not be in a generalized format >> > and hard for generic monitoring tools to parse correctly. >> > >> > So I'd suggest keeping ClientName and ClientVersion as the client >> > implementation name ("java" or "org.apache.kafka...") and version, >> > which can't be changed by the user/app developer, and providing two >> > optional fields for the application counterpart: >> > ApplicationName and ApplicationVersion, which are backed by >> corresponding >> > configuration properties (application.name, application.version). >> > >> > 2) >> > Do ..Name and ..Version need to be two separate fields, seeing how the >> two >> > fields are ambigious when separated? >> > If we're looking to identify unique versions, combining the two fields >> > would be sufficient (e.g., "java-2.3.1", "librdkafka/1.2.0", >> "sarama@1.2.3") >> > and perhaps easier to work with. >> > The actual format or content of the name-version string is irrelevant as >> > long as it identifies a unique name+version. >> > >> >> Hmm. Wouldn't the same arguments you made above about a combined >> named+version field being "hard for generic monitoring tools to parse >> correctly" apply here? In any case, there seems to be no reason not to >> just have two fields. It avoids string parsing. >> >> > >> > 3) >> > As for allowed characters, will the broker fail the ApiVersionResponse >> if >> > any of these fields contain invalid characters, >> > or will the broker sanitize the strings? >> > For future backwards compatibility (when the broker constraints change >> but >> > clients are not updated) I suggest the latter. >> > >> >> I would argue that we should be strict about the characters that we >> accept, and just close the connection if the string is bad. There's no >> reason to let clients troll us with "librdkafka " (two spaces at the end) >> or a version string with slashes or control characters in it. In fact, I >> would argue we should just allow something like ([\.][a-z][A-Z][0-9])+ >> This ensures that JMX will work well. We can always loosen the >> restrictions later if there is a real need. >> >> > 4) >> > And while we're at it, can we add the broker name and version to the >> > ApiVersionResponse? >> > While an application must not use this information to detect features >> (Hi >> > Jay!), it is good for troubleshooting >> > and providing more meaningful logs to the client user in case a feature >> > (based on the actual api versions) is not available. >> >> I can think of several cases where people tried to set up client-side >> hacks based on the broker version, and were only stopped by the fact that >> we don't expose this information. I agree with Jay that we should think >> very carefully before exposing it. In any case, this seems out of scope... >> >> best, >> Colin >> >> > >> > /Magnus >> > >> > >> > Den tors 22 aug. 2019 kl 10:09 skrev David Jacot <dja...@confluent.io>: >> > >> > > Hi Satish, >> > > >> > > Thank you for your feedback! >> > > >> > > Please find my answers below. >> > > >> > > >> Did you consider taking version property by loading >> > > “kafka/kafka-version.properties” as a resource while java client is >> > > initialized? “kafka/kafka-version.properties” is shipped with >> > > kafka-clients jar. >> > > >> > > I wasn't aware of the property file. It is exactly what I need. >> Thanks for >> > > pointing that out! >> > > >> > > >> I assume this metric value will be the total no of clients >> connected >> > > to a broker irrespective of whether name and version follow the >> > > expected pattern ([-.\w]+) or not. >> > > >> > > That is correct. >> > > >> > > >> It seems client name and client version are treated as tags for >> > > `ConnectedClients` metric. If so, you may implement this metric >> > > similar to `BrokerTopicMetrics` with topic tag as mentioned here[1]. >> > > When is the metric removed for a specific client-name and >> > > client-version? >> > > >> > > That is correct. Client name and version are treated as tags like in >> > > BrokerTopicMetrics. My plan is to remove the metric when it goes >> > > back to zero - when all clients with a given name & version are >> > > disconnected. >> > > >> > > Best, >> > > David >> > > >> > > On Wed, Aug 21, 2019 at 6:52 PM Satish Duggana < >> satish.dugg...@gmail.com> >> > > wrote: >> > > >> > > > Hi David, >> > > > Thanks for the KIP. I have a couple of questions. >> > > > >> > > > >> For the Java client, the idea is to define two constants in the >> code >> > > to >> > > > store its name and its version. If possible, the version will be set >> > > > automatically based on metadata coming from gradle (or the repo >> itself) >> > > to >> > > > avoid having to do manual changes. >> > > > >> > > > Did you consider taking version property by loading >> > > > “kafka/kafka-version.properties” as a resource while java client is >> > > > initialized? “kafka/kafka-version.properties” is shipped with >> > > > kafka-clients jar. >> > > > >> > > > >> kafka.server:type=ClientMetrics,name=ConnectedClients >> > > > I assume this metric value will be the total no of clients connected >> > > > to a broker irrespective of whether name and version follow the >> > > > expected pattern ([-.\w]+) or not. >> > > > >> > > > >> >> > > > >> > > >> kafka.server:type=ClientMetrics,name=ConnectedClients,clientname=([-.\w]+),clientversion=([-.\w]+) >> > > > It seems client name and client version are treated as tags for >> > > > `ConnectedClients` metric. If so, you may implement this metric >> > > > similar to `BrokerTopicMetrics` with topic tag as mentioned here[1]. >> > > > When is the metric removed for a specific client-name and >> > > > client-version? >> > > > >> > > > 1. >> > > > >> > > >> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/KafkaRequestHandler.scala#L231 >> > > > >> > > > Thanks, >> > > > Satish. >> > > > >> > > > >> > > > >> > > > >> > > > On Wed, Aug 21, 2019 at 5:33 PM David Jacot <dja...@confluent.io> >> wrote: >> > > > > >> > > > > Hi all, >> > > > > >> > > > > I would like to start a discussion for KIP-511: >> > > > > >> > > > >> > > >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-511%3A+Collect+and+Expose+Client%27s+Name+and+Version+in+the+Brokers >> > > > > >> > > > > Let me know what you think. >> > > > > >> > > > > Best, >> > > > > David >> > > > >> > > >> > >> >