It would be nice to get the unknown api workaround into 0.9, even if we can't have ProtocolVersionRequest. It should be a small change and it allows clients going forward to detect when they have connected to an old broker, which lets them surface a helpful error to the user. This is much better than retrying indefinitely which is probably how most current clients handle this case.
-Jason On Tue, Oct 6, 2015 at 2:25 PM, Magnus Edenhill <mag...@edenhill.se> wrote: > After today's KIP call we decided on the following regarding KIP-35: > * limit scope to just propagate supported API versions (no key-value tags, > broker info, etc) > * let the new API return the full list of broker's supported ApiKeys and > ApiVersions, rather than an aggregated global version > * ApiVersions are sorted in order of preference > * rename API from BrokerMetadataRequest to ProtocolVersionRequest > * workaround for disconnect-on-unknown-api-request remains valid. > > The wiki page has been updated accordingly: > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-35+-+Retrieving+protocol+version > > Thanks, > Magnus > > > 2015-10-06 17:28 GMT+02:00 Joel Koshy <jjkosh...@gmail.com>: > > > Thanks for the write-up and discussion. This would have been super > > useful in our last round of deployments at LinkedIn for which we ended > > up having to hack around a number of incompatibilities. I could list > > all of the compatibility issues that have hit us, but some are > > irrelevant to this specific KIP (e.g., ZooKeeper registration > > versions). So I should perhaps just list two that I think are > > relevant: > > > > First, is the way that our metrics collection works. We have a special > > metrics producer on every service that emits to a separate metrics > > cluster. Kafka brokers also use this producer to emit to the > > (separate) metrics cluster. So when we upgrade our test clusters, the > > metrics producer in those clusters end up sending the latest produce > > request version to the yet to be upgraded metrics cluster. This caused > > an issue for us in the last round of deployments which bumped up the > > protocol version for the quota-related throttle-time response field. > > We got around that by just setting the metrics producer requiredAcks > > to zero (since the error occurs on parsing the response - and the old > > broker fortunately did not check the request version). > > > > Second, the inter-broker protocol versioning scheme works fine across > > official Apache releases but we picked up intermediate versions that > > contained some request version bumps, and then follow-up versions that > > picked up some more request bumps. For people deploying off trunk, > > protocol version lookup would help. > > > > General comments on the discussion and KIP: > > > > I like Grant’s suggestion on using this to avoid the explicit > > inter-broker-protocol-version - this will not only help address the > > second compatibility issue above, but I’m all for anything that > > eliminates an hour of config deployment (our deployments can take that > > long!) > > > > +1 on explicit response fields vs. key-value pairs - I don’t see this > > reflected on the wiki though. > > > > Aggregate protocol version vs specific request version: so you are > > associating an increasing aggregate version (for each request version > > bump). It may be useful to allow look up of the supported version (or > > version range) for each request type. The BrokerMetadataResponse could > > alternately return a vector of supported version ranges for each > > request type. > > > > Error response for unrecognized request versions: One option raised in > > the discussion was to always include the highest supported version of > > that request type in the response, but it may be worthwhile avoiding > > that (since it is irrelevant most of the time) and fold that into the > > BrokerMetadataRequest instead. > > > > Max-message size/compression-codec: I actually prefer having this only > > in TopicMetadataResponse and leave it out of the > > BrokerMetadataRequest/Response (even for the defaults) since these are > > really topic-specific fields. Rack-info on the other hand should > > probably be there (at some point) in the BrokerMetadataResponse, and > > this should perhaps be just a raw string that would need some > > pluggable (deployment-specific) parsing. > > > > Thanks, > > > > Joel > > > > > > On Wed, Sep 30, 2015 at 3:18 PM, Magnus Edenhill <mag...@edenhill.se> > > wrote: > > > Everyone, thanks for your comments and input this far, here > > > follows an update of the proposal based on the discussions. > > > > > > > > > BrokerMetadataRequest => [NodeId] > > > NodeId => int32 // Request Metadata for these brokers only. > > > // Empty array: retrieve for all brokers. > > > // Use -1 for current broker only. > > > > > > BrokerMetadataResponse => [BrokerId Host Port ProtocolVersionMin > > > ProtocolVersionMax [Key Value]] > > > NodeId => int32 // Broker NodeId > > > Host => string // Broker Host > > > Port => int32 // Broker Port > > > ProtocolVersionMin => int32 // Broker's minimum supported protocol > > > version > > > ProtocolVersionMax => int32 // Broker's maximum supported protocol > > > version > > > Key => string // Tag name > > > Value => stirng // Tag value > > > > > > > > > Builtin tags: > > > "broker.id" = "9" > > > "broker.version" = "0.9.0.0-SNAPSHOT-d12ca4f" > > > "broker.version.int" = "0x00090000" > > > "compression.codecs" = "gzip,snappy,lz4" > > > "message.max.bytes" = "1000000" > > > "message.formats" = "v1,v2" // KIP-31 > > > "endpoints" = "plaintext://host:9092,ssl://host:9192" > > > > > > These are all documented, including their value format and how to parse > > it. > > > > > > The "broker.id" has multiple purposes: > > > * allows upgrading the bootstrap broker connection to a proper one > since > > > the > > > broker_id is initially not known, but would be with this. > > > * verifying that the broker connected to is actually the broker id > that > > > was learnt > > > through TopicMetadata. > > > > > > > > > The BrokerMetadata may be used in broker-broker communication during > > > upgrades > > > to decide on a common protocol version between brokers with different > > > versions. > > > > > > > > > > > > User-provided tags (server.properties), examples: > > > "aws.zone" = "eu-central-1" > > > "rack" = "r8a9" > > > "cluster" = "kafka3" > > > > > > User provided tags are configured through the broker configuration > file, > > > server.properties, e.g.: > > > > > > tag.aws.zone = eu-central-1 > > > tag.rack = r8a9 > > > tag.cluster = kafka3 > > > > > > > > > > > > > > > /Magnus > > > > > > > > > 2015-10-01 0:11 GMT+02:00 Magnus Edenhill <mag...@edenhill.se>: > > > > > >> > > >> Very good points, Todd, totally agree. > > >> > > >> > > >> 2015-09-30 1:04 GMT+02:00 Todd Palino <tpal...@gmail.com>: > > >> > > >>> We should also consider what else should be negotiated between the > > broker > > >>> and the client as this comes together. The version is definitely > first, > > >>> but > > >>> there are other things, such as the max message size, that should not > > need > > >>> to be replicated on both the broker and the client. Granted, max > > message > > >>> size has per-topic overrides as well, but that should also be > > considered > > >>> (possibly as an addition to the topic metadata response). > > >>> > > >>> Ideally you never want a requirement that is enforced by the broker > to > > be > > >>> a > > >>> surprise to the client, whether that's a supported version or a > > >>> configuration parameter. The client should not have to know it in > > advance > > >>> (except for the most basic of connection parameters), and even if it > > does > > >>> have it as a configuration option, it should be able to know before > it > > >>> even > > >>> starts running that what it has configured is in conflict with the > > server. > > >>> > > >>> -Todd > > >>> > > >>> > > >>> On Tue, Sep 29, 2015 at 11:08 AM, Mayuresh Gharat < > > >>> gharatmayures...@gmail.com> wrote: > > >>> > > >>> > Right. But there should be a max old version that the broker should > > >>> support > > >>> > to avoid these incompatibility issues. > > >>> > For example, if the broker is at version X, it should be able to > > support > > >>> > the versions (clients and interbroker) till X-2. In case we have > > brokers > > >>> > and clients older than that it can send a response warning them to > > >>> upgrade > > >>> > till X-2 minimum. > > >>> > The backward compatibility limit can be discussed further. This > will > > >>> help > > >>> > for rolling upgrades. > > >>> > > > >>> > Thanks, > > >>> > > > >>> > Mayuresh > > >>> > > > >>> > On Tue, Sep 29, 2015 at 8:25 AM, Grant Henke <ghe...@cloudera.com> > > >>> wrote: > > >>> > > > >>> > > If we create a protocol version negotiation api for clients, can > we > > >>> use > > >>> > it > > >>> > > to replace or improve the ease of upgrades that break > inter-broker > > >>> > > messaging? > > >>> > > > > >>> > > Currently upgrades that break the wire protocol take 2 rolling > > >>> restarts. > > >>> > > The first restart we set inter.broker.protocol.version telling > all > > >>> > brokers > > >>> > > to communicate on the old version, and then we restart again > > removing > > >>> the > > >>> > > inter.broker.protocol.version. With this api the brokers could > > agree > > >>> on a > > >>> > > version to communicate with, and when bounced re-negotiate to the > > new > > >>> > > version. > > >>> > > > > >>> > > > > >>> > > On Mon, Sep 28, 2015 at 10:26 PM, Mayuresh Gharat < > > >>> > > gharatmayures...@gmail.com> wrote: > > >>> > > > > >>> > > > Nice write-up. > > >>> > > > > > >>> > > > Just had a question, instead of returning an empty response > back > > to > > >>> the > > >>> > > > client, would it be better for the broker to return a response > > that > > >>> > gives > > >>> > > > some more info to the client regarding the min version they > need > > to > > >>> > > upgrade > > >>> > > > to in order to communicate with the broker. > > >>> > > > > > >>> > > > > > >>> > > > Thanks, > > >>> > > > > > >>> > > > Mayuresh > > >>> > > > > > >>> > > > On Mon, Sep 28, 2015 at 6:36 PM, Jiangjie Qin > > >>> > <j...@linkedin.com.invalid > > >>> > > > > > >>> > > > wrote: > > >>> > > > > > >>> > > > > Thanks for the writeup. I also think having a specific > protocol > > >>> for > > >>> > > > > client-broker version negotiation is better. > > >>> > > > > > > >>> > > > > I'm wondering is it better to let the broker to decide the > > >>> version to > > >>> > > > use? > > >>> > > > > It might have some value If brokers have preference for a > > >>> particular > > >>> > > > > version. > > >>> > > > > Using a global version is a good approach. For the > > client-broker > > >>> > > > > negotiation, I am thinking about something like: > > >>> > > > > > > >>> > > > > ProtocolSyncRequest => ClientType [ProtocolVersion] > > >>> > > > > ClientType => int8 > > >>> > > > > ProtocolVersion => int16 > > >>> > > > > > > >>> > > > > ProtocolSyncResponse => PreferredVersion > > >>> > > > > PreferredVersion => int16 > > >>> > > > > > > >>> > > > > Thanks, > > >>> > > > > > > >>> > > > > Jiangjie (Becket) Qin > > >>> > > > > > > >>> > > > > On Mon, Sep 28, 2015 at 11:59 AM, Jun Rao <j...@confluent.io> > > >>> wrote: > > >>> > > > > > > >>> > > > > > I agree with Ewen that having the keys explicitly specified > > in > > >>> the > > >>> > > > > response > > >>> > > > > > is better. > > >>> > > > > > > > >>> > > > > > In addition to the supported protocol version, there are > > other > > >>> > > > > interesting > > >>> > > > > > metadata at the broker level that could be of interest to > > things > > >>> > like > > >>> > > > > admin > > >>> > > > > > tools (e.g., used disk space, remaining disk space, etc). I > > am > > >>> > > > wondering > > >>> > > > > if > > >>> > > > > > we should separate those into different requests. For > > inquiring > > >>> the > > >>> > > > > > protocol version, we can have a separate > > BrokerProtocolRequest. > > >>> The > > >>> > > > > > response will just include the broker version and perhaps a > > >>> list of > > >>> > > > > > supported requests and versions? > > >>> > > > > > > > >>> > > > > > As for sending an empty response for unrecognized requests, > > do > > >>> you > > >>> > > how > > >>> > > > is > > >>> > > > > > that handled in other similar systems? > > >>> > > > > > > > >>> > > > > > Thanks, > > >>> > > > > > > > >>> > > > > > Jun > > >>> > > > > > > > >>> > > > > > On Mon, Sep 28, 2015 at 10:47 AM, Jason Gustafson < > > >>> > > ja...@confluent.io> > > >>> > > > > > wrote: > > >>> > > > > > > > >>> > > > > > > Having the version API can make clients more robust, so > > I'm in > > >>> > > favor. > > >>> > > > > One > > >>> > > > > > > note on the addition of the "rack" field. Since this is a > > >>> > > > > broker-specific > > >>> > > > > > > setting, the client would have to query BrokerMetadata > for > > >>> every > > >>> > > new > > >>> > > > > > broker > > >>> > > > > > > it connects to (unless we also expose rack in > > TopicMetadata). > > >>> > This > > >>> > > is > > >>> > > > > > also > > >>> > > > > > > kind of unfortunate for admin utilities leveraging this > > API. > > >>> It > > >>> > > might > > >>> > > > > be > > >>> > > > > > > more convenient to allow this API to return broker > metadata > > >>> for > > >>> > the > > >>> > > > > full > > >>> > > > > > > cluster, assuming all of it could be made available in > > >>> Zookeeper. > > >>> > > > > > > > > >>> > > > > > > As for using the empty response to indicate an > incompatible > > >>> API, > > >>> > it > > >>> > > > > seems > > >>> > > > > > > like that could work. I think some of the clients might > > catch > > >>> > > > response > > >>> > > > > > > parsing exceptions and retry anyway, but that's no worse > > than > > >>> > > > retrying > > >>> > > > > > > because of a disconnect in the same case. > > >>> > > > > > > > > >>> > > > > > > -Jason > > >>> > > > > > > > > >>> > > > > > > On Fri, Sep 25, 2015 at 9:34 PM, Ewen Cheslack-Postava < > > >>> > > > > > e...@confluent.io> > > >>> > > > > > > wrote: > > >>> > > > > > > > > >>> > > > > > > > The basic functionality is definitely useful here. I'm > > >>> > generally > > >>> > > in > > >>> > > > > > favor > > >>> > > > > > > > of exposing some info about broker versions to clients. > > >>> > > > > > > > > > >>> > > > > > > > I'd prefer to keep the key/values explicit. Making them > > >>> > > extensible > > >>> > > > > > > > string/string pairs is good for avoiding unnecessary > > version > > >>> > > > changes > > >>> > > > > in > > >>> > > > > > > the > > >>> > > > > > > > protocol, but I think we should explicitly define the > > valid > > >>> > > > key/value > > >>> > > > > > > > formats in each version of the protocol. New keys can > > >>> safely be > > >>> > > > > > ignored, > > >>> > > > > > > > but actually specifying the format of the values is > > >>> important > > >>> > if > > >>> > > we > > >>> > > > > > ever > > >>> > > > > > > > need to evolve those formats. > > >>> > > > > > > > > > >>> > > > > > > > I like some of the examples you've provided for > returned > > >>> > > key/value > > >>> > > > > > pairs > > >>> > > > > > > > and I think we should provide some of them even when > the > > >>> values > > >>> > > > > should > > >>> > > > > > be > > >>> > > > > > > > obvious from the broker version. > > >>> > > > > > > > > > >>> > > > > > > > * broker.version - are we definitely standardizing on > > this > > >>> > > > versioning > > >>> > > > > > > > format? 4 digits, with each level indicating the > > intuitive > > >>> > levels > > >>> > > > of > > >>> > > > > > > > compatibility? Is there any chance we'll have a > 0.10.0.0? > > >>> This > > >>> > > > might > > >>> > > > > > seem > > >>> > > > > > > > like a trivial consideration, but after fighting > > versioning > > >>> in > > >>> > > > > > different > > >>> > > > > > > > packaging systems, I'm a bit more sensitive to the > > annoying > > >>> > > effects > > >>> > > > > > that > > >>> > > > > > > > not considering this carefully can have. We're at a > > >>> > particularly > > >>> > > > > > > sensitive > > >>> > > > > > > > point as we hit .9 -> .10 or .9 -> 1.0 (!!!!). > > >>> > > > > > > > * supported.compression.codecs - nit, but I'd like to > > figure > > >>> > out > > >>> > > a > > >>> > > > > good > > >>> > > > > > > way > > >>> > > > > > > > to keep these as close to the actual config name as > > >>> possible. > > >>> > in > > >>> > > > this > > >>> > > > > > > case, > > >>> > > > > > > > the setting is compression.codec, so just > > >>> "compression.codecs" > > >>> > > > seems > > >>> > > > > > > ideal > > >>> > > > > > > > to me. > > >>> > > > > > > > * rack: I think there's a ton of demand for something > > like > > >>> > this, > > >>> > > > but > > >>> > > > > > I'd > > >>> > > > > > > > really like to think through it more before exposing > > >>> anything. > > >>> > > > "rack" > > >>> > > > > > is > > >>> > > > > > > > *very* specific to a deployment scenario. I think we're > > >>> > > comfortable > > >>> > > > > > > > adapting the terminology, but the meaning can change > > >>> > drastically, > > >>> > > > > even > > >>> > > > > > > > under seemingly similar deployments. For example, if > you > > >>> deploy > > >>> > > in > > >>> > > > > EC2, > > >>> > > > > > > you > > >>> > > > > > > > might create instances in multiple AZs. Within an AZ, > you > > >>> might > > >>> > > > > > consider > > >>> > > > > > > > all the nodes on the same "rack". But there are also > > >>> placement > > >>> > > > groups > > >>> > > > > > > > within each AZ that provide better guarantees on > network > > >>> > > > performance. > > >>> > > > > > Are > > >>> > > > > > > > any nodes in the same AZ considered on the same rack or > > do > > >>> you > > >>> > > need > > >>> > > > > > > special > > >>> > > > > > > > guarantees to be on the same "rack". In general, I > don't > > >>> think > > >>> > > > > there's > > >>> > > > > > a > > >>> > > > > > > > generic "rack" identifier we can expose -- we'll need > to > > do > > >>> > > > something > > >>> > > > > > > > specialized depending on different environments. This > is > > a > > >>> case > > >>> > > > where > > >>> > > > > > > > extensibility in the format is probably really useful. > > >>> > > > > > > > * Properties like supported.compression.codecs can > > >>> presumably > > >>> > be > > >>> > > > > > > determined > > >>> > > > > > > > just via the broker version. Is there a good reason for > > >>> > including > > >>> > > > > them? > > >>> > > > > > > > Perhaps explicit info >> implicit info? > > >>> > > > > > > > * "All existing clients should be able to handle this > > >>> > gracefully > > >>> > > > > > without > > >>> > > > > > > > any alterations to the code" - which clients does this > > refer > > >>> > to? > > >>> > > > For > > >>> > > > > > > > example, will the Java clients continue to behave the > > same > > >>> way > > >>> > > they > > >>> > > > > do > > >>> > > > > > > > today? I'm curious because empty responses seem *very* > > >>> > different > > >>> > > > from > > >>> > > > > > > > closing connections. > > >>> > > > > > > > > > >>> > > > > > > > > > >>> > > > > > > > -Ewen > > >>> > > > > > > > > > >>> > > > > > > > On Fri, Sep 25, 2015 at 2:53 PM, Magnus Edenhill < > > >>> > > > mag...@edenhill.se > > >>> > > > > > > > >>> > > > > > > > wrote: > > >>> > > > > > > > > > >>> > > > > > > > > Good evening, > > >>> > > > > > > > > > > >>> > > > > > > > > KIP-35 was created to address current and future > > >>> > broker-client > > >>> > > > > > > > > compatibility. > > >>> > > > > > > > > > > >>> > > > > > > > > > > >>> > > > > > > > > > > >>> > > > > > > > > > >>> > > > > > > > > >>> > > > > > > > >>> > > > > > > >>> > > > > > >>> > > > > >>> > > > >>> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-35+-+Retrieving+protocol+version > > >>> > > > > > > > > > > >>> > > > > > > > > Summary: > > >>> > > > > > > > > * allow clients to retrieve the broker's protocol > > version > > >>> > > > > > > > > * make broker handle unknown protocol requests > > gracefully > > >>> > > > > > > > > > > >>> > > > > > > > > Feedback and comments welcome! > > >>> > > > > > > > > > > >>> > > > > > > > > Regards, > > >>> > > > > > > > > Magnus > > >>> > > > > > > > > > > >>> > > > > > > > > > >>> > > > > > > > > > >>> > > > > > > > > > >>> > > > > > > > -- > > >>> > > > > > > > Thanks, > > >>> > > > > > > > Ewen > > >>> > > > > > > > > > >>> > > > > > > > > >>> > > > > > > > >>> > > > > > > >>> > > > > > >>> > > > > > >>> > > > > > >>> > > > -- > > >>> > > > -Regards, > > >>> > > > Mayuresh R. Gharat > > >>> > > > (862) 250-7125 > > >>> > > > > > >>> > > > > >>> > > > > >>> > > > > >>> > > -- > > >>> > > Grant Henke > > >>> > > Software Engineer | Cloudera > > >>> > > gr...@cloudera.com | twitter.com/gchenke | > > linkedin.com/in/granthenke > > >>> > > > > >>> > > > >>> > > > >>> > > > >>> > -- > > >>> > -Regards, > > >>> > Mayuresh R. Gharat > > >>> > (862) 250-7125 > > >>> > > > >>> > > >> > > >> > > >