Everyone, thanks for your comments and input this far, here
follows an update of the proposal based on the discussions.


 BrokerMetadataRequest => [NodeId]
   NodeId => int32   // Request Metadata for these brokers only.
                     // Empty array: retrieve for all brokers.
                     // Use -1 for current broker only.

 BrokerMetadataResponse => [BrokerId Host Port ProtocolVersionMin
ProtocolVersionMax [Key Value]]
  NodeId => int32              // Broker NodeId
  Host => string               // Broker Host
  Port => int32                // Broker Port
  ProtocolVersionMin => int32  // Broker's minimum supported protocol
version
  ProtocolVersionMax => int32  // Broker's maximum supported protocol
version
  Key => string                // Tag name
  Value => stirng              // Tag value


Builtin tags:
 "broker.id"          = "9"
 "broker.version"     = "0.9.0.0-SNAPSHOT-d12ca4f"
 "broker.version.int" = "0x00090000"
 "compression.codecs" = "gzip,snappy,lz4"
 "message.max.bytes"  = "1000000"
 "message.formats"    = "v1,v2"  // KIP-31
 "endpoints"          = "plaintext://host:9092,ssl://host:9192"

These are all documented, including their value format and how to parse it.

The "broker.id" has multiple purposes:
 * allows upgrading the bootstrap broker connection to a proper one since
the
   broker_id is initially not known, but would be with this.
 * verifying that the broker connected to is actually the broker id that
was learnt
   through TopicMetadata.


The BrokerMetadata may be used in broker-broker communication during
upgrades
to decide on a common protocol version between brokers with different
versions.



User-provided tags (server.properties), examples:
 "aws.zone"           = "eu-central-1"
 "rack"               = "r8a9"
 "cluster"            = "kafka3"

User provided tags are configured through the broker configuration file,
server.properties, e.g.:

   tag.aws.zone = eu-central-1
   tag.rack = r8a9
   tag.cluster = kafka3




/Magnus


2015-10-01 0:11 GMT+02:00 Magnus Edenhill <mag...@edenhill.se>:

>
> Very good points, Todd, totally agree.
>
>
> 2015-09-30 1:04 GMT+02:00 Todd Palino <tpal...@gmail.com>:
>
>> We should also consider what else should be negotiated between the broker
>> and the client as this comes together. The version is definitely first,
>> but
>> there are other things, such as the max message size, that should not need
>> to be replicated on both the broker and the client. Granted, max message
>> size has per-topic overrides as well, but that should also be considered
>> (possibly as an addition to the topic metadata response).
>>
>> Ideally you never want a requirement that is enforced by the broker to be
>> a
>> surprise to the client, whether that's a supported version or a
>> configuration parameter. The client should not have to know it in advance
>> (except for the most basic of connection parameters), and even if it does
>> have it as a configuration option, it should be able to know before it
>> even
>> starts running that what it has configured is in conflict with the server.
>>
>> -Todd
>>
>>
>> On Tue, Sep 29, 2015 at 11:08 AM, Mayuresh Gharat <
>> gharatmayures...@gmail.com> wrote:
>>
>> > Right. But there should be a max old version that the broker should
>> support
>> > to avoid these incompatibility issues.
>> > For example, if the broker is at version X, it should be able to support
>> > the versions (clients and interbroker) till X-2. In case we have brokers
>> > and clients older than that it can send a response warning them to
>> upgrade
>> > till X-2 minimum.
>> > The backward compatibility limit can be discussed further. This will
>> help
>> > for rolling upgrades.
>> >
>> > Thanks,
>> >
>> > Mayuresh
>> >
>> > On Tue, Sep 29, 2015 at 8:25 AM, Grant Henke <ghe...@cloudera.com>
>> wrote:
>> >
>> > > If we create a protocol version negotiation api for clients, can we
>> use
>> > it
>> > > to replace or improve the ease of upgrades that break inter-broker
>> > > messaging?
>> > >
>> > > Currently upgrades that break the wire protocol take 2 rolling
>> restarts.
>> > > The first restart we set inter.broker.protocol.version telling all
>> > brokers
>> > > to communicate on the old version, and then we restart again removing
>> the
>> > > inter.broker.protocol.version. With this api the brokers could agree
>> on a
>> > > version to communicate with, and when bounced re-negotiate to the new
>> > > version.
>> > >
>> > >
>> > > On Mon, Sep 28, 2015 at 10:26 PM, Mayuresh Gharat <
>> > > gharatmayures...@gmail.com> wrote:
>> > >
>> > > > Nice write-up.
>> > > >
>> > > > Just had a question, instead of returning an empty response back to
>> the
>> > > > client, would it be better for the broker to return a response that
>> > gives
>> > > > some more info to the client regarding the min version they need to
>> > > upgrade
>> > > > to in order to communicate with the broker.
>> > > >
>> > > >
>> > > > Thanks,
>> > > >
>> > > > Mayuresh
>> > > >
>> > > > On Mon, Sep 28, 2015 at 6:36 PM, Jiangjie Qin
>> > <j...@linkedin.com.invalid
>> > > >
>> > > > wrote:
>> > > >
>> > > > > Thanks for the writeup. I also think having a specific protocol
>> for
>> > > > > client-broker version negotiation is better.
>> > > > >
>> > > > > I'm wondering is it better to let the broker to decide the
>> version to
>> > > > use?
>> > > > > It might have some value If brokers have preference for a
>> particular
>> > > > > version.
>> > > > > Using a global version is a good approach. For the client-broker
>> > > > > negotiation, I am thinking about something like:
>> > > > >
>> > > > > ProtocolSyncRequest => ClientType [ProtocolVersion]
>> > > > >     ClientType => int8
>> > > > >     ProtocolVersion => int16
>> > > > >
>> > > > > ProtocolSyncResponse => PreferredVersion
>> > > > >     PreferredVersion => int16
>> > > > >
>> > > > > Thanks,
>> > > > >
>> > > > > Jiangjie (Becket) Qin
>> > > > >
>> > > > > On Mon, Sep 28, 2015 at 11:59 AM, Jun Rao <j...@confluent.io>
>> wrote:
>> > > > >
>> > > > > > I agree with Ewen that having the keys explicitly specified in
>> the
>> > > > > response
>> > > > > > is better.
>> > > > > >
>> > > > > > In addition to the supported protocol version, there are other
>> > > > > interesting
>> > > > > > metadata at the broker level that could be of interest to things
>> > like
>> > > > > admin
>> > > > > > tools (e.g., used disk space, remaining disk space, etc). I am
>> > > > wondering
>> > > > > if
>> > > > > > we should separate those into different requests. For inquiring
>> the
>> > > > > > protocol version, we can have a separate BrokerProtocolRequest.
>> The
>> > > > > > response will just include the broker version and perhaps a
>> list of
>> > > > > > supported requests and versions?
>> > > > > >
>> > > > > > As for sending an empty response for unrecognized requests, do
>> you
>> > > how
>> > > > is
>> > > > > > that handled in other similar systems?
>> > > > > >
>> > > > > > Thanks,
>> > > > > >
>> > > > > > Jun
>> > > > > >
>> > > > > > On Mon, Sep 28, 2015 at 10:47 AM, Jason Gustafson <
>> > > ja...@confluent.io>
>> > > > > > wrote:
>> > > > > >
>> > > > > > > Having the version API can make clients more robust, so I'm in
>> > > favor.
>> > > > > One
>> > > > > > > note on the addition of the "rack" field. Since this is a
>> > > > > broker-specific
>> > > > > > > setting, the client would have to query BrokerMetadata for
>> every
>> > > new
>> > > > > > broker
>> > > > > > > it connects to (unless we also expose rack in TopicMetadata).
>> > This
>> > > is
>> > > > > > also
>> > > > > > > kind of unfortunate for admin utilities leveraging this API.
>> It
>> > > might
>> > > > > be
>> > > > > > > more convenient to allow this API to return broker metadata
>> for
>> > the
>> > > > > full
>> > > > > > > cluster, assuming all of it could be made available in
>> Zookeeper.
>> > > > > > >
>> > > > > > > As for using the empty response to indicate an incompatible
>> API,
>> > it
>> > > > > seems
>> > > > > > > like that could work. I think some of the clients might catch
>> > > > response
>> > > > > > > parsing exceptions and retry anyway, but that's no worse than
>> > > > retrying
>> > > > > > > because of a disconnect in the same case.
>> > > > > > >
>> > > > > > > -Jason
>> > > > > > >
>> > > > > > > On Fri, Sep 25, 2015 at 9:34 PM, Ewen Cheslack-Postava <
>> > > > > > e...@confluent.io>
>> > > > > > > wrote:
>> > > > > > >
>> > > > > > > > The basic functionality is definitely useful here. I'm
>> > generally
>> > > in
>> > > > > > favor
>> > > > > > > > of exposing some info about broker versions to clients.
>> > > > > > > >
>> > > > > > > > I'd prefer to keep the key/values explicit. Making them
>> > > extensible
>> > > > > > > > string/string pairs is good for avoiding unnecessary version
>> > > > changes
>> > > > > in
>> > > > > > > the
>> > > > > > > > protocol, but I think we should explicitly define the valid
>> > > > key/value
>> > > > > > > > formats in each version of the protocol. New keys can
>> safely be
>> > > > > > ignored,
>> > > > > > > > but actually specifying the format of the values is
>> important
>> > if
>> > > we
>> > > > > > ever
>> > > > > > > > need to evolve those formats.
>> > > > > > > >
>> > > > > > > > I like some of the examples you've provided for returned
>> > > key/value
>> > > > > > pairs
>> > > > > > > > and I think we should provide some of them even when the
>> values
>> > > > > should
>> > > > > > be
>> > > > > > > > obvious from the broker version.
>> > > > > > > >
>> > > > > > > > * broker.version - are we definitely standardizing on this
>> > > > versioning
>> > > > > > > > format? 4 digits, with each level indicating the intuitive
>> > levels
>> > > > of
>> > > > > > > > compatibility? Is there any chance we'll have a 0.10.0.0?
>> This
>> > > > might
>> > > > > > seem
>> > > > > > > > like a trivial consideration, but after fighting versioning
>> in
>> > > > > > different
>> > > > > > > > packaging systems, I'm a bit more sensitive to the annoying
>> > > effects
>> > > > > > that
>> > > > > > > > not considering this carefully can have. We're at a
>> > particularly
>> > > > > > > sensitive
>> > > > > > > > point as we hit .9 -> .10 or .9 -> 1.0 (!!!!).
>> > > > > > > > * supported.compression.codecs - nit, but I'd like to figure
>> > out
>> > > a
>> > > > > good
>> > > > > > > way
>> > > > > > > > to keep these as close to the actual config name as
>> possible.
>> > in
>> > > > this
>> > > > > > > case,
>> > > > > > > > the setting is compression.codec, so just
>> "compression.codecs"
>> > > > seems
>> > > > > > > ideal
>> > > > > > > > to me.
>> > > > > > > > * rack: I think there's a ton of demand for something like
>> > this,
>> > > > but
>> > > > > > I'd
>> > > > > > > > really like to think through it more before exposing
>> anything.
>> > > > "rack"
>> > > > > > is
>> > > > > > > > *very* specific to a deployment scenario. I think we're
>> > > comfortable
>> > > > > > > > adapting the terminology, but the meaning can change
>> > drastically,
>> > > > > even
>> > > > > > > > under seemingly similar deployments. For example, if you
>> deploy
>> > > in
>> > > > > EC2,
>> > > > > > > you
>> > > > > > > > might create instances in multiple AZs. Within an AZ, you
>> might
>> > > > > > consider
>> > > > > > > > all the nodes on the same "rack". But there are also
>> placement
>> > > > groups
>> > > > > > > > within each AZ that provide better guarantees on network
>> > > > performance.
>> > > > > > Are
>> > > > > > > > any nodes in the same AZ considered on the same rack or do
>> you
>> > > need
>> > > > > > > special
>> > > > > > > > guarantees to be on the same "rack". In general, I don't
>> think
>> > > > > there's
>> > > > > > a
>> > > > > > > > generic "rack" identifier we can expose -- we'll need to do
>> > > > something
>> > > > > > > > specialized depending on different environments. This is a
>> case
>> > > > where
>> > > > > > > > extensibility in the format is probably really useful.
>> > > > > > > > * Properties like supported.compression.codecs can
>> presumably
>> > be
>> > > > > > > determined
>> > > > > > > > just via the broker version. Is there a good reason for
>> > including
>> > > > > them?
>> > > > > > > > Perhaps explicit info >> implicit info?
>> > > > > > > > * "All existing clients should be able to handle this
>> > gracefully
>> > > > > > without
>> > > > > > > > any alterations to the code" - which clients does this refer
>> > to?
>> > > > For
>> > > > > > > > example, will the Java clients continue to behave the same
>> way
>> > > they
>> > > > > do
>> > > > > > > > today? I'm curious because empty responses seem *very*
>> > different
>> > > > from
>> > > > > > > > closing connections.
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > -Ewen
>> > > > > > > >
>> > > > > > > > On Fri, Sep 25, 2015 at 2:53 PM, Magnus Edenhill <
>> > > > mag...@edenhill.se
>> > > > > >
>> > > > > > > > wrote:
>> > > > > > > >
>> > > > > > > > > Good evening,
>> > > > > > > > >
>> > > > > > > > > KIP-35 was created to address current and future
>> > broker-client
>> > > > > > > > > compatibility.
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-35+-+Retrieving+protocol+version
>> > > > > > > > >
>> > > > > > > > > Summary:
>> > > > > > > > >  * allow clients to retrieve the broker's protocol version
>> > > > > > > > >  * make broker handle unknown protocol requests gracefully
>> > > > > > > > >
>> > > > > > > > > Feedback and comments welcome!
>> > > > > > > > >
>> > > > > > > > > Regards,
>> > > > > > > > > Magnus
>> > > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > --
>> > > > > > > > Thanks,
>> > > > > > > > Ewen
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > > >
>> > > >
>> > > > --
>> > > > -Regards,
>> > > > Mayuresh R. Gharat
>> > > > (862) 250-7125
>> > > >
>> > >
>> > >
>> > >
>> > > --
>> > > Grant Henke
>> > > Software Engineer | Cloudera
>> > > gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke
>> > >
>> >
>> >
>> >
>> > --
>> > -Regards,
>> > Mayuresh R. Gharat
>> > (862) 250-7125
>> >
>>
>
>

Reply via email to