Yes, the topicMetadataResponse format is a bit weird. The main reason that
it's done that way is that we don't want to return null replica objects to
the client. An alternative is to have the broker return all the replica ids
and a list of live brokers, and let the client decide what to do with
replica ids without a matching broker.

Thanks,

Jun

On Wed, Jan 14, 2015 at 6:53 PM, Jay Kreps <jay.kr...@gmail.com> wrote:

> I agree.
>
> Also, is this behavior a good one? It seems kind of hacky to give an error
> code and a result both, no?
>
> -Jay
>
> On Wed, Jan 14, 2015 at 6:35 PM, Dana Powers <dana.pow...@rd.io> wrote:
>
> > Thanks -- i see that this was more of a bug in 0.8.1 than a regression in
> > 0.8.2.  But I do think the 0.8.2 bug fix to the metadata cache means that
> > the very common scenario of a single broker failure (and subsequent
> > partition leadership change) will now return error codes in the
> > MetadataResponse -- different from 0.8.1 -- and those errors may cause
> pain
> > to some users if the client doesn't know how to handle them.  The "fix"
> for
> > users is to upgrade client code (or verify that existing client code
> > handles this well) before upgrading to 0.8.2 in a production environment.
> >
> > What would be really useful for the non-java community is a list or
> > specification of what error codes should be expected for each API
> response
> > (here the MetadataResponse) along with perhaps even some context-related
> > notes on what they mean.  As it stands, the protocol document leaves all
> of
> > the ErrorCode documentation until the end and doesn't give any context
> > about *when* to handle each error:
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol#AGuideToTheKafkaProtocol-ErrorCodes
> >
> > I would volunteer to go in to the wiki and help with that effort, but I
> > also feel like perhaps protocol document changes deserve a more strict
> > review process, .  Maybe the KIP process mentioned separately on the
> > dev-list.  Maybe the protocol document itself should be versioned and
> > released with core
> >
> > Nonetheless, right now the error-handling part of managing clients is
> > fairly ad-hoc and I think we should work to tighten that process up.
> >
> > Dana Powers
> > Rdio, Inc.
> > dana.pow...@rd.io
> > rdio.com/people/dpkp/
> >
> >
> > On Wed, Jan 14, 2015 at 5:43 PM, Jun Rao <j...@confluent.io> wrote:
> >
> > > Hi, Dana,
> > >
> > > Thanks for reporting this. I investigated this a bit more. What you
> > > observed is the following: a client getting a partition level error
> > > code of ReplicaNotAvailableError
> > > in a TopicMetadataResponse when one of replicas is offline. The short
> > story
> > > is that that behavior can already happen in 0.8.1, although the
> > probability
> > > of it showing up in 0.8.1 is less than that in 0.8.2.
> > >
> > > Currently, when sending a topic metadata response, the broker only
> > includes
> > > replicas (in either isr or assigned replica set) that are alive. To
> > > indicate that a replica is missing, we set the partition level error
> code
> > > to ReplicaNotAvailableError. In most cases, the client probably just
> > cares
> > > about the leader in the response. However, this error code could be
> > useful
> > > for some other clients (e.g., building admin tools). Since our
> java/scala
> > > producer/consumer clients (both 0.8.1 and 0.8.2) only care about the
> > > leader, they are ignoring the error code. That's why they are not
> > affected
> > > by this behavior. The reason why this behavior doesn't show up as often
> > in
> > > 0.8.1 as in 0.8.2 is that in 0.8.1, we had a bug such that dead brokers
> > are
> > > never removed from the metadata cache on the broker. That bug has since
> > > been fixed in 0.8.2. To reproduce that behavior in 0.8.1, you can do
> the
> > > following: (1) start 2 brokers, (2) create a topic with 1 partition
> and 2
> > > replicas, (3) bring down both brokers, (4) restart only 1 broker, (5)
> > issue
> > > a TopicMetadataRequest on that topic, (6) you should see the
> > > ReplicaNotAvailableError
> > > code.
> > >
> > > So, technically, this is not a regression from 0.8.1. I agree that we
> > > should have documented this behavior more clearly. Really sorry about
> > that.
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Wed, Jan 14, 2015 at 1:14 PM, Dana Powers <dana.pow...@rd.io>
> wrote:
> > >
> > > > Overall the 0.8.2.0 release candidate looks really good.
> > > >
> > > > All of the kafka-python integration tests pass as they do w/ prior
> > > servers,
> > > > except one... When testing recovery from a broker failure / leader
> > > switch,
> > > > we now see a ReplicaNotAvailableError in broker metadata /
> > > > PartitionMetadata, which we do not see in the same test against
> > previous
> > > > servers.  I understand from discussion around KAFKA-1609 and
> KAFKA-1649
> > > > that this behavior is expected and that clients should ignore the
> error
> > > (or
> > > > at least treat it as non-critical).  But strictly speaking this is a
> > > > behavior change and could cause client issues.  Indeed, anyone using
> > > older
> > > > versions of kafka-python against this release candidate will get bad
> > > > failures on leader switch (exactly when you don't want bad client
> > > > failures!).  It may be that it is our fault for not handling this in
> > > > kafka-python, but at the least I think this needs to be flagged as a
> > > > possible issue for 3rd party clients.  Also KAFKA-1649 doesn't look
> > like
> > > it
> > > > was ever actually resolved... The protocol document does not mention
> > > > anything about clients ignoring this error code.
> > > >
> > > > Dana Powers
> > > > Rdio, Inc.
> > > > dana.pow...@rd.io
> > > > rdio.com/people/dpkp/
> > > >
> > >
> >
>

Reply via email to