Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

Dong Lin Sun, 05 Mar 2017 22:31:11 -0800

Hey Jun,

I am happy to work for a few days if that is what it takes to discuss
KIP-113. But if it takes 2+ weeks to discuss KIP-113, I wondering if we can
vote for KIP-112 first. We aim to start JBOD test in a test cluster by end
of Q1 and there is only three weeks from that. If we can miss the schedule,
we may have to do one-broker-per-disk setup first which causes the
unnecessary trouble to migrate to JBOD later. It will greatly help me meet
this schedule if we can vote for KIP-112 and get the patch reviewed while
KIP-113 is under discussion.


It seems reasonable to vote for KIP-112 before KIP-113. The reason is that
we have agreed on the motivation of both KIPs and we already have a
compelling case for JBOD. What we need to discuss for KIP-113 is its
implementation detail instead of motivation. Whatever implementation we end
up with KIP-113 should be implemented on top of KIP-112 without backward
incompatible change.

I understand that you are busy with reviewing many stuffs so I would prefer
not to ask you for quick response on KIP-113. If KIP-112 is blocked on
KIP-113, I would somehow be under pressure to bother you repeatedly.
Allowing me to work on KIP-112 implementation can make our schedule easier:)

Thank you for all the time to review these KIPs!

Dong



On Thu, Mar 2, 2017 at 9:36 AM, Jun Rao <j...@confluent.io> wrote:

> Hi, Dong,
>
> Ok. We can keep LeaderAndIsrRequest as it is in the wiki.
>
> Since we need both KIP-112 and KIP-113 to make a compelling case for JBOD,
> perhaps we should discuss KIP-113 before voting for both? I left some
> comments in the other thread.
>
> Thanks,
>
> Jun
>
> On Wed, Mar 1, 2017 at 1:58 PM, Dong Lin <lindon...@gmail.com> wrote:
>
> > Hey Jun,
> >
> > Do you think it is OK to keep the existing wire protocol in the KIP? I am
> > wondering if we can initiate vote for this KIP.
> >
> > Thanks,
> > Dong
> >
> >
> >
> > On Tue, Feb 28, 2017 at 2:41 PM, Dong Lin <lindon...@gmail.com> wrote:
> >
> > > Hey Jun,
> > >
> > > I just realized that StopReplicaRequest itself doesn't specify the
> > > replicaId in the wire protocol. Thus controller would need to log the
> > > brokerId with StopReplicaRequest in the log. Thus it may be
> > > reasonable for controller to do the same with LeaderAndIsrRequest and
> > only
> > > specify the isNewReplica for the broker that receives
> > LeaderAndIsrRequest.
> > >
> > > Thanks,
> > > Dong
> > >
> > > On Tue, Feb 28, 2017 at 2:14 PM, Dong Lin <lindon...@gmail.com> wrote:
> > >
> > >> Hi Jun,
> > >>
> > >> Yeah there is tradeoff between controller's implementation complexity
> > vs.
> > >> wire-protocol complexity. I personally think it is more important to
> > keep
> > >> wire-protocol concise and only add information in wire-protocol if
> > >> necessary. It seems fine to add a little bit complexity to
> controller's
> > >> implementation, e.g. log destination broker per LeaderAndIsrRequet.
> > Becket
> > >> also shares this opinion with me. Is the only purpose of doing so to
> > make
> > >> controller log simpler?
> > >>
> > >> And certainly, I have added Todd's comment in the wiki.
> > >>
> > >> Thanks,
> > >> Dong
> > >>
> > >>
> > >> On Tue, Feb 28, 2017 at 1:37 PM, Jun Rao <j...@confluent.io> wrote:
> > >>
> > >>> Hi, Dong,
> > >>>
> > >>> 52. What you suggested would work. However, I am thinking that it's
> > >>> probably simpler to just set isNewReplica at the replica level. That
> > way,
> > >>> the LeaderAndIsrRequest can be created a bit simpler. When reading a
> > >>> LeaderAndIsrRequest in the controller log, it's easier to see which
> > >>> replicas are new without looking at which broker the request is
> > intended
> > >>> for.
> > >>>
> > >>> Could you also add those additional points from Todd's on 1 broker
> per
> > >>> disk
> > >>> vs JBOD vs RAID5/6 to the KIP?
> > >>>
> > >>> Thanks,
> > >>>
> > >>> Hi, Todd,
> > >>>
> > >>> Thanks for the feedback. That's very useful.
> > >>>
> > >>> Jun
> > >>>
> > >>> On Tue, Feb 28, 2017 at 10:25 AM, Dong Lin <lindon...@gmail.com>
> > wrote:
> > >>>
> > >>> > Hey Jun,
> > >>> >
> > >>> > Certainly, I have added Todd to reply to the thread. And I have
> > >>> updated the
> > >>> > item to in the wiki.
> > >>> >
> > >>> > 50. The full statement is "Broker assumes a log directory to be
> good
> > >>> after
> > >>> > it starts, and mark log directory as bad once there is IOException
> > when
> > >>> > broker attempts to access (i.e. read or write) the log directory".
> > This
> > >>> > statement seems reasonable, right? If a log directory is actually
> > bad,
> > >>> then
> > >>> > the broker will first assume it is OK, try to read logs on this log
> > >>> > directory, encounter IOException, and then mark it as bad.
> > >>> >
> > >>> > 51. My bad. I thought I removed it but I didn't. It is removed now.
> > >>> >
> > >>> > 52. I don't think so.. The isNewReplica field in the
> > >>> LeaderAndIsrRequest is
> > >>> > only relevant to the replica (i.e. broker) that receives the
> > >>> > LeaderAndIsrRequest. There is no need to specify whether each
> replica
> > >>> is
> > >>> > new inside LeaderAndIsrRequest. In other words, if a broker sends
> > >>> > LeaderAndIsrRequest to three different replicas of a given
> partition,
> > >>> the
> > >>> > isNewReplica field can be different across these three requests.
> > >>> >
> > >>> > Yeah, I would definitely want to start discussion on KIP-113 after
> we
> > >>> have
> > >>> > reached agreement on KIP-112. I have actually opened KIP-113
> > discussion
> > >>> > thread on 1/12 together with this thread. I have yet to add the
> > >>> ability to
> > >>> > list offline directories in KIP-113 which we discussed in this
> > thread.
> > >>> >
> > >>> > Thanks for all your reviews! Is there further concern with the
> latest
> > >>> KIP?
> > >>> >
> > >>> > Thanks!
> > >>> > Dong
> > >>> >
> > >>> > On Tue, Feb 28, 2017 at 9:40 AM, Jun Rao <j...@confluent.io> wrote:
> > >>> >
> > >>> > > Hi, Dong,
> > >>> > >
> > >>> > > RAID6 is an improvement over RAID5 and can tolerate 2 disks
> > failure.
> > >>> > Eno's
> > >>> > > point is that the rebuild of RAID5/RAID6 requires reading more
> data
> > >>> > > compared with RAID10, which increases the probability of error
> > during
> > >>> > > rebuild. This makes sense. In any case, do you think you could
> ask
> > >>> the
> > >>> > SREs
> > >>> > > at LinkedIn to share their opinions on RAID5/RAID6?
> > >>> > >
> > >>> > > Yes, when a replica is offline due to a bad disk, it makes sense
> to
> > >>> > handle
> > >>> > > it immediately as if a StopReplicaRequest is received (i.e.,
> > replica
> > >>> is
> > >>> > no
> > >>> > > longer considered a leader and is removed from any replica
> fetcher
> > >>> > thread).
> > >>> > > Could you add that detail in item 2. in the wiki?
> > >>> > >
> > >>> > > 50. The wiki says "Broker assumes a log directory to be good
> after
> > it
> > >>> > > starts" : A log directory actually could be bad during startup.
> > >>> > >
> > >>> > > 51. In item 4, the wiki says "The controller watches the path
> > >>> > > /log_dir_event_notification for new znode.". This doesn't seem be
> > >>> needed
> > >>> > > now?
> > >>> > >
> > >>> > > 52. The isNewReplica field in LeaderAndIsrRequest should be for
> > each
> > >>> > > replica inside the replicas field, right?
> > >>> > >
> > >>> > > Other than those, the current KIP looks good to me. Do you want
> to
> > >>> start
> > >>> > a
> > >>> > > separate discussion thread on KIP-113? I do have some comments
> > there.
> > >>> > >
> > >>> > > Thanks for working on this!
> > >>> > >
> > >>> > > Jun
> > >>> > >
> > >>> > >
> > >>> > > On Mon, Feb 27, 2017 at 5:51 PM, Dong Lin <lindon...@gmail.com>
> > >>> wrote:
> > >>> > >
> > >>> > > > Hi Jun,
> > >>> > > >
> > >>> > > > In addition to the Eno's reference of why rebuild time with
> > RAID-5
> > >>> is
> > >>> > > more
> > >>> > > > expensive, another concern is that RAID-5 will fail if more
> than
> > >>> one
> > >>> > disk
> > >>> > > > fails. JBOD is still works with 1+ disk failure and has better
> > >>> > > performance
> > >>> > > > with one disk failure. These seems like good argument for using
> > >>> JBOD
> > >>> > > > instead of RAID-5.
> > >>> > > >
> > >>> > > > If a leader replica goes offline, the broker should first take
> > all
> > >>> > > actions
> > >>> > > > (i.e. remove the partition from fetcher thread) as if it has
> > >>> received
> > >>> > > > StopReplicaRequest for this partition because the replica can
> no
> > >>> longer
> > >>> > > > work anyway. It will also respond with error to any
> > ProduceRequest
> > >>> and
> > >>> > > > FetchRequest for partition. The broker notifies controller by
> > >>> writing
> > >>> > > > notification znode in ZK. The controller learns the disk
> failure
> > >>> event
> > >>> > > from
> > >>> > > > ZK, sends LeaderAndIsrRequest and receives LeaderAndIsrResponse
> > to
> > >>> > learn
> > >>> > > > that the replica is offline. The controller will then elect new
> > >>> leader
> > >>> > > for
> > >>> > > > this partition and sends LeaderAndIsrRequest/MetadataUp
> > dateRequest
> > >>> to
> > >>> > > > relevant brokers. The broker should stop adjusting the ISR for
> > this
> > >>> > > > partition as if the broker is already offline. I am not sure
> > there
> > >>> is
> > >>> > any
> > >>> > > > inconsistency in broker's behavior when it is leader or
> follower.
> > >>> Is
> > >>> > > there
> > >>> > > > any concern with this approach?
> > >>> > > >
> > >>> > > > Thanks for catching this. I have removed that reference from
> the
> > >>> KIP.
> > >>> > > >
> > >>> > > > Hi Eno,
> > >>> > > >
> > >>> > > > Thank you for providing the reference of the RAID-5. In
> LinkedIn
> > we
> > >>> > have
> > >>> > > 10
> > >>> > > > disks per Kafka machine. It will not be a show-stopper
> > >>> operationally
> > >>> > for
> > >>> > > > LinkedIn if we have to deploy one-broker-per-disk. On the other
> > >>> hand we
> > >>> > > > previously discussed the advantage of JBOD vs.
> > one-broker-per-disk
> > >>> or
> > >>> > > > one-broker-per-machine. One-broker-per-disk suffers from the
> > >>> problems
> > >>> > > > described in the KIP and one-broker-per-machine increases the
> > >>> failure
> > >>> > > > caused by disk failure by 10X. Since JBOD is strictly better
> than
> > >>> > either
> > >>> > > of
> > >>> > > > the two, it is also better then one-broker-per-multiple-disk
> > which
> > >>> is
> > >>> > > > somewhere between one-broker-per-disk and
> one-broker-per-machine.
> > >>> > > >
> > >>> > > > I personally think the benefits of JBOD design is worth the
> > >>> > > implementation
> > >>> > > > complexity it introduces. I would also argue that it is
> > reasonable
> > >>> for
> > >>> > > > Kafka to manage this low level detail because Kafka is already
> > >>> exposing
> > >>> > > and
> > >>> > > > managing replication factor of its data. But whether the
> > >>> complexity is
> > >>> > > > worthwhile can be subjective and I can not prove my opinion. I
> am
> > >>> > > > contributing significant amount of time to do this KIP because
> > >>> Kafka
> > >>> > > > develops at LinkedIn believes it is useful and worth the
> effort.
> > >>> Yeah,
> > >>> > it
> > >>> > > > will be useful to see what everyone else think about it.
> > >>> > > >
> > >>> > > >
> > >>> > > > Thanks,
> > >>> > > > Dong
> > >>> > > >
> > >>> > > >
> > >>> > > > On Mon, Feb 27, 2017 at 1:16 PM, Jun Rao <j...@confluent.io>
> > wrote:
> > >>> > > >
> > >>> > > > > Hi, Dong,
> > >>> > > > >
> > >>> > > > > For RAID5, I am not sure the rebuild cost is a big concern.
> If
> > a
> > >>> disk
> > >>> > > > > fails, typically an admin has to bring down the broker,
> replace
> > >>> the
> > >>> > > > failed
> > >>> > > > > disk with a new one, trigger the RAID rebuild, and bring up
> the
> > >>> > broker.
> > >>> > > > > This way, there is no performance impact at runtime due to
> > >>> rebuild.
> > >>> > The
> > >>> > > > > benefit is that a broker doesn't fail in a hard way when
> there
> > >>> is a
> > >>> > > disk
> > >>> > > > > failure and can be brought down in a controlled way for
> > >>> maintenance.
> > >>> > > > While
> > >>> > > > > the broker is running with a failed disk, reads may be more
> > >>> expensive
> > >>> > > > since
> > >>> > > > > they have to be computed from the parity. However, if most
> > reads
> > >>> are
> > >>> > > from
> > >>> > > > > page cache, this may not be a big issue either. So, it would
> be
> > >>> > useful
> > >>> > > to
> > >>> > > > > do some tests on RAID5 before we completely rule it out.
> > >>> > > > >
> > >>> > > > > Regarding whether to remove an offline replica from the
> fetcher
> > >>> > thread
> > >>> > > > > immediately. What do we do when a failed replica is a leader?
> > Do
> > >>> we
> > >>> > do
> > >>> > > > > nothing or mark the replica as not the leader immediately?
> > >>> > Intuitively,
> > >>> > > > it
> > >>> > > > > seems it's better if the broker acts consistently on a failed
> > >>> replica
> > >>> > > > > whether it's a leader or a follower. For ISR churns, I was
> just
> > >>> > > pointing
> > >>> > > > > out that if we don't send StopReplicaRequest to a broker to
> be
> > >>> shut
> > >>> > > down
> > >>> > > > in
> > >>> > > > > a controlled way, then the leader will shrink ISR, expand it
> > and
> > >>> > shrink
> > >>> > > > it
> > >>> > > > > again after the timeout.
> > >>> > > > >
> > >>> > > > > The KIP seems to still reference "
> > >>> > > > > /broker/topics/[topic]/partitions/[partitionId]/
> > >>> > > > controller_managed_state".
> > >>> > > > >
> > >>> > > > > Thanks,
> > >>> > > > >
> > >>> > > > > Jun
> > >>> > > > >
> > >>> > > > > On Sat, Feb 25, 2017 at 7:49 PM, Dong Lin <
> lindon...@gmail.com
> > >
> > >>> > wrote:
> > >>> > > > >
> > >>> > > > > > Hey Jun,
> > >>> > > > > >
> > >>> > > > > > Thanks for the suggestion. I think it is a good idea to
> know
> > >>> put
> > >>> > > > created
> > >>> > > > > > flag in ZK and simply specify isNewReplica=true in
> > >>> > > LeaderAndIsrRequest
> > >>> > > > if
> > >>> > > > > > repilcas was in NewReplica state. It will only fail the
> > replica
> > >>> > > > creation
> > >>> > > > > in
> > >>> > > > > > the scenario that the controller fails after
> > >>> > > > > > topic-creation/partition-reassignment/partition-number-
> chang
> > e
> > >>> but
> > >>> > > > before
> > >>> > > > > > actually sends out the LeaderAndIsrRequest while there is
> > >>> ongoing
> > >>> > > disk
> > >>> > > > > > failure, which should be pretty rare and acceptable. This
> > >>> should
> > >>> > > > simplify
> > >>> > > > > > the design of this KIP.
> > >>> > > > > >
> > >>> > > > > > Regarding RAID-5, I think the concern with RAID-5/6 is not
> > just
> > >>> > about
> > >>> > > > > > performance when there is no failure. For example, RAID-5
> can
> > >>> > support
> > >>> > > > up
> > >>> > > > > to
> > >>> > > > > > one disk failure and it takes time to rebuild disk after
> one
> > >>> disk
> > >>> > > > > > failure. RAID 5 implementations are susceptible to system
> > >>> failures
> > >>> > > > > because
> > >>> > > > > > of trends regarding array rebuild time and the chance of
> > drive
> > >>> > > failure
> > >>> > > > > > during rebuild. There is no such performance degradation
> for
> > >>> JBOD
> > >>> > and
> > >>> > > > > JBOD
> > >>> > > > > > can support multiple log directory failure without reducing
> > >>> > > performance
> > >>> > > > > of
> > >>> > > > > > good log directories. Would this be a reasonable reason for
> > >>> using
> > >>> > > JBOD
> > >>> > > > > > instead of RAID-5/6?
> > >>> > > > > >
> > >>> > > > > > Previously we discussed wether broker should remove offline
> > >>> replica
> > >>> > > > from
> > >>> > > > > > replica fetcher thread. I still think it should do it
> instead
> > >>> of
> > >>> > > > > printing a
> > >>> > > > > > lot of error in the log4j log. We can still let controller
> > send
> > >>> > > > > > StopReplicaRequest to the broker. I am not sure I undertand
> > why
> > >>> > > > allowing
> > >>> > > > > > broker to remove offline replica from fetcher thread will
> > >>> increase
> > >>> > > > churns
> > >>> > > > > > in ISR. Do you think this is concern with this approach?
> > >>> > > > > >
> > >>> > > > > > I have updated the KIP to remove created flag from ZK and
> > >>> change
> > >>> > the
> > >>> > > > > filed
> > >>> > > > > > name to isNewReplica. Can you check if there is any issue
> > with
> > >>> the
> > >>> > > > latest
> > >>> > > > > > KIP? Thanks for your time!
> > >>> > > > > >
> > >>> > > > > > Regards,
> > >>> > > > > > Dong
> > >>> > > > > >
> > >>> > > > > >
> > >>> > > > > > On Sat, Feb 25, 2017 at 9:11 AM, Jun Rao <j...@confluent.io
> >
> > >>> wrote:
> > >>> > > > > >
> > >>> > > > > > > Hi, Dong,
> > >>> > > > > > >
> > >>> > > > > > > Thanks for the reply.
> > >>> > > > > > >
> > >>> > > > > > > Personally, I'd prefer not to write the created flag per
> > >>> replica
> > >>> > in
> > >>> > > > ZK.
> > >>> > > > > > > Your suggestion of disabling replica creation if there
> is a
> > >>> bad
> > >>> > log
> > >>> > > > > > > directory on the broker could work. The only thing is
> that
> > >>> it may
> > >>> > > > delay
> > >>> > > > > > the
> > >>> > > > > > > creation of new replicas. I was thinking that an
> > alternative
> > >>> is
> > >>> > to
> > >>> > > > > extend
> > >>> > > > > > > LeaderAndIsrRequest by adding a isNewReplica field per
> > >>> replica.
> > >>> > > That
> > >>> > > > > > field
> > >>> > > > > > > will be set when a replica is transitioning from the
> > >>> NewReplica
> > >>> > > state
> > >>> > > > > to
> > >>> > > > > > > Online state. Then, when a broker receives a
> > >>> LeaderAndIsrRequest,
> > >>> > > if
> > >>> > > > a
> > >>> > > > > > > replica is marked as the new replica, it will be created
> > on a
> > >>> > good
> > >>> > > > log
> > >>> > > > > > > directory, if not already present. Otherwise, it only
> > >>> creates the
> > >>> > > > > replica
> > >>> > > > > > > if all log directories are good and the replica is not
> > >>> already
> > >>> > > > present.
> > >>> > > > > > > This way, we don't delay the processing of new replicas
> in
> > >>> the
> > >>> > > common
> > >>> > > > > > case.
> > >>> > > > > > >
> > >>> > > > > > > I am ok with not persisting the offline replicas in ZK
> and
> > >>> just
> > >>> > > > > > discovering
> > >>> > > > > > > them through the LeaderAndIsrRequest. It handles the
> cases
> > >>> when a
> > >>> > > > > broker
> > >>> > > > > > > starts up with bad log directories better. So, the
> > additional
> > >>> > > > overhead
> > >>> > > > > of
> > >>> > > > > > > rediscovering the offline replicas is justified.
> > >>> > > > > > >
> > >>> > > > > > >
> > >>> > > > > > > Another high level question. The proposal rejected
> RAID5/6
> > >>> since
> > >>> > it
> > >>> > > > > adds
> > >>> > > > > > > additional I/Os. The main issue with RAID5 is that to
> > write a
> > >>> > block
> > >>> > > > > that
> > >>> > > > > > > doesn't match the RAID stripe size, we have to first read
> > >>> the old
> > >>> > > > > parity
> > >>> > > > > > to
> > >>> > > > > > > compute the new one, which increases the number of I/Os (
> > >>> > > > > > > http://rickardnobel.se/raid-5-write-penalty/). I am
> > >>> wondering if
> > >>> > > you
> > >>> > > > > > have
> > >>> > > > > > > tested RAID5's performance by creating a file system
> whose
> > >>> block
> > >>> > > size
> > >>> > > > > > > matches the RAID stripe size (
> > https://www.percona.com/blog/
> > >>> > > > > > > 2011/12/16/setting-up-xfs-the-simple-edition/). This
> way,
> > >>> > writing
> > >>> > > a
> > >>> > > > > > block
> > >>> > > > > > > doesn't require a read first. A large block size may
> > >>> increase the
> > >>> > > > > amount
> > >>> > > > > > of
> > >>> > > > > > > data writes, when the same block has to be written to
> disk
> > >>> > multiple
> > >>> > > > > > times.
> > >>> > > > > > > However, this is probably ok in Kafka's use case since we
> > >>> batch
> > >>> > the
> > >>> > > > I/O
> > >>> > > > > > > flush already. As you can see, we will be adding some
> > >>> complexity
> > >>> > to
> > >>> > > > > > support
> > >>> > > > > > > JBOD in Kafka one way or another. If we can tune the
> > >>> performance
> > >>> > of
> > >>> > > > > RAID5
> > >>> > > > > > > to match that of RAID10, perhaps using RAID5 is a simpler
> > >>> > solution.
> > >>> > > > > > >
> > >>> > > > > > > Thanks,
> > >>> > > > > > >
> > >>> > > > > > > Jun
> > >>> > > > > > >
> > >>> > > > > > >
> > >>> > > > > > > On Fri, Feb 24, 2017 at 10:17 AM, Dong Lin <
> > >>> lindon...@gmail.com>
> > >>> > > > > wrote:
> > >>> > > > > > >
> > >>> > > > > > > > Hey Jun,
> > >>> > > > > > > >
> > >>> > > > > > > > I don't think we should allow failed replicas to be
> > >>> re-created
> > >>> > on
> > >>> > > > the
> > >>> > > > > > > good
> > >>> > > > > > > > disks. Say there are 2 disks and each of them is 51%
> > >>> loaded. If
> > >>> > > any
> > >>> > > > > > disk
> > >>> > > > > > > > fail, and we allow replicas to be re-created on the
> other
> > >>> > disks,
> > >>> > > > both
> > >>> > > > > > > disks
> > >>> > > > > > > > will fail. Alternatively we can disable replica
> creation
> > if
> > >>> > there
> > >>> > > > is
> > >>> > > > > > bad
> > >>> > > > > > > > disk on a broker. I personally think it is worth the
> > >>> additional
> > >>> > > > > > > complexity
> > >>> > > > > > > > in the broker to store created replicas in ZK so that
> we
> > >>> allow
> > >>> > > new
> > >>> > > > > > > replicas
> > >>> > > > > > > > to be created on the broker even when there is bad log
> > >>> > directory.
> > >>> > > > > This
> > >>> > > > > > > > approach won't add complexity in the controller. But I
> am
> > >>> fine
> > >>> > > with
> > >>> > > > > > > > disabling replica creation when there is bad log
> > directory
> > >>> that
> > >>> > > if
> > >>> > > > it
> > >>> > > > > > is
> > >>> > > > > > > > the only blocking issue for this KIP.
> > >>> > > > > > > >
> > >>> > > > > > > > Whether we store created flags is independent of
> > >>> whether/how we
> > >>> > > > store
> > >>> > > > > > > > offline replicas. Per our previous discussion, do you
> > >>> think it
> > >>> > is
> > >>> > > > OK
> > >>> > > > > > not
> > >>> > > > > > > > store offline replicas in ZK and propagate the offline
> > >>> replicas
> > >>> > > > from
> > >>> > > > > > > broker
> > >>> > > > > > > > to controller via LeaderAndIsrRequest?
> > >>> > > > > > > >
> > >>> > > > > > > > Thanks,
> > >>> > > > > > > > Dong
> > >>> > > > > > > >
> > >>> > > > > > >
> > >>> > > > > >
> > >>> > > > >
> > >>> > > >
> > >>> > >
> > >>> >
> > >>>
> > >>
> > >>
> > >
> >
>

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

Reply via email to