Thanks, it helps a lot, for a long time I am used to read documentation on Kafka official site, you made me realize that there are also a lot of resources on Confluent.
M. Manna <manme...@gmail.com> 于2020年3月12日周四 下午9:06写道: > Please see the following link from Confluent. Also, if you register with > Confluent Technical Talks, they are running quite a lot of nice and > simplified webinar this month on Fundamentals of Kafka. > > https://www.youtube.com/watch?v=ibozaujze9k > > I thought the 2 part presentation was quite good (but I don't work for > Confluent :), so a disclaimer in advance). > > There is also an upcoming webinar on how Kafka is integrated in your > application/architecture. > > I hope it helps. > > Regards, > M. MAnna > > On Thu, 12 Mar 2020 at 00:51, 张祥 <xiangzhang1...@gmail.com> wrote: > > > Thanks, very helpful ! > > > > Peter Bukowinski <pmb...@gmail.com> 于2020年3月12日周四 上午5:48写道: > > > > > Yes, that’s correct. While a broker is down: > > > > > > all topic partitions assigned to that broker will be under-replicated > > > topic partitions with an unmet minimum ISR count will be offline > > > leadership of partitions meeting the minimum ISR count will move to the > > > next in-sync replica in the replica list > > > if no in-sync replica exists for a topic-partitions, it will be offline > > > Setting unclean.leader.election.enable=true will allow an out-of-sync > > > replica to become a leader. > > > If topic partition availability is more important to you than data > > > integrity, you should allow unclean leader election. > > > > > > > > > > On Mar 11, 2020, at 6:11 AM, 张祥 <xiangzhang1...@gmail.com> wrote: > > > > > > > > Hi, Peter, following what we talked about before, I want to > understand > > > what > > > > will happen when one broker goes down, I would say it will be very > > > similar > > > > to what happens under disk failure, except that the rules apply to > all > > > the > > > > partitions on that broker instead of only one malfunctioned disk. Am > I > > > > right? Thanks. > > > > > > > > 张祥 <xiangzhang1...@gmail.com> 于2020年3月5日周四 上午9:25写道: > > > > > > > >> Thanks Peter, really appreciate it. > > > >> > > > >> Peter Bukowinski <pmb...@gmail.com> 于2020年3月4日周三 下午11:50写道: > > > >> > > > >>> Yes, you should restart the broker. I don’t believe there’s any > code > > to > > > >>> check if a Log directory previously marked as failed has returned > to > > > >>> healthy. > > > >>> > > > >>> I always restart the broker after a hardware repair. I treat broker > > > >>> restarts as a normal, non-disruptive operation in my clusters. I > use > > a > > > >>> minimum of 3x replication. > > > >>> > > > >>> -- Peter (from phone) > > > >>> > > > >>>> On Mar 4, 2020, at 12:46 AM, 张祥 <xiangzhang1...@gmail.com> wrote: > > > >>>> > > > >>>> Another question, according to my memory, the broker needs to be > > > >>> restarted > > > >>>> after replacing disk to recover this. Is that correct? If so, I > take > > > >>> that > > > >>>> Kafka cannot know by itself that the disk has been replaced, > > manually > > > >>>> restart is necessary. > > > >>>> > > > >>>> 张祥 <xiangzhang1...@gmail.com> 于2020年3月4日周三 下午2:48写道: > > > >>>> > > > >>>>> Thanks Peter, it makes a lot of sense. > > > >>>>> > > > >>>>> Peter Bukowinski <pmb...@gmail.com> 于2020年3月3日周二 上午11:56写道: > > > >>>>> > > > >>>>>> Whether your brokers have a single data directory or multiple > data > > > >>>>>> directories on separate disks, when a disk fails, the topic > > > partitions > > > >>>>>> located on that disk become unavailable. What happens next > depends > > > on > > > >>> how > > > >>>>>> your cluster and topics are configured. > > > >>>>>> > > > >>>>>> If the topics on the affected broker have replicas and the > minimum > > > ISR > > > >>>>>> (in-sync replicas) count is met, then all topic partitions will > > > remain > > > >>>>>> online and leaders will move to another broker. Producers and > > > >>> consumers > > > >>>>>> will continue to operate as usual. > > > >>>>>> > > > >>>>>> If the topics don’t have replicas or the minimum ISR count is > not > > > met, > > > >>>>>> then the topic partitions on the failed disk will be offline. > > > >>> Producers can > > > >>>>>> still send data to the affected topics — it will just go to the > > > online > > > >>>>>> partitions. Consumers can still consume data from the online > > > >>> partitions. > > > >>>>>> > > > >>>>>> -- Peter > > > >>>>>> > > > >>>>>>>> On Mar 2, 2020, at 7:00 PM, 张祥 <xiangzhang1...@gmail.com> > > wrote: > > > >>>>>>>> > > > >>>>>>>> Hi community, > > > >>>>>>>> > > > >>>>>>>> I ran into disk failure when using Kafka, and fortunately it > did > > > not > > > >>>>>> crash > > > >>>>>>> the entire cluster. So I am wondering how Kafka handles > multiple > > > >>> disks > > > >>>>>> and > > > >>>>>>> it manages to work in case of single disk failure. The more > > > detailed, > > > >>>>>> the > > > >>>>>>> better. Thanks ! > > > >>>>>> > > > >>>>> > > > >>> > > > >> > > > > > > > > >