That's why you have 3 brokers in minimum for production, having replication factor set to 3 , min.isr set to 2, having each broker on different rack , you could also use mm2 or replicator to copy data to other dc...
בתאריך יום ו׳, 18 ביוני 2021, 00:33, מאת Jhanssen Fávaro < jhanssenfav...@gmail.com>: > Thats a disaster recovery simulation, we need to validate a way to avoid > that in a disaster case/scenario!! I mean If I have a disaster and the > servers got rebooted we need to prevent its kafka weaknes. > > Regards, > Jhanssen Fávaro de Oliveira > > > > On Thu, Jun 17, 2021 at 6:30 PM Sunil Unnithan <sunilu...@gmail.com> > wrote: > > > Why would you reboot all three brokers on same week/day? > > > > On Thu, Jun 17, 2021 at 5:26 PM Jhanssen Fávaro < > jhanssenfav...@gmail.com> > > wrote: > > > > > Sunil, > > > Business needs... Anyway, if it was 2, we would face the same problem. > > For > > > example if the partition leader was the last one to be rebooted and > then > > > got its disk corrupted. The erase would happens the same way. > > > > > > Regrads, > > > > > > On 2021/06/17 21:23:40, Sunil Unnithan <sunilu...@gmail.com> wrote: > > > > Why isr=all? Why not use min.isr=2 in this case? > > > > > > > > On Thu, Jun 17, 2021 at 5:11 PM Jhanssen Fávaro < > > > jhanssenfav...@gmail.com> > > > > wrote: > > > > > > > > > Basically, if we have 3 brokers and the ISR == all, and in the case > > > that a > > > > > leader partition broker was the last server that was > > > restarted/rebooted, > > > > > and during its startup got a disk corruption, all the followers > will > > > mark > > > > > the topic as offline. > > > > > So, If the last broker leader that got the corrupted disk starts, > It > > > will > > > > > be back to the partition leaderhip and then erase all the others > > > > > followers/brokers in the cluster. > > > > > > > > > > It should at least "asks" the other 2 brokers if they are not > zeroed. > > > > > Anyway to avoid this data to be truncate in the followers ? > > > > > > > > > > Best Regards, > > > > > Jhanssen > > > > > On 2021/06/17 20:54:50, Jhanssen F��varo <jhanssenfav...@gmail.com > > > > > > > wrote: > > > > > > Hi all, we were testing kafka disaster/recover in our Sites. > > > > > > > > > > > > Anyway do avoid the scenario in this post ? > > > > > > > https://blog.softwaremill.com/help-kafka-ate-my-data-ae2e5d3e6576 > > > > > > > > > > > > But, the Unclean Leader exception is not an option in our case. > > > > > > FYI.. > > > > > > We needed to deactivated our systemctl for kafka brokers to > avoid a > > > > > service startup with a corrupted leader disk. > > > > > > > > > > > > Best Regards! > > > > > > > > > > > > > > > > > > > > > > > > > > >