[ https://issues.apache.org/jira/browse/KAFKA-6414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16307606#comment-16307606 ]
Ted Yu commented on KAFKA-6414: ------------------------------- Interesting. Since the proposal changes the semantics of replica failover handling, does this need a KIP ? > Inverse replication for replicas that are far behind > ---------------------------------------------------- > > Key: KAFKA-6414 > URL: https://issues.apache.org/jira/browse/KAFKA-6414 > Project: Kafka > Issue Type: Bug > Components: replication > Reporter: Ivan Babrou > > Let's suppose the following starting point: > * 1 topic > * 1 partition > * 1 reader > * 24h retention period > * leader outbound bandwidth is 3x of inbound bandwidth (1x replication + 1x > reader + 1x slack = total outbound) > In this scenario, when replica fails and needs to be brought back from > scratch, you can catch up at 2x inbound bandwidth (1x regular replication + > 1x slack used). > 2x catch-up speed means replica will be at the point where leader is now in > 24h / 2x = 12h. However, in 12h the oldest 12h of the topic will fall out of > retention cliff and will be deleted. There's absolutely to use for this data, > it will never be read from the replica in any scenario. And this not even > including the fact that we still need to replicate 12h more of data that > accumulated since the time we started. > My suggestion is to refill sufficiently out of sync replicas backwards from > the tip: newest segments first, oldest segments last. Then we can stop when > we hit retention cliff and replicate far less data. The lower the ratio of > catch-up bandwidth to inbound bandwidth, the higher the returns would be. > This will also set a hard cap on retention time: it will be no higher than > retention period if catch-up speed if >1x (if it's less, you're forever out > of ISR anyway). > What exactly "sufficiently out of sync" means in terms of lag is a topic for > a debate. The default segment size is 1GiB, I'd say that being >1 full > segments behind probably warrants this. > As of now, the solution for slow recovery appears to be to reduce retention > to speed up recovery, which doesn't seem very friendly. -- This message was sent by Atlassian JIRA (v6.4.14#64029)