I just left a comment in your PR about the part of the *schema*. PTAL! Thanks, sinan
Rajan Dhabalia <rdhaba...@apache.org> 于2023年12月21日周四 01:19写道: > >>>>> I don't see the topic load issues. The topic loading works fine, and > the producer works fine. But the proposal said it would resolve the topic > load issue, can you reproduce the topic load issue? > > Yes, you tried different usecase and not the one which is mentioned in the > PIP. Deleted ledgers give specific error code that is handled by broker and > broker skips such non-recoverable ledgers. However, you can reproduce issue > #21751 when bookies are removed from the clusters without graceful > recovery, in that case brokers can not conclude such non-recoverable errors > which could have impacted multiple ledgers and topics, and it makes those > topics unavailable until there will be a manual cleanup of managed-ledger > metadata for each topic. > > >>>>>> And introduce a new configuration such as > `ledgerFailedToRecoverThreashold`, if the ledger continues to > fail-recover,. > > No, let's not introduce such unnecessary complication as we already have > autoSkipNonRecoverableData flag to handle non-recoverable errors and one > doesn't want to take a bet on number of retries to skip non-recoverable > data but one needs a control when one is sure about actual data loss or > bookies are removed from the clusters and one really requires force > skipping such non-recoverable data using a flag that helps to forcefully > skip them. > > Thanks, > Rajan > > On Wed, Dec 20, 2023 at 1:17 AM 太上玄元道君 <dao...@apache.org> wrote: > > > In my understanding, the PIP is for some certain `extreme` conditions. > Some > > ledgers failing to recover is an event with a very low probability, and > it > > should be hard to reproduce(unless we delete some ledgers manually). > > > > If we skip these failed-recover ledgers, message production should be > able > > to proceed smoothly. > > > > But for message consumption, how can we deal with it? > > 1. Skip them: it will lead to data loss, even these ledgers just failed > to > > recover temporarily. > > 2. Not skip them: Consumers may cann't receive messages from brokers, the > > consumption of messages cannot proceed normally, even these ledgers were > > deleted and cannot recover. > > > > So we must accurately determine whether these Ledgers are temporarily > > unable to recover or will never be able to recover. > > Maybe we need to persist the failed-recover number of times of the ledger > > into MetadataStore, if the ledger recovers successfully, set it to 0, > else, > > +1. > > And introduce a new configuration such as > > `ledgerFailedToRecoverThreashold`, > > if the ledger continues to fail-recover, and the number of times is > > greater than `ledgerFailedToRecoverThreashold` , delete the ledger from > > MetadataStore. > > > > Thanks > > > > PengHui Li <peng...@apache.org> 于2023年12月20日周三 16:32写道: > > > > > Hi Rajan, > > > > > > I tried to test the case that you provided in the proposal. > > > > > > - Produce messages to a topic > > > - Unload the topic 5 times to ensure we have some ledgers in the topic > > > - Delete one ledger by using the bookkeeper shell > > > - Unload the topic again > > > - Start to produce messages again, it works > > > - Start a consumer to consume messages from the earliest position, it > get > > > stuck on the deleted ledger > > > > > > I don't see the topic load issues. The topic loading works fine, and > the > > > producer works fine. > > > But the proposal said it would resolve the topic load issue, can you > > > reproduce the topic load issue? > > > > > > Regards, > > > Penghui > > > > > > > > > > > > On Wed, Dec 20, 2023 at 3:28 AM Rajan Dhabalia <rdhaba...@apache.org> > > > wrote: > > > > > > > Hi, > > > > > > > > We have an issue to fail loading topics in unrecoverable situation > and > > > > impacting topic availability:: > > > > https://github.com/apache/pulsar/issues/21751 > > > > This PIP addresses the issue and allows brokers to handle such > > situations > > > > and maintain the topic availability: > > > > > > > > PIP: https://github.com/apache/pulsar/pull/21752 > > > > > > > > Thanks, > > > > Rajan > > > > > > > > > >