An LAC read will fail in this way if Ack Quorum or more bookies respond with any other than OK, NoSuchEntry, NoSuchLedger.
What is your ack quorum? If it is just 1 (not a good setting), then a single bookie being down will make the LAC read fail this way. If your ack quorum is 2, then 2 bookies being down will cause it etc. Jack On Tue, Sep 14, 2021 at 1:17 PM zhangao <gaozhangmin...@qq.com.invalid> wrote: > [ External sender. Exercise caution. ] > > As title, When bookie is lost, the ledger which state is open cannot > replicated because of reading lac from failed bookie. > it would failed read lac from failed bookie, because it cannot be > connected. > > How bookkeeper auto recovery deal with open ledger in failed bookie ? > > I don't know if it's a bug or not. > > The error log: > > 12:29:57.072 [main-EventThread] INFO > org.apache.bookkeeper.client.DefaultBookieAddressResolver - Cannot resolve > x.x.x.x:3181, bookie is unknown > org.apache.bookkeeper.client.BKException$BKBookieHandleNotAvailableException: > Bookie handle is not available > > 12:29:57.072 [main-EventThread] ERROR > org.apache.bookkeeper.proto.PerChannelBookieClient - Cannot connect to > x.x.x.x:3181 as endpoint resolution failed (probably bookie is down) err > org.apache.bookkeeper.proto.BookieAddressResolver$BookieIdNotResolvedException: > Cannot resolve bookieId x.x.x.x:3181, bookie does not exist or it is not > running > > 12:29:57.078 [BookKeeperClientWorker-OrderedExecutor-29-0] INFO > org.apache.bookkeeper.client.PendingReadLacOp - While readLac ledger: 96789 > did not hear success responses from all of ensemble > > 12:29:57.078 [ReplicationWorker] INFO > org.apache.bookkeeper.replication.ReplicationWorker - BKReadException while > rereplicating ledger 96789. Enough Bookies might not have available So, no > harm to continue