1. Any ledgers that are open, and whose current ensemble include that
downed bookie would be unrecoverable until it comes back online. If other
ledgers are recoverable that are hosted on that bookie then either the
ledgers are closed, or they are open (or in-recovery) but the last ensemble
does not include this bookie. The LAC read only gets sent to the bookies of
the current ensemble (and a ledger may have multiple ensembles if
ensemble changes occurred).
2. Look for "coverageSet.checkCovered()"

Jack

On Tue, Sep 14, 2021 at 3:00 PM zhangao <gaozhangmin...@qq.com.invalid>
wrote:

> [ External sender. Exercise caution. ]
>
> my ack quorum is 1, please let me explain my confusion:
> 1、when one bookie is down, as you said, why some ledgers can be replicated
> successfully, but some cannot.
> 2、from the code below in PendingReadLacOp, i don't see any codes relation
> to ack quorum when read lac.
>
>
> public void initiate() {
>     for (int i = 0; i < currentEnsemble.size(); i++) {
>         bookieClient.readLac(currentEnsemble.get(i), lh.ledgerId, this, i);
>     }
> }
>
>
> ------------------ 原始邮件 ------------------
> 发件人:
>                                                   "dev"
>                                                                 <
> jvanligh...@splunk.com.INVALID&gt;;
> 发送时间:&nbsp;2021年9月14日(星期二) 晚上8:49
> 收件人:&nbsp;"dev"<dev@bookkeeper.apache.org&gt;;
>
> 主题:&nbsp;Re: AutoRecovery failed replicate ledger , because, it would read
> lac from failed bookie
>
>
>
> An LAC read will fail in this way if Ack Quorum or more bookies respond
> with any other than OK, NoSuchEntry, NoSuchLedger.
>
> What is your ack quorum? If it is just 1 (not a good setting), then a
> single bookie being down will make the LAC read fail this way. If your ack
> quorum is 2, then 2 bookies being down will cause it etc.
>
> Jack
>
> On Tue, Sep 14, 2021 at 1:17 PM zhangao <gaozhangmin...@qq.com.invalid&gt;
> wrote:
>
> &gt; [ External sender. Exercise caution. ]
> &gt;
> &gt; As title, When bookie is lost, the ledger which state is open cannot
> &gt; replicated because of reading lac from failed bookie.
> &gt; it would failed read lac from failed bookie, because it cannot be
> &gt; connected.
> &gt;
> &gt; How bookkeeper auto recovery deal with open ledger in failed bookie ?
> &gt;
> &gt; I don't know if it's a bug or not.
> &gt;
> &gt; The error log:
> &gt;
> &gt; 12:29:57.072 [main-EventThread] INFO&amp;nbsp;
> &gt; org.apache.bookkeeper.client.DefaultBookieAddressResolver - Cannot
> resolve
> &gt; x.x.x.x:3181, bookie is unknown
> &gt;
> org.apache.bookkeeper.client.BKException$BKBookieHandleNotAvailableException:
> &gt; Bookie handle is not available
> &gt;
> &gt; 12:29:57.072 [main-EventThread] ERROR
> &gt; org.apache.bookkeeper.proto.PerChannelBookieClient - Cannot connect to
> &gt; x.x.x.x:3181 as endpoint resolution failed (probably bookie is down)
> err
> &gt;
> org.apache.bookkeeper.proto.BookieAddressResolver$BookieIdNotResolvedException:
> &gt; Cannot resolve bookieId x.x.x.x:3181, bookie does not exist or it is
> not
> &gt; running
> &gt;
> &gt; 12:29:57.078 [BookKeeperClientWorker-OrderedExecutor-29-0]
> INFO&amp;nbsp;
> &gt; org.apache.bookkeeper.client.PendingReadLacOp - While readLac ledger:
> 96789
> &gt; did not hear success responses from all of ensemble
> &gt;
> &gt; 12:29:57.078 [ReplicationWorker] INFO&amp;nbsp;
> &gt; org.apache.bookkeeper.replication.ReplicationWorker - BKReadException
> while
> &gt; rereplicating ledger 96789. Enough Bookies might not have available
> So, no
> &gt; harm to continue

Reply via email to