> Thanks for the detailed response. Just one question, if writer doesn't
> fail, but bookie write fails (Say a soft failure because of network problem
> or GC pause), the writer will create a new fragment within a ledger. So the
> same sequence of operations that happen while closing the ledger nee
Thanks for the detailed response. Just one question, if writer doesn't
fail, but bookie write fails (Say a soft failure because of network problem
or GC pause), the writer will create a new fragment within a ledger. So the
same sequence of operations that happen while closing the ledger needs to
ha
> So.. log truncation, the way it's needed in leader based systems like RAFT
> and Kafka, where leader may have entries appended to its log which are not
> replicated. If leader crashes before replicating entries, which will elect
> other node as leader. Once the previous leader rejoins the cluster
==You also ask for pointers. Are you ==looking for code pointers
Yes.. looking for code pointers..
So.. log truncation, the way it's needed in leader based systems like RAFT
and Kafka, where leader may have entries appended to its log which are not
replicated. If leader crashes before replicating
> From the bookie perspective, if a bookie of a ledger ensemble crashes while a
> ledger is being written to, then it is replaced and the history of the ledger
> is updated in the ledger metadata according to the last add confirmed by the
> crashed bookie. If the bookie crashes after the ledger
Hi Umesh,
The BookKeeper protocol is closer to register protocols than it is to consensus
protocols, although we do need consensus for agreement on the closed state of a
ledger. A ledger has a single writer, and either the writer closes it naturally
or a reader needs to close it by running a re
And in case writing to a bookie fail in a quorum fails, and entry is
rewritten on the other set of set of bookies, the truncation of entries on
that specific bookie beyond last add confirmed, will also be handled by
auto recovery? So recovery does addition of missing entries, and truncation
of ext
> But who takes care of updating a particular Bookie in case it crashses (or
> temporarily partitioned) and rejoins the cluster?
Autorecovery takes care of this. The metadata describes the entries
that should exist on a bookie. If this doesn't match what actually
exists on the bookie, autorecovery
Hi,
In case of partial failures while implementing Replicated Log, there are
few requirements which need to be fulfilled to sync logs on multiple nodes
in case of node failure. e.g. In RAFT, if a node fails, there is a sync up
that happens with communication from leader to push all the entries and