Re: Log truncation and sync up when bookie fails and rejoins

2020-01-28 Thread Ivan Kelly
> Thanks for the detailed response. Just one question, if writer doesn't > fail, but bookie write fails (Say a soft failure because of network problem > or GC pause), the writer will create a new fragment within a ledger. So the > same sequence of operations that happen while closing the ledger nee

Re: Log truncation and sync up when bookie fails and rejoins

2020-01-28 Thread Unmesh Joshi
Thanks for the detailed response. Just one question, if writer doesn't fail, but bookie write fails (Say a soft failure because of network problem or GC pause), the writer will create a new fragment within a ledger. So the same sequence of operations that happen while closing the ledger needs to ha

Re: Log truncation and sync up when bookie fails and rejoins

2020-01-28 Thread Ivan Kelly
> So.. log truncation, the way it's needed in leader based systems like RAFT > and Kafka, where leader may have entries appended to its log which are not > replicated. If leader crashes before replicating entries, which will elect > other node as leader. Once the previous leader rejoins the cluster

Re: Log truncation and sync up when bookie fails and rejoins

2020-01-28 Thread Unmesh Joshi
==You also ask for pointers. Are you ==looking for code pointers Yes.. looking for code pointers.. So.. log truncation, the way it's needed in leader based systems like RAFT and Kafka, where leader may have entries appended to its log which are not replicated. If leader crashes before replicating

Re: Log truncation and sync up when bookie fails and rejoins

2020-01-28 Thread Ivan Kelly
> From the bookie perspective, if a bookie of a ledger ensemble crashes while a > ledger is being written to, then it is replaced and the history of the ledger > is updated in the ledger metadata according to the last add confirmed by the > crashed bookie. If the bookie crashes after the ledger

Re: Log truncation and sync up when bookie fails and rejoins

2020-01-28 Thread Flavio Junqueira
Hi Umesh, The BookKeeper protocol is closer to register protocols than it is to consensus protocols, although we do need consensus for agreement on the closed state of a ledger. A ledger has a single writer, and either the writer closes it naturally or a reader needs to close it by running a re

Re: Log truncation and sync up when bookie fails and rejoins

2020-01-28 Thread Unmesh Joshi
And in case writing to a bookie fail in a quorum fails, and entry is rewritten on the other set of set of bookies, the truncation of entries on that specific bookie beyond last add confirmed, will also be handled by auto recovery? So recovery does addition of missing entries, and truncation of ext

Re: Log truncation and sync up when bookie fails and rejoins

2020-01-28 Thread Ivan Kelly
> But who takes care of updating a particular Bookie in case it crashses (or > temporarily partitioned) and rejoins the cluster? Autorecovery takes care of this. The metadata describes the entries that should exist on a bookie. If this doesn't match what actually exists on the bookie, autorecovery

Log truncation and sync up when bookie fails and rejoins

2020-01-28 Thread Unmesh Joshi
Hi, In case of partial failures while implementing Replicated Log, there are few requirements which need to be fulfilled to sync logs on multiple nodes in case of node failure. e.g. In RAFT, if a node fails, there is a sync up that happens with communication from leader to push all the entries and