On Mon, Aug 6, 2018 at 12:08 AM Ivan Kelly <iv...@apache.org> wrote: > >> Recovery operates on a few seconds of data (from the last LAC written > >> to the end of the ledger, call this LLAC). > > > > the data during this duration can be very large if the traffic of the > > ledger is large. That has > > been observed at Twitter's production. so when we are talking about "a > few > > seconds of data", > > we can't assume the amount of data is little. That says the recovery can > be > > taking time than > > Yes, it can be large, but still it is only a few seconds worth of > data. It is the amount of data that can be transmitted in the period > of one roundtrip, as the next roundtrip will update the LAC.
> I didn't mean to imply the data was small. I was implying that the > data was small in comparison to the overall size of that ledger. > > what we can expect, so if we don't handle failures during recovery how we > > are able to ensure > > we have enough data copy during recovery. > > Consider a e3w3a2 ledger, there's two cases where you can lose a > bookie during recover. > > Case one, one bookie is lost. You can still recover from as ack=2 is > available. > Case two, two bookies are lost. You can't recover, but ledger is > unavailable anyhow, since any entry in the ledger may only have been > replicated to 2. > > However, with e3w3a3 I guess you wouldn't be able to recover at all, > and we have to handle that case. > > > I am not sure "make ledger metadata immutable" == "getting rid of merging > > ledger metadata". > > because I don't think these are same thing. making ledger metadata > > immutable will make code > > much clearer and simpler because the ledger metadata is immutable. how > > getting rid of merging > > ledger metadata is a different thing, when you make ledger metadata > > immutable, it will help make > > merging ledger metadata on conflicts clearer. > > I wouldn't call it merging in this case. That's fine. > Merging implies taking two > valid pieces of metadata and getting another usable, valid metadata > from it. > What happens with immutable metadata, is that you are taking one valid > metadata, and applying operations to it. So in the failure during > recovery place, we would have a list of AddEnsemble operations which > we add when we try to close. > > In theory this is perfectly valid and clean. It just can look messy in > the code, due to how the PendingAddOp reaches back into the ledger > handle to get the current ensemble. > That's okay since it is reality which we have to face anyway. But the most important thing is that we can't get rid of ensemble changes during ledger recovery. > > So, in conclusion, I will keep the handling. Thank you. > In any case, these > changes are all still blocked on > https://github.com/apache/bookkeeper/pull/1577. > > -Ivan >