Hi, I've recently modelled the BookKeeper protocol in TLA+ and can confirm that once confirmed, that an entry is not replayed to another bookie. This leaves a "hole" as the entry is now replicated only to 2 bookies, however, the new data integrity check that Ivan worked on, when run periodically will be able to repair that hole.
Jack On Sat, Jan 9, 2021 at 1:06 AM Venkateswara Rao Jujjuri <jujj...@gmail.com> wrote: > [ External sender. Exercise caution. ] > > On Fri, Jan 8, 2021 at 2:29 PM Matteo Merli <matteo.me...@gmail.com> > wrote: > > > On Fri, Jan 8, 2021 at 2:15 PM Venkateswara Rao Jujjuri > > <jujj...@gmail.com> wrote: > > > > > > > otherwise the write will timeout internally and it will get replayed > > to a > > > new bookie. > > > If Qa is met and the writes of Qw-Qa fail after we send the success to > > the > > > client, why would the write replayed on a new bookie? > > > > I think the original intention was to avoid having 1 bookie with a > > "hole" in the entries sequence. If you then lose one of the 2 bookies, > > it would be difficult to know which entries need to be recovered. > > > > @Matteo Merli <matteo.me...@gmail.com> I don't believe we retry the write > on bookie if Qa is satisfied and the write to a bookie timedout. > Once the entry is ack'ed to the client we move the LAC and can't > retroactively change the active segment's ensemble. > > > will get replayed to a new bookie > This will happen only if we are not able to satisfy Qa and go through > ensemble changes. > We change the ensemble and tetry write only if bookie write fails before > satisfying Qa. > We have added a new feature called handling "delayed write failure", but > that happens only for > new entries not retroactively. > > I may be missing something here, and not understanding your point. > > Thanks, > JV > > > > > -- > Jvrao > --- > First they ignore you, then they laugh at you, then they fight you, then > you win. - Mahatma Gandhi >