Hi Jack,

> I've recently modelled the BookKeeper protocol in TLA+ and can confirm that
> once confirmed, that an entry is not replayed to another bookie.

Should I assume that you modeled it after the code? Otherwise, what did you use 
as a reference? Is the TLA+ spec available anywhere? It sounds like a good 
development.

> once confirmed, that an entry is not replayed to another bookie.


I'd like to understand this a bit better. I think this is saying that if I have 
an entry e that is written to AQ < WQ, and at least one bookie b in the ledger 
ensemble crashes before it writes e, then e is considered confirmed and when b 
is replaced with b' for the ledger, e is not replicated on b'.

If that's the case, then isn't it a bug?

>  the new data integrity check that Ivan worked on, when run periodically

> will be able to repair that hole.


This is good, but I'm not sure this is a replacement for a proper fix.

Please let me know if I'm missing anything.

-Flavio 

> On 11 Jan 2021, at 09:31, Jack Vanlightly <jvanligh...@splunk.com.INVALID> 
> wrote:
> 
> Hi,
> 
> I've recently modelled the BookKeeper protocol in TLA+ and can confirm that
> once confirmed, that an entry is not replayed to another bookie. This
> leaves a "hole" as the entry is now replicated only to 2 bookies, however,
> the new data integrity check that Ivan worked on, when run periodically
> will be able to repair that hole.
> 
> Jack
> 
> On Sat, Jan 9, 2021 at 1:06 AM Venkateswara Rao Jujjuri <jujj...@gmail.com>
> wrote:
> 
>> [ External sender. Exercise caution. ]
>> 
>> On Fri, Jan 8, 2021 at 2:29 PM Matteo Merli <matteo.me...@gmail.com>
>> wrote:
>> 
>>> On Fri, Jan 8, 2021 at 2:15 PM Venkateswara Rao Jujjuri
>>> <jujj...@gmail.com> wrote:
>>>> 
>>>>> otherwise the write will timeout internally and it will get replayed
>>> to a
>>>> new bookie.
>>>> If Qa is met and the writes of Qw-Qa fail after we send the success to
>>> the
>>>> client, why would the write replayed on a new bookie?
>>> 
>>> I think the original intention was to avoid having 1 bookie with a
>>> "hole" in the entries sequence. If you then lose one of the 2 bookies,
>>> it would be difficult to know which entries need to be recovered.
>>> 
>> 
>> @Matteo Merli <matteo.me...@gmail.com>  I don't believe we retry the write
>> on bookie if Qa is satisfied and the write to a bookie timedout.
>> Once the entry is ack'ed to the client we move the LAC and can't
>> retroactively change the active segment's ensemble.
>> 
>>> will get replayed to a new bookie
>> This will happen only if we are not able to satisfy Qa and go through
>> ensemble changes.
>> We change the ensemble and tetry write only if bookie write fails before
>> satisfying Qa.
>> We have added a new feature called handling "delayed write failure", but
>> that happens only for
>> new entries not retroactively.
>> 
>> I may be missing something here, and not understanding your point.
>> 
>> Thanks,
>> JV
>> 
>> 
>> 
>> 
>> --
>> Jvrao
>> ---
>> First they ignore you, then they laugh at you, then they fight you, then
>> you win. - Mahatma Gandhi
>> 

Reply via email to