Awesome work Jack! I haven't worked with formal verification but have always been interested. Can you share the tools, thinking processes, setup etc that went into this?
On Mon, Feb 22, 2021 at 9:19 AM Jack Vanlightly <jvanligh...@splunk.com.invalid> wrote: > Hi all, > > I have written a TLA+ specification for the replication protocol. It models > the life of a single ledger and includes two writers, the original writer > that creates the ledger and a second writer that can at any point (or not) > start the recovery process. > > It found one problem with the fencing that allowed for a data loss scenario > to occur. You can find the details of that particular problem in issue 2614 > ( > https://urldefense.com/v3/__https://github.com/apache/bookkeeper/issues/2614__;!!DCbAVzZNrAf4!VUV5pNTOHyZQOg1ZhwuWT7JIcxobv26MM14ErklQukaMnxsSftnbe-mMrmNe4qfOnA$ > ). > > The specification is here: > https://urldefense.com/v3/__https://github.com/Vanlightly/bookkeeper-tlaplus__;!!DCbAVzZNrAf4!VUV5pNTOHyZQOg1ZhwuWT7JIcxobv26MM14ErklQukaMnxsSftnbe-mMrmOuWn47BA$ > > The state space for a 4 broker deployment with a Qw 3, Qa 2 ledger is > pretty large and takes days to run on a 96 core beast. However, you can run > it with smaller ensemble sizes to run in mere hours. > > Check it out if you're interested in formal verification. Also if there are > other aspects of BookKeeper that could benefit from verification let me > know. > > Best regards > Jack > > > -- > *Jack Vanlightly* > Principal Software Engineer > Splunk Inc. > jvanligh...@splunk.com <kramas...@splunk.com> > Barcelona > -- Karan Mehta <https://smart.salesforce.com/sig/k.mehta//us_mb/default/link.html>