Il Mar 2 Giu 2020, 17:09 Unmesh Joshi <unmeshjo...@gmail.com> ha scritto:
> >>>This sounds strange and you are not the first one that is asking this > question > If the order is changed, writing to journal ahead of writing to ledger, > will it make any difference? > AFAIK it should not make any difference. > >>>The acknowledgement is sent to the client only after a successful > fdatasync > >>on the journal (if you do not ask for DEFERRED_SYNC or disable fsyncs > >>>explicitly) > Ah, I missed the callback passed in the QueueEntry. The flush > implementation though, seems to be writing to file (BufferedChannel.flush), > doesnt seem to be doing actual fileChannel.force? > The callback is only called after 'force' I don't have my laptop here now but I am sure. We have a background thread that performs 'force', in order to group commits. Please check. > >>it is super fast and it > >>guarantees the data have been persisted durable. > Just curious, if there are any throughput/lagency tests to look at? > We only have a benchmark tool, but not public results. I suggest you to make benchmarks on your use case. We will be happy to help you Enrico > Thanks, > Unmesh > > > On Tue, Jun 2, 2020 at 7:23 PM Enrico Olivelli <eolive...@gmail.com> > wrote: > > > Il Mar 2 Giu 2020, 15:20 Unmesh Joshi <unmeshjo...@gmail.com> ha > scritto: > > > > > Hi, > > > > > > I was going through bookkeeper code, particularly to see when and how > > > transaction logs are written and flushed to disk. > > > Just curious to understand, why in, Bookie.addEntryInternal method, > > writes > > > to journal happen after the writes to ledger. ( > > > > > > > > > https://github.com/apache/bookkeeper/blob/master/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/Bookie.java > > > ) > > > Also, journal writes are not flushed to disk synchronously, as they > > happen > > > in their own dedicated thread (and can also be done in batches). > > > So I had two questions. > > > 1. Why journal writes are not done before the writes to ledgers > > > > > > > This sounds strange and you are not the first one that is asking this > > question. > > Basically entries in BK are immutabile and when the bookie restarts it > > replays the journal. > > The LAC protocol shields reader clients from reading entries that have > not > > been acknowledged to the writer. > > > > > > 2. Why not to wait till journal writes are successful (even if not synced > > > to disk may be) before returning the response. > > > > > > > The acknowledgement is sent to the client only after a successful > fdatasync > > on the journal (if you do not ask for DEFERRED_SYNC or disable fsyncs > > explicitly) > > This is basically one of the core features of BK: it is super fast and it > > guarantees the data have been persisted durable. > > > > Enrico > > > > > > If these things are not done, there is always a risk of losing data in > case > > > of server or disk crash? > > > > > > Thanks, > > > Unmesh > > > > > >