Hello,
It looks like we all want to add a feature to BookKeeper to allow us to
skip writes to the journals.

We already have these patches:
https://github.com/apache/bookkeeper/pull/2401
https://github.com/apache/bookkeeper/pull/2157
and IIRC there are also tentative BPs.

The limit of the given patches is that it is simply skipping all of the
writes to the journal,  and this in turn is a big problem:
- if you restart the bookie it is likely that you lose your data, and
especially the 'fenced' flag
- clients cannot rely on most of the guarantees that BK provides

Also another problem is that those implementations work on a per-bookie
basis, I understand that the user in those cases is Pulsar and usually you
do not share your BK cluster with other applications (is it really true ?
think about PulsarFunctions and BK StreamStorage service....).

Btw this is not true for our case at EmailSuccess.com and also at
MagNews.com, in which we are sharing the bookies with other components
(like HerdDB, DistributedLog, BlobIt).

Skipping the journal is a good trade off in several cases, because it makes
writes blazing fast and also reduces the write amplification.

I would like to wrap up all of this stuff and provide a feature to BK, to
be used consistently by all of the users.

I think that it will be far better to have a WriteFlag to enable this
feature, this way different clients will be able to express their
durability constraints and service level regarding this feature.

Also when the Bookie is not writing to the Journal, after a restart, we
should tell to the clients that the Bookie is not able to return data for a
given ledger or to tell if the ledger has been fenced. IIUC Ivan and Matteo
already have this change in their private fork.


I will be happy to start a BP or to help any other volunteer in writing it.
We should work as a community on this topic.

Thoughts ?
Enrico

Reply via email to