Hi all, I started a proposal of contributing a table (aka key/value) service component as a contrib module to the bookkeeper community. This BP together with other BPs I sent last week forms the idea of how we can do on improving metadata management in bookkeeper (I will talk a bit more at the end of this email).
**why it was developed** Two main categories of use cases were driving the need of a key/value like service. One is metadata storage, bookkeeper needs a key/value like storage (currently it is zookeeper) to store the ledger's metadata, systems built on top of bookkeeper like distributedlog/pulsar also follow the pattern that bookkeeper is using. They all need a key/value like storage to store their metadata. We all know zookeeper is the bottleneck of the scalability. And it is also an issue marker to production systems (based on my biased production experiences). The other one is state storage in real-time/streaming analytics/computation. In streaming analytics, the computation jobs usually process streaming data. they usually need to store some sort of state of the computation operators into a storage and serve the computation state as final results for queries. Those state are usually represented in key/value forms, and usually backed by wal. BookKeeper has been used in this area via distributedlog/pulsar for storing and serving log / streaming data. It is ideal for bookkeeper also able to store and serve state data for the sake of unification, simplification and also reducing the complexity of deployment and operations. Hence we prototyped/developed a table service component as an add-on to bookkeeper. We'd like to contribute this as a contrib module to bookkeeper and continue the development, integration and evaluation in the bookkeeper community. We hope this can be like bookkeeper in zookeeper. bookkeeper was a contrib module in zookeeper, and it is developed in the community and grown into what it is now. **how it is aligned with metadata storage** BP-28, BP-29 and BP-30. They are related at some extend. BP-28 is more a cleanup proposal to carry-on Jia's work (on service discovery interfaces). This is to produce a clean metadata api module, define a clean dependency between bookkeeper implementation and metadata service, and allow we really plugin different metadata services without touching/changing bookkeeper implementation. BP-29 and BP-30 can be thought as two different metadata service implementation based on the metadata api contract defined in BP-28. BP-29 is to use Etcd as the metadata service, while BP-30 is to have a built-in key/value service as the metadata service. Both BP-29 and BP-30 have pros and cons. However they are not against to each other. Allowing two concurrent approaches will help us understand more on metadata management in bookkeeper and its ecosystem (e.g. dlog, pulsar), which will lead the project head in a healthy direction. **Proposed Changes** This proposal is to propose this table service as a contrib module under `stream` directory just as how we handle `dlog`. We can mark it as "preview"/"alpha" in 4.7 and continue the development of this module in bookkeeper community. The details of the proposal can be found in the google doc attached below: https://docs.google.com/document/d/155xAwWv5IdOitHh1NVMEwCMGgB28M3FyMiQSxEpjE-Y/edit#heading=h.56rbh52koe3f Please take a look. Comments are welcome. - Sijie