Hi all, Hope you guys have gone through the BP.
I've also made the repo of this table service that we have been working on public. https://github.com/streamlio/stream-storage I would like to call a vote for making this work as a contrib module in bookkeeper for 4.7 as a preview, so we can continue the development of this idea for bookkeeper metadata storage in subsequent releases. Like how bookkeeper was developed from a contrib module in zookeeper before. Here is the PR for adding this BP to bookkeeper_proposals page: https://github.com/apache/bookkeeper/pull/1185 Please take a look and vote your opinions. - Sijie On Sun, Feb 11, 2018 at 5:07 PM, Sijie Guo <guosi...@gmail.com> wrote: > Thanks JV and Encrico. > > I would like to include this as a contrib in bookkeeper for 4.7 like > bookkeeper was grown from a contrib in zookeeper before. > > So if the idea sounds good to you guys, and if you guys think this is > aligned with bookkeeper roadmap, let’s try to move this forward with a > contrib module in bookkeeper and continue the development in bookkeeper. > > If there is no major concerns, I would like to call a vote for this week. > > Sijie > > > On Thu, Feb 8, 2018 at 12:01 AM Venkateswara Rao Jujjuri < > jujj...@gmail.com> wrote: > >> A great step to move forward. BP-29 and BP-30 along with reorganizing ZK >> will help the BK to shape perfect MDS abstraction. >> While BP-30 is ambitious, it is a perfect way to start ambitious projects. >> :) >> >> JV >> >> On Wed, Feb 7, 2018 at 6:49 AM, Enrico Olivelli <eolive...@gmail.com> >> wrote: >> >> > It is very interesting! Thank you. >> > I will look into it soon >> > >> > Enrico >> > >> > Il mer 7 feb 2018, 15:24 Sijie Guo <guosi...@gmail.com> ha scritto: >> > >> > > Hi all, >> > > >> > > I started a proposal of contributing a table (aka key/value) service >> > > component as a contrib module to the bookkeeper community. This BP >> > together >> > > with other BPs I sent last week forms the idea of how we can do on >> > > improving metadata management in bookkeeper (I will talk a bit more at >> > the >> > > end of this email). >> > > >> > > **why it was developed** >> > > >> > > Two main categories of use cases were driving the need of a key/value >> > like >> > > service. >> > > >> > > One is metadata storage, bookkeeper needs a key/value like storage >> > > (currently it is zookeeper) to store the ledger's metadata, systems >> built >> > > on top of bookkeeper like distributedlog/pulsar also follow the >> pattern >> > > that bookkeeper is using. They all need a key/value like storage to >> store >> > > their metadata. We all know zookeeper is the bottleneck of the >> > scalability. >> > > And it is also an issue marker to production systems (based on my >> biased >> > > production experiences). >> > > >> > > The other one is state storage in real-time/streaming >> > > analytics/computation. In streaming analytics, the computation jobs >> > usually >> > > process streaming data. they usually need to store some sort of state >> of >> > > the computation operators into a storage and serve the computation >> state >> > as >> > > final results for queries. Those state are usually represented in >> > key/value >> > > forms, and usually backed by wal. BookKeeper has been used in this >> area >> > via >> > > distributedlog/pulsar for storing and serving log / streaming data. >> It is >> > > ideal for bookkeeper also able to store and serve state data for the >> sake >> > > of unification, simplification and also reducing the complexity of >> > > deployment and operations. >> > > >> > > Hence we prototyped/developed a table service component as an add-on >> to >> > > bookkeeper. We'd like to contribute this as a contrib module to >> > bookkeeper >> > > and continue the development, integration and evaluation in the >> > bookkeeper >> > > community. >> > > >> > > We hope this can be like bookkeeper in zookeeper. bookkeeper was a >> > contrib >> > > module in zookeeper, and it is developed in the community and grown >> into >> > > what it is now. >> > > >> > > **how it is aligned with metadata storage** >> > > >> > > BP-28, BP-29 and BP-30. They are related at some extend. >> > > >> > > BP-28 is more a cleanup proposal to carry-on Jia's work (on service >> > > discovery interfaces). This is to produce a clean metadata api module, >> > > define a clean dependency between >> > > bookkeeper implementation and metadata service, and allow we really >> > plugin >> > > different >> > > metadata services without touching/changing bookkeeper implementation. >> > > >> > > BP-29 and BP-30 can be thought as two different metadata service >> > > implementation based >> > > on the metadata api contract defined in BP-28. >> > > >> > > BP-29 is to use Etcd as the metadata service, while BP-30 is to have a >> > > built-in key/value service as the metadata service. Both BP-29 and >> BP-30 >> > > have pros and cons. However they >> > > are not against to each other. Allowing two concurrent approaches will >> > help >> > > us understand >> > > more on metadata management in bookkeeper and its ecosystem (e.g. >> dlog, >> > > pulsar), which >> > > will lead the project head in a healthy direction. >> > > >> > > **Proposed Changes** >> > > >> > > This proposal is to propose this table service as a contrib module >> under >> > > `stream` directory just as how we handle `dlog`. We can mark it as >> > > "preview"/"alpha" in 4.7 and continue the development of this module >> in >> > > bookkeeper community. >> > > >> > > The details of the proposal can be found in the google doc attached >> > below: >> > > >> > > >> > > https://docs.google.com/document/d/155xAwWv5IdOitHh1NVMEwCMGgB28M >> > 3FyMiQSxEpjE-Y/edit#heading=h.56rbh52koe3f >> > > >> > > Please take a look. Comments are welcome. >> > > >> > > - Sijie >> > > >> > >> > >> > -- >> > >> > >> > -- Enrico Olivelli >> > >> >> >> >> -- >> Jvrao >> --- >> First they ignore you, then they laugh at you, then they fight you, then >> you win. - Mahatma Gandhi >> >