Thanks JV and Encrico. I would like to include this as a contrib in bookkeeper for 4.7 like bookkeeper was grown from a contrib in zookeeper before.
So if the idea sounds good to you guys, and if you guys think this is aligned with bookkeeper roadmap, let’s try to move this forward with a contrib module in bookkeeper and continue the development in bookkeeper. If there is no major concerns, I would like to call a vote for this week. Sijie On Thu, Feb 8, 2018 at 12:01 AM Venkateswara Rao Jujjuri <jujj...@gmail.com> wrote: > A great step to move forward. BP-29 and BP-30 along with reorganizing ZK > will help the BK to shape perfect MDS abstraction. > While BP-30 is ambitious, it is a perfect way to start ambitious projects. > :) > > JV > > On Wed, Feb 7, 2018 at 6:49 AM, Enrico Olivelli <eolive...@gmail.com> > wrote: > > > It is very interesting! Thank you. > > I will look into it soon > > > > Enrico > > > > Il mer 7 feb 2018, 15:24 Sijie Guo <guosi...@gmail.com> ha scritto: > > > > > Hi all, > > > > > > I started a proposal of contributing a table (aka key/value) service > > > component as a contrib module to the bookkeeper community. This BP > > together > > > with other BPs I sent last week forms the idea of how we can do on > > > improving metadata management in bookkeeper (I will talk a bit more at > > the > > > end of this email). > > > > > > **why it was developed** > > > > > > Two main categories of use cases were driving the need of a key/value > > like > > > service. > > > > > > One is metadata storage, bookkeeper needs a key/value like storage > > > (currently it is zookeeper) to store the ledger's metadata, systems > built > > > on top of bookkeeper like distributedlog/pulsar also follow the pattern > > > that bookkeeper is using. They all need a key/value like storage to > store > > > their metadata. We all know zookeeper is the bottleneck of the > > scalability. > > > And it is also an issue marker to production systems (based on my > biased > > > production experiences). > > > > > > The other one is state storage in real-time/streaming > > > analytics/computation. In streaming analytics, the computation jobs > > usually > > > process streaming data. they usually need to store some sort of state > of > > > the computation operators into a storage and serve the computation > state > > as > > > final results for queries. Those state are usually represented in > > key/value > > > forms, and usually backed by wal. BookKeeper has been used in this area > > via > > > distributedlog/pulsar for storing and serving log / streaming data. It > is > > > ideal for bookkeeper also able to store and serve state data for the > sake > > > of unification, simplification and also reducing the complexity of > > > deployment and operations. > > > > > > Hence we prototyped/developed a table service component as an add-on to > > > bookkeeper. We'd like to contribute this as a contrib module to > > bookkeeper > > > and continue the development, integration and evaluation in the > > bookkeeper > > > community. > > > > > > We hope this can be like bookkeeper in zookeeper. bookkeeper was a > > contrib > > > module in zookeeper, and it is developed in the community and grown > into > > > what it is now. > > > > > > **how it is aligned with metadata storage** > > > > > > BP-28, BP-29 and BP-30. They are related at some extend. > > > > > > BP-28 is more a cleanup proposal to carry-on Jia's work (on service > > > discovery interfaces). This is to produce a clean metadata api module, > > > define a clean dependency between > > > bookkeeper implementation and metadata service, and allow we really > > plugin > > > different > > > metadata services without touching/changing bookkeeper implementation. > > > > > > BP-29 and BP-30 can be thought as two different metadata service > > > implementation based > > > on the metadata api contract defined in BP-28. > > > > > > BP-29 is to use Etcd as the metadata service, while BP-30 is to have a > > > built-in key/value service as the metadata service. Both BP-29 and > BP-30 > > > have pros and cons. However they > > > are not against to each other. Allowing two concurrent approaches will > > help > > > us understand > > > more on metadata management in bookkeeper and its ecosystem (e.g. dlog, > > > pulsar), which > > > will lead the project head in a healthy direction. > > > > > > **Proposed Changes** > > > > > > This proposal is to propose this table service as a contrib module > under > > > `stream` directory just as how we handle `dlog`. We can mark it as > > > "preview"/"alpha" in 4.7 and continue the development of this module in > > > bookkeeper community. > > > > > > The details of the proposal can be found in the google doc attached > > below: > > > > > > > > > https://docs.google.com/document/d/155xAwWv5IdOitHh1NVMEwCMGgB28M > > 3FyMiQSxEpjE-Y/edit#heading=h.56rbh52koe3f > > > > > > Please take a look. Comments are welcome. > > > > > > - Sijie > > > > > > > > > -- > > > > > > -- Enrico Olivelli > > > > > > -- > Jvrao > --- > First they ignore you, then they laugh at you, then they fight you, then > you win. - Mahatma Gandhi >