Hi all,

Hope you guys have gone through the BP.

I've also made the repo of this table service that we have been working on
public. https://github.com/streamlio/stream-storage

I would like to call a vote for making this work as a contrib module in
bookkeeper for 4.7 as a preview, so we can continue the development of this
idea for bookkeeper metadata storage in subsequent releases. Like how
bookkeeper was developed from a contrib module in zookeeper before.

Here is the PR for adding this BP to bookkeeper_proposals page:
https://github.com/apache/bookkeeper/pull/1185

Please take a look and vote your opinions.

- Sijie










On Sun, Feb 11, 2018 at 5:07 PM, Sijie Guo <guosi...@gmail.com> wrote:

> Thanks JV and Encrico.
>
> I would like to include this as a contrib in bookkeeper for 4.7 like
> bookkeeper was grown from a contrib in zookeeper before.
>
> So if the idea sounds good to you guys, and if you guys think this is
> aligned with bookkeeper roadmap, let’s try to move this forward with a
> contrib module in bookkeeper and continue the development in bookkeeper.
>
> If there is no major concerns, I would like to call a vote for this week.
>
> Sijie
>
>
> On Thu, Feb 8, 2018 at 12:01 AM Venkateswara Rao Jujjuri <
> jujj...@gmail.com> wrote:
>
>> A great step to move forward. BP-29 and BP-30 along with reorganizing ZK
>> will help the BK to shape perfect MDS abstraction.
>> While BP-30 is ambitious, it is a perfect way to start ambitious projects.
>> :)
>>
>> JV
>>
>> On Wed, Feb 7, 2018 at 6:49 AM, Enrico Olivelli <eolive...@gmail.com>
>> wrote:
>>
>> > It is very interesting! Thank you.
>> > I will look into it soon
>> >
>> > Enrico
>> >
>> > Il mer 7 feb 2018, 15:24 Sijie Guo <guosi...@gmail.com> ha scritto:
>> >
>> > > Hi all,
>> > >
>> > > I started a proposal of contributing a table (aka key/value) service
>> > > component as a contrib module to the bookkeeper community. This BP
>> > together
>> > > with other BPs I sent last week forms the idea of how we can do on
>> > > improving metadata management in bookkeeper (I will talk a bit more at
>> > the
>> > > end of this email).
>> > >
>> > > **why it was developed**
>> > >
>> > > Two main categories of use cases were driving the need of a key/value
>> > like
>> > > service.
>> > >
>> > > One is metadata storage, bookkeeper needs a key/value like storage
>> > > (currently it is zookeeper) to store the ledger's metadata, systems
>> built
>> > > on top of bookkeeper like distributedlog/pulsar also follow the
>> pattern
>> > > that bookkeeper is using. They all need a key/value like storage to
>> store
>> > > their metadata. We all know zookeeper is the bottleneck of the
>> > scalability.
>> > > And it is also an issue marker to production systems (based on my
>> biased
>> > > production experiences).
>> > >
>> > > The other one is state storage in real-time/streaming
>> > > analytics/computation. In streaming analytics, the computation jobs
>> > usually
>> > > process streaming data. they usually need to store some sort of state
>> of
>> > > the computation operators into a storage and serve the computation
>> state
>> > as
>> > > final results for queries. Those state are usually represented in
>> > key/value
>> > > forms, and usually backed by wal. BookKeeper has been used in this
>> area
>> > via
>> > > distributedlog/pulsar for storing and serving log / streaming data.
>> It is
>> > > ideal for bookkeeper also able to store and serve state data for the
>> sake
>> > > of unification, simplification and also reducing the complexity of
>> > > deployment and operations.
>> > >
>> > > Hence we prototyped/developed a table service component as an add-on
>> to
>> > > bookkeeper. We'd like to contribute this as a contrib module to
>> > bookkeeper
>> > > and continue the development, integration and evaluation in the
>> > bookkeeper
>> > > community.
>> > >
>> > > We hope this can be like bookkeeper in zookeeper. bookkeeper was a
>> > contrib
>> > > module in zookeeper, and it is developed in the community and grown
>> into
>> > > what it is now.
>> > >
>> > > **how it is aligned with metadata storage**
>> > >
>> > > BP-28, BP-29 and BP-30. They are related at some extend.
>> > >
>> > > BP-28 is more a cleanup proposal to carry-on Jia's work (on service
>> > > discovery interfaces). This is to produce a clean metadata api module,
>> > > define a clean dependency between
>> > > bookkeeper implementation and metadata service, and allow we really
>> > plugin
>> > > different
>> > > metadata services without touching/changing bookkeeper implementation.
>> > >
>> > > BP-29 and BP-30 can be thought as two different metadata service
>> > > implementation based
>> > > on the metadata api contract defined in BP-28.
>> > >
>> > > BP-29 is to use Etcd as the metadata service, while BP-30 is to have a
>> > > built-in key/value service as the metadata service. Both BP-29 and
>> BP-30
>> > > have pros and cons. However they
>> > > are not against to each other. Allowing two concurrent approaches will
>> > help
>> > > us understand
>> > > more on metadata management in bookkeeper and its ecosystem (e.g.
>> dlog,
>> > > pulsar), which
>> > > will lead the project head in a healthy direction.
>> > >
>> > > **Proposed Changes**
>> > >
>> > > This proposal is to propose this table service as a contrib module
>> under
>> > > `stream` directory just as how we handle `dlog`. We can mark it as
>> > > "preview"/"alpha" in 4.7 and continue the development of this module
>> in
>> > > bookkeeper community.
>> > >
>> > > The details of the proposal can be found in the google doc attached
>> > below:
>> > >
>> > >
>> > > https://docs.google.com/document/d/155xAwWv5IdOitHh1NVMEwCMGgB28M
>> > 3FyMiQSxEpjE-Y/edit#heading=h.56rbh52koe3f
>> > >
>> > > Please take a look. Comments are welcome.
>> > >
>> > > - Sijie
>> > >
>> >
>> >
>> > --
>> >
>> >
>> > -- Enrico Olivelli
>> >
>>
>>
>>
>> --
>> Jvrao
>> ---
>> First they ignore you, then they laugh at you, then they fight you, then
>> you win. - Mahatma Gandhi
>>
>

Reply via email to