On Tue, Feb 13, 2018 at 2:35 PM Enrico Olivelli <eolive...@gmail.com> wrote:

> All clear Sijie,
> IMHO many API function names do not follow usual Java naming conventions, I
> know this is a large code contribution, but do you think it will be
> possible to discuss about 'names' before releasing first version?


Which one doesn’t follow java naming convention?

It should follow bookkeeper convention since we are making it part of bk.

>
>
> Last question: this will be a contrib module, are we going to release it
> with regular releases of BK ?


Yes it will be released with regular release. We will keep it as contrib or
previews for a few releases before claiming it as mature.


>
> Thanks
> Enrico
>
> Il mar 13 feb 2018, 00:47 Sijie Guo <guosi...@gmail.com> ha scritto:
>
> > On Tue, Feb 13, 2018 at 6:35 AM Enrico Olivelli <eolive...@gmail.com>
> > wrote:
> >
> > > (Maybe it is better to comment on google doc but these are very high
> > level
> > > questions)
> > >
> > > Some questions:
> > > 1) I see we initially still need zookeeper, it would be interesting to
> > know
> > > if you want to drop it completely in the future. Certainly the usage of
> > zk
> > > in this case will be very limited because it will only have to support
> > > Helix
> >
> >
> > This BP is not directly removing zookeeper. This BP is more about
> provide a
> > key value service, which it can be used later for storing user ledgers
> > metadata. It then can reduce the amount of metadata in zookeeper to some
> > system ledgers. We will have a separate BP to address the metadata
> > bootstrap problem.
> >
> > So that says we will have to have a key/value service in place before we
> > talk about removing zookeeper. That is the intention to have key/value
> > first.
> >
> >
> >
> > >
> > > 2) I see we are going to use DL for checkpoints, which maybe in turn
> will
> > > need this system (as one motivation is the support of DL metadata), it
> > > seems to me some kind of  circular dependency. Can you explain how to
> > > implement this?
> >
> >
> > Using DL is because of it’s namespacing and reopenable feature. It is
> > similar as bookkeeper ledger metadata itself. You will have some system
> > dlog still use zookeeper, but they will address by metadata bootstrap in
> a
> > separate BP once keyvalue is mature (as mentioned in 1).
> >
> >
> > >
> > > 3) I see that checkpoints will be done by copying raw rocksdb files, I
> > have
> > > no experience of RocksDB, is it safe to directly copy files and obtain
> a
> > > consistent snapshot ?
> >
> > https://github.com/facebook/rocksdb/wiki/Checkpoints
> >
> >
> >
> > >
> > > 4) if BookKeeper need this service for metada and checkpoints are
> written
> > > to BK itself, how can the system boot? Surely I am missing one piece
> >
> >
> > See the comment at 1)
> >
> >
> > >
> > > Great work
> > >
> > > Enrico
> > >
> > > Il lun 12 feb 2018, 02:07 Sijie Guo <guosi...@gmail.com> ha scritto:
> > >
> > > > Thanks JV and Encrico.
> > > >
> > > > I would like to include this as a contrib in bookkeeper for 4.7 like
> > > > bookkeeper was grown from a contrib in zookeeper before.
> > > >
> > > > So if the idea sounds good to you guys, and if you guys think this is
> > > > aligned with bookkeeper roadmap, let’s try to move this forward with
> a
> > > > contrib module in bookkeeper and continue the development in
> > bookkeeper.
> > > >
> > > > If there is no major concerns, I would like to call a vote for this
> > week.
> > > >
> > > > Sijie
> > > >
> > > >
> > > > On Thu, Feb 8, 2018 at 12:01 AM Venkateswara Rao Jujjuri <
> > > > jujj...@gmail.com>
> > > > wrote:
> > > >
> > > > > A great step to move forward. BP-29 and BP-30 along with
> reorganizing
> > > ZK
> > > > > will help the BK to shape perfect MDS abstraction.
> > > > > While BP-30 is ambitious, it is a perfect way to start ambitious
> > > > projects.
> > > > > :)
> > > > >
> > > > > JV
> > > > >
> > > > > On Wed, Feb 7, 2018 at 6:49 AM, Enrico Olivelli <
> eolive...@gmail.com
> > >
> > > > > wrote:
> > > > >
> > > > > > It is very interesting! Thank you.
> > > > > > I will look into it soon
> > > > > >
> > > > > > Enrico
> > > > > >
> > > > > > Il mer 7 feb 2018, 15:24 Sijie Guo <guosi...@gmail.com> ha
> > scritto:
> > > > > >
> > > > > > > Hi all,
> > > > > > >
> > > > > > > I started a proposal of contributing a table (aka key/value)
> > > service
> > > > > > > component as a contrib module to the bookkeeper community. This
> > BP
> > > > > > together
> > > > > > > with other BPs I sent last week forms the idea of how we can do
> > on
> > > > > > > improving metadata management in bookkeeper (I will talk a bit
> > more
> > > > at
> > > > > > the
> > > > > > > end of this email).
> > > > > > >
> > > > > > > **why it was developed**
> > > > > > >
> > > > > > > Two main categories of use cases were driving the need of a
> > > key/value
> > > > > > like
> > > > > > > service.
> > > > > > >
> > > > > > > One is metadata storage, bookkeeper needs a key/value like
> > storage
> > > > > > > (currently it is zookeeper) to store the ledger's metadata,
> > systems
> > > > > built
> > > > > > > on top of bookkeeper like distributedlog/pulsar also follow the
> > > > pattern
> > > > > > > that bookkeeper is using. They all need a key/value like
> storage
> > to
> > > > > store
> > > > > > > their metadata. We all know zookeeper is the bottleneck of the
> > > > > > scalability.
> > > > > > > And it is also an issue marker to production systems (based on
> my
> > > > > biased
> > > > > > > production experiences).
> > > > > > >
> > > > > > > The other one is state storage in real-time/streaming
> > > > > > > analytics/computation. In streaming analytics, the computation
> > jobs
> > > > > > usually
> > > > > > > process streaming data. they usually need to store some sort of
> > > state
> > > > > of
> > > > > > > the computation operators into a storage and serve the
> > computation
> > > > > state
> > > > > > as
> > > > > > > final results for queries. Those state are usually represented
> in
> > > > > > key/value
> > > > > > > forms, and usually backed by wal. BookKeeper has been used in
> > this
> > > > area
> > > > > > via
> > > > > > > distributedlog/pulsar for storing and serving log / streaming
> > data.
> > > > It
> > > > > is
> > > > > > > ideal for bookkeeper also able to store and serve state data
> for
> > > the
> > > > > sake
> > > > > > > of unification, simplification and also reducing the complexity
> > of
> > > > > > > deployment and operations.
> > > > > > >
> > > > > > > Hence we prototyped/developed a table service component as an
> > > add-on
> > > > to
> > > > > > > bookkeeper. We'd like to contribute this as a contrib module to
> > > > > > bookkeeper
> > > > > > > and continue the development, integration and evaluation in the
> > > > > > bookkeeper
> > > > > > > community.
> > > > > > >
> > > > > > > We hope this can be like bookkeeper in zookeeper. bookkeeper
> was
> > a
> > > > > > contrib
> > > > > > > module in zookeeper, and it is developed in the community and
> > grown
> > > > > into
> > > > > > > what it is now.
> > > > > > >
> > > > > > > **how it is aligned with metadata storage**
> > > > > > >
> > > > > > > BP-28, BP-29 and BP-30. They are related at some extend.
> > > > > > >
> > > > > > > BP-28 is more a cleanup proposal to carry-on Jia's work (on
> > service
> > > > > > > discovery interfaces). This is to produce a clean metadata api
> > > > module,
> > > > > > > define a clean dependency between
> > > > > > > bookkeeper implementation and metadata service, and allow we
> > really
> > > > > > plugin
> > > > > > > different
> > > > > > > metadata services without touching/changing bookkeeper
> > > > implementation.
> > > > > > >
> > > > > > > BP-29 and BP-30 can be thought as two different metadata
> service
> > > > > > > implementation based
> > > > > > > on the metadata api contract defined in BP-28.
> > > > > > >
> > > > > > > BP-29 is to use Etcd as the metadata service, while BP-30 is to
> > > have
> > > > a
> > > > > > > built-in key/value service as the metadata service. Both BP-29
> > and
> > > > > BP-30
> > > > > > > have pros and cons. However they
> > > > > > > are not against to each other. Allowing two concurrent
> approaches
> > > > will
> > > > > > help
> > > > > > > us understand
> > > > > > > more on metadata management in bookkeeper and its ecosystem
> (e.g.
> > > > dlog,
> > > > > > > pulsar), which
> > > > > > > will lead the project head in a healthy direction.
> > > > > > >
> > > > > > > **Proposed Changes**
> > > > > > >
> > > > > > > This proposal is to propose this table service as a contrib
> > module
> > > > > under
> > > > > > > `stream` directory just as how we handle `dlog`. We can mark it
> > as
> > > > > > > "preview"/"alpha" in 4.7 and continue the development of this
> > > module
> > > > in
> > > > > > > bookkeeper community.
> > > > > > >
> > > > > > > The details of the proposal can be found in the google doc
> > attached
> > > > > > below:
> > > > > > >
> > > > > > >
> > > > > > >
> > https://docs.google.com/document/d/155xAwWv5IdOitHh1NVMEwCMGgB28M
> > > > > > 3FyMiQSxEpjE-Y/edit#heading=h.56rbh52koe3f
> > > > > > >
> > > > > > > Please take a look. Comments are welcome.
> > > > > > >
> > > > > > > - Sijie
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > >
> > > > > >
> > > > > > -- Enrico Olivelli
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Jvrao
> > > > > ---
> > > > > First they ignore you, then they laugh at you, then they fight you,
> > > then
> > > > > you win. - Mahatma Gandhi
> > > > >
> > > >
> > >
> > >
> > > --
> > >
> > >
> > > -- Enrico Olivelli
> > >
> >
>
>
> --
>
>
> -- Enrico Olivelli
>

Reply via email to