Since there is not objective. would like to make this BP approved.

On Wed, Sep 13, 2017 at 4:24 PM, Sijie Guo <guosi...@gmail.com> wrote:

> On Wed, Sep 13, 2017 at 1:18 AM, Enrico Olivelli <eolive...@gmail.com>
> wrote:
>
> > 2017-09-13 10:10 GMT+02:00 Sijie Guo <guosi...@gmail.com>:
> >
> > > On Wed, Sep 13, 2017 at 12:16 AM, Enrico Olivelli <eolive...@gmail.com
> >
> > > wrote:
> > >
> > > > I think that this is a good direction to go.
> > > >
> > > > I believe to the reasons about ZK in huge systems even it is not my
> > case
> > > so
> > > > I cannot add comments on this usecase.
> > > >
> > > > I am fine with direction as long as we are still going to support
> > > > ZooKeeper.
> > > > BookKeeper is in the Hadoop / ZooKeeper ecosystem and several
> products
> > > rely
> > > > on ZK too, for instance in my systems it is usual to have
> > > > BookKeeper/Kafka/HBase/Majordodo....  and so I am not going to live
> > > > without
> > > > zookeeper in the short/mid term.
> > > >
> > > > I am really OK in dropping ZK because for "simple" systems in fact
> when
> > > you
> > > > need only BK having the burden of setting up a zookeeper server is
> > weird
> > > > for customers. I usually re-distribute BK + ZK with my applications
> and
> > > we
> > > > are talking about little clusters of up to 10 machines.
> > > >
> > >
> > > Just to clarify - we are not dropping ZK here. we are just proposing to
> > > have a ledger manager implementation that doesn't depend on zookeeper
> > > directly.
> > > We are not modifying any existing ledger manager implementation.
> > >
> >
> >
> > Yep, we are on the same page
> > for this proposal the bookie will be a sort of "proxy" between the client
> > and the actual ledger manager implementation which will "live" inside the
> > bookie
> > it is only a new ledger manager to be used in clients, this ledger
> manager
> > will issue RPCs (or kind of "streaming" RPCs) to a list of bookies
> >
> >
> > >
> > >
> > > >
> > > > The direction on this proposal is OK for me and it is very like the
> > work
> > > I
> > > > was starting about "standalone mode".
> > >
> > >
> > > > I think it will be very easy to support the case of having a single
> > > bookie
> > > > with this approach or even client+ bookie in the same JVM,
> > > > Having multiple bookies will make us to add some other coordination
> > > > facility between bookies, I would like to know if there is already
> some
> > > > idea about this, are we going to use another product like
> etcd,jgroups
> > or
> > > > implement our own coordination protocol ?
> > >
> > >
> > > we are not replacing A with B, even with zookeeper. the ledger
> management
> > > is already abstracted in interfaces.
> > > the users can use whatever system they prefer as the metadata store.
> > >
> > > our direction is to provide an option to store metadata as well as data
> > in
> > > bookies. so in this option, there is no external metadata storage
> needed.
> > >
> >
> > Sorry. Maybe my curiosity is not clear.
> > If you have multiple bookies and each bookie holds its own version of
> > metadata, how do you coordinate them ? which will be the source of truth
> ?
> > Maybe we should start a new email thread in the future to talk about
> > "alternative distributed metadata storages"
> >
>
> It is out of the scope of this BP. We will have a next BP to cover this
> part.
>
>
>
>
> >
> > Any way the meaning and the scope of the proposal is clear to me and I am
> > really OK with it, I hope it will get soon approved
> >
> > -- Enrico
> >
> >
> > >
> > >
> > > > ZK is simple but it very
> > > > effective.
> > >
> > > Maybe we could help the ZK community to move forward and resolve
> > > > the problems we are bringing to light
> > > >
> > > >
> > > > Enrico
> > > >
> > > >
> > > > 2017-09-13 3:15 GMT+02:00 Jia Zhai <zhaiji...@gmail.com>:
> > > >
> > > > > Any thoughts or comments
> > > > > :)
> > > > >
> > > > > Thanks a lot.
> > > > > -Jia
> > > > >
> > > > > On Tue, Sep 12, 2017 at 4:30 PM, Jia Zhai <zhaiji...@gmail.com>
> > wrote:
> > > > >
> > > > > > This blog: https://bitworks.software/
> > blog/en/2017-07-12-replicated-
> > > > > > scalable-commitlog-with-apachebookkeeper.html, which also refer
> a
> > > > little
> > > > > > the limitation of zookeeper in bookkeeper
> > > > > >
> > > > > > On Thu, Sep 7, 2017 at 9:45 AM, Jia Zhai <zhaiji...@gmail.com>
> > > wrote:
> > > > > >
> > > > > >> 👍. Thanks a lot for the suggestions and feed back.
> > > > > >>
> > > > > >> On Thu, Sep 7, 2017 at 4:24 AM, Sijie Guo <guosi...@gmail.com>
> > > wrote:
> > > > > >>
> > > > > >>> On Wed, Sep 6, 2017 at 1:07 PM, Enrico Olivelli <
> > > eolive...@gmail.com
> > > > >
> > > > > >>> wrote:
> > > > > >>>
> > > > > >>> > Off topic curiosity... Jia and Sijie, do you think we are
> going
> > > to
> > > > > >>> drop ZK
> > > > > >>> > from DL too?
> > > > > >>> >
> > > > > >>>
> > > > > >>> Yes. That's the goal - 1) for large deployment, we are trying
> to
> > > > > overcome
> > > > > >>> the limitation of zookeeper; 2) for smaller deployments, it
> will
> > > make
> > > > > >>> deployment much easier, you just need to deploy a cluster of
> > > bookies.
> > > > > >>> once
> > > > > >>> it is done, you can use ledger api or log stream api to access
> > the
> > > > > >>> bookkeeper cluster.
> > > > > >>>
> > > > > >>> Both DL and BK are metadata storage pluggable. They have very
> > clear
> > > > > >>> interfaces on defining metadata operations. So it is
> > > straightforward
> > > > to
> > > > > >>> use
> > > > > >>> a different metadata storage.
> > > > > >>>
> > > > > >>>
> > > > > >>> > Enrico
> > > > > >>> >
> > > > > >>> > On mer 6 set 2017, 19:51 Enrico Olivelli <
> eolive...@gmail.com>
> > > > > wrote:
> > > > > >>> >
> > > > > >>> > >
> > > > > >>> > >
> > > > > >>> > > On mer 6 set 2017, 18:25 Sijie Guo <guosi...@gmail.com>
> > wrote:
> > > > > >>> > >
> > > > > >>> > >> On Sep 6, 2017 4:57 AM, "Enrico Olivelli" <
> > > eolive...@gmail.com>
> > > > > >>> wrote:
> > > > > >>> > >>
> > > > > >>> > >> Thank you Sijie and Jia for your comments and
> explanations,
> > > > > >>> > >> answers inline
> > > > > >>> > >>
> > > > > >>> > >> 2017-09-06 2:23 GMT+02:00 Jia Zhai <zhaiji...@gmail.com>:
> > > > > >>> > >>
> > > > > >>> > >> > Thanks a lot Enrico and Sijie for your comments and
> > > > information
> > > > > on
> > > > > >>> > this.
> > > > > >>> > >> >
> > > > > >>> > >> > On Tue, Sep 5, 2017 at 9:31 PM, Enrico Olivelli <
> > > > > >>> eolive...@gmail.com>
> > > > > >>> > >> > wrote:
> > > > > >>> > >> >
> > > > > >>> > >> > > Great to see you working on this !
> > > > > >>> > >> > > I would be great to have such feature, as it is the
> > first
> > > > step
> > > > > >>> to a
> > > > > >>> > >> > > 'standalone' BookKeeper mode
> > > > > >>> > >> > >
> > > > > >>> > >> > > Some complementary ideas/first look questions:
> > > > > >>> > >> > > - the document does not talk about security, IMHO we
> > have
> > > at
> > > > > >>> least
> > > > > >>> > to
> > > > > >>> > >> > cover
> > > > > >>> > >> > > authentication and TLS, it would be great to leverage
> > > > existing
> > > > > >>> > >> > AuthPlugins,
> > > > > >>> > >> > > as they are based on exchanging byte[] (as SASL wants)
> > > > > >>> > >> > >
> > > > > >>> > >> > [Jia] It is a good idea. We left the security part for
> now
> > > > for a
> > > > > >>> few
> > > > > >>> > >> > reasons. 1) Make this BP more focus on removing
> zookeeper
> > > > > >>> dependencies
> > > > > >>> > >> from
> > > > > >>> > >> > client. 2) It is introduced as a separated
> implementation
> > of
> > > > > >>> existing
> > > > > >>> > >> > interfaces. So it won’t impact existing security story.
> > >  And
> > > > > for
> > > > > >>> > sure,
> > > > > >>> > >> We
> > > > > >>> > >> > will add the security part later after this.
> > > > > >>> > >> >
> > > > > >>> > >>
> > > > > >>> > >>
> > > > > >>> > >> I am fine, I am only afraid that we won't be able to
> support
> > > it
> > > > in
> > > > > >>> the
> > > > > >>> > >> (near) future,
> > > > > >>> > >> maybe you could just only cite the security story and add
> > some
> > > > > >>> reference
> > > > > >>> > >> to
> > > > > >>> > >> how we would deal with it in future
> > > > > >>> > >>
> > > > > >>> > >>
> > > > > >>> > >> The new ledger manager will be first marked as
> experimental,
> > > > until
> > > > > >>> it is
> > > > > >>> > >> stable and have security feature.
> > > > > >>> > >>
> > > > > >>> > >> How does that sound?
> > > > > >>> > >>
> > > > > >>> > >
> > > > > >>> > > Ok
> > > > > >>> > >
> > > > > >>> > >>
> > > > > >>> > >>
> > > > > >>> > >>
> > > > > >>> > >> >
> > > > > >>> > >> > - do we have some kind of "bootstrap servers list"
> > > > configuration
> > > > > >>> > option
> > > > > >>> > >> ?
> > > > > >>> > >> > > the list should be complete or just a subset of
> bookies
> > ?
> > > at
> > > > > >>> > >> connection
> > > > > >>> > >> > the
> > > > > >>> > >> > > client could discover the list of other bookies
> > > > > >>> > >> > >
> > > > > >>> > >> > [Jia] Yes, we will have a `clientBootstrapBookies`
> > settings
> > > in
> > > > > the
> > > > > >>> > >> server
> > > > > >>> > >> > set. It can be a list of bookies or just simple a DNS
> over
> > > the
> > > > > >>> > bookies.
> > > > > >>> > >> > Will add this to the BP
> > > > > >>> > >> >
> > > > > >>> > >> > - will the client connect to only one bookie at a time ?
> > how
> > > > we
> > > > > >>> will
> > > > > >>> > >> deal
> > > > > >>> > >> > > with errors ?
> > > > > >>> > >> > >
> > > > > >>> > >> > [Jia] It will connect the the list of bootstrap servers.
> > > gPRC
> > > > > will
> > > > > >>> > load
> > > > > >>> > >> > balance the requests and manage the connection errors.
> > > > > >>> > >> >
> > > > > >>> > >> > - should the bookie write on ZK metadata its gRPC
> endpoint
> > > > info
> > > > > ?
> > > > > >>> > (this
> > > > > >>> > >> > > will be useful for a bookie to tell about other
> bookies
> > to
> > > > the
> > > > > >>> > >> connected
> > > > > >>> > >> > > clients)
> > > > > >>> > >> > >
> > > > > >>> > >> > [Jia]No, it won’t. We don’t see a strong reason to add
> it.
> > > > > >>> Especially
> > > > > >>> > >> > eventually we may eliminate zookeeper completely.
> > > > > >>> > >> > It can be a fixed port `3281`, or in a scheduler-based
> > > > > >>> environment, it
> > > > > >>> > >> is
> > > > > >>> > >> > very easy to have a load balancer sitting in front of
> > those
> > > > > >>> bookies.
> > > > > >>> > >> >
> > > > > >>> > >>
> > > > > >>> > >> I think a fixed port is not a good way.
> > > > > >>> > >> You will not be able to run more than one bookie on a
> single
> > > > host.
> > > > > >>> > >>
> > > > > >>> > >> We should support:
> > > > > >>> > >> - configurable port
> > > > > >>> > >> - ephemeral port for tests
> > > > > >>> > >>
> > > > > >>> > >>
> > > > > >>> > >> I think what Jia means is a configurable port, but it is a
> > > > > >>> relatively
> > > > > >>> > >> fixed
> > > > > >>> > >> port, which client doesn't discover this port from
> > zookeeper.
> > > > > >>> > >>
> > > > > >>> > >
> > > > > >>> > > Very good
> > > > > >>> > >
> > > > > >>> > >>
> > > > > >>> > >>
> > > > > >>> > >> Ideally I would like to have the local transport option,
> in
> > > > order
> > > > > to
> > > > > >>> > have
> > > > > >>> > >> a
> > > > > >>> > >> single JVM, but this is not a blocker problem, as we are
> > > running
> > > > > >>> gRPC on
> > > > > >>> > >> netty it should be feasible or we can create some kind of
> > > > > >>> short-circut
> > > > > >>> > >> between the client and the Bookie
> > > > > >>> > >>
> > > > > >>> > >>
> > > > > >>> > >> GRPC supports inprocess channel. So you don't need to use
> > the
> > > > low
> > > > > >>> level
> > > > > >>> > >> netty settings.
> > > > > >>> > >>
> > > > > >>> > >
> > > > > >>> > > Great
> > > > > >>> > >
> > > > > >>> > > So it sounds all good to me thanks
> > > > > >>> > >
> > > > > >>> > > Enrico
> > > > > >>> > >
> > > > > >>> > >
> > > > > >>> > >>
> > > > > >>> > >> I am OK for not writing this to the bookie metadata,
> leaving
> > > up
> > > > to
> > > > > >>> the
> > > > > >>> > >> client have a configured list of bookies enabled to
> metadata
> > > > > >>> operations
> > > > > >>> > >>
> > > > > >>> > >>
> > > > > >>> > >>
> > > > > >>> > >>
> > > > > >>> > >> >
> > > > > >>> > >> > - the bookie will be somehow a proxy for zookeeper, I
> > think
> > > > that
> > > > > >>> the
> > > > > >>> > >> > > 'watch' part is the more complex, we will have to deal
> > > with
> > > > > >>> > >> > reconnections,
> > > > > >>> > >> > > errors....maybe it is worth to write more detail about
> > > this
> > > > > >>> > >> > >
> > > > > >>> > >> > [Jia] The `watch` API is using the `streaming` rpc in
> > gRPC.
> > > It
> > > > > is
> > > > > >>> a
> > > > > >>> > >> > straightforward proxy behavior, if a connection is
> broken,
> > > the
> > > > > >>> client
> > > > > >>> > >> will
> > > > > >>> > >> > simply retry on watching again.
> > > > > >>> > >> >
> > > > > >>> > >> >
> > > > > >>> > >> > > Minor issues:
> > > > > >>> > >> > > - Maybe you can consider using ledgerId and not
> > ledger_id,
> > > > > like
> > > > > >>> in
> > > > > >>> > >> > > LedgerMetadataFormat we are using lastEntryId
> > > > > >>> > >> > >
> > > > > >>> > >> > [Jia] Thanks, It is a protobuf style. The protobuf will
> > > > convert
> > > > > >>> > >> `ledger_id`
> > > > > >>> > >> > to `ledgerId`. We don’t need to worry about this.
> > > > > >>> > >> >
> > > > > >>> > >>
> > > > > >>> > >> got it, thanks
> > > > > >>> > >>
> > > > > >>> > >>
> > > > > >>> > >> >
> > > > > >>> > >> >
> > > > > >>> > >> > > -In the "motivation" part you write that the fact the
> > > having
> > > > > >>> more
> > > > > >>> > >> clients
> > > > > >>> > >> > > than the number of bookies would be a problem for
> > > zookeeper,
> > > > > >>> > actually
> > > > > >>> > >> > > zookeeper is very good at dealing with a huge number
> of
> > > > > clients.
> > > > > >>> > >> > Actually I
> > > > > >>> > >> > > am always running clusters with 3-5 bookies and 10-100
> > > > writing
> > > > > >>> > clients
> > > > > >>> > >> > and
> > > > > >>> > >> > > this has never given troubles
> > > > > >>> > >> >
> > > > > >>> > >> > [Jia] :) Seems “10-100 writing clients” is not “a huge
> > > number
> > > > of
> > > > > >>> > >> clients”.
> > > > > >>> > >> >
> > > > > >>> > >>
> > > > > >>> > >> OK, I agree with you an Sijie, I have no experience of
> > larger
> > > > > >>> clusters
> > > > > >>> > >>
> > > > > >>> > >>
> > > > > >>> > >> >
> > > > > >>> > >> > >
> > > > > >>> > >> >
> > > > > >>> > >> >
> > > > > >>> > >> >
> > > > > >>> > >> > > Future:
> > > > > >>> > >> > > - as bookies will be proxies maybe we should take care
> > not
> > > > to
> > > > > >>> > >> overwhelm
> > > > > >>> > >> a
> > > > > >>> > >> > > bookie with too many clients
> > > > > >>> > >> > >
> > > > > >>> > >> > [Jia] First, gRPC is based on Netty, the protocol is
> > http2,
> > > so
> > > > > the
> > > > > >>> > >> > connection is multiplexed. We don’t need to worry about
> > > > > connection
> > > > > >>> > >> count.
> > > > > >>> > >> > Second, all the bookies are treated equally for the
> > metadata
> > > > > >>> > operations,
> > > > > >>> > >> > gRPC will load balancing the requests across the
> bookies.
> > We
> > > > > don’t
> > > > > >>> > need
> > > > > >>> > >> to
> > > > > >>> > >> > worry about some bookies are overwhelmed.
> > > > > >>> > >> >
> > > > > >>> > >>
> > > > > >>> > >> gRPC sounds great
> > > > > >>> > >>
> > > > > >>> > >>
> > > > > >>> > >> >
> > > > > >>> > >> >
> > > > > >>> > >> > > - iteration on ledgers, sometimes the clients
> enumerates
> > > > > >>> ledgers but
> > > > > >>> > >> it
> > > > > >>> > >> > is
> > > > > >>> > >> > > not interested in having all of them, as we are using
> > the
> > > > > >>> bookie as
> > > > > >>> > >> proxy
> > > > > >>> > >> > > maybe some kind of "filter" (at least on custom
> > metadata)
> > > > > would
> > > > > >>> be
> > > > > >>> > >> create
> > > > > >>> > >> > > to limit the number of returned items. Other point I
> > don't
> > > > > know
> > > > > >>> gRPC
> > > > > >>> > >> but
> > > > > >>> > >> > it
> > > > > >>> > >> > > does not seems to be very clear how to 'stop' the
> > > iteration
> > > > > >>> > >> > >
> > > > > >>> > >> > [Jia] Thanks, We can add it later. For now, we would
> like
> > to
> > > > > >>> focus on
> > > > > >>> > >> > adding the features the ledger manager needs.
> > > > > >>> > >> >
> > > > > >>> > >>
> > > > > >>> > >> Yup
> > > > > >>> > >>
> > > > > >>> > >> -- Enrico
> > > > > >>> > >>
> > > > > >>> > >>
> > > > > >>> > >> >
> > > > > >>> > >> > >
> > > > > >>> > >> > > -- Enrico
> > > > > >>> > >> > >
> > > > > >>> > >> > >
> > > > > >>> > >> > > 2017-09-05 15:10 GMT+02:00 Jia Zhai <
> > zhaiji...@gmail.com
> > > >:
> > > > > >>> > >> > >
> > > > > >>> > >> > > > Hi all,
> > > > > >>> > >> > > >
> > > > > >>> > >> > > > I have just posted a proposal to remove zookeeper
> > > > dependency
> > > > > >>> from
> > > > > >>> > >> > > > bookkeeper client, to make bookkeeper client a thin
> > > > client:
> > > > > >>> > >> > > >
> > > > > >>> > >> > > > https://cwiki.apache.org/
> > confluence/display/BOOKKEEPER/
> > > > > >>> > >> > > > BP-16%3A+remove+zookeeper+
> dependency+from+bookkeeper+
> > > > client
> > > > > >>> > >> > > >
> > > > > >>> > >> > > >
> > > > > >>> > >> > > > BookKeeper uses zookeeper for service discovery
> > > > (discovering
> > > > > >>> the
> > > > > >>> > >> > > available
> > > > > >>> > >> > > > bookies in the cluster), metadata management
> (storing
> > > all
> > > > > the
> > > > > >>> > >> metadata
> > > > > >>> > >> > > for
> > > > > >>> > >> > > > ledgers). However it exposes the metadata storage
> > > directly
> > > > > to
> > > > > >>> the
> > > > > >>> > >> > > clients,
> > > > > >>> > >> > > > making bookkeeper client a very thick client. It
> also
> > > > > exposes
> > > > > >>> some
> > > > > >>> > >> > > > problems.
> > > > > >>> > >> > > >
> > > > > >>> > >> > > > This BP explores the possibility of eliminating
> > > zookeeper
> > > > > >>> > completely
> > > > > >>> > >> > from
> > > > > >>> > >> > > > client side, to produce a thin bookkeeper client.
> > > > > >>> > >> > > >
> > > > > >>> > >> > > > I will send a patch as soon as we agree on the
> > proposal.
> > > > > >>> > >> > > >
> > > > > >>> > >> > > >
> > > > > >>> > >> > > > Thanks.
> > > > > >>> > >> > > >
> > > > > >>> > >> > > > -Jia
> > > > > >>> > >> > > >
> > > > > >>> > >> > >
> > > > > >>> > >> >
> > > > > >>> > >>
> > > > > >>> > > --
> > > > > >>> > >
> > > > > >>> > >
> > > > > >>> > > -- Enrico Olivelli
> > > > > >>> > >
> > > > > >>> > --
> > > > > >>> >
> > > > > >>> >
> > > > > >>> > -- Enrico Olivelli
> > > > > >>> >
> > > > > >>>
> > > > > >>
> > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to