On mer 6 set 2017, 18:25 Sijie Guo <guosi...@gmail.com> wrote:

> On Sep 6, 2017 4:57 AM, "Enrico Olivelli" <eolive...@gmail.com> wrote:
>
> Thank you Sijie and Jia for your comments and explanations,
> answers inline
>
> 2017-09-06 2:23 GMT+02:00 Jia Zhai <zhaiji...@gmail.com>:
>
> > Thanks a lot Enrico and Sijie for your comments and information on this.
> >
> > On Tue, Sep 5, 2017 at 9:31 PM, Enrico Olivelli <eolive...@gmail.com>
> > wrote:
> >
> > > Great to see you working on this !
> > > I would be great to have such feature, as it is the first step to a
> > > 'standalone' BookKeeper mode
> > >
> > > Some complementary ideas/first look questions:
> > > - the document does not talk about security, IMHO we have at least to
> > cover
> > > authentication and TLS, it would be great to leverage existing
> > AuthPlugins,
> > > as they are based on exchanging byte[] (as SASL wants)
> > >
> > [Jia] It is a good idea. We left the security part for now for a few
> > reasons. 1) Make this BP more focus on removing zookeeper dependencies
> from
> > client. 2) It is introduced as a separated implementation of existing
> > interfaces. So it won’t impact existing security story.   And for sure,
> We
> > will add the security part later after this.
> >
>
>
> I am fine, I am only afraid that we won't be able to support it in the
> (near) future,
> maybe you could just only cite the security story and add some reference to
> how we would deal with it in future
>
>
> The new ledger manager will be first marked as experimental, until it is
> stable and have security feature.
>
> How does that sound?
>

Ok

>
>
>
> >
> > - do we have some kind of "bootstrap servers list" configuration option ?
> > > the list should be complete or just a subset of bookies ? at connection
> > the
> > > client could discover the list of other bookies
> > >
> > [Jia] Yes, we will have a `clientBootstrapBookies` settings in the server
> > set. It can be a list of bookies or just simple a DNS over the bookies.
> > Will add this to the BP
> >
> > - will the client connect to only one bookie at a time ? how we will deal
> > > with errors ?
> > >
> > [Jia] It will connect the the list of bootstrap servers. gPRC will load
> > balance the requests and manage the connection errors.
> >
> > - should the bookie write on ZK metadata its gRPC endpoint info ? (this
> > > will be useful for a bookie to tell about other bookies to the
> connected
> > > clients)
> > >
> > [Jia]No, it won’t. We don’t see a strong reason to add it. Especially
> > eventually we may eliminate zookeeper completely.
> > It can be a fixed port `3281`, or in a scheduler-based environment, it is
> > very easy to have a load balancer sitting in front of those bookies.
> >
>
> I think a fixed port is not a good way.
> You will not be able to run more than one bookie on a single host.
>
> We should support:
> - configurable port
> - ephemeral port for tests
>
>
> I think what Jia means is a configurable port, but it is a relatively fixed
> port, which client doesn't discover this port from zookeeper.
>

Very good

>
>
> Ideally I would like to have the local transport option, in order to have a
> single JVM, but this is not a blocker problem, as we are running gRPC on
> netty it should be feasible or we can create some kind of short-circut
> between the client and the Bookie
>
>
> GRPC supports inprocess channel. So you don't need to use the low level
> netty settings.
>

Great

So it sounds all good to me thanks

Enrico


>
> I am OK for not writing this to the bookie metadata, leaving up to the
> client have a configured list of bookies enabled to metadata operations
>
>
>
>
> >
> > - the bookie will be somehow a proxy for zookeeper, I think that the
> > > 'watch' part is the more complex, we will have to deal with
> > reconnections,
> > > errors....maybe it is worth to write more detail about this
> > >
> > [Jia] The `watch` API is using the `streaming` rpc in gRPC. It is a
> > straightforward proxy behavior, if a connection is broken, the client
> will
> > simply retry on watching again.
> >
> >
> > > Minor issues:
> > > - Maybe you can consider using ledgerId and not ledger_id, like in
> > > LedgerMetadataFormat we are using lastEntryId
> > >
> > [Jia] Thanks, It is a protobuf style. The protobuf will convert
> `ledger_id`
> > to `ledgerId`. We don’t need to worry about this.
> >
>
> got it, thanks
>
>
> >
> >
> > > -In the "motivation" part you write that the fact the having more
> clients
> > > than the number of bookies would be a problem for zookeeper, actually
> > > zookeeper is very good at dealing with a huge number of clients.
> > Actually I
> > > am always running clusters with 3-5 bookies and 10-100 writing clients
> > and
> > > this has never given troubles
> >
> > [Jia] :) Seems “10-100 writing clients” is not “a huge number of
> clients”.
> >
>
> OK, I agree with you an Sijie, I have no experience of larger clusters
>
>
> >
> > >
> >
> >
> >
> > > Future:
> > > - as bookies will be proxies maybe we should take care not to overwhelm
> a
> > > bookie with too many clients
> > >
> > [Jia] First, gRPC is based on Netty, the protocol is http2, so the
> > connection is multiplexed. We don’t need to worry about connection count.
> > Second, all the bookies are treated equally for the metadata operations,
> > gRPC will load balancing the requests across the bookies. We don’t need
> to
> > worry about some bookies are overwhelmed.
> >
>
> gRPC sounds great
>
>
> >
> >
> > > - iteration on ledgers, sometimes the clients enumerates ledgers but it
> > is
> > > not interested in having all of them, as we are using the bookie as
> proxy
> > > maybe some kind of "filter" (at least on custom metadata) would be
> create
> > > to limit the number of returned items. Other point I don't know gRPC
> but
> > it
> > > does not seems to be very clear how to 'stop' the iteration
> > >
> > [Jia] Thanks, We can add it later. For now, we would like to focus on
> > adding the features the ledger manager needs.
> >
>
> Yup
>
> -- Enrico
>
>
> >
> > >
> > > -- Enrico
> > >
> > >
> > > 2017-09-05 15:10 GMT+02:00 Jia Zhai <zhaiji...@gmail.com>:
> > >
> > > > Hi all,
> > > >
> > > > I have just posted a proposal to remove zookeeper dependency from
> > > > bookkeeper client, to make bookkeeper client a thin client:
> > > >
> > > > https://cwiki.apache.org/confluence/display/BOOKKEEPER/
> > > > BP-16%3A+remove+zookeeper+dependency+from+bookkeeper+client
> > > >
> > > >
> > > > BookKeeper uses zookeeper for service discovery (discovering the
> > > available
> > > > bookies in the cluster), metadata management (storing all the
> metadata
> > > for
> > > > ledgers). However it exposes the metadata storage directly to the
> > > clients,
> > > > making bookkeeper client a very thick client. It also exposes some
> > > > problems.
> > > >
> > > > This BP explores the possibility of eliminating zookeeper completely
> > from
> > > > client side, to produce a thin bookkeeper client.
> > > >
> > > > I will send a patch as soon as we agree on the proposal.
> > > >
> > > >
> > > > Thanks.
> > > >
> > > > -Jia
> > > >
> > >
> >
>
-- 


-- Enrico Olivelli

Reply via email to