On mer 6 set 2017, 18:25 Sijie Guo <guosi...@gmail.com> wrote: > On Sep 6, 2017 4:57 AM, "Enrico Olivelli" <eolive...@gmail.com> wrote: > > Thank you Sijie and Jia for your comments and explanations, > answers inline > > 2017-09-06 2:23 GMT+02:00 Jia Zhai <zhaiji...@gmail.com>: > > > Thanks a lot Enrico and Sijie for your comments and information on this. > > > > On Tue, Sep 5, 2017 at 9:31 PM, Enrico Olivelli <eolive...@gmail.com> > > wrote: > > > > > Great to see you working on this ! > > > I would be great to have such feature, as it is the first step to a > > > 'standalone' BookKeeper mode > > > > > > Some complementary ideas/first look questions: > > > - the document does not talk about security, IMHO we have at least to > > cover > > > authentication and TLS, it would be great to leverage existing > > AuthPlugins, > > > as they are based on exchanging byte[] (as SASL wants) > > > > > [Jia] It is a good idea. We left the security part for now for a few > > reasons. 1) Make this BP more focus on removing zookeeper dependencies > from > > client. 2) It is introduced as a separated implementation of existing > > interfaces. So it won’t impact existing security story. And for sure, > We > > will add the security part later after this. > > > > > I am fine, I am only afraid that we won't be able to support it in the > (near) future, > maybe you could just only cite the security story and add some reference to > how we would deal with it in future > > > The new ledger manager will be first marked as experimental, until it is > stable and have security feature. > > How does that sound? >
Ok > > > > > > > - do we have some kind of "bootstrap servers list" configuration option ? > > > the list should be complete or just a subset of bookies ? at connection > > the > > > client could discover the list of other bookies > > > > > [Jia] Yes, we will have a `clientBootstrapBookies` settings in the server > > set. It can be a list of bookies or just simple a DNS over the bookies. > > Will add this to the BP > > > > - will the client connect to only one bookie at a time ? how we will deal > > > with errors ? > > > > > [Jia] It will connect the the list of bootstrap servers. gPRC will load > > balance the requests and manage the connection errors. > > > > - should the bookie write on ZK metadata its gRPC endpoint info ? (this > > > will be useful for a bookie to tell about other bookies to the > connected > > > clients) > > > > > [Jia]No, it won’t. We don’t see a strong reason to add it. Especially > > eventually we may eliminate zookeeper completely. > > It can be a fixed port `3281`, or in a scheduler-based environment, it is > > very easy to have a load balancer sitting in front of those bookies. > > > > I think a fixed port is not a good way. > You will not be able to run more than one bookie on a single host. > > We should support: > - configurable port > - ephemeral port for tests > > > I think what Jia means is a configurable port, but it is a relatively fixed > port, which client doesn't discover this port from zookeeper. > Very good > > > Ideally I would like to have the local transport option, in order to have a > single JVM, but this is not a blocker problem, as we are running gRPC on > netty it should be feasible or we can create some kind of short-circut > between the client and the Bookie > > > GRPC supports inprocess channel. So you don't need to use the low level > netty settings. > Great So it sounds all good to me thanks Enrico > > I am OK for not writing this to the bookie metadata, leaving up to the > client have a configured list of bookies enabled to metadata operations > > > > > > > > - the bookie will be somehow a proxy for zookeeper, I think that the > > > 'watch' part is the more complex, we will have to deal with > > reconnections, > > > errors....maybe it is worth to write more detail about this > > > > > [Jia] The `watch` API is using the `streaming` rpc in gRPC. It is a > > straightforward proxy behavior, if a connection is broken, the client > will > > simply retry on watching again. > > > > > > > Minor issues: > > > - Maybe you can consider using ledgerId and not ledger_id, like in > > > LedgerMetadataFormat we are using lastEntryId > > > > > [Jia] Thanks, It is a protobuf style. The protobuf will convert > `ledger_id` > > to `ledgerId`. We don’t need to worry about this. > > > > got it, thanks > > > > > > > > > -In the "motivation" part you write that the fact the having more > clients > > > than the number of bookies would be a problem for zookeeper, actually > > > zookeeper is very good at dealing with a huge number of clients. > > Actually I > > > am always running clusters with 3-5 bookies and 10-100 writing clients > > and > > > this has never given troubles > > > > [Jia] :) Seems “10-100 writing clients” is not “a huge number of > clients”. > > > > OK, I agree with you an Sijie, I have no experience of larger clusters > > > > > > > > > > > > > > > > Future: > > > - as bookies will be proxies maybe we should take care not to overwhelm > a > > > bookie with too many clients > > > > > [Jia] First, gRPC is based on Netty, the protocol is http2, so the > > connection is multiplexed. We don’t need to worry about connection count. > > Second, all the bookies are treated equally for the metadata operations, > > gRPC will load balancing the requests across the bookies. We don’t need > to > > worry about some bookies are overwhelmed. > > > > gRPC sounds great > > > > > > > > > - iteration on ledgers, sometimes the clients enumerates ledgers but it > > is > > > not interested in having all of them, as we are using the bookie as > proxy > > > maybe some kind of "filter" (at least on custom metadata) would be > create > > > to limit the number of returned items. Other point I don't know gRPC > but > > it > > > does not seems to be very clear how to 'stop' the iteration > > > > > [Jia] Thanks, We can add it later. For now, we would like to focus on > > adding the features the ledger manager needs. > > > > Yup > > -- Enrico > > > > > > > > > > -- Enrico > > > > > > > > > 2017-09-05 15:10 GMT+02:00 Jia Zhai <zhaiji...@gmail.com>: > > > > > > > Hi all, > > > > > > > > I have just posted a proposal to remove zookeeper dependency from > > > > bookkeeper client, to make bookkeeper client a thin client: > > > > > > > > https://cwiki.apache.org/confluence/display/BOOKKEEPER/ > > > > BP-16%3A+remove+zookeeper+dependency+from+bookkeeper+client > > > > > > > > > > > > BookKeeper uses zookeeper for service discovery (discovering the > > > available > > > > bookies in the cluster), metadata management (storing all the > metadata > > > for > > > > ledgers). However it exposes the metadata storage directly to the > > > clients, > > > > making bookkeeper client a very thick client. It also exposes some > > > > problems. > > > > > > > > This BP explores the possibility of eliminating zookeeper completely > > from > > > > client side, to produce a thin bookkeeper client. > > > > > > > > I will send a patch as soon as we agree on the proposal. > > > > > > > > > > > > Thanks. > > > > > > > > -Jia > > > > > > > > > > -- -- Enrico Olivelli