Any thoughts or comments :) Thanks a lot. -Jia
On Tue, Sep 12, 2017 at 4:30 PM, Jia Zhai <zhaiji...@gmail.com> wrote: > This blog: https://bitworks.software/blog/en/2017-07-12-replicated- > scalable-commitlog-with-apachebookkeeper.html, which also refer a little > the limitation of zookeeper in bookkeeper > > On Thu, Sep 7, 2017 at 9:45 AM, Jia Zhai <zhaiji...@gmail.com> wrote: > >> 👍. Thanks a lot for the suggestions and feed back. >> >> On Thu, Sep 7, 2017 at 4:24 AM, Sijie Guo <guosi...@gmail.com> wrote: >> >>> On Wed, Sep 6, 2017 at 1:07 PM, Enrico Olivelli <eolive...@gmail.com> >>> wrote: >>> >>> > Off topic curiosity... Jia and Sijie, do you think we are going to >>> drop ZK >>> > from DL too? >>> > >>> >>> Yes. That's the goal - 1) for large deployment, we are trying to overcome >>> the limitation of zookeeper; 2) for smaller deployments, it will make >>> deployment much easier, you just need to deploy a cluster of bookies. >>> once >>> it is done, you can use ledger api or log stream api to access the >>> bookkeeper cluster. >>> >>> Both DL and BK are metadata storage pluggable. They have very clear >>> interfaces on defining metadata operations. So it is straightforward to >>> use >>> a different metadata storage. >>> >>> >>> > Enrico >>> > >>> > On mer 6 set 2017, 19:51 Enrico Olivelli <eolive...@gmail.com> wrote: >>> > >>> > > >>> > > >>> > > On mer 6 set 2017, 18:25 Sijie Guo <guosi...@gmail.com> wrote: >>> > > >>> > >> On Sep 6, 2017 4:57 AM, "Enrico Olivelli" <eolive...@gmail.com> >>> wrote: >>> > >> >>> > >> Thank you Sijie and Jia for your comments and explanations, >>> > >> answers inline >>> > >> >>> > >> 2017-09-06 2:23 GMT+02:00 Jia Zhai <zhaiji...@gmail.com>: >>> > >> >>> > >> > Thanks a lot Enrico and Sijie for your comments and information on >>> > this. >>> > >> > >>> > >> > On Tue, Sep 5, 2017 at 9:31 PM, Enrico Olivelli < >>> eolive...@gmail.com> >>> > >> > wrote: >>> > >> > >>> > >> > > Great to see you working on this ! >>> > >> > > I would be great to have such feature, as it is the first step >>> to a >>> > >> > > 'standalone' BookKeeper mode >>> > >> > > >>> > >> > > Some complementary ideas/first look questions: >>> > >> > > - the document does not talk about security, IMHO we have at >>> least >>> > to >>> > >> > cover >>> > >> > > authentication and TLS, it would be great to leverage existing >>> > >> > AuthPlugins, >>> > >> > > as they are based on exchanging byte[] (as SASL wants) >>> > >> > > >>> > >> > [Jia] It is a good idea. We left the security part for now for a >>> few >>> > >> > reasons. 1) Make this BP more focus on removing zookeeper >>> dependencies >>> > >> from >>> > >> > client. 2) It is introduced as a separated implementation of >>> existing >>> > >> > interfaces. So it won’t impact existing security story. And for >>> > sure, >>> > >> We >>> > >> > will add the security part later after this. >>> > >> > >>> > >> >>> > >> >>> > >> I am fine, I am only afraid that we won't be able to support it in >>> the >>> > >> (near) future, >>> > >> maybe you could just only cite the security story and add some >>> reference >>> > >> to >>> > >> how we would deal with it in future >>> > >> >>> > >> >>> > >> The new ledger manager will be first marked as experimental, until >>> it is >>> > >> stable and have security feature. >>> > >> >>> > >> How does that sound? >>> > >> >>> > > >>> > > Ok >>> > > >>> > >> >>> > >> >>> > >> >>> > >> > >>> > >> > - do we have some kind of "bootstrap servers list" configuration >>> > option >>> > >> ? >>> > >> > > the list should be complete or just a subset of bookies ? at >>> > >> connection >>> > >> > the >>> > >> > > client could discover the list of other bookies >>> > >> > > >>> > >> > [Jia] Yes, we will have a `clientBootstrapBookies` settings in the >>> > >> server >>> > >> > set. It can be a list of bookies or just simple a DNS over the >>> > bookies. >>> > >> > Will add this to the BP >>> > >> > >>> > >> > - will the client connect to only one bookie at a time ? how we >>> will >>> > >> deal >>> > >> > > with errors ? >>> > >> > > >>> > >> > [Jia] It will connect the the list of bootstrap servers. gPRC will >>> > load >>> > >> > balance the requests and manage the connection errors. >>> > >> > >>> > >> > - should the bookie write on ZK metadata its gRPC endpoint info ? >>> > (this >>> > >> > > will be useful for a bookie to tell about other bookies to the >>> > >> connected >>> > >> > > clients) >>> > >> > > >>> > >> > [Jia]No, it won’t. We don’t see a strong reason to add it. >>> Especially >>> > >> > eventually we may eliminate zookeeper completely. >>> > >> > It can be a fixed port `3281`, or in a scheduler-based >>> environment, it >>> > >> is >>> > >> > very easy to have a load balancer sitting in front of those >>> bookies. >>> > >> > >>> > >> >>> > >> I think a fixed port is not a good way. >>> > >> You will not be able to run more than one bookie on a single host. >>> > >> >>> > >> We should support: >>> > >> - configurable port >>> > >> - ephemeral port for tests >>> > >> >>> > >> >>> > >> I think what Jia means is a configurable port, but it is a >>> relatively >>> > >> fixed >>> > >> port, which client doesn't discover this port from zookeeper. >>> > >> >>> > > >>> > > Very good >>> > > >>> > >> >>> > >> >>> > >> Ideally I would like to have the local transport option, in order to >>> > have >>> > >> a >>> > >> single JVM, but this is not a blocker problem, as we are running >>> gRPC on >>> > >> netty it should be feasible or we can create some kind of >>> short-circut >>> > >> between the client and the Bookie >>> > >> >>> > >> >>> > >> GRPC supports inprocess channel. So you don't need to use the low >>> level >>> > >> netty settings. >>> > >> >>> > > >>> > > Great >>> > > >>> > > So it sounds all good to me thanks >>> > > >>> > > Enrico >>> > > >>> > > >>> > >> >>> > >> I am OK for not writing this to the bookie metadata, leaving up to >>> the >>> > >> client have a configured list of bookies enabled to metadata >>> operations >>> > >> >>> > >> >>> > >> >>> > >> >>> > >> > >>> > >> > - the bookie will be somehow a proxy for zookeeper, I think that >>> the >>> > >> > > 'watch' part is the more complex, we will have to deal with >>> > >> > reconnections, >>> > >> > > errors....maybe it is worth to write more detail about this >>> > >> > > >>> > >> > [Jia] The `watch` API is using the `streaming` rpc in gRPC. It is >>> a >>> > >> > straightforward proxy behavior, if a connection is broken, the >>> client >>> > >> will >>> > >> > simply retry on watching again. >>> > >> > >>> > >> > >>> > >> > > Minor issues: >>> > >> > > - Maybe you can consider using ledgerId and not ledger_id, like >>> in >>> > >> > > LedgerMetadataFormat we are using lastEntryId >>> > >> > > >>> > >> > [Jia] Thanks, It is a protobuf style. The protobuf will convert >>> > >> `ledger_id` >>> > >> > to `ledgerId`. We don’t need to worry about this. >>> > >> > >>> > >> >>> > >> got it, thanks >>> > >> >>> > >> >>> > >> > >>> > >> > >>> > >> > > -In the "motivation" part you write that the fact the having >>> more >>> > >> clients >>> > >> > > than the number of bookies would be a problem for zookeeper, >>> > actually >>> > >> > > zookeeper is very good at dealing with a huge number of clients. >>> > >> > Actually I >>> > >> > > am always running clusters with 3-5 bookies and 10-100 writing >>> > clients >>> > >> > and >>> > >> > > this has never given troubles >>> > >> > >>> > >> > [Jia] :) Seems “10-100 writing clients” is not “a huge number of >>> > >> clients”. >>> > >> > >>> > >> >>> > >> OK, I agree with you an Sijie, I have no experience of larger >>> clusters >>> > >> >>> > >> >>> > >> > >>> > >> > > >>> > >> > >>> > >> > >>> > >> > >>> > >> > > Future: >>> > >> > > - as bookies will be proxies maybe we should take care not to >>> > >> overwhelm >>> > >> a >>> > >> > > bookie with too many clients >>> > >> > > >>> > >> > [Jia] First, gRPC is based on Netty, the protocol is http2, so the >>> > >> > connection is multiplexed. We don’t need to worry about connection >>> > >> count. >>> > >> > Second, all the bookies are treated equally for the metadata >>> > operations, >>> > >> > gRPC will load balancing the requests across the bookies. We don’t >>> > need >>> > >> to >>> > >> > worry about some bookies are overwhelmed. >>> > >> > >>> > >> >>> > >> gRPC sounds great >>> > >> >>> > >> >>> > >> > >>> > >> > >>> > >> > > - iteration on ledgers, sometimes the clients enumerates >>> ledgers but >>> > >> it >>> > >> > is >>> > >> > > not interested in having all of them, as we are using the >>> bookie as >>> > >> proxy >>> > >> > > maybe some kind of "filter" (at least on custom metadata) would >>> be >>> > >> create >>> > >> > > to limit the number of returned items. Other point I don't know >>> gRPC >>> > >> but >>> > >> > it >>> > >> > > does not seems to be very clear how to 'stop' the iteration >>> > >> > > >>> > >> > [Jia] Thanks, We can add it later. For now, we would like to >>> focus on >>> > >> > adding the features the ledger manager needs. >>> > >> > >>> > >> >>> > >> Yup >>> > >> >>> > >> -- Enrico >>> > >> >>> > >> >>> > >> > >>> > >> > > >>> > >> > > -- Enrico >>> > >> > > >>> > >> > > >>> > >> > > 2017-09-05 15:10 GMT+02:00 Jia Zhai <zhaiji...@gmail.com>: >>> > >> > > >>> > >> > > > Hi all, >>> > >> > > > >>> > >> > > > I have just posted a proposal to remove zookeeper dependency >>> from >>> > >> > > > bookkeeper client, to make bookkeeper client a thin client: >>> > >> > > > >>> > >> > > > https://cwiki.apache.org/confluence/display/BOOKKEEPER/ >>> > >> > > > BP-16%3A+remove+zookeeper+dependency+from+bookkeeper+client >>> > >> > > > >>> > >> > > > >>> > >> > > > BookKeeper uses zookeeper for service discovery (discovering >>> the >>> > >> > > available >>> > >> > > > bookies in the cluster), metadata management (storing all the >>> > >> metadata >>> > >> > > for >>> > >> > > > ledgers). However it exposes the metadata storage directly to >>> the >>> > >> > > clients, >>> > >> > > > making bookkeeper client a very thick client. It also exposes >>> some >>> > >> > > > problems. >>> > >> > > > >>> > >> > > > This BP explores the possibility of eliminating zookeeper >>> > completely >>> > >> > from >>> > >> > > > client side, to produce a thin bookkeeper client. >>> > >> > > > >>> > >> > > > I will send a patch as soon as we agree on the proposal. >>> > >> > > > >>> > >> > > > >>> > >> > > > Thanks. >>> > >> > > > >>> > >> > > > -Jia >>> > >> > > > >>> > >> > > >>> > >> > >>> > >> >>> > > -- >>> > > >>> > > >>> > > -- Enrico Olivelli >>> > > >>> > -- >>> > >>> > >>> > -- Enrico Olivelli >>> > >>> >> >> >