On Mon, Nov 9, 2015, at 10:24 PM, Kevin Carter wrote: > Hello all, > > The rational behind using a solution like zookeeper makes sense however > in reviewing the thread I found myself asking if there was a better way > to address the problem without the addition of a Java based solution as > the default. While it has been covered that the current implementation > would be a reference and that "other" driver support in Tooz would allow > for any backend a deployer may want, the work being proposed within > devstack [0] would become the default development case thus making it the > de-facto standard and I think we could do better in terms of supporting > developers and delivering capability. > > My thoughts on using Redis+Redislock instead of Java+Zookeeper as the > default option: > * Tooz already support redislock > * Redis has an established cluster system known for general ease of use > and reliability on distributed systems.
This one I somewhat suspect, the clustering support was released about six months ago: https://github.com/antirez/redis/blob/3.0/00-RELEASENOTES#L130 So I'm not exactly sure how established (or even well deployed and tested it is); does anyone have experience with it, configuring it, handling its failure modes?? It'd be nice to know how it works (and I'm generally curious). > * Several OpenStack projects already support Redis as a backend option or > have extended capabilities using a Redis. > * Redis can be implemented in RHEL, SUSE, and DEB based systems with > ease. > * Redis is Opensource software licensed under the "three clause BSD > license" and would not have any of the same questionable license > implications as found when dealing with anything Java. > * The inclusion of Redis would work on a single node allowing developers > to continue work using VMs running on Laptops with 4GB or ram but would > also scale to support the multi-controller use case with ease. This would > also give developers the ability to work on a systems that will actually > resemble production. > * Redislock will bring with it no additional developer facing language > dependencies (Redis is written in ANSI C and works ... without external > dependencies [1]) while also providing a plethora of language bindings > [2]. > > > I apologize for questioning the proposed solution so late into the > development of this thread and for not making the summit conversations to > talk more with everyone whom worked on the proposal. While the ship may > have sailed on this point for now I figured I'd ask why we might go down > the path of Zookeeper+Java when a solution with likely little to no > development effort already exists, can support just about any > production/development environment, has lots of bindings, and (IMHO) > would integrate with the larger community easier; many OpenStack > developers and deployers already know Redis. With the inclusion of > ZK+Java in DevStack and the act of making it the default it essentially > creates new hard dependencies one of which is Java and I'd like to avoid > that if at all possible; basically I think we can do better. > > > [0] - https://review.openstack.org/#/c/241040/ > [1] - http://redis.io/topics/introduction > [2] - http://redis.io/topics/distlock > > -- > > Kevin Carter > IRC: cloudnull > > > ________________________________________ > From: Fox, Kevin M <kevin....@pnnl.gov> > Sent: Monday, November 9, 2015 1:54 PM > To: maishsk+openst...@maishsk.com; OpenStack Development Mailing List > (not for usage questions) > Subject: Re: [openstack-dev] [all] Outcome of distributed lock manager > discussion @ the summit > > Dedicating 3 controller nodes in a small cloud is not the best allocation > of resources sometimes. Your thinking of medium to large clouds. Small > production clouds are a thing too. and at that scale, a little downtime > if you actually hit the rare case of a node failure on the controller may > be acceptable. Its up for an OP to decide. > > We've also experienced that sometimes HA software causes more, or longer > downtimes then it solves sometimes. Due to its complexity, knowledge > required, proper testing, etc. Again, the risk gets higher the smaller > the cloud is in some ways. > > Being able to keep it simple and small for that case, then scale with > switching out pieces as needed does have some tangible benefits. > > Thanks, > Kevin > ________________________________________ > From: Maish Saidel-Keesing [mais...@maishsk.com] > Sent: Monday, November 09, 2015 11:35 AM > To: OpenStack Development Mailing List (not for usage questions) > Subject: Re: [openstack-dev] [all] Outcome of distributed lock manager > discussion @ the summit > > On 11/05/15 23:18, Fox, Kevin M wrote: > > Your assuming there are only 2 choices, > > zk or db+rabbit. I'm claiming both hare suboptimal at present. a 3rd > > might be needed. Though even with its flaws, the db+rabbit choice has a few > > benefits too. > > > > You also seem to assert that to support large clouds, the default must be > > something that can scale that large. While that would be nice, I don't > > think its a requirement if its overly burdensome on deployers of non huge > > clouds. > > > > I don't have metrics, but I would be surprised if most deployments today > > (production + other) used 3 controllers with a full ha setup. I would guess > > that the majority are single controller setups. With those, the > I think it would be safe to assume - that any kind of production cloud - > or any operator that considers their OpenStack environment something > that is close to production ready - would not be daft enough to deploy > their whole environment based on a single controller - which is a > whopper of a single point of failure. > > Most Fuel (mirantis) deployments are multiple controllers. > RHOS also recommends doing multiple controllers. > > I don't think that we as a community can afford to assume that 1 > controller will suffice. > This does not say that maintaining zk will be any easier though. > > overhead of maintaining a whole dlm like zk seems like overkill. If > > db+rabbit would work for that one case, that would be one less thing to > > have to setup for an op. They already have to setup db+rabbit. Or even a > > clm plugin of some sort, that won't scale, but would be very easy to > > deploy, and change out later when needed would be very useful. > > > > etcd is starting to show up in a lot of other projects, and so it may be at > > sites already. being able to support it may be less of a burden to > > operators then zk in some cases. > > > > If your cloud grows to the point where the dlm choice really matters for > > scalability/correctness, then you probably have enough staff members to > > deal with adding in zk, and that's probably the right choice. > > > > You can have multiple suggested things in addition to one default. Default > > to the thing that makes the most sense in the common most deployments, and > > make specific recommendations for certain scenarios. like, "if greater then > > 100 nodes, we strongly recommend using zk" or something to that effect. > > > > Thanks, > > Kevin > > > > > > ________________________________________ > > From: Clint Byrum [cl...@fewbar.com] > > Sent: Thursday, November 05, 2015 11:44 AM > > To: openstack-dev > > Subject: Re: [openstack-dev] [all] Outcome of distributed lock manager > > discussion @ the summit > > > > Excerpts from Fox, Kevin M's message of 2015-11-04 14:32:42 -0800: > >> To clarify that statement a little more, > >> > >> Speaking only for myself as an op, I don't want to support yet one more > >> snowflake in a sea of snowflakes, that works differently then all the > >> rest, without a very good reason. > >> > >> Java has its own set of issues associated with the JVM. Care, and feeding > >> sorts of things. If we are to invest time/money/people in learning how to > >> properly maintain it, its easier to justify if its not just a one off for > >> just DLM, > >> > >> So I wouldn't go so far as to say we're vehemently opposed to java, just > >> that DLM on its own is probably not a strong enough feature all on its own > >> to justify requiring pulling in java. Its been only a very recent thing > >> that you could convince folks that DLM was needed at all. So either make > >> java optional, or find some other use cases that needs java badly enough > >> that you can make java a required component. I suspect some day > >> searchlight might be compelling enough for that, but not today. > >> > >> As for the default, the default should be good reference. if most sites > >> would run with etc or something else since java isn't needed, then don't > >> default zookeeper on. > >> > > There are a number of reasons, but the most important are: > > > > * Resilience in the face of failures - The current database+MQ based > > solutions are all custom made and have unknown characteristics when > > there are network partitions and node failures. > > * Scalability - The current database+MQ solutions rely on polling the > > database and/or sending lots of heartbeat messages or even using the > > database to store heartbeat transactions. This scales fine for tiny > > clusters, but when every new node adds more churn to the MQ and > > database, this will (and has been observed to) be intractable. > > * Tech debt - OpenStack is inventing lock solutions and then maintaining > > them. And service discovery solutions, and then maintaining them. > > Wouldn't you rather have better upgrade stories, more stability, more > > scale, and more featuers? > > > > If those aren't compelling enough reasons to deploy a mature java service > > like Zookeeper, I don't know what would be. But I do think using the > > abstraction layer of tooz will at least allow us to move forward without > > having to convince everybody everywhere that this is actually just the > > path of least resistance. > > > > > > -- > Best Regards, > Maish Saidel-Keesing > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: > openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: > openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: > openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev