On 5/8/2017 1:10 PM, Octave J. Orgeron wrote:
I do agree that scalability and high-availability are definitely issues for OpenStack when you dig deeper into the sub-components. There is a lot of re-inventing of the wheel when you look at how distributed services are implemented inside of OpenStack and deficiencies. For some services you have a scheduler that can scale-out, but the conductor or worker process doesn't. A good example is cinder, where cinder-volume doesn't scale-out in a distributed manner and doesn't have a good mechanism for recovering when an instance fails. All across the services you see different methods for coordinating requests and tasks such as rabbitmq, redis, memcached, tooz, mysql, etc. So for an operator, you have to sift through those choices and configure the per-requisite infrastructure. This is a good example of a problem that should be solved with a single architecturally sound solution that all services can standardize on.
There was an architecture workgroup specifically designed to understand past architectural decisions in OpenStack, and what the differences are in the projects, and how to address some of those issues, but from lack of participation the group dissolved shortly after the Barcelona summit. This is, again, another example of if you want to make these kinds of massive changes, it's going to take massive involvement and leadership.
The problem in a lot of those cases comes down to development being detached from the actual use cases customers and operators are going to use in the real world. Having a distributed control plane with multiple instances of the api, scheduler, coordinator, and other processes is typically not testable without a larger hardware setup. When you get to large scale deployments, you need an active/active setup for the control plane. It's definitely not something you could develop for or test against on a single laptop with devstack. Especially, if you want to use more than a handful of the OpenStack services.
I think we can all agree with this. Developers don't have a lab with 1000 nodes lying around to hack on. There was OSIC but that's gone. I've been requesting help in Nova from companies to do scale testing and help us out with knowing what the major issues are, and report those back in a form so we can work on those issues. People will report there are issues, but not do the profiling, or at least not report the results of profiling, upstream to help us out. So again, this is really up to companies that have the resources to do this kind of scale testing and report back and help fix the issues upstream in the community. That doesn't require OpenStack 2.0.
-- Thanks, Matt __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev