On Sun, 21 May 2017, Monty Taylor wrote:
As the discussion around PostgreSQL has progressed, it has come clear to me that there is a decently deep philosophical question on which we do not currently share either definition or agreement. I believe that the lack of clarity on this point is one of the things that makes the PostgreSQL conversation difficult.
Good analysis. I think this does hit to at least some of the core differences, maybe even most. And as with so many other things we do in OpenStack, because we have landed somewhere in the middle between the two positions we find ourselves in a pickle (see, for example, the different needs for and attitudes to orchestration underlying this thread [1]). You're right to say we need to pick one and move in that direction but our standard struggles with reaching agreement across the entire community, especially on an opinionated position, will need to be overcome. Writing about it to make it visible is a good start.
In the "external" approach, we document the expectations and then write the code assuming that the database is set up appropriately. We may provide some helper tools, such as 'nova-manage db sync' and documentation on the sequence of steps the operator should take.In the "active" approach, we still document expectations, but we also validate them. If they are not what we expect but can be changed at runtime, we change them overriding conflicting environmental config, and if we can't, we hard-stop indicating an unsuitable environment. Rather than providing helper tools, we perform the steps needed ourselves, in the order they need to be performed, ensuring that they are done in the manner in which they need to be done.
I think there's a middle ground here which is "externalize but validate" which is: * document expectations * validate them * do _not_ change at runtime, but tell people what's wrong
Some operations have one and only one "right" way to be done. For those operations if we take an 'active' approach, we can implement them once and not make all of our deployers and distributors each implement and run them. However, there is a cost to that. Automatic and prescriptive behavior has a higher dev cost that is proportional to the number of supported architectures. This then implies a need to limit deployer architecture choices.
That "higher dev cost" is one of my objections to the 'active' approach but it is another implication that worries me more. If we limit deployer architecture choices at the persistence layer then it seems very likely that we will be tempted to build more and more power and control into the persistence layer rather than in the so-called "business" layer. In my experience this is a recipe for ossification. The persistence layer needs to be dumb and replaceable.
On the other hand, taking an 'external' approach allows us to federate the work of supporting the different architectures to the deployers. This means more work on the deployer's part, but also potentially a greater amount of freedom on their part to deploy supporting services the way they want. It means that some of the things that have been requested of us - such as easier operation and an increase in the number of things that can be upgraded with no-downtime - might become prohibitively costly for us to implement.
That's not necessarily the case. Consider that in an external approach, where the persistence layer is opaque to the application, it means that third parties (downstream consumers, the market, the invisible hand, etc) have the option to do all kinds of wacky stuff. Probably avec containers™. In that model, the core functionality is simple and adequate but not deluxe. Deluxe is an after-market add on.
BUT - without a decision as to what our long-term philosophical intent in this space is that is clear and understandable to everyone, we cannot have successful discussions about the impact of implementation choices, since we will not have a shared understanding of the problem space or the solutions we're talking about.
Yes.
For my part - I hear complaints that OpenStack is 'difficult' to operate and requests for us to make it easier. This is why I have been advocating some actions that are clearly rooted in an 'active' worldview.
If OpenStack were more of a monolith instead of a system with 3 to many different databases, along with some optional number of other ways to do other kinds of (short term) persistence, I would find the 'active' model a good option. If we were to start over I'd say let's do that. But as it stands implementing actually useful 'active' management of the database feels like a very large amount of work that will take so long that by the time we complete it it will be not just out of date but also limit us. External but validate feels much more viable. What we really want is that people can get reasonably good results without trying that hard and great (but also various) results with a bit of effort. So that means it ought to be possible to do enough OpenStack to think it is cool with whatever database I happen to have handy. And then once I dig it I should be able to manage it effectively using the solutions that are best for my environment.
Finally, this is focused on the database layer but similar questions arise in other places. What is our philosophy on prescriptive/active choices on our part coupled with automated action and ease of operation vs. expanded choices for the deployer at the expense of configuration and operational complexity. For now let's see if we can answer it for databases, and see where that gets us.
I continue to think that this issue is somewhat special at the persistence layer because of the balance of who it impacts the most: the deployers, developers, and distributors more than the users[2]. Making global conclusions about external and active based on this issue may be premature.
Thanks for reading.
Thanks for writing. You've done a lot of writing lately. Is good. [1] http://lists.openstack.org/pipermail/openstack-operators/2017-May/013464.html [2] That our database choices impacts the users (e.g., the case and encoding things at the API layer) is simply a mistake that we all made together, a bug to be fixed, not an architectural artifact. -- Chris Dent ┬──┬◡ノ(° -°ノ) https://anticdent.org/ freenode: cdent tw: @anticdent
__________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev