So as the person who drove the rolling upgrade requirements into keystone in 
this cycle (because we have real customers that need it), and having first 
written the keystone upgrade process to be “versioned object ready” (because I 
assumed we would do this the same as everyone else), and subsequently 
re-written it to be “DB Trigger ready”…and written migration scripts for both 
these cases for the (in fact very minor) DB changes that keystone has in 
Newton…I guess I should also weigh in here :-)

For me, the argument comes down to:

a) Is the pain that needs to cured by the rolling upgrade requirement broadly 
in the same place in the various projects (i.e. nova, glance, keystone etc.)? 
If it is, then working towards a common solution is always preferable (whatever 
that solution is)
b) I would characterise the difference between the trigger approach, the 
versioned objects approach and the “n-app approach as: do we want a small 
amount of very nasty complexity vs. spreading that complexity out to be not as 
bad, but over a broader area. Probably fewer people can (successfully) write 
the nasty complexity trigger work, than they can, say, the “do it all in the 
app” work. LOC (which, of course, isn’t always a good measure) is also 
reflected in this characterisation, with the trigger code having probably the 
fewest LOC, and the app code having the greatest. 
c) I don’t really follow the argument that somehow the trigger code in 
migrations is less desirable because we use higher level sqla abstractions in 
our main-line code - I’ve always seen migration as different and expected that 
we might have to do strange things there. Further, we should be aware of the 
time-preiods…the migration cycle is a small % of elapsed time the cloud is 
running (well, hopefully) - so again, do we solve the “issues of migration” as 
part of the migration cycle (which is what the trigger approach does) or make 
our code be (effectively) continually migration aware (using versioned objects 
or in-app code)
d) The actual process (for an operator) is simpler for a rolling upgrade 
process with Triggers than the alternative (since you don’t require several of 
the checkpoints, e.g. when you know you can move out of compatibility mode 
etc.). Operator error is also a cause of problems in upgrades (especially as 
the complexity of a cloud increases).

From a purely keystone perspective, my gut feeling is that actually the trigger 
approach is likely to lead to a more robust, not less, solution - due to the 
fact that we solve the very specific problems of a given migration (i.e. need 
to keep column A in sync with Column B) or a short period of time, right at the 
point of pain, with well established techniques - albeit they be complex ones 
that need experienced coders in those techniques. I actually prefer the small 
locality of complexity (marked with “there be dragons there, be careful”), as 
opposed to spreading medium pain over a large area, which by definition is 
updated by many…and  may do the wrong thing inadvertently. It is simpler for 
operators.

I do recognise, however, the “let’s not do different stuff for a core project 
like keytsone” as a powerful argument. I just don’t know how to square this 
with the fact that although I started in the “versioned objects camp”, having 
worked through many of the issues have come to believe that the Trigger 
approach will be more reliable overall for this specific use case. From the 
other reaction to this thread, I don’t detect a lot of support for the Trigger 
approach becoming our overall, cross-project solution.

The actual migrations in Keystone needed for Newton are minor, so one 
possibility is we use keystone as a guinea pig for this approach in Newton…if 
we had to undo this in a subsequent release, we are not talking about rafts of 
migration code to redo.

Henry



> On 1 Sep 2016, at 09:45, Robert Collins <robe...@robertcollins.net> wrote:
> 
> On 31 August 2016 at 01:57, Clint Byrum <cl...@fewbar.com> wrote:
>> 
>> 
>> It's simple, these are the holy SQL schema commandments:
>> 
>> Don't delete columns, ignore them.
>> Don't change columns, create new ones.
>> When you create a column, give it a default that makes sense.
> 
> I'm sure you're aware of this but I think its worth clarifying for non
> DBAish folk: non-NULL values can change a DDL statements execution
> time from O(1) to O(N) depending on the DB in use. E.g. for Postgres
> DDL requires an exclusive table lock, and adding a column with any
> non-NULL value (including constants) requires calculating a new value
> for every row, vs just updating the metadata - see
> https://www.postgresql.org/docs/9.5/static/sql-altertable.html
> """
> When a column is added with ADD COLUMN, all existing rows in the table
> are initialized with the column's default value (NULL if no DEFAULT
> clause is specified). If there is no DEFAULT clause, this is merely a
> metadata change and does not require any immediate update of the
> table's data; the added NULL values are supplied on readout, instead.
> """
> 
>> Do not add new foreign key constraints.
> 
> What's the reason for this - if it's to avoid exclusive locks, I'd
> note that the other rules above don't avoid exclusive locks - again,
> DB specific, and for better or worse we are now testing on multiple DB
> engines via 3rd party testing.
> 
> https://dev.launchpad.net/Database/LivePatching has some info from our
> experience doing online and very fast offline patches in Launchpad.
> 
> -Rob
> 
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to