On 01/01/2014 02:16 PM, Day, Phil wrote:
<snip>
>> So you are really making 2 assumptions here that aren't valid:
   * it was known to be a backwards compatibility problem - because it wasn't,
and no one provided feedback over the course of 4 days to indicate that it
was (there were some alternative implementation ideas, but no one said
"hold on, this could break X")

That's not really an assumption -  I'm saying it would have been spotted had 
there been more reviews because I know would have spotted it, as I think would 
some of my collegues.   We've been looking at a number of problems with backing 
files recently, including the security  fix we posted a week or so ago, and 
have been working on how to configure some of our images to use ext4 - so it 
was an area that was fresh in our minds.

   * it wasn't urgent - and like I said the tone really was towards this being 
an
important need for trippleo


I'm assuming (or maybe asserting would be a better term) that it wasn't urgent 
because:

- Nothing in the commit message suggests its urgent - it describes it more as a 
major performance enhancement

- Its not linked to a high priority bug or approved blueprint

- The capability to use ext4 already existed via configuration options.  If 
this was needed for the TripelO gate tests (and again no mention of this in the 
commit itself) then why couldn’t the tests of been configured to use ext4 
rather than changing the default ?

So no sorry - I'm not willing yet to back away from my assumption that this 
wasn't an urgently needed fix. especially because of the last point.

It may have been the tone in IRC, but should I really be expected to look there 
to understand the background to a submitted change ?

Again, I think in this case we did make a mistake, but I also think we'd have made the same mistake given the same data in the future. So I don't think review policy changes are the fix here.

Again, mostly opinion there. Backed up with a few footnotes in the response to Tim's post on this thread.

Which is why I don't think saying 3 week minimum review time is helpful,
because if any of the reviewers had thought it was backwards incompatible,
they would have -1ed it. Humans only get so much of that right.


I think I agree that a time limit is the wrong way to go - as in this case the 
number of active reviewers is very variable throughout the year.   Something 
which was more around number and type of reviewers (and maybe we need a 
distinction now that goes beyond just core and non-core) would probably be more 
workable.


So we need to handle this defensively and beef up the automated systems.
Does that require more people adding in defensive tests to those scenarios?
definitely. The fact that right now basically Dean and I are the only ones that
understand our upgrade testing (with a few others
bootstrapping) is way too few people touching that problem.

If seemless upgrade is the most important thing to people, we need more
people in the trenches on the tooling that helps us verify that.

        -Sean
As I said before, I'm not arguing against improving the automated tests, but of 
course we also have to be cognisant that extending the time take to run tests 
is in itself a problem.   And however much effort we can divert into adding 
tests it will take time before that gets us improved coverage of this kind of 
guest / upgrade issue.

Gate time is not a problem, we can run this matrix in parallel.

My complaint here is that "however much effort" is basically largely 0 right now. I know, I'm reviewing and trying to help land what little effort is there.

I'd probably be a little more positive on looking at non code based solutions here if there was significant effort on the code based solutions, and we still had gaps. The fact that there isn't, makes me pretty negative on policy solutions.

In the meantime as you and Rob have pointed out the review process clearly 
isn’t working as well as we'd all like it to, with some changes getting no 
reviews and others some sitting for weeks with many +1's (for example here's a 
bug fix with seven +1's  that has been stuck since 6th Dec 
https://review.openstack.org/#/c/56258/ ) - none of which seems to be really 
correlated with the relative priority of the proposed changes.  So I'm suggest 
that that we look at what can be done to improve the review process now while 
the automated testing is improved.

Has it been brought up in the Nova meeting? That's usually a good place to bring up bug fixes that fall through the cracks.

        -Sean

--
Sean Dague
Samsung Research America
s...@dague.net / sean.da...@samsung.com
http://dague.net

_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to