Hi David, Thank you for caring.
TL;DR: Quality isn't quite where we'd like it to be, action is being taken, backing out uplifts isn't always straightforward, too many simultaneous branches/releases for too long, more integration tests needed, which branch should we dogfood? I did some intensive dogfooding of v1-train during my week of PTO recently and I agree that quality isn't quite where we'd like it to be right now. Action is being taken to resolve this, including a joint UX, Engineering & QA SWAT team which was assigned just yesterday to to closely focus on some key issues in order to get v1.0.1 into a more shippable state and something we can be proud of. To address the case of bug 855021 in particular, unfortunately I was on PTO when this bug was discovered and it took a while for it to get assigned, but in this case simply backing out the change wouldn't have been that simple. Firstly, it took quite a long time to figure out exactly what combination of uplifts had caused the regression (the bug wasn't reproduceable on master at any point). Even once the cause was established (an uplifted patch was subtly dependent on a patch which hadn't been requested for uplift), backing out the patch would have caused another regression, then backing out the patch which caused that regression would have caused another regression, and so on. In the end it turned out to be simpler to write a patch specifically for v1-train and v1.0.1 to resolve the issue, but that wasn't obvious from the beginning. I have to take responsibility for that particular bug because it was caused by uplifts that I was asked to carry out myself and I should have spotted it during a mammoth merge conflict resolution. But I think this is a symptom of the wider issue of usage of the b2g18/v1-train branches and how our different branches have diverged as they've existed for longer than anticipated, this has been discussed at some length ( https://groups.google.com/d/msg/mozilla.dev.b2g/dgRjI_kxSCM/xg-x30u0HKMJ). This is something I think will improve in the future as we change the way in which we do branching and learn lessons about trying to work on too many different releases simultaneously. We also have a lot of room for improvement on our test coverage and whilst there's some awesome work being done on automated smoketests, it would be great to see more support for Gaia developers with regards to writing more integration tests (in JavaScript, not Python). I agree with your point about dogfooding and would encourage everyone to do so, but in the short term this does raise the question of which branch we should be dogfooding. Trunk? v1.1? v1.0.1? OTA updates pushed out to dogfooders are currently based on the b2g18/v1-train branches, but given the level of ongoing development on 1.0.1 I wonder if we should be dogfooding that branch instead, then move onto 1.1 once 1.0.1 is more stable. It's kind of scary that nobody is dogfooding 1.0.1. Ben On Thu, Apr 4, 2013 at 5:51 PM, L. David Baron <[email protected]> wrote: > Back in November I wrote a blog post called "Eating dogfood and > shipping software": http://dbaron.org/log/20121119-dogfood > > Shipping good end-user software requires that the people developing > the software and managing the development of the software actually > use that software and understand the experience of users using that > software. I think that understanding has been critical to Mozilla's > past successes in shipping end-user software. Using the software > leads to an understanding of which problems are the important ones, > and it leads to the ability to test and polish the user experience. > > Those of us who have worked on Firefox for a long time often take > this feedback cycle for granted. But I'm worried that Firefox OS > isn't getting enough of this sort of feedback. > > Getting this feedback requires, of course, that people use the phone > daily. And that, in turn, requires that they consider it usable. > > I've been trying to use builds of v1-train for the last few months, > and my experience lately hasn't been that great; it's been quite a > few weeks since I've pulled a new tree and found the result usable > on a basic level. The most recent example (this week) is > https://bugzilla.mozilla.org/show_bug.cgi?id=855021 . Frankly, I'm > *shocked* that this sort of thing doesn't lead to immediate backout. > (I don't think this is an issue related to anybody involved in that > particular bug; I think it's about a culture that the project as a > whole needs to build.) Whether or not we have the test coverage to > indicate that basic functionality is broken, when basic > functionality breaks, it needs to get fixed immediately, the same > way that an orange or red tree needs to be fixed immediately. > > (Backing out isn't a punishment. Backing out is a way to keep the > tree in an always-usable state. And landing again, when the problem > is fixed, should be easy. Developers should expect to be backed out > some (small) percentage of the time.) > > It would obviously be better if these problems *did* turn the tree > orange. But even when they don't, they need to be treated with the > same priority. > > It should be an expectation that everyone working on the product is > using the product (and updating it regularly). And it should be an > expectation that discovering a problem serious enough to interfere > with that is a high priority (i.e., needs to be dealt with within > hours most of the time, which may mean simply checking that somebody > else has filed a bug and is looking for the regression range, or it > may mean doing that). > > -David > > -- > 𝄞 L. David Baron http://dbaron.org/ 𝄂 > 𝄢 Mozilla http://www.mozilla.org/ 𝄂 > _______________________________________________ > dev-b2g mailing list > [email protected] > https://lists.mozilla.org/listinfo/dev-b2g > -- Ben Francis http://tola.me.uk _______________________________________________ dev-b2g mailing list [email protected] https://lists.mozilla.org/listinfo/dev-b2g
