Hi David,

Thank you for caring.

TL;DR: Quality isn't quite where we'd like it to be, action is being taken,
backing out uplifts isn't always straightforward, too many simultaneous
branches/releases for too long, more integration tests needed, which branch
should we dogfood?

I did some intensive dogfooding of v1-train during my week of PTO recently
and I agree that quality isn't quite where we'd like it to be right now.
Action is being taken to resolve this, including a joint UX, Engineering &
QA SWAT team which was assigned just yesterday to to closely focus on some
key issues in order to get v1.0.1 into a more shippable state and something
we can be proud of.

To address the case of bug 855021 in particular, unfortunately I was on PTO
when this bug was discovered and it took a while for it to get assigned,
but in this case simply backing out the change wouldn't have been that
simple. Firstly, it took quite a long time to figure out exactly what
combination of uplifts had caused the regression (the bug wasn't
reproduceable on master at any point). Even once the cause was established
(an uplifted patch was subtly dependent on a patch which hadn't been
requested for uplift), backing out the patch would have caused another
regression, then backing out the patch which caused that regression would
have caused another regression, and so on. In the end it turned out to be
simpler to write a patch specifically for v1-train and v1.0.1 to resolve
the issue, but that wasn't obvious from the beginning.

I have to take responsibility for that particular bug because it was caused
by uplifts that I was asked to carry out myself and I should have spotted
it during a mammoth merge conflict resolution. But I think this is a
symptom of the wider issue of usage of the b2g18/v1-train branches and how
our different branches have diverged as they've existed for longer than
anticipated, this has been discussed at some length (
https://groups.google.com/d/msg/mozilla.dev.b2g/dgRjI_kxSCM/xg-x30u0HKMJ).
This is something I think will improve in the future as we change the way
in which we do branching and learn lessons about trying to work on too many
different releases simultaneously.

We also have a lot of room for improvement on our test coverage and whilst
there's some awesome work being done on automated smoketests, it would be
great to see more support for Gaia developers with regards to writing more
integration tests (in JavaScript, not Python).

I agree with your point about dogfooding and would encourage everyone to do
so, but in the short term this does raise the question of which branch we
should be dogfooding. Trunk? v1.1? v1.0.1? OTA updates pushed out to
dogfooders are currently based on the b2g18/v1-train branches, but given
the level of ongoing development on 1.0.1 I wonder if we should be
dogfooding that branch instead, then move onto 1.1 once 1.0.1 is more
stable. It's kind of scary that nobody is dogfooding 1.0.1.

Ben


On Thu, Apr 4, 2013 at 5:51 PM, L. David Baron <[email protected]> wrote:

> Back in November I wrote a blog post called "Eating dogfood and
> shipping software": http://dbaron.org/log/20121119-dogfood
>
> Shipping good end-user software requires that the people developing
> the software and managing the development of the software actually
> use that software and understand the experience of users using that
> software.  I think that understanding has been critical to Mozilla's
> past successes in shipping end-user software.  Using the software
> leads to an understanding of which problems are the important ones,
> and it leads to the ability to test and polish the user experience.
>
> Those of us who have worked on Firefox for a long time often take
> this feedback cycle for granted.  But I'm worried that Firefox OS
> isn't getting enough of this sort of feedback.
>
> Getting this feedback requires, of course, that people use the phone
> daily.  And that, in turn, requires that they consider it usable.
>
> I've been trying to use builds of v1-train for the last few months,
> and my experience lately hasn't been that great; it's been quite a
> few weeks since I've pulled a new tree and found the result usable
> on a basic level.  The most recent example (this week) is
> https://bugzilla.mozilla.org/show_bug.cgi?id=855021 .  Frankly, I'm
> *shocked* that this sort of thing doesn't lead to immediate backout.
> (I don't think this is an issue related to anybody involved in that
> particular bug; I think it's about a culture that the project as a
> whole needs to build.)  Whether or not we have the test coverage to
> indicate that basic functionality is broken, when basic
> functionality breaks, it needs to get fixed immediately, the same
> way that an orange or red tree needs to be fixed immediately.
>
> (Backing out isn't a punishment.  Backing out is a way to keep the
> tree in an always-usable state.  And landing again, when the problem
> is fixed, should be easy.  Developers should expect to be backed out
> some (small) percentage of the time.)
>
> It would obviously be better if these problems *did* turn the tree
> orange.  But even when they don't, they need to be treated with the
> same priority.
>
> It should be an expectation that everyone working on the product is
> using the product (and updating it regularly).  And it should be an
> expectation that discovering a problem serious enough to interfere
> with that is a high priority (i.e., needs to be dealt with within
> hours most of the time, which may mean simply checking that somebody
> else has filed a bug and is looking for the regression range, or it
> may mean doing that).
>
> -David
>
> --
> 𝄞   L. David Baron                         http://dbaron.org/   𝄂
> 𝄢   Mozilla                           http://www.mozilla.org/   𝄂
> _______________________________________________
> dev-b2g mailing list
> [email protected]
> https://lists.mozilla.org/listinfo/dev-b2g
>



-- 
Ben Francis
http://tola.me.uk
_______________________________________________
dev-b2g mailing list
[email protected]
https://lists.mozilla.org/listinfo/dev-b2g

Reply via email to