On 08/29/2012 04:03 PM, Ehsan Akhgari wrote:
I don't believe that the current situation is acceptable, especially
with the recent focus on performance (through the Snappy project), and I
would like to ask people if they have any ideas on what we can do to fix
this. The fix might be turning off some Talos tests if they're really
not useful, asking someone or a group of people to go over these test
results, get better tools with them, etc. But _something_ needs to
happen here.
Thanks for starting this discussion. I have some suggestions:
* Less is more. We can pay more attention to tests if every alert is
for something we care about. We can *track* stuff like Trace Malloc
Allocs if there are people who find the data useful in their work, but
we should not *alert* on it unless it is a key user-facing metric.
* I don't like our reactive approach that focuses on trying to identify
regressions, and then decide whether to fix them in place, back them
out, or ignore them. Instead we should proactively set goals for what
our performance should be, and focus on the best way to get it there (or
keep it there). The goals should be based the desired user experience,
and we should focus only on metrics that reflect those user experience
goals.
* Engineering teams should have to answer for these metrics; for example
they should be included in quarterly goals. At Amazon, item #1 in the
quarterly goals for each team was always to meet our metrics
commitments. Slipping a key metric past a certain threshold should stop
other work for the team until it's rectified.
* We need staff whose job includes deciding which regressions are
meaningful, identifying the cause, following up to make sure it's backed
out or fixed, and refining the process and tools used to make all this
possible. Too much slips through the cracks when we leave this to
volunteers (including "employeeteers" like Ehsan or me).
_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform