Hey everyone,
On Mon, Dec 16, 2013 at 1:15 PM, Łukasz 'sil2100' Zemczak < lukasz.zemc...@canonical.com> wrote: > > Right, it might sound a bit scary indeed. The main thing that Didier > wanted to say is that we want all the tests to be reliable. I totally agree > We no longer > do re-runs in case of tests that are failing (due to flakiness) ...nor should you! > and are > no longer allowed to release a component that has an unreliable test. > Excellent! > Tests that are flaky only add to the overall confusion. > > We are in violent agreement here :) > Of course, as you say, the best way is to fix the test properly! > Right on :) > Sometimes though, as we experienced, it's either not that easy to do or > there are simply no resources available in a team. Absolutely - and the QA department probably hasn't been as helpful as we'd like to be - sorry for that! [1] As I mentioned earlier, hopefully the TnT team, and Dave Morley's daily test work should help you clear up the remaining issues. > Integration test > reliability needs to be also relatively high-priority whenever an issue > related pops up - but with many other, seemingly more important tasks > queued, sometimes flaky tests stay around for too long. The key word there for me is "seemingly". I absolutely understand that you guys want to ship new features. My point is that while shipping new features might seem to be more important than fixing your existing test cases, I think you're mistaken :) It's a nice idea that we'll get the image to go "green", and *then* we'll start making sure our tests are reliable. However, I fear that's not how human beings are wired up inside. Instead, I believe we'll get the image green, then have this exact same discussion the next time some test case starts failing. The general wisdom around automated testing is that your test cases should be treated with the same level of respect and attention as your production code. Personally, I don't think we're doing that, and I can point to many examples where the application has changed (possibly regressed, possibly not - that's a judgment call) and the test cases haven't been updated. > And since we > won't release a component with such anomalies, sometimes temporary > skipping the test is the only way to release a component. Since what use > is a test that cannot give proper results! > > I totally agree that test results should be meaningful. However, by skipping the test, you're not improving the quality of the application, you're just reducing the amount of information you have to hand. I think our test results are meaningful already - sometimes they're a little hard to interpret. We need to make autopilot do a better job of providing you with all the information you need in a test result, and we're working with the CI team on that already. > I guess those words were to enforce this high-priority to fixing > integration flakiness. > > The latest image (I'm looking at this: http://ci.ubuntu.com/smokeng/trusty/touch/maguro/69:20131216.1:20131211.2/5498/ ) is looking pretty good. I can't see any application crashes, or autopilot crashes. All the failures I see are either application regressions or test regressions. If they're the former, they should be fixes as a matter of urgency; if they're the latter, skipping them is just regressing our test coverage. > Good to hear about the TnT team - I guess we'll be poking you guys about > some of the problems we'll be encountering. > > The team's mandate is to improve the tools the QA department provide to the rest of the development community. We're focusing on autopilot right now, so you should start to see some changes there. If you have feature requests for autopilot (that aren't already tracked in bugs), we're the ones to talk to. If you're seeing issues in the tool itself (as opposed to a test case), we're the ones to talk to. Anyway, I hope that clears up some confusion. Cheers, [1] In our defence, the number of developers we're supporting is crazy high. If you look at the ratio of QA engineers to feature engineers in other companies that have a strong culture of quality, we do pretty well, considering how few QA engineers we have. I'm not complaining, nor am I offering any excuses: I Actually think we've done pretty darn well over the last few years, but it's hard to meet everyone's requests in a timely manner with so few resources. -- Thomi Richards thomi.richa...@canonical.com
-- Mailing list: https://launchpad.net/~ubuntu-phone Post to : ubuntu-phone@lists.launchpad.net Unsubscribe : https://launchpad.net/~ubuntu-phone More help : https://help.launchpad.net/ListHelp