On 4/18/13 6:41 PM, "David Nalley" <da...@gnsa.us> wrote:
>On Thu, Apr 18, 2013 at 6:26 PM, Will Chan <will.c...@citrix.com> wrote: >> >> > -----Original Message----- >> > From: Chip Childers [mailto:chip.child...@sungard.com] >> > Sent: Monday, April 15, 2013 7:22 AM >> > To: dev@cloudstack.apache.org >> > Cc: cloudstack-...@incubator.apache.org >> > Subject: Re: [ASFCS42] Proposed schedule for our next release >> > >> > On Thu, Apr 11, 2013 at 02:50:02PM -0700, Animesh Chaturvedi wrote: >> > > >> > > I want to call out my concern on technical debt we have accumulated >>so >> > far. >> > > >> > > I did an analysis on JIRA bugs yesterday night PST on "Affects >> > > Version = 4.1" and created since Dec 2012 >> > > >> > > Total records : 429 >> > > Resolution Type (Invalid, Duplicate, Cannot reproduce etc.) : 87 (30 >> > > Blockers, 27 Critical, 27 Major, 4 Minor) Valid Defects : 429-87= >>342 >> > > Fixed : 246 (60 Blockers, 70 Critical, 99 Majors) out of which 217 >> > > were fixed since Feb Unresolved : 96 (1 Blocker, 8 Critical, 64 >>Major) >> > > >> > > With this data it looks like we have fixed 2/3 of valid defects in >>little over >> > 2 months and pretty much deferring around 1/3 rd of issues for future >> > release. >> > > >> > > I also looked at overall backlog of bugs (Critical, Major and >>Blockers only) >> > as of 4/10/2013 - 10:0PM PST. >> > > >> > > 284 open (18 Blocker, 38 Critical, 228 Major) ; By Fix version >> > > - Release 4.0.x and prior: 13 >> > > - 4.1: 70 >> > > - 4.2 : 97 >> > > - Future: 8 >> > > - No version: 107 >> > > >> > > Looking at that we fixed 217 bugs in roughly 2 months during 4.1 >>cycle, >> > fixing the backlog of bug will probably take us 2 months. Should we >>extend >> > the 4.2 test cycle by 2 months [Original Schedule: 6/1 - 7/22, >>Extended >> > Schedule: 6/1-9/22] to reduce the technical debt significantly? I >>would like >> > to hear how community wants to address technical debt. Based on the >> > input and consensus I will publish the agreed schedule next week. >> > > >> > > >> > >> > I don't think that an extension of time changes bug counts really. >>IMO, we >> > need to pull together to have some bug-fix focused effort applied to >>the >> > code-base. It's also another reason that I'm so big on making sure >>that >> > automated tests come in with the new features. That doesn't address >>test >> > scenarios that human testers can come up with, but if a developer >>spends >> > the time to think about testing the basic feature and codifies that, >>we >> > should at least avoid the "this actually doesn't work at all" types >>of bugs. >> > >> > There's a school of thought that says, don't build another feature >>until you >> > have sorted out the known bugs in the current features. I don't >>think we >> > could really pull that off, but perhaps a different thread to rally >>people >> > around the bug backlog is in order? >> > >> > -chip >> >> Sorry to chime in so late to this thread as I've been offsite for the >>better part of this week. I was one of the original 4 month release >>crowd but after the recent two releases of ACS, I'm starting to wonder >>if we shouldn't start moving this to a 6 month cycle instead of two. >>Here are some high level observations based on the previous two releases: >> >> 1. It doesn't seem like we are on a true 4 month time based release >>schedule. Both 4.0 and 4.1 were delayed more than several weeks past the >> original proposed GA date. 4.0 was released 11/6 and let's assume that >>4.1 will ship within a week or two. That's almost a 6 month release >>cycle. > >So both 4.0 and 4.1 strike me as extraordinary. 4.0 was our first >release - and we had lots of issues to resolve. 4.1 introduced a ton >of packaging and name changes that I also consider to be hopefully one >time. Really - we've only been through our release cycle once, so I am >not ready to declare it perpetually behind schedule. > > >> Every release incurs a fixed cost of release notes, upgrade testing, >>etc. that I suspect at least eats a month worth of time depending on >>people's >> schedule. That's 3 months out of the year rather than two if we can >>get a 6 months cycle. We can use that extra month for other purposes if >>need >> be. I suppose if we want to continue to release past the proposed hard >>GA date, then I guess it doesn't matter if it's 4 or 6 months. It's >>basically a >> release when the release mgmt. team feels it's right to release based >>on current bugs, etc. >> > >Having seen the point releases twice now, which still need upgrade >testing, release notes, etc I don't get the feeling that the >'overhread' referred to above is the problem. Joe may disagree with >me. > >> 2. As more and more features/development go in, it just means more >>destabilization of the code. 4.0 was delayed and the majority of that >>work was >> licensing files. 4.1 got just a bit more complicated with new feature >>development and the delay is now much longer. Not all features are >>created >> equal in terms of testing. Some may require more time to develop but >>may not impact the entire system like for example, adding a new >>hypervisor. >> However, work like refactoring vm sync or other more internal code >>could affect the entire stack and require more QA time. We need extra >>time for >> new code to settle in. >> > >I wonder why we would merge feature that we can't prove doesn't break >the entire stack and prove that it works. Some of this is the missing >automation you talk about below. Essentially we have no way, sometimes >until months after the merge, to tell if something works or not >because we relay on manual QA to test it. > >> 3. ACS is still dependent largely on manual QA. Let's face it, our >>automated testing/unit testing isn't mature enough quite yet and we >>cannot always expect manual QA to be there and on ACS schedule. >>CloudStack releases have some type of quality expectations as well as >>support for upgrades. Upgrades and migration scripts aren't that easily >>automatable. Chip and others have been very diligent on ensuring that >>code check in has the appropriate tests but it's not there yet. >> >> 4. ACS development is based on volunteer work and many of us have a >>$dayjob and may not be able to assist with fixing bugs in ACS schedule. >>Having only a couple of months to fix bugs and expect others to follow >>our ACS schedule seems a bit rushed. Wearing my Citrix hat now, I can >>tell you that 2 months of QA and bug fixing is not enough to release >>quality GA release. And that is with me breathing down the necks of >>many of the engineers to get them fixed on time. ACS does not have this >>type of culture and nor should it. Given that, we should be a bit more >>flexible in terms of allowing people eventually to act on issues. >> > > >So a couple of other comments. >We have folks clamoring for the awesome new features. To the point >they are creating derivative works (which tells me we are doing some >things right as folks are finding it easy enough to do) > >What I gathered from reading the above doesn't really have anything to >do with schedule: >* New development destabilizes our code base, and is a threat to >quality and the release schedule >* We can not depend on the current level of manual QA to be present >going forward. > >This brings me to conclusion that as a community we should seriously >temper our inclusion of new features and make our focus automated >testing until such time as pushing a release out is less months of >manual QA processes and more of a decision. This makes me want to >raise the barrier for merges even higher. Perhaps running the entire >Marvin suite with the proposed merge is what we need to begin >mandating. > >--David "who wishes he had kept working on Automated QA tasks" Nalley :) How does "temper the inclusion of new features" jive with "folks clamoring for awesome new features" ? -- Chiradeep