Re: [ASFCS42] Proposed schedule for our next release

David Nalley Thu, 18 Apr 2013 19:29:24 -0700

On Thu, Apr 18, 2013 at 10:26 PM, Chiradeep Vittal
<chiradeep.vit...@citrix.com> wrote:
>
>
> On 4/18/13 6:41 PM, "David Nalley" <da...@gnsa.us> wrote:
>
>>On Thu, Apr 18, 2013 at 6:26 PM, Will Chan <will.c...@citrix.com> wrote:
>>>
>>> > -----Original Message-----
>>> > From: Chip Childers [mailto:chip.child...@sungard.com]
>>> > Sent: Monday, April 15, 2013 7:22 AM
>>> > To: dev@cloudstack.apache.org
>>> > Cc: cloudstack-...@incubator.apache.org
>>> > Subject: Re: [ASFCS42] Proposed schedule for our next release
>>> >
>>> > On Thu, Apr 11, 2013 at 02:50:02PM -0700, Animesh Chaturvedi wrote:
>>> > >
>>> > > I want to call out my concern on technical debt we have accumulated
>>>so
>>> > far.
>>> > >
>>> > >  I did an analysis on JIRA bugs yesterday night PST on "Affects
>>> > > Version = 4.1" and created since Dec 2012
>>> > >
>>> > > Total records : 429
>>> > > Resolution Type (Invalid, Duplicate, Cannot reproduce etc.) : 87 (30
>>> > > Blockers, 27 Critical, 27 Major, 4 Minor) Valid Defects  : 429-87=
>>>342
>>> > > Fixed : 246 (60 Blockers, 70 Critical, 99 Majors) out of which 217
>>> > > were fixed since Feb Unresolved : 96 (1 Blocker, 8 Critical, 64
>>>Major)
>>> > >
>>> > > With this data it looks like we have fixed 2/3 of valid defects in
>>>little over
>>> > 2 months and pretty much deferring around 1/3 rd of issues for future
>>> > release.
>>> > >
>>> > > I also looked at overall backlog of bugs (Critical, Major and
>>>Blockers only)
>>> > as of 4/10/2013 - 10:0PM PST.
>>> > >
>>> > > 284 open (18 Blocker, 38 Critical, 228 Major) ; By Fix version
>>> > >     -  Release 4.0.x and prior: 13
>>> > >     -  4.1: 70
>>> > >     -  4.2 : 97
>>> > >     -  Future: 8
>>> > >     -  No version: 107
>>> > >
>>> > > Looking at that we fixed 217 bugs in roughly 2 months during 4.1
>>>cycle,
>>> > fixing the backlog of bug  will probably take us 2 months.  Should we
>>>extend
>>> > the 4.2 test cycle by 2 months [Original Schedule: 6/1 - 7/22,
>>>Extended
>>> > Schedule: 6/1-9/22] to reduce the technical debt significantly? I
>>>would like
>>> > to hear how community wants to address technical debt. Based on the
>>> > input and consensus I will publish the agreed schedule next week.
>>> > >
>>> > >
>>> >
>>> > I don't think that an extension of time changes bug counts really.
>>>IMO, we
>>> > need to pull together to have some bug-fix focused effort applied to
>>>the
>>> > code-base.  It's also another reason that I'm so big on making sure
>>>that
>>> > automated tests come in with the new features.  That doesn't address
>>>test
>>> > scenarios that human testers can come up with, but if a developer
>>>spends
>>> > the time to think about testing the basic feature and codifies that,
>>>we
>>> > should at least avoid the "this actually doesn't work at all" types
>>>of bugs.
>>> >
>>> > There's a school of thought that says, don't build another feature
>>>until you
>>> > have sorted out the known bugs in the current features.  I don't
>>>think we
>>> > could really pull that off, but perhaps a different thread to rally
>>>people
>>> > around the bug backlog is in order?
>>> >
>>> > -chip
>>>
>>> Sorry to chime in so late to this thread as I've been offsite for the
>>>better part of this week.  I was one of the original 4 month release
>>>crowd but after the recent two releases of ACS, I'm starting to wonder
>>>if we shouldn't start moving this to a 6 month cycle instead of two.
>>>Here are some high level observations based on the previous two releases:
>>>
>>> 1. It doesn't seem like we are on a true 4 month time based release
>>>schedule.  Both 4.0 and 4.1 were delayed more than several weeks past the
>>> original proposed GA date.  4.0 was released 11/6 and let's assume that
>>>4.1 will ship within a week or two.  That's almost a 6 month release
>>>cycle.
>>
>>So both 4.0 and 4.1 strike me as extraordinary. 4.0 was our first
>>release - and we had lots of issues to resolve. 4.1 introduced a ton
>>of packaging and name changes that I also consider to be hopefully one
>>time. Really - we've only been through our release cycle once, so I am
>>not ready to declare it perpetually behind schedule.
>>
>>
>>> Every release incurs a fixed cost of release notes, upgrade testing,
>>>etc. that I suspect at least eats a month worth of time depending on
>>>people's
>>> schedule.  That's 3 months out of the year rather than two if we can
>>>get a 6 months cycle.  We can use that extra month for other purposes if
>>>need
>>> be.  I suppose if we want to continue to release past the proposed hard
>>>GA date, then I guess it doesn't matter if it's 4 or 6 months.  It's
>>>basically a
>>> release when the release mgmt. team feels it's right to release based
>>>on current bugs, etc.
>>>
>>
>>Having seen the point releases twice now, which still need upgrade
>>testing, release notes, etc I don't get the feeling that the
>>'overhread' referred to above is the problem. Joe may disagree with
>>me.
>>
>>> 2. As more and more features/development go in, it just means more
>>>destabilization of the code.  4.0 was delayed and the majority of that
>>>work was
>>> licensing files.  4.1 got just a bit more complicated with new feature
>>>development and the delay is now much longer.  Not all features are
>>>created
>>> equal in terms of testing.  Some may require more time to develop but
>>>may not impact the entire system like for example, adding a new
>>>hypervisor.
>>> However, work like refactoring vm sync or other more internal code
>>>could affect the entire stack and require more QA time.  We need extra
>>>time for
>>> new code to settle in.
>>>
>>
>>I wonder why we would merge feature that we can't prove doesn't break
>>the entire stack and prove that it works. Some of this is the missing
>>automation you talk about below. Essentially we have no way, sometimes
>>until months after the merge, to tell if something works or not
>>because we relay on manual QA to test it.
>>
>>> 3. ACS is still dependent largely on manual QA.  Let's face it, our
>>>automated testing/unit testing isn't mature enough quite yet and we
>>>cannot always expect manual QA to be there and on ACS schedule.
>>>CloudStack releases have some type of quality expectations as well as
>>>support for upgrades.  Upgrades and migration scripts aren't that easily
>>>automatable.  Chip and others have been very diligent on ensuring that
>>>code check in has the appropriate tests but it's not there yet.
>>>
>>> 4. ACS development is based on volunteer work and many of us have a
>>>$dayjob and may not be able to assist with fixing bugs in ACS schedule.
>>>Having only a couple of months to fix bugs and expect others to follow
>>>our ACS schedule seems a bit rushed.  Wearing my Citrix hat now, I can
>>>tell you that 2 months of QA and bug fixing  is not enough to release
>>>quality GA release.  And that is with me breathing down the necks of
>>>many of the engineers to get them fixed on time.  ACS does not have this
>>>type of culture and nor should it.   Given that, we should be a bit more
>>>flexible in terms of allowing people eventually to act on issues.
>>>
>>
>>
>>So a couple of other comments.
>>We have folks clamoring for the awesome new features. To the point
>>they are creating derivative works (which tells me we are doing some
>>things right as folks are finding it easy enough to do)
>>
>>What I gathered from reading the above doesn't really have anything to
>>do with schedule:
>>* New development destabilizes our code base, and is a threat to
>>quality and the release schedule
>>* We can not depend on the current level of manual QA to be present
>>going forward.
>>
>>This brings me to conclusion that as a community we should seriously
>>temper our inclusion of new features and make our focus automated
>>testing until such time as pushing a release out is less months of
>>manual QA processes and more of a decision. This makes me want to
>>raise the barrier for merges even higher. Perhaps running the entire
>>Marvin suite with the proposed merge is what we need to begin
>>mandating.
>>
>>--David "who wishes he had kept working on Automated QA tasks" Nalley :)
>
> How does "temper the inclusion of new features" jive with "folks clamoring
> for awesome new features" ?
>


It doesn't.
But as Animesh indicates all we are really doing is racking up
technical debt. We will have to pay the Piper one way or another.

--David

Re: [ASFCS42] Proposed schedule for our next release

Reply via email to