If this were creating a new bug, for example "oh, your VPCs won't work anymore for this release", or "here's a new UI, but it's really buggy and barely functional" then I'd agree with this train of thought. Instead, we are saying "we recently found out that since 2.2.x cloudstack has had this behavior, and it will be fixed in 4.2"*. That's a totally different thing. If 4.1 ends up being a poor quality release that everyone remembers compared to others, it's not going to be because we didn't address something that has been around for several releases, that nobody has noticed.
* Assuming we verify that it's not a regression, which I'm still very interested in knowing On Wed, May 22, 2013 at 9:51 AM, John Burwell <jburw...@basho.com> wrote: > Marcus, > > I would say that the only thing for an open source project worse than not > releasing is releasing a poor quality release. A late release with high > quality is soon forgotten. An on-time or late release with poor quality > lingers in folks memory. The KDE project made the near fatal mistake of > following the same logic when they release 4.0, and the reputation of KDE 4.x > continues to suffer from it to this day. CloudStack is trusted to run at the > core our user's operations. In my view, if we err, we should err on the side > of quality to avoid of erosion of that trust. If we ever lost that trust, > our new features would never be evaluated. > > Thanks, > -John > > On May 22, 2013, at 11:18 AM, Marcus Sorensen <shadow...@gmail.com> wrote: > >> Thanks for the response. Time sync is certainly an issue, I think one >> of the things we are trying to gauge is whether the system vm >> functionality has been impacted by time sync such that anyone has >> noticed or cared. That's not to detract from the point that having >> time sync is optimal, and affects a lot of things, but functionally, >> back to my item #1, can we confirm that earlier versions have gotten >> out of sync, and if so, do we have bug reports showing that it has >> mattered? >> >> To counter the argument, there are plenty of people looking for the >> features in 4.1, that wouldn't choose cloudstack because it's not >> released yet. Then there's the delay impact to 4.2, and keeping all of >> those features out of the hands of people as well. >> >> For me, the fear is that we end up pushing 4.1 back to or near where >> 4.2 would have been otherwise released, at which point we haven't >> really accomplished anything but delayed the release of the working >> features in 4.1. >> >> >> On Wed, May 22, 2013 at 9:09 AM, John Burwell <jburw...@basho.com> wrote: >>> Marcus, >>> >>> For me, S3 integration and Xen feature parity are not the primary reasons >>> that this defect should remain a blocker. Time synchronization is a basic >>> and essential assumption for systems such as CloudStack. This defect >>> yields file and log timestamps from secondary storage that are unreliable >>> -- impacting customers in an accredited environment (e.g. SOX) or that rely >>> on those timestamps for any downstream operations. It also stands as a >>> significant impediment to operational debugging. Additionally, as others >>> have pointed out, time drifts also impact encryption, and possibly >>> handshake operations between the systems VMs and management server. While >>> I appreciate and fully support a time-based release cycle, there has to be >>> a quality threshold for any release. Looking at it from an operations >>> perspective, failure to maintain time sync across components is >>> unacceptable. Assuming I used Xen, I ask myself, "Would I deploy a 4.1.0 >>> if the known issues list stated that the system VMs could not maintain time >>> sync?", and, without hesitation, I would answer, "No.", and follow it up >>> quickly, "Oh no, I hope the release I have in production doesn't have this >>> problem." >>> >>> Thanks, >>> -John >>> >>> On May 22, 2013, at 10:35 AM, Marcus Sorensen <shadow...@gmail.com> wrote: >>> >>>> I feel like we need to clarify what's at risk here. Not to disrespect >>>> anyone's opinion, but I'm just not getting where this is being >>>> considered a major feature. I think the very idea of Xen not having >>>> feature parity (regardless of the feature) is distasteful to a lot of >>>> us, and it should be. But consider that we are already two months >>>> behind on a four month release cycle, and it sounds like fixing this >>>> could take a month (if no issues are found, two weeks to qual the new >>>> template). We run a time-based release, not a feature-based release. >>>> Not all features are expected to be fully functional to get out the >>>> door. Isn't the correct option to just mark the feature experimental, >>>> tell them to run the newer template at their risk if they want it? >>>> >>>> 1) We need to verify whether this bug has been around for a long time, >>>> because it will tell us how much it really matters and thus whether or >>>> not it's a blocker. This addresses the 'timestamp of logs" and other >>>> issues not related to new features. >>>> >>>> 2) We need to reiterate exactly what features are being affected. The >>>> original e-mail lists 'S3 integration' as the only feature affected. >>>> As far as I understand it, the actual feature impacted is a 'secondary >>>> storage sync', if you have multiple zones, multiple secondary >>>> storages, this backs up and handles the copying of templates, etc so >>>> you don't have to manually register them everywhere. >>>> >>>> I appreciate John's work for getting that secondary storage sync >>>> feature in place. I really wish we would have noticed the issue >>>> earlier on, then we may not be having this discussion. That said, no >>>> disrespect intended toward John, I'm having a hard time understanding >>>> how this is a feature worth holding up the release. It's not a new >>>> primary or secondary storage type integration, and it's not a feature >>>> where the admin is helpless to do it themselves. If VPC doesn't work, >>>> the admin can't do anything about it. If this sync doesn't work, the >>>> admin writes a script that copies their stuff everywhere. >>>> >>>> Please, if anyone considers this a major feature worth blocking on, >>>> explain to us why. Are you willing to push back release of all of the >>>> other new features, and push back the 4.2 features, to have this one >>>> feature in June, or whenever 4.1 gets out? >>>> >>>> >>>> On Wed, May 22, 2013 at 2:14 AM, Sebastien Goasguen <run...@gmail.com> >>>> wrote: >>>>> +1 on moving forward. >>>>> >>>>> On this issue and on the upgrade issue I have realized that we forgot >>>>> about our time based release philosophy. >>>>> >>>>> There will always be bugs in the software. If we know them we can >>>>> acknowledge them in release notes and get started quickly on the next >>>>> releases. >>>>> >>>>> To keep it short, I am now of the opinion (and I know I am kind of >>>>> switching mind here), that we should release 4.1 asap and start working >>>>> on the bug fix versions right away. >>>>> >>>>> If we do release often, then folks stuck on a particular bug can expect a >>>>> quick turn around and fix of their problems. >>>>> >>>>> -sebastien >>>>> >>>>> On May 22, 2013, at 2:59 AM, Mathias Mullins <mathias.mull...@citrix.com> >>>>> wrote: >>>>> >>>>>> -1 on this. >>>>>> >>>>>> New features really should be across the board for the Hypervisors. Part >>>>>> of the thing that distinguishes ACS is it's support across Xen / VMware / >>>>>> KVM. Do we really want to start getting in the habit of pushing forward >>>>>> new features that are not across the fully functional hypervisors? >>>>>> >>>>>> I agree with Outback this also will start to affect the Xen/XCP community >>>>>> by basically setting them apart and out on what a lot of people see as a >>>>>> major feature. >>>>>> >>>>>> I think it sets a really bad precedent. If it was Hyper-V which is not >>>>>> fully functional and not a major feature-set right now, I would be +1 on >>>>>> this. >>>>>> >>>>>> MHO >>>>>> Matt >>>>>> >>>>>> >>>>>> >>>>>> On 5/20/13 4:15 PM, "Chip Childers" <chip.child...@sungard.com> wrote: >>>>>> >>>>>>> All, >>>>>>> >>>>>>> As discussed on another thread [1], we identified a bug >>>>>>> (CLOUDSTACK-2492) in the current 3.x system VMs, where the System VMs >>>>>>> are not configured to sync their time with either the host HV or an NTP >>>>>>> service. That bug affects the system VMs for all three primary HVs >>>>>>> (KVM, >>>>>>> Xen and vSphere). Patches have been committed addressing vSphere and >>>>>>> KVM. It appears that a correction for Xen would require the re-build of >>>>>>> a system VM image and a full round of regression testing that image. >>>>>>> >>>>>>> Given that the discussion thread has not resulted in a consensus on this >>>>>>> issue, I unfortunately believe that the only path forward is to call for >>>>>>> a formal VOTE. >>>>>>> >>>>>>> Please respond with one of the following: >>>>>>> >>>>>>> +1: proceed with 4.1 without the Xen portion of CLOUDSTACK-2492 being >>>>>>> resolved >>>>>>> +0: don't care one way or the other >>>>>>> -1: do *not* proceed with any further 4.1 release candidates until >>>>>>> CLOUDSTACK-2492 has been fully resolved >>>>>>> >>>>>>> -chip >>>>>>> >>>>>>> [1] http://markmail.org/message/rw7vciq3r33biasb >>>>>> >>>>> >>> >