Maybe I am wrong, but are we debating around a problem that can be fixed by adding a simple package to the systemvm?
On Wed, May 22, 2013 at 1:01 PM, Chiradeep Vittal < chiradeep.vit...@citrix.com> wrote: > As the author of the original systemvm (and current contributor to the > systemvm), I can confidently state that this issue has been there since > 2.2.0. > The issue is that the Debian 2.6.32 kernel is a PVOPS kernel. All PVOPs > kernels require ntp to keep time sync. > http://www.gossamer-threads.com/lists/xen/users/234750 > > On 5/22/13 9:56 AM, "Marcus Sorensen" <shadow...@gmail.com> wrote: > > >If this were creating a new bug, for example "oh, your VPCs won't work > >anymore for this release", or "here's a new UI, but it's really buggy > >and barely functional" then I'd agree with this train of thought. > >Instead, we are saying "we recently found out that since 2.2.x > >cloudstack has had this behavior, and it will be fixed in 4.2"*. > >That's a totally different thing. If 4.1 ends up being a poor quality > >release that everyone remembers compared to others, it's not going to > >be because we didn't address something that has been around for > >several releases, that nobody has noticed. > > > >* Assuming we verify that it's not a regression, which I'm still very > >interested in knowing > > > >On Wed, May 22, 2013 at 9:51 AM, John Burwell <jburw...@basho.com> wrote: > >> Marcus, > >> > >> I would say that the only thing for an open source project worse than > >>not releasing is releasing a poor quality release. A late release with > >>high quality is soon forgotten. An on-time or late release with poor > >>quality lingers in folks memory. The KDE project made the near fatal > >>mistake of following the same logic when they release 4.0, and the > >>reputation of KDE 4.x continues to suffer from it to this day. > >>CloudStack is trusted to run at the core our user's operations. In my > >>view, if we err, we should err on the side of quality to avoid of > >>erosion of that trust. If we ever lost that trust, our new features > >>would never be evaluated. > > > >> > >> Thanks, > >> -John > >> > >> On May 22, 2013, at 11:18 AM, Marcus Sorensen <shadow...@gmail.com> > >>wrote: > >> > >>> Thanks for the response. Time sync is certainly an issue, I think one > >>> of the things we are trying to gauge is whether the system vm > >>> functionality has been impacted by time sync such that anyone has > >>> noticed or cared. That's not to detract from the point that having > >>> time sync is optimal, and affects a lot of things, but functionally, > >>> back to my item #1, can we confirm that earlier versions have gotten > >>> out of sync, and if so, do we have bug reports showing that it has > >>> mattered? > >>> > >>> To counter the argument, there are plenty of people looking for the > >>> features in 4.1, that wouldn't choose cloudstack because it's not > >>> released yet. Then there's the delay impact to 4.2, and keeping all of > >>> those features out of the hands of people as well. > >>> > >>> For me, the fear is that we end up pushing 4.1 back to or near where > >>> 4.2 would have been otherwise released, at which point we haven't > >>> really accomplished anything but delayed the release of the working > >>> features in 4.1. > >>> > >>> > >>> On Wed, May 22, 2013 at 9:09 AM, John Burwell <jburw...@basho.com> > >>>wrote: > >>>> Marcus, > >>>> > >>>> For me, S3 integration and Xen feature parity are not the primary > >>>>reasons that this defect should remain a blocker. Time > >>>>synchronization is a basic and essential assumption for systems such > >>>>as CloudStack. This defect yields file and log timestamps from > >>>>secondary storage that are unreliable -- impacting customers in an > >>>>accredited environment (e.g. SOX) or that rely on those timestamps for > >>>>any downstream operations. It also stands as a significant impediment > >>>>to operational debugging. Additionally, as others have pointed out, > >>>>time drifts also impact encryption, and possibly handshake operations > >>>>between the systems VMs and management server. While I appreciate and > >>>>fully support a time-based release cycle, there has to be a quality > >>>>threshold for any release. Looking at it from an operations > >>>>perspective, failure to maintain time sync across components is > >>>>unacceptable. Assuming I used Xen, I ask myself, "Would I deploy a > >>>>4.1.0 if the known issues list stated that the system VMs could not > >>>>maintain time sync?", and, without hesitation, I would answer, "No.", > >>>>and follow it up quickly, "Oh no, I hope the release I have in > >>>>production doesn't have this problem." > >>>> > >>>> Thanks, > >>>> -John > >>>> > >>>> On May 22, 2013, at 10:35 AM, Marcus Sorensen <shadow...@gmail.com> > >>>>wrote: > >>>> > >>>>> I feel like we need to clarify what's at risk here. Not to disrespect > >>>>> anyone's opinion, but I'm just not getting where this is being > >>>>> considered a major feature. I think the very idea of Xen not having > >>>>> feature parity (regardless of the feature) is distasteful to a lot of > >>>>> us, and it should be. But consider that we are already two months > >>>>> behind on a four month release cycle, and it sounds like fixing this > >>>>> could take a month (if no issues are found, two weeks to qual the new > >>>>> template). We run a time-based release, not a feature-based release. > >>>>> Not all features are expected to be fully functional to get out the > >>>>> door. Isn't the correct option to just mark the feature experimental, > >>>>> tell them to run the newer template at their risk if they want it? > >>>>> > >>>>> 1) We need to verify whether this bug has been around for a long > >>>>>time, > >>>>> because it will tell us how much it really matters and thus whether > >>>>>or > >>>>> not it's a blocker. This addresses the 'timestamp of logs" and other > >>>>> issues not related to new features. > >>>>> > >>>>> 2) We need to reiterate exactly what features are being affected. The > >>>>> original e-mail lists 'S3 integration' as the only feature affected. > >>>>> As far as I understand it, the actual feature impacted is a > >>>>>'secondary > >>>>> storage sync', if you have multiple zones, multiple secondary > >>>>> storages, this backs up and handles the copying of templates, etc so > >>>>> you don't have to manually register them everywhere. > >>>>> > >>>>> I appreciate John's work for getting that secondary storage sync > >>>>> feature in place. I really wish we would have noticed the issue > >>>>> earlier on, then we may not be having this discussion. That said, no > >>>>> disrespect intended toward John, I'm having a hard time understanding > >>>>> how this is a feature worth holding up the release. It's not a new > >>>>> primary or secondary storage type integration, and it's not a feature > >>>>> where the admin is helpless to do it themselves. If VPC doesn't work, > >>>>> the admin can't do anything about it. If this sync doesn't work, the > >>>>> admin writes a script that copies their stuff everywhere. > >>>>> > >>>>> Please, if anyone considers this a major feature worth blocking on, > >>>>> explain to us why. Are you willing to push back release of all of the > >>>>> other new features, and push back the 4.2 features, to have this one > >>>>> feature in June, or whenever 4.1 gets out? > >>>>> > >>>>> > >>>>> On Wed, May 22, 2013 at 2:14 AM, Sebastien Goasguen > >>>>><run...@gmail.com> wrote: > >>>>>> +1 on moving forward. > >>>>>> > >>>>>> On this issue and on the upgrade issue I have realized that we > >>>>>>forgot about our time based release philosophy. > >>>>>> > >>>>>> There will always be bugs in the software. If we know them we can > >>>>>>acknowledge them in release notes and get started quickly on the > >>>>>>next releases. > >>>>>> > >>>>>> To keep it short, I am now of the opinion (and I know I am kind of > >>>>>>switching mind here), that we should release 4.1 asap and start > >>>>>>working on the bug fix versions right away. > >>>>>> > >>>>>> If we do release often, then folks stuck on a particular bug can > >>>>>>expect a quick turn around and fix of their problems. > >>>>>> > >>>>>> -sebastien > >>>>>> > >>>>>> On May 22, 2013, at 2:59 AM, Mathias Mullins > >>>>>><mathias.mull...@citrix.com> wrote: > >>>>>> > >>>>>>> -1 on this. > >>>>>>> > >>>>>>> New features really should be across the board for the > >>>>>>>Hypervisors. Part > >>>>>>> of the thing that distinguishes ACS is it's support across Xen / > >>>>>>>VMware / > >>>>>>> KVM. Do we really want to start getting in the habit of pushing > >>>>>>>forward > >>>>>>> new features that are not across the fully functional hypervisors? > >>>>>>> > >>>>>>> I agree with Outback this also will start to affect the Xen/XCP > >>>>>>>community > >>>>>>> by basically setting them apart and out on what a lot of people > >>>>>>>see as a > >>>>>>> major feature. > >>>>>>> > >>>>>>> I think it sets a really bad precedent. If it was Hyper-V which is > >>>>>>>not > >>>>>>> fully functional and not a major feature-set right now, I would be > >>>>>>>+1 on > >>>>>>> this. > >>>>>>> > >>>>>>> MHO > >>>>>>> Matt > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> On 5/20/13 4:15 PM, "Chip Childers" <chip.child...@sungard.com> > >>>>>>>wrote: > >>>>>>> > >>>>>>>> All, > >>>>>>>> > >>>>>>>> As discussed on another thread [1], we identified a bug > >>>>>>>> (CLOUDSTACK-2492) in the current 3.x system VMs, where the System > >>>>>>>>VMs > >>>>>>>> are not configured to sync their time with either the host HV or > >>>>>>>>an NTP > >>>>>>>> service. That bug affects the system VMs for all three primary > >>>>>>>>HVs (KVM, > >>>>>>>> Xen and vSphere). Patches have been committed addressing vSphere > >>>>>>>>and > >>>>>>>> KVM. It appears that a correction for Xen would require the > >>>>>>>>re-build of > >>>>>>>> a system VM image and a full round of regression testing that > >>>>>>>>image. > >>>>>>>> > >>>>>>>> Given that the discussion thread has not resulted in a consensus > >>>>>>>>on this > >>>>>>>> issue, I unfortunately believe that the only path forward is to > >>>>>>>>call for > >>>>>>>> a formal VOTE. > >>>>>>>> > >>>>>>>> Please respond with one of the following: > >>>>>>>> > >>>>>>>> +1: proceed with 4.1 without the Xen portion of CLOUDSTACK-2492 > >>>>>>>>being > >>>>>>>> resolved > >>>>>>>> +0: don't care one way or the other > >>>>>>>> -1: do *not* proceed with any further 4.1 release candidates until > >>>>>>>> CLOUDSTACK-2492 has been fully resolved > >>>>>>>> > >>>>>>>> -chip > >>>>>>>> > >>>>>>>> [1] http://markmail.org/message/rw7vciq3r33biasb > >>>>>>> > >>>>>> > >>>> > >> > >