Maybe I am wrong, but are we debating around a problem that can be fixed by
adding a simple package to the systemvm?


On Wed, May 22, 2013 at 1:01 PM, Chiradeep Vittal <
chiradeep.vit...@citrix.com> wrote:

> As the author of the original systemvm (and current contributor to the
> systemvm), I can confidently state that this issue has been there since
> 2.2.0.
> The issue is that the Debian 2.6.32 kernel is a PVOPS kernel. All PVOPs
> kernels require ntp to keep time sync.
> http://www.gossamer-threads.com/lists/xen/users/234750
>
> On 5/22/13 9:56 AM, "Marcus Sorensen" <shadow...@gmail.com> wrote:
>
> >If this were creating a new bug, for example "oh, your VPCs won't work
> >anymore for this release", or "here's a new UI, but it's really buggy
> >and barely functional" then I'd agree with this train of thought.
> >Instead, we are saying "we recently found out that since 2.2.x
> >cloudstack has had this behavior, and it will be fixed in 4.2"*.
> >That's a totally different thing. If 4.1 ends up being a poor quality
> >release that everyone remembers compared to others, it's not going to
> >be because we didn't address something that has been around for
> >several releases, that nobody has noticed.
> >
> >* Assuming we verify that it's not a regression, which I'm still very
> >interested in knowing
> >
> >On Wed, May 22, 2013 at 9:51 AM, John Burwell <jburw...@basho.com> wrote:
> >> Marcus,
> >>
> >> I would say that the only thing for an open source project worse than
> >>not releasing is releasing a poor quality release.  A late release with
> >>high quality is soon forgotten.  An on-time or late release with poor
> >>quality lingers in folks memory. The KDE project made the near fatal
> >>mistake of following the same logic when they release 4.0, and the
> >>reputation of KDE 4.x continues to suffer from it to this day.
> >>CloudStack is trusted to run at the core our user's operations.  In my
> >>view, if we err, we should err on the side of quality to avoid of
> >>erosion of that trust.  If we ever lost that trust, our new features
> >>would never be evaluated.
> >
> >>
> >> Thanks,
> >> -John
> >>
> >> On May 22, 2013, at 11:18 AM, Marcus Sorensen <shadow...@gmail.com>
> >>wrote:
> >>
> >>> Thanks for the response. Time sync is certainly an issue, I think one
> >>> of the things we are trying to gauge is whether the system vm
> >>> functionality has been impacted by time sync such that anyone has
> >>> noticed or cared.  That's not to detract from the point that having
> >>> time sync is optimal, and affects a lot of things, but functionally,
> >>> back to my item #1, can we confirm that earlier versions have gotten
> >>> out of sync, and if so, do we have bug reports showing that it has
> >>> mattered?
> >>>
> >>>  To counter the argument, there are plenty of people looking for the
> >>> features in 4.1, that wouldn't choose cloudstack because it's not
> >>> released yet. Then there's the delay impact to 4.2, and keeping all of
> >>> those features out of the hands of people as well.
> >>>
> >>> For me, the fear is that we end up pushing 4.1 back to or near where
> >>> 4.2 would have been otherwise released, at which point we haven't
> >>> really accomplished anything but delayed the release of the working
> >>> features in 4.1.
> >>>
> >>>
> >>> On Wed, May 22, 2013 at 9:09 AM, John Burwell <jburw...@basho.com>
> >>>wrote:
> >>>> Marcus,
> >>>>
> >>>> For me, S3 integration and Xen feature parity are not the primary
> >>>>reasons that this defect should remain a blocker.  Time
> >>>>synchronization is a basic and essential assumption for systems such
> >>>>as CloudStack.  This defect yields file and log timestamps from
> >>>>secondary storage that are unreliable -- impacting customers in an
> >>>>accredited environment (e.g. SOX) or that rely on those timestamps for
> >>>>any downstream operations.  It also stands as a significant impediment
> >>>>to operational debugging.  Additionally, as others have pointed out,
> >>>>time drifts also impact encryption, and possibly handshake operations
> >>>>between the systems VMs and management server.  While I appreciate and
> >>>>fully support a time-based release cycle, there has to be a quality
> >>>>threshold for any release.  Looking at it from an operations
> >>>>perspective, failure to maintain time sync across components is
> >>>>unacceptable.   Assuming I used Xen, I ask myself, "Would I deploy a
> >>>>4.1.0 if the known issues list stated that the system VMs could not
> >>>>maintain time sync?", and, without hesitation, I would answer, "No.",
> >>>>and follow it up quickly, "Oh no, I hope the release I have in
> >>>>production doesn't have this problem."
> >>>>
> >>>> Thanks,
> >>>> -John
> >>>>
> >>>> On May 22, 2013, at 10:35 AM, Marcus Sorensen <shadow...@gmail.com>
> >>>>wrote:
> >>>>
> >>>>> I feel like we need to clarify what's at risk here. Not to disrespect
> >>>>> anyone's opinion, but I'm just not getting where this is being
> >>>>> considered a major feature.  I think the very idea of Xen not having
> >>>>> feature parity (regardless of the feature) is distasteful to a lot of
> >>>>> us, and it should be. But consider that we are already two months
> >>>>> behind on a four month release cycle, and it sounds like fixing this
> >>>>> could take a month (if no issues are found, two weeks to qual the new
> >>>>> template). We run a time-based release, not a feature-based release.
> >>>>> Not all features are expected to be fully functional to get out the
> >>>>> door. Isn't the correct option to just mark the feature experimental,
> >>>>> tell them to run the newer template at their risk if they want it?
> >>>>>
> >>>>> 1) We need to verify whether this bug has been around for a long
> >>>>>time,
> >>>>> because it will tell us how much it really matters and thus whether
> >>>>>or
> >>>>> not it's a blocker. This addresses the 'timestamp of logs" and other
> >>>>> issues not related to new features.
> >>>>>
> >>>>> 2) We need to reiterate exactly what features are being affected. The
> >>>>> original e-mail lists 'S3 integration' as the only feature affected.
> >>>>> As far as I understand it, the actual feature impacted is a
> >>>>>'secondary
> >>>>> storage sync', if you have multiple zones, multiple secondary
> >>>>> storages, this backs up and handles the copying of templates, etc so
> >>>>> you don't have to manually register them everywhere.
> >>>>>
> >>>>> I appreciate John's work for getting that secondary storage sync
> >>>>> feature in place. I really wish we would have noticed the issue
> >>>>> earlier on, then we may not be having this discussion. That said, no
> >>>>> disrespect intended toward John, I'm having a hard time understanding
> >>>>> how this is a feature worth holding up the release. It's not a new
> >>>>> primary or secondary storage type integration, and it's not a feature
> >>>>> where the admin is helpless to do it themselves. If VPC doesn't work,
> >>>>> the admin can't do anything about it. If this sync doesn't work, the
> >>>>> admin writes a script that copies their stuff everywhere.
> >>>>>
> >>>>> Please, if anyone considers this a major feature worth blocking on,
> >>>>> explain to us why. Are you willing to push back release of all of the
> >>>>> other new features, and push back the 4.2 features, to have this one
> >>>>> feature in June, or whenever 4.1 gets out?
> >>>>>
> >>>>>
> >>>>> On Wed, May 22, 2013 at 2:14 AM, Sebastien Goasguen
> >>>>><run...@gmail.com> wrote:
> >>>>>> +1 on moving forward.
> >>>>>>
> >>>>>> On this issue and on the upgrade issue I have realized that we
> >>>>>>forgot about our time based release philosophy.
> >>>>>>
> >>>>>> There will always be bugs in the software. If we know them we can
> >>>>>>acknowledge them in release notes and get started quickly on the
> >>>>>>next releases.
> >>>>>>
> >>>>>> To keep it short, I am now of the opinion (and I know I am kind of
> >>>>>>switching mind here), that we should release 4.1 asap and start
> >>>>>>working on the bug fix versions right away.
> >>>>>>
> >>>>>> If we do release often, then folks stuck on a particular bug can
> >>>>>>expect a quick turn around and fix of their problems.
> >>>>>>
> >>>>>> -sebastien
> >>>>>>
> >>>>>> On May 22, 2013, at 2:59 AM, Mathias Mullins
> >>>>>><mathias.mull...@citrix.com> wrote:
> >>>>>>
> >>>>>>> -1 on this.
> >>>>>>>
> >>>>>>> New features really should be across the board for the
> >>>>>>>Hypervisors. Part
> >>>>>>> of the thing that distinguishes ACS is it's support across Xen /
> >>>>>>>VMware /
> >>>>>>> KVM. Do we really want to start getting in the habit of pushing
> >>>>>>>forward
> >>>>>>> new features that are not across the fully functional hypervisors?
> >>>>>>>
> >>>>>>> I agree with Outback this also will start to affect the Xen/XCP
> >>>>>>>community
> >>>>>>> by basically setting them apart and out on what a lot of people
> >>>>>>>see as a
> >>>>>>> major feature.
> >>>>>>>
> >>>>>>> I think it sets a really bad precedent. If it was Hyper-V which is
> >>>>>>>not
> >>>>>>> fully functional and not a major feature-set right now, I would be
> >>>>>>>+1 on
> >>>>>>> this.
> >>>>>>>
> >>>>>>> MHO
> >>>>>>> Matt
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On 5/20/13 4:15 PM, "Chip Childers" <chip.child...@sungard.com>
> >>>>>>>wrote:
> >>>>>>>
> >>>>>>>> All,
> >>>>>>>>
> >>>>>>>> As discussed on another thread [1], we identified a bug
> >>>>>>>> (CLOUDSTACK-2492) in the current 3.x system VMs, where the System
> >>>>>>>>VMs
> >>>>>>>> are not configured to sync their time with either the host HV or
> >>>>>>>>an NTP
> >>>>>>>> service.  That bug affects the system VMs for all three primary
> >>>>>>>>HVs (KVM,
> >>>>>>>> Xen and vSphere).  Patches have been committed addressing vSphere
> >>>>>>>>and
> >>>>>>>> KVM.  It appears that a correction for Xen would require the
> >>>>>>>>re-build of
> >>>>>>>> a system VM image and a full round of regression testing that
> >>>>>>>>image.
> >>>>>>>>
> >>>>>>>> Given that the discussion thread has not resulted in a consensus
> >>>>>>>>on this
> >>>>>>>> issue, I unfortunately believe that the only path forward is to
> >>>>>>>>call for
> >>>>>>>> a formal VOTE.
> >>>>>>>>
> >>>>>>>> Please respond with one of the following:
> >>>>>>>>
> >>>>>>>> +1: proceed with 4.1 without the Xen portion of CLOUDSTACK-2492
> >>>>>>>>being
> >>>>>>>> resolved
> >>>>>>>> +0: don't care one way or the other
> >>>>>>>> -1: do *not* proceed with any further 4.1 release candidates until
> >>>>>>>> CLOUDSTACK-2492 has been fully resolved
> >>>>>>>>
> >>>>>>>> -chip
> >>>>>>>>
> >>>>>>>> [1] http://markmail.org/message/rw7vciq3r33biasb
> >>>>>>>
> >>>>>>
> >>>>
> >>
>
>

Reply via email to