Re: Autoscale groups not working

Bryan Tiang Fri, 16 Aug 2024 04:38:00 -0700

Hey Nathan,

Weve run some tests and verified that the scale up and down policy works 
smoothly for us.


We’re using Linstor as the storage , not NFS.

Not sure how to troubleshoot the issue youre having anymore.. anyone else?

Regards,
Bryan
On 16 Aug 2024 at 8:44 AM +0800, Nathan Gleason 
<[email protected]>, wrote:
> Yes, this is our test & dev system. Our prod is still running 4.18.1.0 and it 
> works there. I did go through and upgrade all of the VRs. But this test was 
> with a new network and VR, new VMs etc…
>
> Thank you,
> Nathan
>
> > On Aug 15, 2024, at 20:39, Bryan Tiang <[email protected]> wrote:
> >
> > Hey Nathan,
> >
> > Strange, so just an update of cloudstack version cause this to break? Ill 
> > check with my guys later today.
> >
> > After the upgrade, VR system template version should need an upgrade. Is 
> > that done? If not, will a cleanup work?
> >
> > Its not really solving the root problem but it is a stop gap solution.
> >
> > Regards,
> > Bryan
> > On 16 Aug 2024 at 8:31 AM +0800, Nathan Gleason 
> > <[email protected]>, wrote:
> > > Hi Bryan, thanks for the replies.
> > >
> > > We are able to create ASGs without issue as well, it’s just the scaling 
> > > that doesn’t work. We don’t have any orphan VMs from the LB. We did see 
> > > the Windows nugget, but these are all just regular Ubuntu VMs. We haven’t 
> > > had any node failures.
> > >
> > > Thank you,
> > > Nathan
> > >
> > > > On Aug 15, 2024, at 20:05, Bryan Tiang <[email protected]> wrote:
> > > >
> > > > Hey Nathan,
> > > >
> > > > As for the metrics not showing right, do you happen to be using Windows 
> > > > Guest VMs for the autoscale?
> > > >
> > > > I remember something about Windows VMs not being able to get metrics on 
> > > > cloudstack correctly unless some setting was made in the hypervisor.
> > > >
> > > > Regards,
> > > > Bryan
> > > > On 16 Aug 2024 at 8:03 AM +0800, Bryan Tiang 
> > > > <[email protected]>, wrote:
> > > > > Hey Nathan,
> > > > >
> > > > > Our company uses around 20 Autoscale Groups at the moment with 
> > > > > 4.19.1.1 with KVM.
> > > > >
> > > > > Since the upgrade, we definitely tested being able to create new ASGs 
> > > > > without issue, but ive not tested the scale up and down scenario.
> > > > >
> > > > > But we did found a bug where if an ASG VM is restarted after a node 
> > > > > failure, the scale down policies dont work anymore. Github ticket 
> > > > > below:
> > > > >
> > > > > https://github.com/apache/cloudstack/issues/9336
> > > > >
> > > > > Might affect Scale Up policies too.
> > > > >
> > > > > What we did was recreate the VR with Cleanup. If that doesnt work, 
> > > > > turn off the policy, delete affected ASG VMs and turn it on again so 
> > > > > all VMs are new again.
> > > > >
> > > > > Another possible scenario i could think is that your the ASG VM 
> > > > > became an orphan for some reason (again from restart after node 
> > > > > failure) where the VM is restarted but not recorded under the LB. You 
> > > > > can go to ASG Group > LB > Press + icon to display all the VMs and 
> > > > > see if yours is there. If not, this can explain why the scale up/down 
> > > > > policies no longer work.
> > > > >
> > > > > https://github.com/apache/cloudstack/issues/9145
> > > > >
> > > > > Regards,
> > > > > Bryan
> > > > > On 16 Aug 2024 at 6:19 AM +0800, Nathan Gleason 
> > > > > <[email protected]>, wrote:
> > > > > > Hello,
> > > > > >
> > > > > > We’ve recently upgraded Cloudstack from 4.18.1.0 to 4.19.1.0, then 
> > > > > > to 4.19.1.1. While testing Autoscale we’ve found that a ScaleUp 
> > > > > > policy for “VM CPU - average percentage” does not work. We run 
> > > > > > cpuburn on the VM to load the CPU to 100%. We see the metrics in 
> > > > > > the UI as well as in the autoscale_vmgroup_statistics table. But 
> > > > > > scale up never happens. We’ve set the threshold anywhere from 1% to 
> > > > > > 50% but it does not work. We have restarted cloudstack-mangement, 
> > > > > > cloudstack-agent, libvirtd, etc… Has anyone encountered this?
> > > > > >
> > > > > > This may be unrelated but we have also noticed that memory metrics 
> > > > > > for all of the VMs are incorrect. We load the memory with memtester 
> > > > > > and while the VM shows the memory usage properly, the metrics do 
> > > > > > not. We found this while testing “VM Memory - average percentage” 
> > > > > > in ScaleUp policies.
> > > > > >
> > > > > > Versions:
> > > > > >
> > > > > > OS: Ubuntu 22.04
> > > > > > Cloudstack: 4.19.1.1
> > > > > > Hypervisor: KVM
> > > > > >
> > > > > > Libvirt:
> > > > > > Compiled against library: libvirt 8.0.0
> > > > > > Using library: libvirt 8.0.0
> > > > > > Using API: QEMU 8.0.0
> > > > > > Running hypervisor: QEMU 6.2.0
> > > > > >
> > > > > > Thank you,
> > > > > > Nathan
> > >
>

Re: Autoscale groups not working

Reply via email to