Hey Nathan, Weve run some tests and verified that the scale up and down policy works smoothly for us.
We’re using Linstor as the storage , not NFS. Not sure how to troubleshoot the issue youre having anymore.. anyone else? Regards, Bryan On 16 Aug 2024 at 8:44 AM +0800, Nathan Gleason <[email protected]>, wrote: > Yes, this is our test & dev system. Our prod is still running 4.18.1.0 and it > works there. I did go through and upgrade all of the VRs. But this test was > with a new network and VR, new VMs etc… > > Thank you, > Nathan > > > On Aug 15, 2024, at 20:39, Bryan Tiang <[email protected]> wrote: > > > > Hey Nathan, > > > > Strange, so just an update of cloudstack version cause this to break? Ill > > check with my guys later today. > > > > After the upgrade, VR system template version should need an upgrade. Is > > that done? If not, will a cleanup work? > > > > Its not really solving the root problem but it is a stop gap solution. > > > > Regards, > > Bryan > > On 16 Aug 2024 at 8:31 AM +0800, Nathan Gleason > > <[email protected]>, wrote: > > > Hi Bryan, thanks for the replies. > > > > > > We are able to create ASGs without issue as well, it’s just the scaling > > > that doesn’t work. We don’t have any orphan VMs from the LB. We did see > > > the Windows nugget, but these are all just regular Ubuntu VMs. We haven’t > > > had any node failures. > > > > > > Thank you, > > > Nathan > > > > > > > On Aug 15, 2024, at 20:05, Bryan Tiang <[email protected]> wrote: > > > > > > > > Hey Nathan, > > > > > > > > As for the metrics not showing right, do you happen to be using Windows > > > > Guest VMs for the autoscale? > > > > > > > > I remember something about Windows VMs not being able to get metrics on > > > > cloudstack correctly unless some setting was made in the hypervisor. > > > > > > > > Regards, > > > > Bryan > > > > On 16 Aug 2024 at 8:03 AM +0800, Bryan Tiang > > > > <[email protected]>, wrote: > > > > > Hey Nathan, > > > > > > > > > > Our company uses around 20 Autoscale Groups at the moment with > > > > > 4.19.1.1 with KVM. > > > > > > > > > > Since the upgrade, we definitely tested being able to create new ASGs > > > > > without issue, but ive not tested the scale up and down scenario. > > > > > > > > > > But we did found a bug where if an ASG VM is restarted after a node > > > > > failure, the scale down policies dont work anymore. Github ticket > > > > > below: > > > > > > > > > > https://github.com/apache/cloudstack/issues/9336 > > > > > > > > > > Might affect Scale Up policies too. > > > > > > > > > > What we did was recreate the VR with Cleanup. If that doesnt work, > > > > > turn off the policy, delete affected ASG VMs and turn it on again so > > > > > all VMs are new again. > > > > > > > > > > Another possible scenario i could think is that your the ASG VM > > > > > became an orphan for some reason (again from restart after node > > > > > failure) where the VM is restarted but not recorded under the LB. You > > > > > can go to ASG Group > LB > Press + icon to display all the VMs and > > > > > see if yours is there. If not, this can explain why the scale up/down > > > > > policies no longer work. > > > > > > > > > > https://github.com/apache/cloudstack/issues/9145 > > > > > > > > > > Regards, > > > > > Bryan > > > > > On 16 Aug 2024 at 6:19 AM +0800, Nathan Gleason > > > > > <[email protected]>, wrote: > > > > > > Hello, > > > > > > > > > > > > We’ve recently upgraded Cloudstack from 4.18.1.0 to 4.19.1.0, then > > > > > > to 4.19.1.1. While testing Autoscale we’ve found that a ScaleUp > > > > > > policy for “VM CPU - average percentage” does not work. We run > > > > > > cpuburn on the VM to load the CPU to 100%. We see the metrics in > > > > > > the UI as well as in the autoscale_vmgroup_statistics table. But > > > > > > scale up never happens. We’ve set the threshold anywhere from 1% to > > > > > > 50% but it does not work. We have restarted cloudstack-mangement, > > > > > > cloudstack-agent, libvirtd, etc… Has anyone encountered this? > > > > > > > > > > > > This may be unrelated but we have also noticed that memory metrics > > > > > > for all of the VMs are incorrect. We load the memory with memtester > > > > > > and while the VM shows the memory usage properly, the metrics do > > > > > > not. We found this while testing “VM Memory - average percentage” > > > > > > in ScaleUp policies. > > > > > > > > > > > > Versions: > > > > > > > > > > > > OS: Ubuntu 22.04 > > > > > > Cloudstack: 4.19.1.1 > > > > > > Hypervisor: KVM > > > > > > > > > > > > Libvirt: > > > > > > Compiled against library: libvirt 8.0.0 > > > > > > Using library: libvirt 8.0.0 > > > > > > Using API: QEMU 8.0.0 > > > > > > Running hypervisor: QEMU 6.2.0 > > > > > > > > > > > > Thank you, > > > > > > Nathan > > > >
