Hi Rohit, Like it! Our VR system is due for some rethinking! I don't have much points to add to issues you highlight, it seems pretty complete.
Here are some more features or ideas that would be interesting to see in a new VR system: - Use or support for routing protocol, such as BGP or OSBP so we could provide a more dynamic PrivateGateway concept. using FRR[1]? - have an API driven way to configure IPtables and other network services. - could we decouple network services such as LB, VPNserver, gateways from the VR ? Debian has been a pain for building VR because the iso defined in our config need constant update, but on the other hand it's been proved to be a reliable OS, we saw some of our VR with uptime over 1000 days. As an alternative OS, I'd be curious to look at VyOS[2] or Alpine Linux. >From a certain perspective, us providing the systemvm template make sure that systemVM will deploy. work reliably and make it a single template to test. Compared to a system that would just provide RPM/DEB packages and mechanism to push configs, this could require to test all kinds of template scenarios, since users could use any version of distro to deploy their systemVM/VRs. "Users forget to register the right systemvmtemplate for a new ACS version and upgrades fail": Maybe we could automatically register the new template during the upgrade process? This feature used to exist in the Citrix CloudPlatform. [1] https://frrouting.org/ [2] https://github.com/vyos/vyos-1x On Wed, Aug 11, 2021 at 12:37 PM Rohit Yadav <rohit.ya...@shapeblue.com> wrote: > All, > > We've over the years create a VR that largely is stable but we've > discussed the pain of maintaining, extending, and upgrading VRs both in > lists and in user-groups and CCC conferences. > > On a high-level the pain points are: > > * It's difficult to debug, investigate VR for operators and support > team > * Maintaining the VR code, fixing bugs, implementing features is a > pain for developers; further the xml&json databags based programming model > is confusing > * Any fix or changes requires a new systemvmtemplate or VR codebase > whose patching requires restarting the VR or destroying an old VR > * No uniform VR programming API (current approach is SSH+databags and > execute a script), this makes testing VR difficult and QA in isolation is > not possible > * Users forget to register the right systemvmtemplate for a new ACS > version and upgrades fail > * Others (please share yours)? > > Among these pain points my colleagues have proposed a PR targeting 4.16 > [1] that aims to unify systemvmtemplate as a building block that is bundled > as part of ACS rpm/deb/* packages which CloudStack will automatically > seed/register/use with which upgrades will be as simple as a yum update or > apt-get upgrade. Further, my colleagues and I are exploring a live patch > API which in near future that can patch a running systemvm/VR using > systemvm.iso (or deprecate systemvm.iso and use ssh/scp to patch?) without > requiring to reboot/recreate it. Hopefully, this addresses some of those > pain points. We request the community for your feedback and > review/participation in the PR. > > Open questions, topics to discuss and gather feedback: > > * VR programming: Should we explore a new light-weight VR agent that > provides an API (restful/grpc, or CLI?), some mechanism of live patching VR > code, packages, and kernel? > * Refactoring isolated network and VPC codebases into a unified > codebase and feature sets (assumption that isolated network are largely a > VPC with single tier), does it benefit the community, users, and developers? > * Underlying OS: > * should we consider something other than Debian, any suggestions? > * or explore stable/widely used and maintained opensource router > distributions such as OpenWRT [2] which ships with a UI and > CLI/configuration system UCI [3]? The cons of the approach are a new > dependency and some likely missing packages. > * In current VR codebase, most of the effort is spent in > implementing/maintaining router packages/configure codebase which we can > get rid of by depending on a stable Linux router distro which ships with > some API/config system. Not choosing an existing router distribution means > we continue to DIY router programming+config management codebase. > * Any other ideas? > > Thoughts, feedback? > > [1] https://github.com/apache/cloudstack/pull/4329 > [2] https://github.com/openwrt/openwrt/graphs/contributors > [3] https://openwrt.org/docs/guide-user/base-system/uci > > > Regards. > > > >