Daniel P. Berrangé, Oct 22, 2024 at 16:29: > On Tue, Oct 22, 2024 at 04:16:36PM +0200, Anthony Harivel wrote: >> Daniel P. Berrangé, Oct 22, 2024 at 15:15: >> > On Tue, Oct 22, 2024 at 02:46:15PM +0200, Igor Mammedov wrote: >> >> On Fri, 18 Oct 2024 13:59:34 +0100 >> >> Daniel P. Berrangé <berra...@redhat.com> wrote: >> >> >> >> > On Fri, Oct 18, 2024 at 02:25:26PM +0200, Igor Mammedov wrote: >> >> > > On Wed, 16 Oct 2024 14:56:39 +0200 >> >> > > "Anthony Harivel" <ahari...@redhat.com> wrote: >> >> [...] >> >> >> >> > > >> >> > > This also leads to a question, if we should account for >> >> > > not VCPU threads at all. Looking at real hardware, those >> >> > > MSRs return power usage of CPUs only, and they do not >> >> > > return consumption from auxiliary system components >> >> > > (io/memory/...). One can consider non VCPU threads in QEMU >> >> > > as auxiliary components, so we probably should not to >> >> > > account for them at all when modeling the same hw feature. >> >> > > (aka be consistent with what real hw does). >> >> > >> >> > I understand your POV, but I think that would be a mistake, >> >> > and would undermine the usefulness of the feature. >> >> > >> >> > The deployment model has a cluster of hosts and guests, all >> >> > belonging to the same user. The user goal is to measure host >> >> > power consumption imposed by the guest, and dynamically adjust >> >> > guest workloads in order to minimize power consumption of the >> >> > host. >> >> >> >> For cloud use-case, host side is likely in a better position >> >> to accomplish the task of saving power by migrating VM to >> >> another socket/host to compact idle load. (I've found at least 1 >> >> kubernetis tool[1], which does energy monitoring). Perhaps there >> >> are schedulers out there that do that using its data. >> >> I also work for Kepler project. I use it to monitor my VM has a black >> box and I used it inside my VM with this feature enable. Thanks to that >> I can optimize the workloads (dpdk application,database,..) inside my VM. >> >> This is the use-case in NFV deployment and I'm pretty sure this could be >> the use-case of many others. >> >> > >> > The host admin can merely shuffle workloads around, hoping that >> > a different packing of workloads onto machines, will reduce power >> > in some aount. You might win a few %, or low 10s of % with this >> > if you're good at it. >> > >> > The guest admin can change the way their workload operates to >> > reduce its inherant power consumption baseline. You could easily >> > come across ways to win high 10s of % with this. That's why it >> > is interesting to expose power consumption info to the guest >> > admin. >> > >> > IOW, neither makes the other obsolete, both approaches are >> > desirable. >> > >> >> > The guest workloads can impose non-negligble power consumption >> >> > loads on non-vCPU threads in QEMU. Without that accounted for, >> >> > any adjustments will be working from (sometimes very) inaccurate >> >> > data. >> >> >> >> Perhaps adding one or several energy sensors (ex: some i2c ones), >> >> would let us provide auxiliary threads consumption to guest, and >> >> even make it more granular if necessary (incl. vhost user/out of >> >> process device models or pass-through devices if they have PMU). >> >> It would be better than further muddling vCPUs consumption >> >> estimates with something that doesn't belong there. >> >> I'm confused about your statement. Like every software power metering >> tools out is using RAPL (Kepler, Scaphandre, PowerMon, etc) and custom >> sensors would be better than a what everyone is using ? >> The goal is not to be accurate. The goal is to be able to compare >> A against B in the same environment and RAPL is given reproducible >> values to do so. > > Be careful with saying "The goal isnot to be accurate", as that's > a very broad statement, and I don't think it is true. > > > If you're doing A/B comparisons, you *do* need accuracy, in the > sense that if a guest workload config change alters host CPU > power consumption, you want that to be reflected in what the > guest is told about its power usagte. > > ie if a change in B moves some power usage from a vCPU thread > to a non-vCPU thread, you don't want that power usage to > disappear from what's reported to the guest. It would give you > the false idea that B is more efficient than A, even if the > non-vCPU thread for B was cosuming x2 what the orignal vCPU > thread was for A. > > What I think you don't need is for the absolute magnitude of > the reported power consumption to be a precise match to the > actual power consumption. > > ie if A and B are reported as 7 and 9 Watts respectively, it > doesn't matter if the actual consumption was 12 and 15 watts. >
Right, my bad, I agree. When I said "not accurate" I was indeed talking about the absolute magnitude of the reported power consumption. Like your example above is what I had in mind. Sorry for my clumsy shortcut and thanks for clarifying this important point.