Re: [PATCH v6 0/3] Add support for the RAPL MSRs series

Daniel P . Berrangé Tue, 22 Oct 2024 06:16:28 -0700

On Tue, Oct 22, 2024 at 02:46:15PM +0200, Igor Mammedov wrote:
> On Fri, 18 Oct 2024 13:59:34 +0100
> Daniel P. Berrangé <berra...@redhat.com> wrote:
> 
> > On Fri, Oct 18, 2024 at 02:25:26PM +0200, Igor Mammedov wrote:
> > > On Wed, 16 Oct 2024 14:56:39 +0200
> > > "Anthony Harivel" <ahari...@redhat.com> wrote:
> [...]
> 
> > > 
> > > This also leads to a question, if we should account for
> > > not VCPU threads at all. Looking at real hardware, those
> > > MSRs return power usage of CPUs only, and they do not
> > > return consumption from auxiliary system components
> > > (io/memory/...). One can consider non VCPU threads in QEMU
> > > as auxiliary components, so we probably should not to
> > > account for them at all when modeling the same hw feature.
> > > (aka be consistent with what real hw does).  
> > 
> > I understand your POV, but I think that would be a mistake,
> > and would undermine the usefulness of the feature.
> > 
> > The deployment model has a cluster of hosts and guests, all
> > belonging to the same user. The user goal is to measure host
> > power consumption imposed by the guest, and dynamically adjust
> > guest workloads in order to minimize power consumption of the
> > host.
> 
> For cloud use-case, host side is likely in a better position
> to accomplish the task of saving power by migrating VM to
> another socket/host to compact idle load. (I've found at least 1
> kubernetis tool[1], which does energy monitoring). Perhaps there
> are schedulers out there that do that using its data.


The host admin can merely shuffle workloads around, hoping that
a different packing of workloads onto machines, will reduce power
in some aount. You might win a few %, or low 10s of % with this
if you're good at it.

The guest admin can change the way their workload operates to
reduce its inherant power consumption baseline. You could easily
come across ways to win high 10s of % with this. That's why it
is interesting to expose power consumption info to the guest
admin.

IOW, neither makes the other obsolete, both approaches are
desirable.

> > The guest workloads can impose non-negligble power consumption
> > loads on non-vCPU threads in QEMU. Without that accounted for,
> > any adjustments will be working from (sometimes very) inaccurate
> > data.
> 
> Perhaps adding one or several energy sensors (ex: some i2c ones),
> would let us provide auxiliary threads consumption to guest, and
> even make it more granular if necessary (incl. vhost user/out of
> process device models or pass-through devices if they have PMU).
> It would be better than further muddling vCPUs consumption
> estimates with something that doesn't belong there.

There's a tradeoff here in that info directly associated with
backends threads, is effectively exposing private QEMU impl
details as public ABI. IOW, we don't want too fine granularity
here, we need it abstracted sufficiently, that different
backend choices for a given don't change what sensors are
exposed.

I also wonder how existing power monitoring applications
would consume such custom sensors - is there sufficient
standardization in this are that we're not inventing
something totally QEMU specific ?

> > IOW, I think it is right to include non-vCPU threads usage in
> > the reported info, as it is still fundamentally part of the
> > load that the guest imposes on host pCPUs it is permitted to
> > run on.
> 
> 
> From what I've read, process energy usage done via RAPL is not
> exactly accurate. But there are monitoring tools out there that
> use RAPL and other sources to make energy consumption monitoring
> more reliable.
> 
> Reinventing that wheel and pulling all of the nuances of process
> power monitoring inside of QEMU process, needlessly complicates it.
> Maybe we should reuse one of existing tools and channel its data
> through appropriate QEMU channels (RAPL/emulated PMU counters/...).

Note, this feature is already released in QEMU 9.1.0.

> Implementing RAPL in pure form though looks fine to me,
> so the same tools could use it the same way as on the host
> if needed without VM specific quirks.

IMHO the so called "pure" form is misleading to applications, unless
we first provided  some other pratical way to expose the data that
we would be throwing away from RAPL.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Re: [PATCH v6 0/3] Add support for the RAPL MSRs series

Reply via email to