* Daniel P. Berrange <[email protected]> [2010-08-24 11:02:44]:
> On Tue, Aug 24, 2010 at 01:05:26PM +0530, Balbir Singh wrote:
> > * Nikunj A. Dadhania <[email protected]> [2010-08-24 11:53:27]:
> >
> > >
> > > Subject: [RFC] Memory controller exploitation in libvirt
> > >
> > > Memory CGroup is a kernel feature that can be exploited effectively in
> > > the
> > > current libvirt/qemu driver. Here is a shot at that.
> > >
> > > At present, QEmu uses memory ballooning feature, where the memory can be
> > > inflated/deflated as and when needed, co-operatively between the host and
> > > the guest. There should be some mechanism where the host can have more
> > > control over the guests memory usage. Memory CGroup provides features
> > > such
> > > as hard-limit and soft-limit for memory, and hard-limit for swap area.
> > >
> > > Design 1: Provide new API and XML changes for resource management
> > > =================================================================
> > >
> > > All the memory controller tunables are not supported with the current
> > > abstractions provided by the libvirt API. libvirt works on various OS.
> > > This
> > > new API will support GNU/Linux initially and as and when other platforms
> > > starts supporting memory tunables, the interface could be enabled for
> > > them. Adding following two function pointer to the virDriver interface.
> > >
> > > 1) domainSetMemoryParameters: which would take one or more name-value
> > > pairs. This makes the API extensible, and agnostic to the kind of
> > > parameters supported by various Hypervisors.
> > > 2) domainGetMemoryParameters: For getting current memory parameters
> > >
> > > Corresponding libvirt public API:
> > > int virDomainSetMemoryParamters (virDomainPtr domain,
> > > virMemoryParamterPtr params,
> > > unsigned int nparams);
> > > int virDomainGetMemoryParamters (virDomainPtr domain,
> > > virMemoryParamterPtr params,
> > > unsigned int nparams);
> > >
> > >
> >
> > Does nparams imply setting several parameters together? Does bulk
> > loading help? I would prefer splitting out the API if possible
> > into
> >
> > virCgroupSetMemory() - already present in src/util/cgroup.c
> > virCgroupGetMemory() - already present in src/util/cgroup.c
> > virCgroupSetMemorySoftLimit()
> > virCgroupSetMemoryHardLimit()
> > virCgroupSetMemorySwapHardLimit()
> > virCgroupGetStats()
>
> Nope, we don't want cgroups exposed in the public API, since this
> has to be applicable to the VMWare and OpenVZ drivers too.
>
I am not talking about exposing these as public API, but
be a part of src/util/cgroup.c and utilized by the qemu driver.
It is good to abstract out the OS independent parts, but my concern
was double exposure through API like driver->setMemory() that is currently
used and the newer API.
> > > Parameter list supported:
> > >
> > > MemoryHardLimits (memory.limits_in_bytes) - Maximum memory
> > > MemorySoftLimits (memory.softlimit_in_bytes) - Desired memory
> >
> > Soft limits allows you to set memory limit on contention.
> >
> > > MemoryMinimumGaurantee - Minimum memory required (without this amount
> > > of
> > > memory, VM should not be started)
> > >
> > > SwapHardLimits (memory.memsw_limit_in_bytes) - Maximum swap
> > > SwapSoftLimits (Currently not supported by kernel) - Desired swap
> > > space
> > >
> >
> > We *dont* support SwapSoftLimits in the memory cgroup controller with
> > no plans to support it in the future either at this point. The
> > semantics are just too hard to get right at the moment.
>
> That's not a huge problem. Since we have many hypervisors to support
> in libvirt, I expect the set of tunables will expand over time, and
> not every hypervisor driver in libvirt will support every tunable.
> They'll just pick the tunables that apply to them. We can leave
> SwapSoftLimits out of the public API until we find a HV that needs
> it
>
> >
> > > Tunables memory.limit_in_bytes, memory.softlimit_in_bytes and
> > > memory.memsw_limit_in_bytes are provided by the memory controller in
> > > the
> > > Linux kernel.
> > >
> > > I am not an expert here, so just listing what new elements need to be
> > > added
> > > to the XML schema:
> > >
> > > <define name="resource">
> > > <element memory>
> > > <element memoryHardLimit/>
> > > <element memorySoftLimit/>
> > > <element memoryMinGaurantee/>
> > > <element swapHardLimit/>
> > > <element swapSoftLimit/>
> > > </element>
> > > </define>
> > >
> >
> > I'd prefer a syntax that integrates well with what we currently have
> >
> > <cgroup>
> > <path>...</path>
> > <controller>
> > <name>..</name>
> > <soft limit>...</>
> > <hard limit>...</>
> > </controller>
> > ...
> > </cgroup>
>
> That is exposing far too much info about the cgroups implementation
> details. The XML representation needs to be decouple from the
> implementation.
>
Don't we already expose a lot of information about qemu for example
about vhost net's or cmdline's/virtio etc in the qemu configuration of
a guest. I am not opposed to having a higher level abstraction but
concerned that some of the nitty-gritty details like swappiness (yes
that is a tunable) or the interpretation of stats might vary widely
across operating systems. Hence, I felt it is better to expose it as a
part of the qemu-cgroup-linux driver combo.
--
Three Cheers,
Balbir
--
libvir-list mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/libvir-list