On Thu, Nov 27, 2014 at 2:45 PM, Denis V. Lunev <d...@openvz.org> wrote: > Excessive virtio_balloon inflation can cause invocation of OOM-killer, > when Linux is under severe memory pressure. Various mechanisms are > responsible for correct virtio_balloon memory management. Nevertheless it > is often the case that these control tools does not have enough time to > react on fast changing memory load. As a result OS runs out of memory and > invokes OOM-killer. The balancing of memory by use of the virtio balloon > should not cause the termination of processes while there are pages in the > balloon. Now there is no way for virtio balloon driver to free memory at > the last moment before some process get killed by OOM-killer. > > This does not provide a security breach as balloon itself is running > inside Guest OS and is working in the cooperation with the host. Thus > some improvements from Guest side should be considered as normal. > > To solve the problem, introduce a virtio_balloon callback which is > expected to be called from the oom notifier call chain in out_of_memory() > function. If virtio balloon could release some memory, it will make the > system to return and retry the allocation that forced the out of memory > killer to run. > > This behavior should be enabled if and only if appropriate feature bit > is set on the device. It is off by default. > > This functionality was recently merged into vanilla Linux (actually in > linux-next at the moment) > > commit 5a10b7dbf904bfe01bb9fcc6298f7df09eed77d5 > Author: Raushaniya Maksudova <rmaksud...@parallels.com> > Date: Mon Nov 10 09:36:29 2014 +1030 > > This patch adds respective control bits into QEMU. It introduces > deflate-on-oom option for baloon device which do the trick. > > Signed-off-by: Denis V. Lunev <d...@openvz.org> > CC: Raushaniya Maksudova <rmaksud...@parallels.com> > CC: Anthony Liguori <aligu...@amazon.com> > CC: Michael S. Tsirkin <m...@redhat.com> > --- > hw/virtio/virtio-balloon.c | 6 ++++-- > include/hw/virtio/virtio-balloon.h | 2 ++ > 2 files changed, 6 insertions(+), 2 deletions(-) > > diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c > index 7bfbb75..4d043ce 100644 > --- a/hw/virtio/virtio-balloon.c > +++ b/hw/virtio/virtio-balloon.c > @@ -305,8 +305,8 @@ static void virtio_balloon_set_config(VirtIODevice *vdev, > > static uint32_t virtio_balloon_get_features(VirtIODevice *vdev, uint32_t f) > { > - f |= (1 << VIRTIO_BALLOON_F_STATS_VQ); > - return f; > + VirtIOBalloon *dev = VIRTIO_BALLOON(vdev); > + return (f | VIRTIO_BALLOON_F_STATS_VQ) | dev->host_features; > } > > static void virtio_balloon_stat(void *opaque, BalloonInfo *info) > @@ -409,6 +409,8 @@ static void virtio_balloon_device_unrealize(DeviceState > *dev, Error **errp) > } > > static Property virtio_balloon_properties[] = { > + DEFINE_PROP_BIT("deflate-on-oom", VirtIOBalloon, host_features, > + VIRTIO_BALLOON_F_DEFLATE_ON_OOM, false), > DEFINE_PROP_END_OF_LIST(), > }; > > diff --git a/include/hw/virtio/virtio-balloon.h > b/include/hw/virtio/virtio-balloon.h > index f863bfe..2e1ccd9 100644 > --- a/include/hw/virtio/virtio-balloon.h > +++ b/include/hw/virtio/virtio-balloon.h > @@ -30,6 +30,7 @@ > /* The feature bitmap for virtio balloon */ > #define VIRTIO_BALLOON_F_MUST_TELL_HOST 0 /* Tell before reclaiming pages */ > #define VIRTIO_BALLOON_F_STATS_VQ 1 /* Memory stats virtqueue */ > +#define VIRTIO_BALLOON_F_DEFLATE_ON_OOM 2 /* Deflate balloon on OOM */ > > /* Size of a PFN in the balloon interface. */ > #define VIRTIO_BALLOON_PFN_SHIFT 12 > @@ -67,6 +68,7 @@ typedef struct VirtIOBalloon { > QEMUTimer *stats_timer; > int64_t stats_last_update; > int64_t stats_poll_interval; > + uint32_t host_features; > } VirtIOBalloon; > > #endif > -- > 1.9.1 > >
Had you tried this with a system-wide OOM on a real workload? This behavior can work perfectly with dedicated memory cgroups, but I`m afraid it would be unusable when entire system stalls and waits for a balloon deflation.