On Tue, Feb 4, 2020 at 3:58 PM Tyler Sanderson <[email protected]> wrote:

>
>
> On Tue, Feb 4, 2020 at 11:17 AM David Hildenbrand <[email protected]>
> wrote:
>
>> On 04.02.20 19:52, Tyler Sanderson wrote:
>> >
>> >
>> > On Tue, Feb 4, 2020 at 12:29 AM David Hildenbrand <[email protected]
>> > <mailto:[email protected]>> wrote:
>> >
>> >     On 03.02.20 21:32, Tyler Sanderson wrote:
>> >     > There were apparently good reasons for moving away from OOM
>> notifier
>> >     > callback:
>> >     > https://lkml.org/lkml/2018/7/12/314
>> >     > https://lkml.org/lkml/2018/8/2/322
>> >     >
>> >     > In particular the OOM notifier is worse than the shrinker because:
>> >
>> >     The issue is that DEFLATE_ON_OOM is under-specified.
>> >
>> >     >
>> >     >  1. It is last-resort, which means the system has already gone
>> through
>> >     >     heroics to prevent OOM. Those heroic reclaim efforts are
>> expensive
>> >     >     and impact application performance.
>> >
>> >     That's *exactly* what "deflate on OOM" suggests.
>> >
>> >
>> > It seems there are some use cases where "deflate on OOM" is desired and
>> > others where "deflate on pressure" is desired.
>> > This suggests adding a new feature bit "DEFLATE_ON_PRESSURE" that
>> > registers the shrinker, and reverting DEFLATE_ON_OOM to use the OOM
>> > notifier callback.
>> >
>> > This lets users configure the balloon for their use case.
>>
>> You want the old behavior back, so why should we introduce a new one? Or
>> am I missing something? (you did want us to revert to old handling, no?)
>>
> Reverting actually doesn't help me because this has been the behavior
> since Linux 4.19 which is already widely in use. So my device
> implementation needs to handle the shrinker behavior anyways. I started
> this conversation to ask what the intended device implementation was.
>
I should clarify: reverting _would_ improve guest performance under my
implementation. So I guess I'm in favor. But I think we should consider
reasonable alternative implementations. I think this suggests adding a new
feature bit to allow device implementations to choose.


> I think there are reasonable device implementations that would prefer the
> shrinker behavior (it turns out that mine doesn't).
> For example, an implementation that slowly inflates the balloon for the
> purpose of memory overcommit. It might leave the balloon inflated and
> expect any memory pressure (including page cache usage) to deflate the
> balloon as a way to dynamically right-size the balloon.
>
> Two reasons I didn't go with the above implementation:
> 1. I need to support guests before Linux 4.19 which don't have the
> shrinker behavior.
> 2. Memory in the balloon does not appear as "available" in /proc/meminfo
> even though it is freeable. This is confusing to users, but isn't a deal
> breaker.
>
> If we added a DEFLATE_ON_PRESSURE feature bit that indicated shrinker API
> support then that would resolve reason #1 (ideally we would backport the
> bit to 4.19).
>
> In any case, the shrinker behavior when pressuring page cache is more of
> an inefficiency than a bug. It's not clear to me that it necessitates
> reverting. If there were/are reasons to be on the shrinker interface then I
> think those carry similar weight as the problem itself.
>
>
>>
>> I consider virtio-balloon to this very day a big hack. And I don't see
>> it getting better with new config knobs. Having that said, the
>> technologies that are candidates to replace it (free page reporting,
>> taming the guest page cache, etc.) are still not ready - so we'll have
>> to stick with it for now :( .
>>
>> >
>> > I'm actually not sure how you would safely do memory overcommit without
>> > DEFLATE_ON_OOM. So I think it unlocks a huge use case.
>>
>> Using better suited technologies that are not ready yet (well, some form
>> of free page reporting is available under IBM z already but in a
>> proprietary form) ;) Anyhow, I remember that DEFLATE_ON_OOM only makes
>> it less likely to crash your guest, but not that you are safe to squeeze
>> the last bit out of your guest VM.
>>
> Can you elaborate on the danger of DEFLATE_ON_OOM? I haven't seen any
> problems in testing but I'd really like to know about the dangers.
> Is there a difference in safety between the OOM notifier callback and the
> shrinker API?
>
>
>>
>> --
>> Thanks,
>>
>> David / dhildenb
>>
>>
_______________________________________________
Virtualization mailing list
[email protected]
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Reply via email to