On Sat, Jul 27, 2024 at 01:18:32PM +0800, Guoyi Tu wrote: > On 2024/7/25 19:57, Daniel P. Berrangé wrote: > > On Thu, Jul 25, 2024 at 01:35:21PM +0200, Markus Armbruster wrote: > > > Guoyi Tu <t...@chinatelecom.cn> writes: > > > > > > > In the test environment, we conducted IO stress tests on all storage > > > > disks > > > > within a virtual machine that had five storage devices mounted.During > > > > testing, > > > > we found that the qemu process allocated a large amount of memory > > > > (~800MB) > > > > to handle these IO operations. > > > > > > > > When the test ended, although qemu called free() to release the > > > > allocated > > > > memory, the memory was not actually returned to the operating system, as > > > > observed via the top command. > > > > > > > > Upon researching the glibc memory management mechanism, we found that > > > > when > > > > small chunks of memory are allocated in user space and then released > > > > with > > > > free(), the glibc memory management mechanism does not necessarily > > > > return > > > > this memory to the operating system. Instead, it retains the memory > > > > until > > > > certain conditions are met for release. > > > > > > Yes. > > > > Looking at mallopt(3) man page, the M_TRIM_THRESHOLD is said to control > > when glibc releases the top of the heap back to the OS. It is said to > > default to 128 kb. > Yes, the M_TRIM_THRESHOLD option can control glibc to release the free > memory at the top of the heap, but glibc will not release the free > memory in the middle of the heap. > > > I'm curious how we get from that default, to 800 MB of unused memory > Is > > it related to the number of distinct malloc arenas that are in use ? > > At least 600MB of memory is free, and this memory might be in the middle of > the heap and cannot be automatically released. > > > I'm curious what malloc_stats() would report before & after malloc_trim > > when QEMU is in this situation with lots of wasted memory. > Here is the test case:
snip That looks like an artifical reproducer, rather than the real world QEMU scenario. What's the actual I/O stress test scenario you use to reproduce the problem in QEMU, and how have you configured QEMU (ie what CLI args) ? I'm fairly inclined to suggest that having such a huge amount of freed memory is a glibc bug, but to escalate this to glibc requires us to provide them better real world examples of the problems. > > The above usage is automatic, while this proposal requires that > > an external mgmt app monitor QEMU and tell it to free memory. > > I'm wondering if the latter is really desirable, or whether QEMU > > can call this itself when reasonable ? > > Yes, I have also considered implementing an automatic memory release > function within qemu. This approach would require qemu to periodically > monitor the IO load of all backend storage, and if the IO load is very > low over a period of time, it would proactively release memory. I would note that in systemd they have logic which is monitoring either /proc/pressure/memory or $CGROUP/memory.pressure, and in response to events on that, it will call malloc_trim https://github.com/systemd/systemd/blob/main/docs/MEMORY_PRESSURE.md https://docs.kernel.org/accounting/psi.html Something like that might be better, as it lets us hide the specific design & impl choices inside QEMU, letting us change/evolve them at will without impacting public API designs. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|