Am 24.06.2016 um 10:20 schrieb Paolo Bonzini: > > On 24/06/2016 10:11, Peter Lieven wrote: >> Am 24.06.2016 um 06:10 schrieb Paolo Bonzini: >>>>> If it's 10M nothing. If there is a 100M regression that is also caused >>>>> by RCU, we have to give up on it for that data structure, or mmap/munmap >>>>> the affected data structures. >>>> If it was only 10MB I would agree. But if I run the VM described earlier >>>> in this thread it goes from ~35MB with Qemu-2.2.0 to ~130-150MB with >>>> current master. This is with coroutine pool disabled. With the coroutine >>>> pool >>>> it can grow to sth like 300-350MB. >>>> >>>> Is there an easy way to determinate if RCU is the problem? I have the same >>>> symptoms, valgrind doesn't see the allocated memory. Is it possible >>>> to make rcu_call directly invoking the function - maybe with a lock around >>>> it >>>> that serializes the calls? Even if its expensive it might show if we search >>>> at the right place. >>> Yes, you can do that. Just make it call the function without locks, for >>> a quick PoC it will be okay. >> Unfortunately, it leads to immediate segfaults because a lot of things seem >> to go horribly wrong ;-) >> >> Do you have any other idea than reverting all the rcu patches for this >> section? > Try freeing under the big QEMU lock: > > if (qemu_mutex_iothread_locked()) { > unlock = true; > qemu_mutex_lock_iothread(); > } > ... > if (unlock) { > qemu_mutex_unlock_iothread(); > } > > afbe70535ff1a8a7a32910cc15ebecc0ba92e7da should be easy to backport.
Will check this out. Meanwhile I read a little about returning RSS to the kernel as I was wondering why RSS and HWM are almost at the same high level. It seems that ptmalloc (glibc default alloctor) is very reluctant in retuning memory to the kernel. There indeed is no guarantee that freed memory returned. Only mmap'ed memory that is unmapped is guaranteed to be returned. So I tried the following without reverting anything: MALLOC_MMAP_THRESHOLD_=4096 ./x86_64-softmmu/qemu-system-x86_64 ... No idea on performance impact yet, but it solves the issue. With default threshold my test VM rises up to 154MB RSS usage: VmHWM: 154284 kB VmRSS: 154284 kB With the option it looks like this: VmHWM: 50588 kB VmRSS: 41920 kB with jemalloc I can observe that the HWM is still high, but RSS is below its value. But still in the order of about 100MB. Peter