Re: [Qemu-devel] [RFC PATCH 2/3] cpus-common: Cache allocated work items

Paolo Bonzini Tue, 29 Aug 2017 13:39:15 -0700

Il 28 ago 2017 11:43 PM, "Pranith Kumar" <bobby.pr...@gmail.com> ha scritto:


On Mon, Aug 28, 2017 at 1:47 PM, Richard Henderson
<richard.hender...@linaro.org> wrote:
> On 08/27/2017 08:53 PM, Pranith Kumar wrote:
>> Using heaptrack, I found that quite a few of our temporary allocations
>> are coming from allocating work items. Instead of doing this
>> continously, we can cache the allocated items and reuse them instead
>> of freeing them.
>>
>> This reduces the number of allocations by 25% (200000 -> 150000 for
>> ARM64 boot+shutdown test).
>>
>> Signed-off-by: Pranith Kumar <bobby.pr...@gmail.com>
>
> Why does this list need to record a "last" element?
> It would seem a simple lifo would be sufficient.
>
> (You would also be able to manage the list via cmpxchg without a separate
lock,
> but perhaps the difference between the two isn't measurable.)
>

Yes, seems like a better design choice. Will fix in next iteration.


More recent glibc will also have an efficient per-thread allocator, and
though I haven't yet benchmarked the newer glibc malloc, GSlice is slower
than at least both tcmalloc and jemalloc. Perhaps you could instead make
work items statically allocated?

Thanks,

Paolo


Thanks,
--
Pranith

Re: [Qemu-devel] [RFC PATCH 2/3] cpus-common: Cache allocated work items

Reply via email to