On 04/02/2016 08:48, Gonglei (Arei) wrote:
> 11.44%  qemu-kvm                 [.] memory_region_find
>   6.31%  qemu-kvm                 [.] qemu_get_ram_ptr
>   4.61%  libpthread-2.19.so       [.] __pthread_mutex_unlock_usercnt
>   3.54%  qemu-kvm                 [.] qemu_ram_addr_from_host
>   2.80%  libpthread-2.19.so       [.] pthread_mutex_lock
>   2.55%  qemu-kvm                 [.] object_unref
>   2.49%  libc-2.19.so             [.] malloc
>   2.47%  libc-2.19.so             [.] _int_malloc
>   2.34%  libc-2.19.so             [.] _int_free
>   2.18%  qemu-kvm                 [.] object_ref
>   2.18%  qemu-kvm                 [.] address_space_translate
>   2.03%  libc-2.19.so             [.] __memcpy_sse2_unaligned
>   1.76%  libc-2.19.so             [.] malloc_consolidate
>   1.56%  qemu-kvm                 [.] addrrange_intersection
>   1.52%  qemu-kvm                 [.] vring_pop
>   1.36%  qemu-kvm                 [.] find_next_zero_bit
>   1.30%  [kernel]                 [k] native_write_msr_safe
>   1.29%  qemu-kvm                 [.] addrrange_intersects
>   1.21%  qemu-kvm                 [.] vring_map
>   0.93%  qemu-kvm                 [.] virtio_notify
> 
> Do you have any thoughts to decrease the cpu overhead and get higher through 
> output? Thanks!

Using bigger chunks than 256 bytes will reduce the overhead in
memory_region_find and qemu_get_ram_ptr.  You could expect
 a further 10-12% improvement.

Paolo

Reply via email to