QEMU is currently accessing the dirty bitmaps very liberally, which is understandable since the accesses are cheap. This is however not good for squeezing maximum performance out of dataplane, and is also not good if the accesses become more expensive---as is the case when they use atomic primitives.
Patches 1-2 make acpi-build.h only use public memory APIs. Patches 3-7 optimize access to the VGA dirty bitmap, by restricting it to video RAM only. Patches 8-15 optimize access to the code and migration bitmaps, by tracking them respectively if TCG is enabled and if migration is in progress. Note that the first iteration of migration already does not look at the migration bitmap (commit 70c8652, migration: do not search dirty pages in bulk stage, 2013-03-26). Patches 16-21 are Stefan's patches to convert bitmap access to use atomic primitives. While the main purpose of these patches is a working dirty bitmap for dataplane (and possibly multithreaded TCG), there's something that they are immediately useful for: patch 22 makes the migration thread synchronize the bitmap outside the big QEMU lock, thus removing the last source of jitter during the RAM copy phase of migration. Please review and test! (it's available as branch "atomic-dirty" on my github repository) In particular, I suspect that the postcopy patches might be good at finding bugs. Paolo Paolo Bonzini (16): memory: add memory_region_ram_resize acpi-build: remove dependency from ram_addr.h memory: the only dirty memory flag for users is DIRTY_MEMORY_VGA display: enable DIRTY_MEMORY_VGA tracking explicitly memory: return bitmap from memory_region_is_logging framebuffer: check memory_region_is_logging ui/console: check memory_region_is_logging memory: track DIRTY_MEMORY_CODE in mr->dirty_log_mask memory: return DIRTY_MEMORY_MIGRATION from memory_region_is_logging ram_addr: tweaks to xen_modified_memory exec: simplify notdirty_mem_write exec: use memory_region_is_logging to optimize dirty tracking exec: pass client mask to cpu_physical_memory_set_dirty_range exec: only check relevant bitmaps for cleanliness memory: do not touch code dirty bitmap unless TCG is enabled migration: run bitmap sync outside iothread lock Stefan Hajnoczi (6): bitmap: add atomic set functions bitmap: add atomic test and clear memory: use atomic ops for setting dirty memory bits migration: move dirty bitmap sync to ram_addr.h memory: replace cpu_physical_memory_reset_dirty() with test-and-clear memory: make cpu_physical_memory_sync_dirty_bitmap() fully atomic arch_init.c | 56 +++---------------- cputlb.c | 4 +- exec.c | 103 ++++++++++++++++------------------ hw/core/loader.c | 8 +-- hw/display/cg3.c | 1 + hw/display/exynos4210_fimd.c | 7 ++- hw/display/framebuffer.c | 23 ++++++-- hw/display/g364fb.c | 2 +- hw/display/sm501.c | 1 + hw/display/tcx.c | 1 + hw/display/vmware_vga.c | 2 +- hw/i386/acpi-build.c | 36 ++++++------ hw/virtio/vhost.c | 3 +- include/exec/memory.h | 27 +++++++-- include/exec/ram_addr.h | 128 ++++++++++++++++++++++++++++--------------- include/hw/loader.h | 8 ++- include/qemu/bitmap.h | 4 ++ include/qemu/bitops.h | 14 +++++ kvm-all.c | 3 +- memory.c | 34 ++++++++---- ui/console.c | 14 +++-- util/bitmap.c | 78 ++++++++++++++++++++++++++ xen-hvm.c | 3 +- 23 files changed, 356 insertions(+), 204 deletions(-) -- 2.3.3