this is v5 of my patch series with various optimizations in
zero buffer checking and migration tweaks.

thanks especially to Eric Blake, Orit Wassermann and Paolo Bonzini
for reviewing.

v5:
- move zero splat vector to a different patch
- fix indentation of can_user_buffer_find_nonzero_offset()
- do not unroll the first loop in buffer_find_nonzero_offset()
  to optimize it for zero page checking
- use an older unrolled version of find_next_bit() without
  SIMD instruction as there is no evidence that the vectorized
  version is better if not even worse and the code is easier
  to understand.
- added a word in the commit message of patch 8 
  about the skipped pages field in QMP MigrationStats.
- fixed the order of key-value pairs of MigrationStats in
  qapi-schema.json
- updated info about the performance benefit of is_zero_page()
  to the latest benchmark results in the commit message.

v4:
- do not inline buffer_find_nonzero_offset()
- inline can_usebuffer_find_nonzero_offset() correctly
- readd asserts in buffer_find_nonzero_offset() as profiling
  shows they do not hurt.
- change last occurences of scalar 8 by 
  BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR
- avoid deferencing p already in patch 5 where we
  know that the page (p) is zero
- explicitly set bytes_sent = 0 if we skip a zero page.
  bytes_sent was 0 before, but it was not obvious.
- add accounting information for skipped zero pages
- fix errors reported by checkpatch.pl

v3:
- remove asserts, inline functions and add a check
  function if buffer_find_nonzero_offset() can be used.
- use above check function in buffer_is_zero() and
  find_next_bit().
- use buffer_is_nonzero_offset() directly to find
  zero pages. we know that all requirements are met
  for memory pages.
- fix C89 violation in buffer_is_zero().
- avoid derefencing p in ram_save_block() if we already
  know the page is zero.
- fix initialization of last_offset in reset_ram_globals().
- avoid skipping pages with offset == 0 in bulk stage in
  migration_bitmap_find_and_reset_dirty().
- compared to v1 check for zero pages also after bulk
  ram migration as there are guests (e.g. Windows) which
  zero out large amount of memory while running.

v2:
- fix description, add trivial zero check and add asserts 
  to buffer_find_nonzero_offset.
- add a constant for the unroll factor of buffer_find_nonzero_offset
- replace is_dup_page() by buffer_is_zero()
- added test results to xbzrle patch
- optimize descriptions

Peter Lieven (10):
  move vector definitions to qemu-common.h
  add a zero splat vector to qemu-common.h
  cutils: add a function to find non-zero content in a buffer
  buffer_is_zero: use vector optimizations if possible
  bitops: unroll while loop in find_next_bit()
  migration: search for zero instead of dup pages
  migration: add an indicator for bulk state of ram migration
  migration: do not sent zero pages in bulk stage
  migration: do not search dirty pages in bulk stage
  migration: use XBZRLE only after bulk stage

 arch_init.c                   |   74 +++++++++++++++++++----------------------
 hmp.c                         |    2 ++
 include/migration/migration.h |    2 ++
 include/qemu-common.h         |   37 +++++++++++++++++++++
 migration.c                   |    3 +-
 qapi-schema.json              |    8 +++--
 qmp-commands.hx               |    3 +-
 util/bitops.c                 |   18 +++++++++-
 util/cutils.c                 |   60 +++++++++++++++++++++++++++++++++
 9 files changed, 162 insertions(+), 45 deletions(-)

-- 
1.7.9.5


Reply via email to