This is v10 of the in-memory part of the incremental backup feature. There are some remaining issues for which I am requesting commentary, please see below under the "RFC" section for the lingering questions I am aware of as of this revision, the broad overview below, and the detailed changelog.
Enough changes have been made that most Reviewed-By lines from previous iterations have been removed. (Sorry!) This series was originally authored by Fam Zheng; his cover letter is included below. ~John Snow ================================================================= This is the in memory part of the incremental backup feature. With the added commands, we can create a bitmap on a block backend, from which point of time all the writes are tracked by the bitmap, marking sectors as dirty. Later, we call drive-backup and pass the bitmap to it, to do an incremental backup. See the last patch which adds some tests for this use case. Fam ================================================================= For convenience, this patchset is available on github: https://github.com/jnsnow/qemu/commits/dbm-backup v10 (Overview): - I've included one of Vladimir's bitmap fixes as patch #1. - QMP commands and transactions are now protected via aio_context functions. - QMP commands use "node-ref" as a parameter name now. Reasoning is thus: Either a "device name" or a "node name" can be used to reference a BDS, which is the key item we are actually acting on with these bitmap commands. Thus, I refer to this unified parameter as a "Node Reference," or "node-ref." We could also argue for "backend-ref" or "device-ref" for the reverse semantics: where we accept a unified parameter, but we intend to resolve it to the BlockBackend instead of resolving the parameter given to the BlockDriverState. Or, we could use "reference" for both cases, and to match the existing BlockdevRef command. - All QMP commands added are now per-node via a unified parameter name. Though backup only operates on "devices," you are free to create bitmaps for any arbitrary node you wish. It is obviously only useful for the root node, currently. - Bitmap Sync modes (CONSUME, RESET) are removed. See below (changelog, RFC questions) for more details. - Code to recover the bitmap after a failure has been added, but I have some major questions about drive_backup guarantees. See RFC questions. v10 (Detailed Changelog): (1/13) New Patch: - Included Vladimir Sementsov-Ogievskiy's patch that clarifies the semantics of the bitmap primitives. (3/13): - Edited function names for consistency (Stefanha) - Removed compile-time constants (Stefanha) - Acquire aio_context for bitmap add/remove (Stefanha) - Documented optional granularity for bitmap-add in qmp-commands.hx file (Eric Blake) - block_dirty_bitmap_lookup moved forward to this patch to allow more re-use. (Stefanha) - Fixed a problem where the block_dirty_bitmap_lookup didn't always set an error if it returned NULL. - Added an optional return BDS lookup parameter to bdrv_dirty_bitmap_lookup. - Renamed "device" to "node-ref" for block_dirty_bitmap_lookup, adjusted calls to bdrv_lookup_bs() to reflect unified parameter usage. - qmp_block_dirty_bitmap_{add,remove} now reference arbitrary node names via @node-ref. (4/13): - Default granularity and granularity getters are both now uint64_t (Stefanha) (5/13): - Added documentation to warn about the necessity of updating the hbitmap deep copy. (Stefanha) (6/13) - Renamed bdrv_reset_dirty_bitmap to bdrv_clear_dirty_bitmap to be consistent with Vladimir's patches. - Removed const qualifier for bdrv_copy_dirty_bitmap, to accommodate patch 8. (7/13) New Patch: - Added an hbitmap_merge operation to join two bitmaps together. (8/13) New Patch: - Added bdrv_reclaim_dirty_bitmap() to allow us to "roll back" a bitmap into the bitmap that it was spawned from. This will let us maintain an accurate account of dirty sectors even after a failure. - This adds an "originator" pointer to the BdrvDirtyBitmap and is why "const" was removed for copy. (9/13): - QMP semantics changes as outlined for Patch #3. - Enable/Disable now protected by aio_context (Stefanha) (10/13): - Add coroutine_fn annotation to block backup helper. (Stefanha) - Removed sync_bitmap_gran from BackupBlockJob, just use the getter on the BdrvDirtyBitmap instead. (Stefanha) - bdrv_dirty_iter_set was renamed to bdrv_set_dirty_iter. - Calls to bdrv_reset_dirty_bitmap are modified to bdrv_clear_dirty_bitmap to reflect the rename in patch #6. - Bitmap usage modes (RESET vs. CONSUME) has been deleted, for the reason of targeting a simpler core usage first before targeting optimizations. CONSUME is particularly problematic in the case of errors; so this mode is omitted for now. - Adjusted error message to use MirrorSyncMode enum, not (incorrectly) the BitmapSyncMode enum. - In the event of a failure, the sync_bitmap is now merged back into the original bitmap so that we do not lose any dirty bitmap information needlessly. (11/13): - Changed block_dirty_bitmap_add_abort to use qmp_block_dirty_bitmap_remove: - Old code used bdrv_lookup_bs to acquire bitmap and bs anyway, and ignored the failure case. - New code relies on the same information, and with a NULL errp pointer we will similarly ignore the failure case. prepare() should have properly vetted this information before this point anyway. This new code is also now protected via aio_context through the use of the QMP commands. - More code re-use of block_dirty_bitmap_lookup to get both the bs and bitmap pointers. - aio_context protection and cleanup in new abort methods. (13/13): - Modified test for new parameter names. v9: - Edited commit message, for English embetterment (02/10) - Rebased on top of stefanha/block-next (06,08/10) - Adjusted error message and line length (07/10) v8: - Changed int64_t return for bdrv_dbm_calc_def_granularity to uint64_t (2/10) - Updated commit message (2/10) - Removed redundant check for null in device parameter (2/10) - Removed comment cruft. (2/10) - Removed redundant local_err propagation (several) - Updated commit message (3/10) - Fix HBitmap copy loop index (4/10) - Remove redundant ternary (5/10) - Shift up the block_dirty_bitmap_lookup function (6/10) - Error messages cleanup (7/10) - Add an assertion to explain the re-use of .prepare() for two transactions. (8/10) - Removed BDS argument from bitmap enable/disable helper; it was unused. (8/10) v7: (highlights) - First version being authored by jsnow - Addressed most list feedback from V6, many small changes. All feedback was either addressed on-list (as a wontfix) or patched. - Replaced all error_set with error_setg - Replaced all bdrv_find with bdrv_lookup_bs() - Adjusted the max granularity to share a common function with backup/mirror that attempts to "guess" a reasonable default. It clamps between [4K,64K] currently. - The BdrvDirtyBitmap object now counts granularity exclusively in bytes to match its interface. It leaves the sector granularity concerns to HBitmap. - Reworked the backup loop to utilize the hbitmap iterator. There are some extra concerns to handle arrhythmic cases where the granularity of the bitmap does not match the backup cluster size. This iteration works best when it does match, but it's not a deal-breaker if it doesn't -- it just gets less efficient. - Reworked the transactional functions so that abort() wouldn't "undo" a redundant command. They now have been split into a prepare and a commit function (with state) and do not provide an abort command. - Added a block_dirty_bitmap_lookup(device, name, errp) function to shorten a few of the commands added in this series, particularly qmp_enable, qmp_disable, and the transaction preparations. v6: Re-send of v5. v5: Rebase to master. v4: Last version tailored by Fam Zheng. ================================================================= RFC Questions: (1) Return values for QMP documentation. Versions 1-9 indicate that a QMP command will return "DeviceNotFound," but this seems to be related to the old error_set implementation instead of the newer(?) error_setg. How should docs/writing-qmp-commands.txt and this documentation be updated accordingly? (2) qmp_drive_backup coherency Does qmp_drive_backup promise a /coherent/ backup? This has implications for how we handle the failure mode for sync bitmaps in block/backup; those implications are examined below. == MODE: CONSUME == The "CONSUME" BitmapUseMode would take the BdrvDirtyBitmap and copy the sectors indicated by the bitmap, then discard the bitmap. This means that from the time of the start of the backup, nobody is keeping track of new writes. In the failure case, this means that either: (A) Backups are coherent: We cannot retry the operation, because we simply do not know which bits may have been altered from the time we started the backup. The usefulness of the bitmap is lost on errors. (B) Backups are not necessarily coherent: We can retry the operation by just keeping this bitmap around and retrying the operation. Additional bits may have been changed since, but we are only interested in producing some differential, without regard to coherency. The usefulness of this is uncertain, because we do not have the needed information to flush the remaining differential to another image, so we can never produce a differential that forms a coherent backing chain. For the above reasons, I have omitted the CONSUME mode from v10 of the patchset entirely. == MODE: RESET == The "RESET" BitmapUseMode, now the only mode, makes a copy of the BdrvDirtyBitmap for use during backup and clears the original bitmap. This way we are tracking new writes during backup. In the failure case, this means that either for coherent or incoherent snapshots, we should be able to take the list of sectors we originally meant to copy and merge them back into the list of sectors that have become modified since we began copying them. In either the coherent or incoherent case, this should be sufficient for maintaining data integrity and allowing for retries on backup failure. In the incoherent case, this will allow us to flush remaining dirty clusters to disk to produce a coherent incremental backup chain when we go to close the drive. In this mode, a backup retry will then attempt to copy all clusters modified since the last successful backup, because modified sectors will accumulate in the bitmap between each failure. == EFFICIENCY == If backups are not guaranteed or expected to be coherent; we may leave too many bits set. For example, say we copy clusters [0 - 100]. Let's say that cluster 75 is one that is marked dirty. As we begin our backup, we make a copy of the original bitmap and clear all bits in the original. During our backup, as we copy cluster 25, a write occurs that marks cluster 75 dirty in the original bitmap. Now it is marked as dirty in both bitmaps. If drive backups are not coherent, by the time we arrive at cluster 75, it includes the newest data that marked the bit dirty, but no bitmaps have cluster 75 marked clean. Assuming cluster 75 is not written to again before the next incremental backup is requested, when we go to do our backup we will copy cluster 75 needlessly. This scenario only happens if incremental backups are NOT guaranteed to be self-consistent. If they are, there is no problem. If they are not, the existing code can be optimized by marking sectors as clean in the original bitmap as they are copied out. == What the heck? == The big RFC question here is "Are backups expected to be self-consistent?" because depending on the answer, we may need to treat bitmaps different during backup or upon error after a backup. Fam Zheng (8): qapi: Add optional field "name" to block dirty bitmap qmp: Add block-dirty-bitmap-add and block-dirty-bitmap-remove block: Introduce bdrv_dirty_bitmap_granularity() hbitmap: Add hbitmap_copy qmp: Add block-dirty-bitmap-enable and block-dirty-bitmap-disable qapi: Add transaction support to block-dirty-bitmap-{add, enable, disable} qmp: Add dirty bitmap 'enabled' field in query-block qemu-iotests: Add tests for drive-backup sync=dirty-bitmap John Snow (4): block: Add bdrv_copy_dirty_bitmap and bdrv_clear_dirty_bitmap hbitmap: add hbitmap_merge block: add bdrv_reclaim_dirty_bitmap qmp: Add support of "dirty-bitmap" sync mode for drive-backup Vladimir Sementsov-Ogievskiy (1): block: fix spoiling all dirty bitmaps by mirror and migration block.c | 157 ++++++++++++++++++++++++-- block/backup.c | 119 ++++++++++++++++---- block/mirror.c | 27 +++-- blockdev.c | 254 +++++++++++++++++++++++++++++++++++++++++- hmp.c | 3 +- include/block/block.h | 25 ++++- include/block/block_int.h | 2 + include/qemu/hbitmap.h | 19 ++++ migration/block.c | 7 +- qapi-schema.json | 5 +- qapi/block-core.json | 101 ++++++++++++++++- qmp-commands.hx | 68 ++++++++++- tests/qemu-iotests/056 | 33 +++++- tests/qemu-iotests/056.out | 4 +- tests/qemu-iotests/iotests.py | 8 ++ util/hbitmap.c | 48 ++++++++ 16 files changed, 814 insertions(+), 66 deletions(-) -- 1.9.3