On 07/06/2018 10:38 AM, John Snow wrote:
I think the "discard stashed state, making undo impossible"
interpretation is good because .commit() is not allowed to fail. That
function should only do things that never fail.
I think this is probably the correct way to proceed, and we ought to
formalize this model. I think it's actually nearly impossible to do it
the other way around, too.
Let's say we've got two actions that both do error-checking in .prepare,
but perform their actual stateful changes in .commit, but are
interdependent.
{ a, b }
- a.prepare() checks state and confirms we will be able to proceed, but
changes no state.
- b.prepare() checks state and confirms it will be able to proceed, but
can't see what a is planning to do.
- a.commit() succeeds as expected.
- b.commit() does not fail as it is not allowed to, but the effects are
undetermined because we have not checked the state change from a.commit()
I'm also trying to figure out if we guarantee that .abort is always
called in reverse order of .prepare. If we have { a, b, c }, where
'a'.prepare makes one change, 'b'.prepare makes another that 'a' cannot
directly undo, then 'c' fails, we're still okay if 'b'.abort can roll
back to the state it had during prepare, so that 'a'.abort can then undo
state cleanly. And for consistency, I'd argue that .commit should have
the same ordering as .abort (that is, every .prepare - .abort/.commit
pair is a nested stack to track for unwinding purposes, and should be
unwound in reverse order of setting aside resources up front).
In general we can get around this, but the more actions we add, the
harder it is to do proper error checking while considering the
hypothetical state after some actions have committed.
I think the model where we take effect in .prepare() and undo it if
necessary in .abort() is actually easier to model in your head, because
error checking is simpler. That's probably the right model, then.
Yes, I'm also leaning heavily towards strong guarantees. In addition to
our examples of which operations depend on which bitmaps, we have this
additional case:
{ blockdev-snapshot, blockdev-snapshot-internal-sync }
vs.
{ blockdev-snapshot-internal-sync, blockdev-snapshot }
Yes, I know that mixing internal and external snapshots is currently
VERY dangerous
(https://lists.gnu.org/archive/html/qemu-devel/2018-04/msg00865.html) if
you aren't careful - but there are two plausible outcomes that can be
chosen, where the order matters. In the first instance, we want to
create a new external snapshot (going from 'A' to 'A <- B') and THEN
create an internal snapshot (image 'B' gets an internal snapshot); in
the second instance we want to create an internal snapshot (image 'A'
gets an internal snapshot) and THEN create an external snapshot (going
from 'A' to 'A <- B'). Since the file where the internal snapshot is
created is indeterminate unless we strictly process the transaction in
the order that it was given, that argues that we want the strong order
guarantees.
At the very least, I think that's correct for 3.0 and the immediate
future. If you disagree, please speak up because I've long been
particularly uncertain about this aspect.
--John
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3266
Virtualization: qemu.org | libvirt.org