Hi Akihiko,

Thanks for the review.

On Wed, Aug 06, 2025 at 02:24:37PM +0900, Akihiko Odaki wrote:
> On 2025/08/06 3:25, Arun Menon wrote:
> > We consider,
> >    - the saving of the device state (save or subsection save) and
> >    - running the cleanup after (post_save)
> > as an atomic operation. And that is why, post_save() is called regardless
> > of whether there is a preceeding error. This means that, it is possible
> > that we have 2 distict errors, one from the preceeding function and the
> > other from the post_save() function.
> > 
> > This commit changes the error handling behavior when two errors occur during
> > a save operation.
> > 
> > 1) Preceding errors were propagated before, but they are now dismissed, if 
> > there
> >     is a new post_save() error.
> >      - A failure during the main save operation, means the system failed to
> >        reach a new desired state. A failure during the post-save cleanup,
> >        however, is a more critical issue as it can leave the system in an
> >        inconsistent or corrupted state. At present, all post_save() calls
> >        succeed. However, the introduction of error handling variants of 
> > these
> >        functions (post_save_errp) in the following commit, imply that we 
> > need
> >        to handle and propagate these errors. Therefore, we prioritize 
> > reporting
> >        the more severe error.
> 
> This explains why the post_save() error is propagated instead of propagating
> the preceding error. But the preceding errors still do not have to be
> dismissed if we report them with error_report_err() instead.

Yes, both can also be reported.

> 
> > 
> > 2) post_save() errors were dismissed before, but they are now propagated.
> >      - The original design implicitly assumed post-save hooks were 
> > infallible
> >        cleanup routines.
> >        This will not be the case after introducing the post/pre save/load 
> > errp
> >        variant functions. Dismissing these errors prevents users from
> >        identifying the specific operation that failed.
> 
> Re-iterating previous comments, if introducing post save errp variant
> functions break the assumption, we just don't have to introduce one. The
> reason to introduce one needs to be explained.

Sure, let's keep the original design and try the approach where we rename
post_save to cleanup_save and return void. This should make things clear.

That way, we can avoid introducing post_save_errp() and also we can discard
patches [PATCH v9 23/27] Refactor vmstate_save_state_v() function and this
one [PATCH v9 24/27] Propagate last encountered error in vmstate_save_state_v()
function.

I will post a new version with that patch.

> 
> >        The entire save-and-cleanup process is treated as a single
> >        logical operation, and all potential failures are communicated.
> > 
> > Signed-off-by: Arun Menon <arme...@redhat.com>
> > ---
> >   migration/vmstate.c | 14 ++++++++++++--
> >   1 file changed, 12 insertions(+), 2 deletions(-)
> > 
> > diff --git a/migration/vmstate.c b/migration/vmstate.c
> > index 
> > fbc59caadbbcc75fe6de27b459aa9aa25e76aa0a..ef78a1e62ad92e9608de72d125da80ea496c8dd1
> >  100644
> > --- a/migration/vmstate.c
> > +++ b/migration/vmstate.c
> > @@ -554,6 +554,12 @@ static int vmstate_save_dispatch(QEMUFile *f,
> >                       error_setg(errp, "Save of field %s/%s failed",
> >                                   vmsd->name, field->name);
> >                       ps_ret = post_save_dispatch(vmsd, opaque, &local_err);
> > +                    if (ps_ret) {
> > +                        ret = ps_ret;
> > +                        error_free_or_abort(errp);
> > +                        error_setg(errp, "post-save for %s failed, ret: 
> > %d",
> > +                                   vmsd->name, ret);
> > +                    }
> >                       return ret;
> >                   }
> > @@ -603,10 +609,14 @@ int vmstate_save_state_v(QEMUFile *f, const 
> > VMStateDescription *vmsd,
> >       }
> >       ret = vmstate_subsection_save(f, vmsd, opaque, vmdesc, errp);
> > +
> >       ps_ret = post_save_dispatch(vmsd, opaque, &local_err);
> > -    if (!ret && ps_ret) {
> > +    if (ps_ret) {
> > +        if (ret) {
> > +            error_free_or_abort(errp);
> > +        }
> >           ret = ps_ret;
> > -        error_setg(errp, "post-save failed: %s", vmsd->name);
> > +        error_propagate(errp, local_err);
> >       }
> >       return ret;
> > 
> 
Regards,
Arun


Reply via email to