Il lun 29 lug 2024, 14:20 Kevin Wolf <kw...@redhat.com> ha scritto:

> Apparently both oVirt and Kubevirt unconditionally use the stop policy,
> so I'm afraid in this case we must acknowledge that our expectations
> don't match reality.
>

Yeah, of course.

If I understand correctly, not having a pr-manager could mean that QEMU
> itself is sufficiently privileged and then the same logic would apply.
>
> But even if it means that we can't change any persistent reservations
> from the VM, what use would stopping the VM be? You would run into the
> exact case I'm describing in the commit message: You try to resume the
> VM and it immediately stops again because the request still doesn't get
> through. Or do you expect the host admin to take some manual action
> then?
>

Yes, if the PR operation is not allowed then the host admin would probably
get a notification and release the PR (or perhaps shutdown the VM with an
error) itself.

And what would you do about the Windows cluster validation case that
> intentionally sends a request which reservations don't and shouldn't
> allow? There is nothing on the host side to fix there. The guest is only
> happy when it gets an error back.
>

Yes, in that case (which is the most common one) there is nothing you can
do, so the patch is a good idea even if the case without a PR manager is a
bit murky.

Paolo

> > -        if (status == CHECK_CONDITION) {
> > > +        switch (status) {
> > > +        case CHECK_CONDITION:
> > >              req_has_sense = true;
> > >              error = scsi_sense_buf_to_errno(r->req.sense,
> sizeof(r->req.sense));
> > > -        } else {
> > > +            break;
> > > +        case RESERVATION_CONFLICT:
> > > +            /* Don't apply the error policy, always report to the
> guest */
> >
> > This is the only case where you get error == 0. Maybe remove it from
> > the initializer, and set it here?
>
> Not sure why the initialiser was added in the first place, but yes, I
> can do that.
>
> Kevin
>
> > On Mon, Jul 29, 2024 at 11:47 AM Kevin Wolf <kw...@redhat.com> wrote:
> > >
> > > RESERVATION_CONFLICT is not a backend error, but indicates that the
> > > guest tried to make a request that it isn't allowed to execute. Pass
> the
> > > error to the guest so that it can decide what to do with it.
> > >
> > > Without this, if we stop the VM in response to a RESERVATION_CONFLICT,
> > > it can happen that the VM cannot be resumed any more because every
> > > attempt to resume it immediately runs into the same error and stops the
> > > VM again.
> > >
> > > One case that expects RESERVATION_CONFLICT errors to be visible in the
> > > guest is running the validation tests in Windows 2019's Failover
> Cluster
> > > Manager, which intentionally tries to execute invalid requests to see
> if
> > > they are properly rejected.
> > >
> > > Buglink: https://issues.redhat.com/browse/RHEL-50000
> > > Signed-off-by: Kevin Wolf <kw...@redhat.com>
> > > ---
> > >  hw/scsi/scsi-disk.c | 15 +++++++++++----
> > >  1 file changed, 11 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
> > > index 69a195177e..e173b238de 100644
> > > --- a/hw/scsi/scsi-disk.c
> > > +++ b/hw/scsi/scsi-disk.c
> > > @@ -235,11 +235,17 @@ static bool scsi_handle_rw_error(SCSIDiskReq *r,
> int ret, bool acct_failed)
> > >      } else {
> > >          /* A passthrough command has completed with nonzero status.
> */
> > >          status = ret;
> > > -        if (status == CHECK_CONDITION) {
> > > +        switch (status) {
> > > +        case CHECK_CONDITION:
> > >              req_has_sense = true;
> > >              error = scsi_sense_buf_to_errno(r->req.sense,
> sizeof(r->req.sense));
> > > -        } else {
> > > +            break;
> > > +        case RESERVATION_CONFLICT:
> > > +            /* Don't apply the error policy, always report to the
> guest */
> > > +            break;
> > > +        default:
> > >              error = EINVAL;
> > > +            break;
> > >          }
> > >      }
> > >
> > > @@ -249,8 +255,9 @@ static bool scsi_handle_rw_error(SCSIDiskReq *r,
> int ret, bool acct_failed)
> > >       * are usually retried immediately, so do not post them to QMP and
> > >       * do not account them as failed I/O.
> > >       */
> > > -    if (req_has_sense &&
> > > -        scsi_sense_buf_is_guest_recoverable(r->req.sense,
> sizeof(r->req.sense))) {
> > > +    if (!error || (req_has_sense &&
> > > +                   scsi_sense_buf_is_guest_recoverable(r->req.sense,
> > > +
>  sizeof(r->req.sense)))) {
> > >          action = BLOCK_ERROR_ACTION_REPORT;
> > >          acct_failed = false;
> > >      } else {
> > > --
> > > 2.45.2
> > >
> >
>
>

Reply via email to