On 24.09.2018 14:16, Halil Pasic wrote:
>
> On 09/24/2018 01:36 PM, Cornelia Huck wrote:
>> On Wed, 12 Sep 2018 15:43:03 -0400
>> Tony Krowiak <akrow...@linux.vnet.ibm.com> wrote:
>>
>>> From: Tony Krowiak <akrow...@linux.ibm.com>
>>>
>>> Let's call PAPQ(ZAPQ) to zeroize a queue for each queue configured
>>> for a mediated matrix device when it is released.
>>>
>>> Zeroizing a queue resets the queue, clears all pending
>>> messages for the queue entries and disables adapter interruptions
>>> associated with the queue.
>>>
>>> Signed-off-by: Tony Krowiak <akrow...@linux.ibm.com>
>>> Reviewed-by: Halil Pasic <pa...@linux.ibm.com>
>>> Tested-by: Michael Mueller <m...@linux.ibm.com>
>>> Tested-by: Farhan Ali <al...@linux.ibm.com>
>>> Signed-off-by: Christian Borntraeger <borntrae...@de.ibm.com>
>>> ---
>>>  drivers/s390/crypto/vfio_ap_ops.c |   44 
>>> +++++++++++++++++++++++++++++++++++++
>>>  1 files changed, 44 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/drivers/s390/crypto/vfio_ap_ops.c 
>>> b/drivers/s390/crypto/vfio_ap_ops.c
>>> index f8b276a..48b1b78 100644
>>> --- a/drivers/s390/crypto/vfio_ap_ops.c
>>> +++ b/drivers/s390/crypto/vfio_ap_ops.c
>>> @@ -829,6 +829,49 @@ static int vfio_ap_mdev_group_notifier(struct 
>>> notifier_block *nb,
>>>     return NOTIFY_OK;
>>>  }
>>>  
>>> +static int vfio_ap_mdev_reset_queue(unsigned int apid, unsigned int apqi,
>>> +                               unsigned int retry)
>>> +{
>>> +   struct ap_queue_status status;
>>> +
>>> +   do {
>>> +           status = ap_zapq(AP_MKQID(apid, apqi));
>>> +           switch (status.response_code) {
>>> +           case AP_RESPONSE_NORMAL:
>>> +                   return 0;
>>> +           case AP_RESPONSE_RESET_IN_PROGRESS:
>>> +           case AP_RESPONSE_BUSY:
>>> +                   msleep(20);
>>> +                   break;
>>> +           default:
>>> +                   /* things are really broken, give up */
>>> +                   return -EIO;
>>> +           }
>>> +   } while (retry--);
>>> +
>>> +   return -EBUSY;
>> So, this function may either return 0, -EIO (things are really broken),
>> or -EBUSY (still busy after multiple tries)...
>>
>>> +}
>>> +
>>> +static int vfio_ap_mdev_reset_queues(struct mdev_device *mdev)
>>> +{
>>> +   int ret;
>>> +   int rc = 0;
>>> +   unsigned long apid, apqi;
>>> +   struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
>>> +
>>> +   for_each_set_bit_inv(apid, matrix_mdev->matrix.apm,
>>> +                        matrix_mdev->matrix.apm_max + 1) {
>>> +           for_each_set_bit_inv(apqi, matrix_mdev->matrix.aqm,
>>> +                                matrix_mdev->matrix.aqm_max + 1) {
>>> +                   ret = vfio_ap_mdev_reset_queue(apid, apqi, 1);
>>> +                   if (ret)
>>> +                           rc = ret;
>> ...and here, we return the last error of any of the resets. Two
>> questions:
>>
>> - Does it make sense to continue if we get -EIO? IOW, does "really
>>   broken" only refer to a certain tuple and other tuples still can/need
>>   to be reset?
> I think it does make sense to continue, because IMHO "things are really
> broken" is an overstatement (I mean the APQN invalid case). One could
> argue would skipping the current card (adapter) be justified or not.
>
> IMHO the current code is good enough for the first shot, and we can think
> about fine-tuning it later.
Absolutely. The -EIO case is reached for example when the APQN
is 'deconfigured' which means the crypto adapter is logically unplugged.
So the -EIO case should NOT lead to some fatal actions like panic()
or cause a KVM guest to shut down or so.
>> - Is the return code useful in any way, as we don't know which tuple it
>>   refers to?
>>
> Well, good question. It conveys that the operation can 'fail'. AFAIR -EBUSY
> is mostly fine given what the architecture say if we are satisfied with just
> reset. And the cases behind -EIO might actually be OK too in the same sense.
> My guess is, that based on the return value client code can tell if we have
> zeroize for all queues or basically just reset (like rapq). We could log that
> to some debug facility or whatever -- I guess, but at the moment we don't 
> care.
>
> In the end I think the code is good enough as is, and if we want we can
> improve on it later.
>
> Regards,
> Halil
>
>
>>> +           }
>>> +   }
>>> +
>>> +   return rc;
>>> +}
>>> +
>>>  static int vfio_ap_mdev_open(struct mdev_device *mdev)
>>>  {
>>>     struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
>>> @@ -859,6 +902,7 @@ static void vfio_ap_mdev_release(struct mdev_device 
>>> *mdev)
>>>     if (matrix_mdev->kvm)
>>>             kvm_arch_crypto_clear_masks(matrix_mdev->kvm);
>>>  
>>> +   vfio_ap_mdev_reset_queues(mdev);
>>>     vfio_unregister_notifier(mdev_dev(mdev), VFIO_GROUP_NOTIFY,
>>>                              &matrix_mdev->group_notifier);
>>>     matrix_mdev->kvm = NULL;

Reply via email to