On 5/13/2021 3:38 PM, Kevin Traynor wrote:
> On 13/05/2021 15:11, David Marchand wrote:
>> On Thu, May 13, 2021 at 2:38 PM Chenbo Xia <chenbo....@intel.com> wrote:
>>>
>>> This patch fixes an issue of application crash because of vhost iotlb
>>> not initialized when virtio has multiqueue enabled.
>>>
>>> iotlb messages will be sent when some queues are not enabled. If we
>>> initialize iotlb in vhost_user_set_vring_num, it could happen that
>>> iotlb update comes when iotlb pool of disabled queues are not
>>> initialized.
>>
>> This makes the problem I reproduced disappear at init, but I noticed
>> the segfault after restarting testpmd once.
>> And a little bit after this, my vm crashed.
>>
>> This is not systematic, so I guess there is some condition with how
>> the virtio device is initialised in the vm.
>>
> 
> Ok, no point in Red Hat QA testing RC3 yet, if it is still faulty.
> 
> fyi - if you want to fix with a new patch it will likely delay Red Hat
> QA testing RC3 (maybe others?) and probably they will only have cycles
> for one RC3 test run.
> 
> If you choose to revert, we can ask Red Hat QA to test RC3 without
> further delay. Please let us know when you consider the options.
> 

If the patch is not good to go as it is I suggest reverting it, as far as I know
Chenbo will be off for Friday & Monday, so it doesn't leave much time to
update/test a new version.

>>
>> One question below.
>>
>>
>> Bugzilla ID: 703
>>
>>> Fixes: 968bbc7e2e50 ("vhost: avoid IOTLB mempool allocation while IOMMU 
>>> disabled")
>>>
>>
>> Reported-by: Pei Zhang <pezh...@redhat.com>
>>
>>> Signed-off-by: Chenbo Xia <chenbo....@intel.com>
>>> ---
>>>  lib/vhost/vhost_user.c | 13 +++++++++----
>>>  1 file changed, 9 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c
>>> index 611ff209e3..ae4df8eb69 100644
>>> --- a/lib/vhost/vhost_user.c
>>> +++ b/lib/vhost/vhost_user.c
>>> @@ -311,6 +311,7 @@ vhost_user_set_features(struct virtio_net **pdev, 
>>> struct VhostUserMsg *msg,
>>>         uint64_t features = msg->payload.u64;
>>>         uint64_t vhost_features = 0;
>>>         struct rte_vdpa_device *vdpa_dev;
>>> +       uint32_t i;
>>>
>>>         if (validate_msg_fds(msg, 0) != 0)
>>>                 return RTE_VHOST_MSG_RESULT_ERR;
>>> @@ -389,6 +390,14 @@ vhost_user_set_features(struct virtio_net **pdev, 
>>> struct VhostUserMsg *msg,
>>>                 vdpa_dev->ops->set_features(dev->vid);
>>>
>>>         dev->flags &= ~VIRTIO_DEV_FEATURES_FAILED;
>>> +
>>> +       if (dev->features & (1ULL << VIRTIO_F_IOMMU_PLATFORM)) {
>>> +               for (i = 0; i < dev->nr_vring; i++) {
>>
>> I don't know the vhost-user protocol.
>> At this point of the device init/life, are we sure nr_vring is set to
>> the max number of vring?
>> The logs I have tend to say it is the case, but is there a guarantee
>> in the protocol?
>>
>>
>> Another way to fix would be to allocate on the first
>> VHOST_USER_IOTLB_MSG message received for a vring.
>>
>>
>>> +                       if (vhost_user_iotlb_init(dev, i))
>>> +                               return RTE_VHOST_MSG_RESULT_ERR;
>>> +               }
>>> +       }
>>> +
>>>         return RTE_VHOST_MSG_RESULT_OK;
>>>  }
>>>
>>
>>
> 

Reply via email to