Hi Fan, On 10/1/20 10:07 AM, Zhang, Roy Fan wrote: > Hi Maxime, > >> -----Original Message----- >> From: Maxime Coquelin <maxime.coque...@redhat.com> >> Sent: Thursday, October 1, 2020 8:55 AM >> To: Zhang, Roy Fan <roy.fan.zh...@intel.com>; Xia, Chenbo >> <chenbo....@intel.com>; Liu, Changpeng <changpeng....@intel.com>; >> dev@dpdk.org >> Cc: ma...@mellanox.com; Zawadzki, Tomasz <tomasz.zawad...@intel.com>; >> Yigit, Ferruh <ferruh.yi...@intel.com> >> Subject: Re: [dpdk-dev] [PATCH] vhost: return ready when at least 1 vring is >> configured >> >> >> >> On 9/30/20 6:37 PM, Zhang, Roy Fan wrote: >>> Hi Chenbo and Maxime, >>> >>> Thanks for replying the email. >>> >>> >>>> -----Original Message----- >>>> From: Maxime Coquelin <maxime.coque...@redhat.com> >>>> Sent: Wednesday, September 30, 2020 4:37 PM >>>> To: Xia, Chenbo <chenbo....@intel.com>; Zhang, Roy Fan >>>> <roy.fan.zh...@intel.com>; Liu, Changpeng <changpeng....@intel.com>; >>>> dev@dpdk.org >>>> Cc: ma...@mellanox.com; Zawadzki, Tomasz >> <tomasz.zawad...@intel.com>; >>>> Yigit, Ferruh <ferruh.yi...@intel.com> >>>> Subject: Re: [dpdk-dev] [PATCH] vhost: return ready when at least 1 vring >> is >>>> configured >>>> >>>> Hi, >>>> >>>> On 9/30/20 4:48 AM, Xia, Chenbo wrote: >>>>> Hi Fan & Maxime, >>>>> >>>>> I am thinking that should we move set_features outside of new_device >>>> callback >>>>> for crypto device? I see that net and blk devices both set features >> between >>>> register >>>>> and start, and personally I think this makes sense that device features >> are >>>> set >>>>> before device start and ready. How do you think 😊? >>> >>> The reason it is set inside rte_vhost_crypto_create() is logically speaking >>> the user shouldn't expect to have to set the feature flags even after the >> create >>> function is called - and what I know in the application the only way to know >> the >>> vid for the first time is when new_device() is invoked. So if there is away >>> to >> know >>> the vid before new_device() is invoked I am happy to change the sample >> app. >> >> I think the Vhost-crypto API should be fixed. >> The good news is that it is still experimental, so we can fix it without >> worries (BTW, except the DPDK example, are there other application using >> the Vhost-crypto API?). >> >> The .new_device() callback is called when the Virtio device is ready, >> meaning when the backend can start processing the virtqueues. So feature >> negotiation should have taken place before that. >> >> I'm surprised it worked before, because doesn't the features negotiation >> takes place before the memory table are set? If so, how can the first >> virtqueue can be tested as ready if the vring address is not set? >> >> One other issue here, which is triggering the issue is that given how >> the registration is done, VIRTIO_DEV_BUILTIN_VIRTIO_NET flag is set for >> Vhost-crypto, which shouldn't happen. Even before last release rework, >> you should have faced issues when more than 2 queues where in used: >> > > Vhost-crypto was not working since 20.05. Changpeng's patch which set the > Number of queues to one made it working again so we waited it merged.
But the patch introducing the regression was introduced in v20.08, I'm confused. > However the patch was rejected by you. Indeed, I rejected the patch because it would break net backends. > I suppose there is another way - adding a new API called > "rte_vhost_crypto_set_feature(const char *socket)" so we don't have to > rely on rte_vhost_crypto_create() to set the feature flags > > what do you think? The set_features thing is just another problem. The main problem is that VIRTIO_DEV_BUILTIN_VIRTIO_NET gets set for crypto backend, which does not make sense. I proposed a fix below to be able to differentiate between builtin net and crypto backends below, but I think you missed it. Please check below. > Regards, > Fan > >> static int >> vhost_user_set_features(struct virtio_net **pdev, struct VhostUserMsg >> *msg, >> int main_fd __rte_unused) >> { >> ... >> if ((dev->flags & VIRTIO_DEV_BUILTIN_VIRTIO_NET) && >> !(dev->features & (1ULL << VIRTIO_NET_F_MQ))) { >> /* >> * Remove all but first queue pair if MQ hasn't been >> * negotiated. This is safe because the device is not >> * running at this stage. >> */ >> while (dev->nr_vring > 2) { >> struct vhost_virtqueue *vq; >> >> vq = dev->virtqueue[--dev->nr_vring]; >> if (!vq) >> continue; >> >> dev->virtqueue[dev->nr_vring] = NULL; >> cleanup_vq(vq, 1); >> cleanup_vq_inflight(dev, vq); >> free_vq(dev, vq); >> } >> } >> >> As VIRTIO_NET_F_MQ is never negotiated with crypto devices, it means you >> can not have more than two queues. >> >>>> >>>> Indeed, we cannot consider the device to be ready (and so call >>>> .new_device callback) if features haven't been negotiated. >>>> >>>> I agree, rte_vhost_driver_set_features() has to be called before >>>> .new_device(), and that's the reason why it takes socket's path and not >>>> vid as input. >>> >>> Different than vhost_blk, vhost_crypto lies in the library and needs to be >>> able to be treated evenly as virtio_net. Without the new_device() calling >>> rte_vhost_crypto_create() the feature flag cannot be set. Without setting >>> the feature flag the device is always treated as virtio_net device, hence it >>> cannot pass the virtio_is_ready() check as the number of queues virtio >>> crypt uses is only 1 instead of 2 (required by virtio_net). >> >> OK, so we are aligned, we need to find a proper solution. I think you >> need a specific driver start function that does not set >> VIRTIO_DEV_BUILTIN_VIRTIO_NET. >> >> First we can change that VIRTIO_DEV_BUILTIN_VIRTIO_NET flag by a new >> field in the device without breaking the API: >> >> enum virtio_backend_type { >> VIRTIO_DEV_UNKNOWN = 0, /* Likely external */ >> VIRTIO_DEV_BUILTIN_NET, >> VIRTIO_DEV_BUILTIN_CRYPTO, >> }; >> >> struct virtio_net { >> ... >> enum virtio_backend_type type; >> }; >> >> >> Then, introduce a new API to start crypto backend that would be called >> instead of rte_vhost_driver_start(): >> >> int >> rte_vhost_crypto_driver_start(const char *path) >> { >> >> return vhost_driver_start(path, VIRTIO_DEV_BUILTIN_CRYPTO); >> } >> >> >> int >> rte_vhost_driver_start(const char *path) >> { >> >> return vhost_driver_start(path, VIRTIO_DEV_BUILTIN_NET); >> } >> >> And then propagate the info down to vhost_new_device(). >> >> Does that make sense? Note that it does not fix the set_feature thing, which would also need to be fixed. But it should revert the behaviour for crypto backends back to pre-v20.08, as Changpeng patch did. >> Note that issue has been reported during v20.11 cycle was it was >> introduced in v20.08. It means it has not been tested. Does Intel QE has >> some Vhost crypto tests? >> Thanks, >> Maxime