Re: [dpdk-dev] [PATCH v5] net/i40e: fix mirror rule reset when port is stopped
> -Original Message- > From: Wu, Jingjing > Sent: Saturday, September 23, 2017 3:27 AM > To: Ananyev, Konstantin ; Dai, Wei > ; Xing, Beilei > Cc: dev@dpdk.org; sta...@dpdk.org > Subject: RE: [dpdk-dev] [PATCH v5] net/i40e: fix mirror rule reset when port > is stopped > > > > > -Original Message- > > From: Ananyev, Konstantin > > Sent: Thursday, September 21, 2017 6:46 AM > > To: Dai, Wei ; Wu, Jingjing ; > > Xing, Beilei > > > > Cc: dev@dpdk.org; sta...@dpdk.org > > Subject: RE: [dpdk-dev] [PATCH v5] net/i40e: fix mirror rule reset when > > port is stopped > > > > Hi Wei, > > > > > > > > Hi, Konstantin > > > > > > Without this patch, when a port is stopped, all mirror SW resource are > > > cleared but HW > > settings are still there. > > > And once the port is started again, creating mirror rule may fail due to > > > remain HW > > settings. > > > > > > There is a testpmd use case which can show why this patch is needed. > > > 1. bind PF to igb_uio > > > #/root/dpdk-devbind.py -b igb_uio :08:00.0 > > > 2. create 2 VFs > > > #echo 2 > /sys/bus/pci/devices/\:08\:00.0/max_vfs > > > 3. launch testpmd with PF, and set port mirror configuration > > > #./x86_64-native-linuxapp-gcc/app/testpmd -c 3 -n 4 -- -i > > > Testpmd > set port 0 mirror-rule 0 pool-mirror-up 0x1 dst-pool 1 on > > > Testpmd > quit > > > 4. relaunch testpmd with PF, and set port mirror as the same configuration > > > #./x86_64-native-linuxapp-gcc/app/testpmd -c 3 -n 4 -- -i > > > Testpmd > set port 0 mirror-rule 0 pool-mirror-up 0x1 dst-pool 1 on > > > I40e_mirror_rule_set( ): failed to add mirror rule: ret = -53, aq_err = > > > 13. > > > Mirror rule add error: (Function not implemented) > > > > > > When testpmd is quitted, rte_eth_dev_stop( ) and then rte_eth_dev_close( > > > ) are called. > > > In these function, i40e_dev_stop( ) and i40e_dev_close( ) is called. > > > And i40e_dev_stop is also called by i40e_dev_close( ). > > > > > > Without this patch, the error in second running of testpmd always happen. > > > This patch can address this error. > > > > Thanks for explanation. > > Though it seems that the patch while fixing the issue might introduce some > > inconsistences: > > 1. right now for i40e: dev_stop(port); dev_start(port) would keep existing > > HW mirror rule > > working. > >With your patch is not any more. > > 2. What about ixgbe? Do we also need to change its behavior? > > As I can see right now ixgbe doesn't reset any mirror rules at > > dev_stop(). > >Would it be reset automatically, or do we need to update ixgbe PMD too? > > > > About #1 - if we'll decide that this is a desired behavior that dev_stop() > > voids > > all mirror rules, then at least we probably need to update the > > documentation. > > As alternative - at dev_stop() we can reset only actual HW rule, but keep SW > > configuration intact, > > and restore them at dev_start(). > > I personally think second option is a bit better - as it preserves existing > > behavior, > > and probably more convenient for the user. > > Yes, you reminded me the mirror is to VF. The mirror rule should be kept or > at least > be restored when device start again.. > Because the dev_stop should only stop the pf interface, but not affect VF. It > looks > like we don't delete the VEB we dev_stop. > The reset behavior may need to be done in dev_close, but not in dev_stop. if we can move that code into dev_close() , then indeed might be a better approach. > > > > > About the patch itself, why just not: > > while ((p_mirror = TAILQ_FIRST(&pf->mirror_list))) > > i40e_mirror_rule_reset(dev, p_mirror->index); > > > Wei's v1 patch is doing like that. But I commented it to change it. Because > i40e_mirror_rule_reset is doing a search in the list which is not necessary. Yes, but it is a control path, so need to be extra quick here. As a plus - avoid code duplication and easier to control/maintain it. Konstantin
Re: [dpdk-dev] [PATCH v4 00/41] Introduce NXP DPAA Bus, Mempool and PMD
> -Original Message- > From: Thomas Monjalon [mailto:tho...@monjalon.net] > Sent: Friday, September 22, 2017 7:49 PM > To: Shreyansh Jain > Cc: dev@dpdk.org; ferruh.yi...@intel.com; Hemant Agrawal > > Subject: Re: [PATCH v4 00/41] Introduce NXP DPAA Bus, Mempool and PMD > > 22/09/2017 16:00, Shreyansh Jain: > > From: Thomas Monjalon [mailto:tho...@monjalon.net] > > > At the beginning of fslmc work, I had understood that every NXP SoC were > > > connecting components with the same principle which we could call the > > > "Freescale bus". > > > Then you came with this bus named bus/fslmc, not bus/dpaa2. > > > Now I am confused. What is the exact scope of fslmc? Is it just DPAA2? > > > > My memory is poor. I will have to look through the old emails what happened > - but I recall there was a discussion in initial phases about the naming. > "fslmc" came out as a name that is what is the real name of the DPAA2 bus. > There was initial a confusion if name of bus in Linux Kernel should match or > not - but, we realized that bus is *not* device and device name is "dpaa2". > > > > As for whether fslmc would cover multiple SoC - that is still true. There > are multiple SoCs within the DPAA2 umbrella. LS20XX, LS108X series and some > more - all of which use the FSLMC bus (DPAA2 architecture, on FSLMC bus, > having 'dpaa2' devices). > > > > There is another architecture, an old one, which are still popular. This is > platform type bus which is aptly named 'dpaa' - and here the confusion of bus > name and device doesn't appear. (DPAA bus, using DPAA architecture, exposing > 'dpaa' devices). > > > > Exact scope of FSLMC is just DPAA2 architecture based SoCs. There are many > here with new coming up. > > Exact scope of DPAA bus is just DPAA architecture based SoCs. There are > many here. > > > > Does this clear your doubt to some extent? > > Yes it is a lot clearer! Thanks > > Now that I better understand, I think flsmc bus should have been named > dpaa2 bus. Is it too late? :) I resonate your thought that drivers/bus/dpaa2, drivers/mempool/dpaa2, drivers/net/dpaa2, drivers/crypto/dpaa2_sec would have been more uniform. But again, that would have misled a lot of DPAA2 users into thinking bus name is 'dpaa2' which is not the case. And anyways, the changes required in the code to reflect this name change are not worthwhile. I would prefer to go as is.
Re: [dpdk-dev] [PATCH v4 3/4] eventdev: Add eventdev ethernet Rx adapter
On 9/21/2017 9:13 PM, Pavan Nikhilesh Bhagavatula wrote: Hi Nikhil, Few comments Inline + * - 0: Success, statistics reset successfully. Invalid description. Thanks Pavan, for catching these, will fix. + * - <0: Error code on failure, if the adapter doesn't use a rte_service + * function, this function returns -ESRCH. + */ +int rte_event_eth_rx_adapter_service_id_get(uint8_t id, uint32_t *service_id); + +#ifdef __cplusplus +} +#endif +#endif /* _RTE_EVENT_ETH_RX_ADAPTER_ */ diff --git a/lib/librte_eventdev/rte_event_eth_rx_adapter.c b/lib/librte_eventdev/rte_event_eth_rx_adapter.c new file mode 100644 index 0..d5b655dae --- /dev/null +++ b/lib/librte_eventdev/rte_event_eth_rx_adapter.c @@ -0,0 +1,1238 @@ + +static int +rx_adapter_ctrl(uint8_t id, int start) +{ + struct rte_event_eth_rx_adapter *rx_adapter; + struct rte_eventdev *dev; + struct eth_device_info *dev_info; + uint32_t i; + int use_service = 0; + int stop = !start; + + RTE_EVENT_ETH_RX_ADAPTER_ID_VALID_OR_ERR_RET(id, -EINVAL); + rx_adapter = id_to_rx_adapter(id); + if (!rx_adapter) + return -EINVAL; + + dev = &rte_eventdevs[rx_adapter->eventdev_id]; + + for (i = 0; i < rte_eth_dev_count(); i++) { + dev_info = &rx_adapter->eth_devices[i]; + /* if start check for num dev queues */ + if (start && !dev_info->nb_dev_queues) + continue; + /* if stop check if dev has been started */ + if (stop && !dev_info->dev_rx_started) + continue;:1 + use_service |= !dev_info->internal_event_port; + dev_info->dev_rx_started = start; + if (!dev_info->internal_event_port) + continue; + start ? (*dev->dev_ops->eth_rx_adapter_start)(dev, + &rte_eth_devices[i]) : + (*dev->dev_ops->eth_rx_adapter_stop)(dev, + &rte_eth_devices[i]); + } + + if (use_service) Here setting the service run state is not sufficient we need to enable the service on a service core calling rte_service_start_with_defaults() should be sufficient. Yes it is necessary but insufficient. IMO, If the application is controlling core masks, the application flow at startup looks like: rte_event_eth_rx_adapter_create(id,..) ... rte_event_eth_rx_adapter_start(id) if (!rte_event_eth_rx_adapter_service_id_get(id, &service_id)) { rte_service_lcore_add(rx_lcore_id); rte_service_map_lcore_set(service_id, rx_lcore_id, 1); rte_service_lcore_start(rx_lcore_id) } Since rte_service_start_with_defaults() is invoked before the adapter is created, how would it get assigned a core etc ? Nikhil + rte_service_runstate_set(rx_adapter->service_id, start); + + return 0; +} + Regards, Pavan
[dpdk-dev] KNI module on FreeBSD
Hi, I can't make my DPDK application on FreeBSD, and i give the following error fatal error: rte_kni.h: No such file or directory #include In, ubuntu and debian, I load rte_kni.ko before compilation; However, there is no such file in FreeBSD version! my .../dpdk-2.2.0/x86_64-native-bsdapp-gcc/kmod/ directory only contains: contigmem.konic_uio.ko Thanks in advance for your help
Re: [dpdk-dev] KNI module on FreeBSD
KNI isn’t supported on FreeBSD. > On Sep 23, 2017, at 1:42 PM, Mostafa Salari wrote: > > Hi, > I can't make my DPDK application on FreeBSD, and i give the following error >fatal error: rte_kni.h: No such file or directory >#include > In, ubuntu and debian, I load rte_kni.ko before compilation; However, there > is no such file in FreeBSD version! > my .../dpdk-2.2.0/x86_64-native-bsdapp-gcc/kmod/ directory only contains: > contigmem.konic_uio.ko > Thanks in advance for your help
Re: [dpdk-dev] [PATCH v5] net/i40e: fix mirror rule reset when port is stopped
> -Original Message- > From: Ananyev, Konstantin > Sent: Saturday, September 23, 2017 6:37 PM > To: Wu, Jingjing ; Dai, Wei ; > Xing, Beilei > Cc: dev@dpdk.org; sta...@dpdk.org > Subject: RE: [dpdk-dev] [PATCH v5] net/i40e: fix mirror rule reset when port > is stopped > > > > > -Original Message- > > From: Wu, Jingjing > > Sent: Saturday, September 23, 2017 3:27 AM > > To: Ananyev, Konstantin ; Dai, Wei > > ; Xing, Beilei > > Cc: dev@dpdk.org; sta...@dpdk.org > > Subject: RE: [dpdk-dev] [PATCH v5] net/i40e: fix mirror rule reset > > when port is stopped > > > > > > > > > -Original Message- > > > From: Ananyev, Konstantin > > > Sent: Thursday, September 21, 2017 6:46 AM > > > To: Dai, Wei ; Wu, Jingjing > > > ; Xing, Beilei > > > Cc: dev@dpdk.org; sta...@dpdk.org > > > Subject: RE: [dpdk-dev] [PATCH v5] net/i40e: fix mirror rule reset > > > when port is stopped > > > > > > Hi Wei, > > > > > > > > > > > Hi, Konstantin > > > > > > > > Without this patch, when a port is stopped, all mirror SW resource > > > > are cleared but HW > > > settings are still there. > > > > And once the port is started again, creating mirror rule may fail > > > > due to remain HW > > > settings. > > > > > > > > There is a testpmd use case which can show why this patch is needed. > > > > 1. bind PF to igb_uio > > > > #/root/dpdk-devbind.py -b igb_uio :08:00.0 2. create 2 VFs > > > > #echo 2 > /sys/bus/pci/devices/\:08\:00.0/max_vfs > > > > 3. launch testpmd with PF, and set port mirror configuration > > > > #./x86_64-native-linuxapp-gcc/app/testpmd -c 3 -n 4 -- -i Testpmd > > > > > set port 0 mirror-rule 0 pool-mirror-up 0x1 dst-pool 1 on > > > > Testpmd > quit 4. relaunch testpmd with PF, and set port mirror as > > > > the same configuration #./x86_64-native-linuxapp-gcc/app/testpmd > > > > -c 3 -n 4 -- -i Testpmd > set port 0 mirror-rule 0 pool-mirror-up > > > > 0x1 dst-pool 1 on I40e_mirror_rule_set( ): failed to add mirror > > > > rule: ret = -53, aq_err = 13. > > > > Mirror rule add error: (Function not implemented) > > > > > > > > When testpmd is quitted, rte_eth_dev_stop( ) and then > rte_eth_dev_close( ) are called. > > > > In these function, i40e_dev_stop( ) and i40e_dev_close( ) is called. > > > > And i40e_dev_stop is also called by i40e_dev_close( ). > > > > > > > > Without this patch, the error in second running of testpmd always > happen. > > > > This patch can address this error. > > > > > > Thanks for explanation. > > > Though it seems that the patch while fixing the issue might > > > introduce some > > > inconsistences: > > > 1. right now for i40e: dev_stop(port); dev_start(port) would keep > > > existing HW mirror rule working. > > >With your patch is not any more. > > > 2. What about ixgbe? Do we also need to change its behavior? > > > As I can see right now ixgbe doesn't reset any mirror rules at > dev_stop(). > > >Would it be reset automatically, or do we need to update ixgbe PMD > too? > > > > > > About #1 - if we'll decide that this is a desired behavior that > > > dev_stop() voids all mirror rules, then at least we probably need to > update the documentation. > > > As alternative - at dev_stop() we can reset only actual HW rule, but > > > keep SW configuration intact, and restore them at dev_start(). > > > I personally think second option is a bit better - as it preserves > > > existing behavior, and probably more convenient for the user. > > > > Yes, you reminded me the mirror is to VF. The mirror rule should be > > kept or at least be restored when device start again.. > > Because the dev_stop should only stop the pf interface, but not affect > > VF. It looks like we don't delete the VEB we dev_stop. > > The reset behavior may need to be done in dev_close, but not in dev_stop. > > if we can move that code into dev_close() , then indeed might be a better > approach. > > > > > > > > > About the patch itself, why just not: > > > while ((p_mirror = TAILQ_FIRST(&pf->mirror_list))) > > > i40e_mirror_rule_reset(dev, p_mirror->index); > > > > > Wei's v1 patch is doing like that. But I commented it to change it. > > Because i40e_mirror_rule_reset is doing a search in the list which is not > necessary. > > Yes, but it is a control path, so need to be extra quick here. > As a plus - avoid code duplication and easier to control/maintain it. > Konstantin > Thanks to Konstantin and Jingjing for all of your suggestion. I will work out v6 patch to put all these mirror reset operations in dev_close( ). As to ixgbe, same scheme will be tried and tested.
[dpdk-dev] [PATCH] doc: fix a typo in testpmd guide
This patch fixes a trivial typo in testpmd guide. Signed-off-by: Rami Rosen --- doc/guides/testpmd_app_ug/testpmd_funcs.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst index 2ed62f5..a8e0c37 100644 --- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst +++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst @@ -392,7 +392,7 @@ ddp get info Display information about dynamic device personalization (DDP) profile:: - testpmd> ddp get info (profile_patch) + testpmd> ddp get info (profile_path) show vf stats ~ -- 1.9.1
[dpdk-dev] [PATCH] vhost: Expose virtio interrupt requirement on rte_vhos API
Performance tests with the OVS DPDK datapath have shown that the tx throughput over a vhostuser port into a VM with an interrupt-based virtio driver is limited by the overhead incurred by virtio interrupts. The OVS PMD spends up to 30% of its cycles in system calls kicking the eventfd. Also the core running the vCPU is heavily loaded with generating the virtio interrupts in KVM on the host and handling these interrupts in the virtio-net driver in the guest. This limits the throughput to about 500-700 Kpps with a single vCPU. OVS is trying to address this issue by batching packets to a vhostuser port for some time to limit the virtio interrupt frequency. With a 50 us batching period we have measured an iperf3 throughput increase by 15% and a PMD utilization decrease from 45% to 30%. On the other hand, guests using virtio PMDs do not profit from time-based tx batching. Instead they experience a 2-3% performance penalty and an average latency increase of 30-40 us. OVS therefore intends to apply time-based tx batching only for vhostuser tx queues that need to trigger virtio interrupts. Today this information is hidden inside the rte_vhost library and not accessible to users of the API. This patch adds a function to the API to query it. Signed-off-by: Jan Scheurich --- lib/librte_vhost/rte_vhost.h | 12 lib/librte_vhost/vhost.c | 19 +++ 2 files changed, 31 insertions(+) diff --git a/lib/librte_vhost/rte_vhost.h b/lib/librte_vhost/rte_vhost.h index 8c974eb..d62338b 100644 --- a/lib/librte_vhost/rte_vhost.h +++ b/lib/librte_vhost/rte_vhost.h @@ -444,6 +444,18 @@ int rte_vhost_get_vhost_vring(int vid, uint16_t vring_idx, */ uint32_t rte_vhost_rx_queue_count(int vid, uint16_t qid); +/** + * Does the virtio driver request interrupts for a vhost tx queue? + * + * @param vid + * vhost device ID + * @param qid + * virtio queue index in mq case + * @return + * 1 if true, 0 if false + */ +int rte_vhost_tx_interrupt_requested(int vid, uint16_t qid); + #ifdef __cplusplus } #endif diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c index 0b6aa1c..bd1ebf9 100644 --- a/lib/librte_vhost/vhost.c +++ b/lib/librte_vhost/vhost.c @@ -503,3 +503,22 @@ struct virtio_net * return *((volatile uint16_t *)&vq->avail->idx) - vq->last_avail_idx; } + +int rte_vhost_tx_interrupt_requested(int vid, uint16_t qid) +{ +struct virtio_net *dev; +struct vhost_virtqueue *vq; + +dev = get_device(vid); +if (dev == NULL) +return 0; + +vq = dev->virtqueue[qid]; +if (vq == NULL) +return 0; + +if (unlikely(vq->enabled == 0 || vq->avail == NULL)) +return 0; + +return !(vq->avail->flags & VRING_AVAIL_F_NO_INTERRUPT); +}
[dpdk-dev] [PATCH v2] vhost: Expose virtio interrupt need on rte_vhost API
Performance tests with the OVS DPDK datapath have shown that the tx throughput over a vhostuser port into a VM with an interrupt-based virtio driver is limited by the overhead incurred by virtio interrupts. The OVS PMD spends up to 30% of its cycles in system calls kicking the eventfd. Also the core running the vCPU is heavily loaded with generating the virtio interrupts in KVM on the host and handling these interrupts in the virtio-net driver in the guest. This limits the throughput to about 500-700 Kpps with a single vCPU. OVS is trying to address this issue by batching packets to a vhostuser port for some time to limit the virtio interrupt frequency. With a 50 us batching period we have measured an iperf3 throughput increase by 15% and a PMD utilization decrease from 45% to 30%. On the other hand, guests using virtio PMDs do not profit from time-based tx batching. Instead they experience a 2-3% performance penalty and an average latency increase of 30-40 us. OVS therefore intends to apply time-based tx batching only for vhostuser tx queues that need to trigger virtio interrupts. Today this information is hidden inside the rte_vhost library and not accessible to users of the API. This patch adds a function to the API to query it. Signed-off-by: Jan Scheurich --- v1 -> v2: Fixed too long commit lines Fixed white-space errors and warnings lib/librte_vhost/rte_vhost.h | 12 lib/librte_vhost/vhost.c | 19 +++ 2 files changed, 31 insertions(+) diff --git a/lib/librte_vhost/rte_vhost.h b/lib/librte_vhost/rte_vhost.h index 8c974eb..d62338b 100644 --- a/lib/librte_vhost/rte_vhost.h +++ b/lib/librte_vhost/rte_vhost.h @@ -444,6 +444,18 @@ int rte_vhost_get_vhost_vring(int vid, uint16_t vring_idx, */ uint32_t rte_vhost_rx_queue_count(int vid, uint16_t qid); +/** + * Does the virtio driver request interrupts for a vhost tx queue? + * + * @param vid + * vhost device ID + * @param qid + * virtio queue index in mq case + * @return + * 1 if true, 0 if false + */ +int rte_vhost_tx_interrupt_requested(int vid, uint16_t qid); + #ifdef __cplusplus } #endif diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c index 0b6aa1c..c6e636e 100644 --- a/lib/librte_vhost/vhost.c +++ b/lib/librte_vhost/vhost.c @@ -503,3 +503,22 @@ struct virtio_net * return *((volatile uint16_t *)&vq->avail->idx) - vq->last_avail_idx; } + +int rte_vhost_tx_interrupt_requested(int vid, uint16_t qid) +{ + struct virtio_net *dev; + struct vhost_virtqueue *vq; + + dev = get_device(vid); + if (dev == NULL) + return 0; + + vq = dev->virtqueue[qid]; + if (vq == NULL) + return 0; + + if (unlikely(vq->enabled == 0 || vq->avail == NULL)) + return 0; + + return !(vq->avail->flags & VRING_AVAIL_F_NO_INTERRUPT); +}
[dpdk-dev] [PATCH v3] vhost: Expose virtio interrupt need on rte_vhost API
Performance tests with the OVS DPDK datapath have shown that the tx throughput over a vhostuser port into a VM with an interrupt-based virtio driver is limited by the overhead incurred by virtio interrupts. The OVS PMD spends up to 30% of its cycles in system calls kicking the eventfd. Also the core running the vCPU is heavily loaded with generating the virtio interrupts in KVM on the host and handling these interrupts in the virtio-net driver in the guest. This limits the throughput to about 500-700 Kpps with a single vCPU. OVS is trying to address this issue by batching packets to a vhostuser port for some time to limit the virtio interrupt frequency. With a 50 us batching period we have measured an iperf3 throughput increase by 15% and a PMD utilization decrease from 45% to 30%. On the other hand, guests using virtio PMDs do not profit from time-based tx batching. Instead they experience a 2-3% performance penalty and an average latency increase of 30-40 us. OVS therefore intends to apply time-based tx batching only for vhostuser tx queues that need to trigger virtio interrupts. Today this information is hidden inside the rte_vhost library and not accessible to users of the API. This patch adds a function to the API to query it. Signed-off-by: Jan Scheurich --- v2 -> v3: Fixed even more white-space errors and warnings v1 -> v2: Fixed too long commit lines Fixed white-space errors and warnings lib/librte_vhost/rte_vhost.h | 12 lib/librte_vhost/vhost.c | 19 +++ 2 files changed, 31 insertions(+) diff --git a/lib/librte_vhost/rte_vhost.h b/lib/librte_vhost/rte_vhost.h index 8c974eb..d62338b 100644 --- a/lib/librte_vhost/rte_vhost.h +++ b/lib/librte_vhost/rte_vhost.h @@ -444,6 +444,18 @@ int rte_vhost_get_vhost_vring(int vid, uint16_t vring_idx, */ uint32_t rte_vhost_rx_queue_count(int vid, uint16_t qid); +/** + * Does the virtio driver request interrupts for a vhost tx queue? + * + * @param vid + * vhost device ID + * @param qid + * virtio queue index in mq case + * @return + * 1 if true, 0 if false + */ +int rte_vhost_tx_interrupt_requested(int vid, uint16_t qid); + #ifdef __cplusplus } #endif diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c index 0b6aa1c..c6e636e 100644 --- a/lib/librte_vhost/vhost.c +++ b/lib/librte_vhost/vhost.c @@ -503,3 +503,22 @@ struct virtio_net * return *((volatile uint16_t *)&vq->avail->idx) - vq->last_avail_idx; } + +int rte_vhost_tx_interrupt_requested(int vid, uint16_t qid) +{ + struct virtio_net *dev; + struct vhost_virtqueue *vq; + + dev = get_device(vid); + if (dev == NULL) + return 0; + + vq = dev->virtqueue[qid]; + if (vq == NULL) + return 0; + + if (unlikely(vq->enabled == 0 || vq->avail == NULL)) + return 0; + + return !(vq->avail->flags & VRING_AVAIL_F_NO_INTERRUPT); +}
[dpdk-dev] [PATCH] net/failsafe: fix calling device during RMV events
This commit prevents control path operations from failing after a sub device removal. Following are the failure steps: 1. The physical device is removed due to change in one of PF parameters (e.g. MTU) 2. The interrupt thread flags the device 3. Within 2 seconds Interrupt thread initializes the actual device removal, then every 2 seconds it tries to re-sync (plug in) the device. The trials fail as long as VF parameter mismatches the PF parameter. 4. A control thread initiates a control operation on failsafe which initiates this operation on the device. 5. A race condition occurs between the control thread and interrupt thread when accessing the device data structures. This commit prevents the race condition in step 5. Before this commit if a device was removed and then a control thread operation was initiated on failsafe - in some cases failsafe called the sub device operation instead of avoiding it. Such cases could lead to operations failures. This commit fixes failsafe criteria to determine when the device is removed such that it will avoid calling the sub device operations during that time and will only call them otherwise. Fixes: a46f8d584eb8 ("net/failsafe: add fail-safe PMD") Cc: sta...@dpdk.org Signed-off-by: Ophir Munk --- This is V2 patch is in reply to <20170911083117.gm21...@bidouze.vm.6wind.com> drivers/net/failsafe/failsafe_ether.c | 1 + drivers/net/failsafe/failsafe_ops.c | 31 +++ drivers/net/failsafe/failsafe_private.h | 26 +- 3 files changed, 37 insertions(+), 21 deletions(-) diff --git a/drivers/net/failsafe/failsafe_ether.c b/drivers/net/failsafe/failsafe_ether.c index a3a8cce..1def110 100644 --- a/drivers/net/failsafe/failsafe_ether.c +++ b/drivers/net/failsafe/failsafe_ether.c @@ -378,6 +378,7 @@ failsafe_eth_dev_state_sync(struct rte_eth_dev *dev) i); goto err_remove; } + sdev->remove = 0; } } /* diff --git a/drivers/net/failsafe/failsafe_ops.c b/drivers/net/failsafe/failsafe_ops.c index ff9ad15..721a48a 100644 --- a/drivers/net/failsafe/failsafe_ops.c +++ b/drivers/net/failsafe/failsafe_ops.c @@ -232,7 +232,6 @@ fs_dev_configure(struct rte_eth_dev *dev) dev->data->dev_conf.intr_conf.lsc = 0; } DEBUG("Configuring sub-device %d", i); - sdev->remove = 0; ret = rte_eth_dev_configure(PORT_ID(sdev), dev->data->nb_rx_queues, dev->data->nb_tx_queues, @@ -310,7 +309,7 @@ fs_dev_set_link_up(struct rte_eth_dev *dev) uint8_t i; int ret; - FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) { + FOREACH_SUBDEV_ACTIVE_SAFE(sdev, i, dev) { DEBUG("Calling rte_eth_dev_set_link_up on sub_device %d", i); ret = rte_eth_dev_set_link_up(PORT_ID(sdev)); if (ret) { @@ -329,7 +328,7 @@ fs_dev_set_link_down(struct rte_eth_dev *dev) uint8_t i; int ret; - FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) { + FOREACH_SUBDEV_ACTIVE_SAFE(sdev, i, dev) { DEBUG("Calling rte_eth_dev_set_link_down on sub_device %d", i); ret = rte_eth_dev_set_link_down(PORT_ID(sdev)); if (ret) { @@ -517,7 +516,7 @@ fs_promiscuous_enable(struct rte_eth_dev *dev) struct sub_device *sdev; uint8_t i; - FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) + FOREACH_SUBDEV_ACTIVE_SAFE(sdev, i, dev) rte_eth_promiscuous_enable(PORT_ID(sdev)); } @@ -527,7 +526,7 @@ fs_promiscuous_disable(struct rte_eth_dev *dev) struct sub_device *sdev; uint8_t i; - FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) + FOREACH_SUBDEV_ACTIVE_SAFE(sdev, i, dev) rte_eth_promiscuous_disable(PORT_ID(sdev)); } @@ -537,7 +536,7 @@ fs_allmulticast_enable(struct rte_eth_dev *dev) struct sub_device *sdev; uint8_t i; - FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) + FOREACH_SUBDEV_ACTIVE_SAFE(sdev, i, dev) rte_eth_allmulticast_enable(PORT_ID(sdev)); } @@ -547,7 +546,7 @@ fs_allmulticast_disable(struct rte_eth_dev *dev) struct sub_device *sdev; uint8_t i; - FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) + FOREACH_SUBDEV_ACTIVE_SAFE(sdev, i, dev) rte_eth_allmulticast_disable(PORT_ID(sdev)); } @@ -559,7 +558,7 @@ fs_link_update(struct rte_eth_dev *dev, uint8_t i; int ret; - FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) { + FOREACH_SUBDEV_ACTIVE_SAFE(sdev, i, dev) { DEBUG("Calling link_update on sub_device %d", i); ret = (SUBOPS(sdev, link_update))(ETH(sdev), wait_to_complete); if (ret && ret
Re: [dpdk-dev] [PATCH] doc: update failsafe feature list
Hi Ferruh > -Original Message- > From: Ferruh Yigit [mailto:ferruh.yi...@intel.com] > Sent: Friday, September 22, 2017 1:32 PM > To: Matan Azrad ; Gaetan Rivet > > Cc: dev@dpdk.org; john.mcnam...@intel.com > Subject: Re: [dpdk-dev] [PATCH] doc: update failsafe feature list > > On 9/19/2017 12:39 PM, Matan Azrad wrote: > > Hi Ferruh > > > >> -Original Message- > >> From: Ferruh Yigit [mailto:ferruh.yi...@intel.com] > >> Sent: Tuesday, September 19, 2017 2:00 PM > >> To: Matan Azrad ; Gaetan Rivet > >> > >> Cc: dev@dpdk.org; john.mcnam...@intel.com > >> Subject: Re: [dpdk-dev] [PATCH] doc: update failsafe feature list > >> > >> On 9/19/2017 11:04 AM, Matan Azrad wrote: > >>> > >>> Hi Ferruh > >>> > -Original Message- > From: Ferruh Yigit [mailto:ferruh.yi...@intel.com] > Sent: Tuesday, September 19, 2017 12:27 PM > To: Matan Azrad ; Gaetan Rivet > > Cc: dev@dpdk.org > Subject: Re: [dpdk-dev] [PATCH] doc: update failsafe feature list > > On 9/14/2017 4:32 PM, Matan Azrad wrote: > > Add supported failsafe features to feature list. > > Remove stats per queue feature from failsafe feature list since > > queue_stats_mapping_set dev op has not implemented yet. > > > > Signed-off-by: Matan Azrad > > --- > > doc/guides/nics/features/failsafe.ini | 15 ++- > > 1 file changed, 14 insertions(+), 1 deletion(-) > > > > diff --git a/doc/guides/nics/features/failsafe.ini > > b/doc/guides/nics/features/failsafe.ini > > index a42e344..9f48455 100644 > > --- a/doc/guides/nics/features/failsafe.ini > > +++ b/doc/guides/nics/features/failsafe.ini > > @@ -4,20 +4,33 @@ > > ; Refer to default.ini for the full list of available PMD features. > > ; > > [Features] > > +Speed capabilities = Y > > Link status = Y > > Link status event= Y > > MTU update = Y > > Jumbo frame = Y > > +Scattered Rx = Y > > +LRO = Y > > +TSO = Y > > Promiscuous mode = Y > > Allmulticast mode= Y > > Unicast MAC filter = Y > > Multicast MAC filter = Y > > VLAN filter = Y > > +Ethertype filter = Y > > +N-tuple filter = Y > > +SYN filter = Y > > +Tunnel filter= Y > > +Flexible filter = Y > > +Hash filter = Y > > +Flow director= Y > > Flow control = Y > > Flow API = Y > > +QinQ offload = Y > > +L3 checksum offload = Y > > +L4 checksum offload = Y > > Packet type parsing = Y > > Basic stats = Y > > -Stats per queue = Y > > ARMv7= Y > > ARMv8= Y > > Power8 = Y > > I am not sure if claiming support for these features is correct. > Failsafe itself doesn't provide these features, but relies > underlying hardware which we don't really know what they supports > or not in this > >> stage. > > >>> > >>> Don't you think that almost all failsafe features rely underlying > >>> hardware or > >> sub PMDs? > >> > >> You are right, perhaps we should remove all. This is helpful to show > >> what device features are supported. For failsafe, is this information > useful? > >> > > Since there are features that failsafe cannot support without any sub > > PMD dependences (for example "Stats per queue") it is useful. > > Sorry, I missed your point. > > Device feature list documentation is good for: > - End user can easily see what to expect from a device/driver. > - To trace what features implemented for a device. > - To find out which device has a specific desired feature. > > For failsafe, it is a virtual overlay device on other physical devices. > > The supported architectures and provided documents features can be > useful. But why/how NIC related features can be useful since all they are > coming form underlay devices? > My point is that someone can understand from this list all failsafe PMD features which are not supported even if the failsafe sub devices PMD may support them. Please read section 31.1 in failsafe documentation: http://dpdk.org/doc/guides/nics/fail_safe.html Actually, the failsafe supported features is the logical AND between all its sub devices supported features and failsafe default features. I think this table should reflect the failsafe default features. It is very useful for failsafe user to compare failsafe features and sub devices features to infer which feature is going to be supported with failsafe combination. In addition, Even if the PMD part is only to set capability bit (NIC related features capability) user must know that failsafe is going to set it. > > > >>> > OK for dropping "Stats per queue" > > > > >>> > >
[dpdk-dev] [dpdk-announce] DPDK Community Survey 2017
Hi All, As part of our ongoing efforts to improve DPDK, we'd like to hear your feedback! We have created a number of DPDK-related questions here. https://www.surveymonkey.com/r/DPDK_Community_Survey_2017 and want to hear your views!! The survey will close at midnight GMT on Sunday October 1st, 2017. Thanks in advance for your feedback - the more responses we get the more data we have to drive further features, improvements, etc. So please respond!!! Regards Deepak