Re: [dpdk-dev] [PATCH v2 06/10] net/virtio: fix queue setup consistency
Hi Zhihong, On 02/09/2018 06:44 AM, Wang, Zhihong wrote: Hi Olivier, Given the situation that the vec path can be selected silently now once condition is met. So theoretically speaking this issue impacts the whole virtio pmd. If you plan to fix it in the next release, do you want to do a temporary workaround to disable the vec path selection till then? That may be the less worse solution if we don't fix it quickly. Reverting the patch isn't trivial, so this is not an option. I'm trying to reproduce it now, I'll let you know if I find something. Cheers, Maxime Thanks -Zhihong -Original Message- From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Olivier Matz Sent: Thursday, February 8, 2018 6:01 AM To: Xu, Qian Q Cc: Yao, Lei A ; dev@dpdk.org; y...@fridaylinux.org; maxime.coque...@redhat.com; Thomas Monjalon ; sta...@dpdk.org Subject: Re: [dpdk-dev] [PATCH v2 06/10] net/virtio: fix queue setup consistency Hi, It's in my short plans, but unfortunately some other high priority tasks were inserted before. Honnestly, I'm not sure I'll be able to make it for the release, but I'll do my best. Olivier On Wed, Feb 07, 2018 at 08:31:07AM +, Xu, Qian Q wrote: Any update, Olivier? We are near to release, and the bug-fix is important for the virtio vector path usage. Thanks. -Original Message- From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Olivier Matz Sent: Thursday, February 1, 2018 4:28 PM To: Yao, Lei A Cc: dev@dpdk.org; y...@fridaylinux.org; maxime.coque...@redhat.com; Thomas Monjalon ; sta...@dpdk.org Subject: Re: [dpdk-dev] [PATCH v2 06/10] net/virtio: fix queue setup consistency Hi Lei, It's on my todo list, I'll check this as soon as possible. Olivier On Thu, Feb 01, 2018 at 03:14:15AM +, Yao, Lei A wrote: Hi, Olivier This is Lei from DPDK validation team in Intel. During our DPDK 18.02-rc1 test, I find the following patch will cause one serious issue with virtio vector path: the traffic can't resume after stop/start the virtio device. The step like following: 1. Launch vhost-user port using testpmd at Host 2. Launch VM with virtio device, mergeable is off 3. Bind the virtio device to pmd driver, launch testpmd, let the tx/rx use vector path virtio_xmit_pkts_simple virtio_recv_pkts_vec 4. Send traffic to virtio device from vhost side, then stop the virtio device 5. Start the virtio device again After step 5, the traffic can't resume. Could you help check this and give a fix? This issue will impact the virtio pmd user experience heavily. By the way, this patch is already included into V17.11. Looks like we need give a patch to this LTS version. Thanks a lot! BRs Lei -Original Message- From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Olivier Matz Sent: Thursday, September 7, 2017 8:14 PM To: dev@dpdk.org; y...@fridaylinux.org; maxime.coque...@redhat.com Cc: step...@networkplumber.org; sta...@dpdk.org Subject: [dpdk-dev] [PATCH v2 06/10] net/virtio: fix queue setup consistency In rx/tx queue setup functions, some code is executed only if use_simple_rxtx == 1. The value of this variable can change depending on the offload flags or sse support. If Rx queue setup is called before Tx queue setup, it can result in an invalid configuration: - dev_configure is called: use_simple_rxtx is initialized to 0 - rx queue setup is called: queues are initialized without simple path support - tx queue setup is called: use_simple_rxtx switch to 1, and simple Rx/Tx handlers are selected Fix this by postponing a part of Rx/Tx queue initialization in dev_start(), as it was the case in the initial implementation. Fixes: 48cec290a3d2 ("net/virtio: move queue configure code to proper place") Cc: sta...@dpdk.org Signed-off-by: Olivier Matz --- drivers/net/virtio/virtio_ethdev.c | 13 + drivers/net/virtio/virtio_ethdev.h | 6 ++ drivers/net/virtio/virtio_rxtx.c | 40 ++- --- 3 files changed, 51 insertions(+), 8 deletions(-) diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c index 8eee3ff80..c7888f103 100644 --- a/drivers/net/virtio/virtio_ethdev.c +++ b/drivers/net/virtio/virtio_ethdev.c @@ -1737,6 +1737,19 @@ virtio_dev_start(struct rte_eth_dev *dev) struct virtnet_rx *rxvq; struct virtnet_tx *txvq __rte_unused; struct virtio_hw *hw = dev->data->dev_private; + int ret; + + /* Finish the initialization of the queues */ + for (i = 0; i < dev->data->nb_rx_queues; i++) { + ret = virtio_dev_rx_queue_setup_finish(dev, i); + if (ret < 0) + return ret; + } + for (i = 0; i < dev->data->nb_tx_queues; i++) { + ret = virtio_dev_tx_queue_setup_finish(dev, i); + if (ret < 0) + return ret; + } /* check if lsc interrupt feature is enabled */ if (dev->data->dev_conf.intr_conf.lsc) { d
Re: [dpdk-dev] [PATCH v3] doc: update the usage for shared library
> -Original Message- > From: Varghese, Vipin > Sent: Thursday, February 8, 2018 6:20 PM > To: dev@dpdk.org; Mcnamara, John > Cc: Kovacevic, Marko ; Varghese, Vipin > > Subject: [PATCH v3] doc: update the usage for shared library > > Add note information to intimate about use of option '-d' for shared > library in DPDK application. > Acked-by: John McNamara
Re: [dpdk-dev] [PATCH v2 06/10] net/virtio: fix queue setup consistency
On 02/09/2018 09:59 AM, Maxime Coquelin wrote: Hi Zhihong, On 02/09/2018 06:44 AM, Wang, Zhihong wrote: Hi Olivier, Given the situation that the vec path can be selected silently now once condition is met. So theoretically speaking this issue impacts the whole virtio pmd. If you plan to fix it in the next release, do you want to do a temporary workaround to disable the vec path selection till then? That may be the less worse solution if we don't fix it quickly. Reverting the patch isn't trivial, so this is not an option. I'm trying to reproduce it now, I'll let you know if I find something. Hmm, I reproduced Tiwei instructions, and in my case, Vhost's testpmd crashes because Virtio-user makes it doing an out of bound access. Could you provide a patch to disable vector path selection? I'll continue to debug, but we can start reviewing it so that it is ready if we need it. Thanks, Maxime Cheers, Maxime Thanks -Zhihong -Original Message- From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Olivier Matz Sent: Thursday, February 8, 2018 6:01 AM To: Xu, Qian Q Cc: Yao, Lei A ; dev@dpdk.org; y...@fridaylinux.org; maxime.coque...@redhat.com; Thomas Monjalon ; sta...@dpdk.org Subject: Re: [dpdk-dev] [PATCH v2 06/10] net/virtio: fix queue setup consistency Hi, It's in my short plans, but unfortunately some other high priority tasks were inserted before. Honnestly, I'm not sure I'll be able to make it for the release, but I'll do my best. Olivier On Wed, Feb 07, 2018 at 08:31:07AM +, Xu, Qian Q wrote: Any update, Olivier? We are near to release, and the bug-fix is important for the virtio vector path usage. Thanks. -Original Message- From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Olivier Matz Sent: Thursday, February 1, 2018 4:28 PM To: Yao, Lei A Cc: dev@dpdk.org; y...@fridaylinux.org; maxime.coque...@redhat.com; Thomas Monjalon ; sta...@dpdk.org Subject: Re: [dpdk-dev] [PATCH v2 06/10] net/virtio: fix queue setup consistency Hi Lei, It's on my todo list, I'll check this as soon as possible. Olivier On Thu, Feb 01, 2018 at 03:14:15AM +, Yao, Lei A wrote: Hi, Olivier This is Lei from DPDK validation team in Intel. During our DPDK 18.02-rc1 test, I find the following patch will cause one serious issue with virtio vector path: the traffic can't resume after stop/start the virtio device. The step like following: 1. Launch vhost-user port using testpmd at Host 2. Launch VM with virtio device, mergeable is off 3. Bind the virtio device to pmd driver, launch testpmd, let the tx/rx use vector path virtio_xmit_pkts_simple virtio_recv_pkts_vec 4. Send traffic to virtio device from vhost side, then stop the virtio device 5. Start the virtio device again After step 5, the traffic can't resume. Could you help check this and give a fix? This issue will impact the virtio pmd user experience heavily. By the way, this patch is already included into V17.11. Looks like we need give a patch to this LTS version. Thanks a lot! BRs Lei -Original Message- From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Olivier Matz Sent: Thursday, September 7, 2017 8:14 PM To: dev@dpdk.org; y...@fridaylinux.org; maxime.coque...@redhat.com Cc: step...@networkplumber.org; sta...@dpdk.org Subject: [dpdk-dev] [PATCH v2 06/10] net/virtio: fix queue setup consistency In rx/tx queue setup functions, some code is executed only if use_simple_rxtx == 1. The value of this variable can change depending on the offload flags or sse support. If Rx queue setup is called before Tx queue setup, it can result in an invalid configuration: - dev_configure is called: use_simple_rxtx is initialized to 0 - rx queue setup is called: queues are initialized without simple path support - tx queue setup is called: use_simple_rxtx switch to 1, and simple Rx/Tx handlers are selected Fix this by postponing a part of Rx/Tx queue initialization in dev_start(), as it was the case in the initial implementation. Fixes: 48cec290a3d2 ("net/virtio: move queue configure code to proper place") Cc: sta...@dpdk.org Signed-off-by: Olivier Matz --- drivers/net/virtio/virtio_ethdev.c | 13 + drivers/net/virtio/virtio_ethdev.h | 6 ++ drivers/net/virtio/virtio_rxtx.c | 40 ++- --- 3 files changed, 51 insertions(+), 8 deletions(-) diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c index 8eee3ff80..c7888f103 100644 --- a/drivers/net/virtio/virtio_ethdev.c +++ b/drivers/net/virtio/virtio_ethdev.c @@ -1737,6 +1737,19 @@ virtio_dev_start(struct rte_eth_dev *dev) struct virtnet_rx *rxvq; struct virtnet_tx *txvq __rte_unused; struct virtio_hw *hw = dev->data->dev_private; + int ret; + + /* Finish the initialization of the queues */ + for (i = 0; i < dev->data->nb_rx_queues; i++) { + ret = virtio_dev_rx_queue_setup_finish(dev, i); + if (ret < 0) +
[dpdk-dev] [PATCH] doc: fix ethdev API port_id parameter size
Fix rte_eth_dev_get_sec_ctx() parameter port_id storage size, form uint8_t to uint16_t Signed-off-by: Ferruh Yigit --- Cc: Boris Pismenny Cc: Aviad Yehezkel Cc: Radu Nicolau Cc: Declan Doherty Cc: Hemant Agrawal --- doc/guides/rel_notes/deprecation.rst | 4 1 file changed, 4 insertions(+) diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst index d59ad5988..bbd9456a7 100644 --- a/doc/guides/rel_notes/deprecation.rst +++ b/doc/guides/rel_notes/deprecation.rst @@ -45,6 +45,10 @@ Deprecation Notices Target release for removal of the legacy API will be defined once most PMDs have switched to rte_flow. +* ethdev: rte_eth_dev_get_sec_ctx() fix port id storage + rte_eth_dev_get_sec_ctx() is using uint8_t for port_id, which should be + uint16_t. + * i40e: The default flexible payload configuration which extracts the first 16 bytes of the payload for RSS will be deprecated starting from 18.02. If required the previous behavior can be configured using existing flow -- 2.14.3
Re: [dpdk-dev] [PATCH] doc: add virtio GUEST ANNOUNCE to release notes
> -Original Message- > From: Wang, Xiao W > Sent: Friday, February 9, 2018 2:28 PM > To: dev@dpdk.org > Cc: Mcnamara, John ; tho...@monjalon.net; Wang, > Xiao W > Subject: [PATCH] doc: add virtio GUEST ANNOUNCE to release notes > > Signed-off-by: Xiao Wang > --- > doc/guides/rel_notes/release_18_02.rst | 6 ++ > 1 file changed, 6 insertions(+) > > diff --git a/doc/guides/rel_notes/release_18_02.rst > b/doc/guides/rel_notes/release_18_02.rst > index c42a1d64b..d9718c174 100644 > --- a/doc/guides/rel_notes/release_18_02.rst > +++ b/doc/guides/rel_notes/release_18_02.rst > @@ -230,6 +230,12 @@ New Features >current build system using ``make``. For instructions on how to do a > DPDK build >using the new system, see the instructions in ``doc/build-sdk- > meson.txt``. > > +* **Added VIRTIO_NET_F_GUEST_ANNOUNCE feature support in virtio pmd.** > + > + In scenario where the vhost backend doesn't have the ability to > + generate RARP packet, the VM running virtio pmd can still be live > + migrated if VIRTIO_NET_F_GUEST_ANNOUNCE feature is negotiated. > + > .. note:: > > This new build system support is incomplete at this point and is > added The text has been added between the previous section and a note belonging to the previous section. However, we can fix this in the final revision of the release notes, so: Acked-by: John McNamara
Re: [dpdk-dev] [PATCH] doc: add VxLAN GRO to release notes
> -Original Message- > From: Hu, Jiayu > Sent: Friday, February 9, 2018 5:29 AM > To: dev@dpdk.org > Cc: Mcnamara, John ; tho...@monjalon.net; Hu, > Jiayu > Subject: [PATCH] doc: add VxLAN GRO to release notes > > Signed-off-by: Jiayu Hu > --- > doc/guides/rel_notes/release_18_02.rst | 9 + > 1 file changed, 9 insertions(+) > > diff --git a/doc/guides/rel_notes/release_18_02.rst > b/doc/guides/rel_notes/release_18_02.rst > index c42a1d6..c2078b5 100644 > --- a/doc/guides/rel_notes/release_18_02.rst > +++ b/doc/guides/rel_notes/release_18_02.rst > @@ -230,6 +230,15 @@ New Features >current build system using ``make``. For instructions on how to do a > DPDK build >using the new system, see the instructions in ``doc/build-sdk- > meson.txt``. > > +* **Add GRO support for VxLAN-tunneled packets.** > + > + Add GRO support for VxLAN-tunneled packets. Supported VxLAN packets > + must contain an outer IPv4 header and inner TCP/IPv4 headers. VxLAN > + GRO doesn't check if input packets have correct checksums and doesn't > + update checksums for output packets. Additionally, it assumes the > + packets are complete (i.e., MF==0 && frag_off==0), when IP > + fragmentation is possible (i.e., DF==0). > + > .. note:: > > This new build system support is incomplete at this point and is > added The text has been added between the previous section and a note belonging to the previous section. However, we can fix this in the final revision of the release notes, so: Acked-by: John McNamara
Re: [dpdk-dev] [PATCH] doc: add vhost-user live migration features to release notes
> -Original Message- > From: Hu, Jiayu > Sent: Friday, February 9, 2018 7:13 AM > To: dev@dpdk.org > Cc: Mcnamara, John ; tho...@monjalon.net; Hu, > Jiayu > Subject: [PATCH] doc: add vhost-user live migration features to release > notes > > Signed-off-by: Jiayu Hu > --- > doc/guides/rel_notes/release_18_02.rst | 10 ++ > 1 file changed, 10 insertions(+) > > diff --git a/doc/guides/rel_notes/release_18_02.rst > b/doc/guides/rel_notes/release_18_02.rst > index c42a1d6..116ca2a 100644 > --- a/doc/guides/rel_notes/release_18_02.rst > +++ b/doc/guides/rel_notes/release_18_02.rst > @@ -230,6 +230,16 @@ New Features >current build system using ``make``. For instructions on how to do a > DPDK build >using the new system, see the instructions in ``doc/build-sdk- > meson.txt``. > > +* **Add feature supports for live migration from vhost-net to > +vhost-user.** > + > + To make live migration from vhost-net to vhost-user possible, added > + feature supports for vhost-user. The features include: > + > + * VIRTIO_F_EVENT_IDX > + * VIRTIO_NET_F_GUEST_ECN, VIRTIO_NET_F_HOST_ECN > + * VIRTIO_NET_F_GUEST_UFO, VIRTIO_NET_F_HOST_UFO > + * VIRTIO_NET_F_GSO > + > .. note:: > > This new build system support is incomplete at this point and is > added The text has been added between the previous section and a note belonging to the previous section. However, we can fix this in the final revision of the release notes, so: Acked-by: John McNamara
Re: [dpdk-dev] [PATCH v2] doc: add vhost-user live migration features to release notes
> -Original Message- > From: Hu, Jiayu > Sent: Friday, February 9, 2018 7:47 AM > To: dev@dpdk.org > Cc: Mcnamara, John ; tho...@monjalon.net; Wang, > Zhihong ; Hu, Jiayu > Subject: [PATCH v2] doc: add vhost-user live migration features to release > notes > > Signed-off-by: Jiayu Hu > --- > change in v2: > - add VIRTIO_F_ANY_LAYOUT feature > > doc/guides/rel_notes/release_18_02.rst | 11 +++ > 1 file changed, 11 insertions(+) > > diff --git a/doc/guides/rel_notes/release_18_02.rst > b/doc/guides/rel_notes/release_18_02.rst > index c42a1d6..1d31329 100644 > --- a/doc/guides/rel_notes/release_18_02.rst > +++ b/doc/guides/rel_notes/release_18_02.rst > @@ -230,6 +230,17 @@ New Features >current build system using ``make``. For instructions on how to do a > DPDK build >using the new system, see the instructions in ``doc/build-sdk- > meson.txt``. > > +* **Add feature supports for live migration from vhost-net to > +vhost-user.** > + > + To make live migration from vhost-net to vhost-user possible, added > + feature supports for vhost-user. The features include: > + > + * VIRTIO_F_ANY_LAYOUT > + * VIRTIO_F_EVENT_IDX > + * VIRTIO_NET_F_GUEST_ECN, VIRTIO_NET_F_HOST_ECN > + * VIRTIO_NET_F_GUEST_UFO, VIRTIO_NET_F_HOST_UFO > + * VIRTIO_NET_F_GSO > + > .. note:: > > This new build system support is incomplete at this point and is > added The text has been added between the previous section and a note belonging to the previous section. However, we can fix this in the final revision of the release notes, so: Acked-by: John McNamara
Re: [dpdk-dev] [PATCH] doc: add VxLAN GRO to release notes
09/02/2018 10:56, Mcnamara, John: > The text has been added between the previous section and a note belonging > to the previous section. > > However, we can fix this in the final revision of the release notes, so: I can fix it. I also take care of keeping a logical order in release notes features. For example, drivers are grouped together by class, and libraries are after. General comment about release notes: Please try to update release notes while updating the code. It is better for 3 reasons: - less chance of forgetting - release notes is up-to-date in RC1 - it is better linked in the git history
[dpdk-dev] [PATCH] doc: update ethdev APIs to return named opaque type
Ethdev APIs to add callback return the callback object as "void *", update return type to actual object type "struct rte_eth_rxtx_callback *" Signed-off-by: Ferruh Yigit --- Cc: Konstantin Ananyev Cc: Stephen Hemminger Cc: Bruce Richardson Cc: Thomas Monjalon --- doc/guides/rel_notes/deprecation.rst | 7 +++ 1 file changed, 7 insertions(+) diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst index bbd9456a7..b6479cd5a 100644 --- a/doc/guides/rel_notes/deprecation.rst +++ b/doc/guides/rel_notes/deprecation.rst @@ -49,6 +49,13 @@ Deprecation Notices rte_eth_dev_get_sec_ctx() is using uint8_t for port_id, which should be uint16_t. +* ethdev: add rx/tx callback functions return named opaque type + rte_eth_add_rx_callback(), rte_eth_add_first_rx_callback() and + rte_eth_add_tx_callback() functions currently return "void * " but APIs to + delete callbacks get "struct rte_eth_rxtx_callback * " as parameter. For + consistency functions adding callback will return "struct rte_eth_rxtx_callback * " + instead of "void * ". + * i40e: The default flexible payload configuration which extracts the first 16 bytes of the payload for RSS will be deprecated starting from 18.02. If required the previous behavior can be configured using existing flow -- 2.14.3
[dpdk-dev] [PATCH v2] doc: update ethdev APIs to return named opaque type
Ethdev APIs to add callback return the callback object as "void *", update return type to actual object type "struct rte_eth_rxtx_callback *" Signed-off-by: Ferruh Yigit --- Cc: Konstantin Ananyev Cc: Stephen Hemminger Cc: Bruce Richardson Cc: Thomas Monjalon --- doc/guides/rel_notes/deprecation.rst | 7 +++ 1 file changed, 7 insertions(+) diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst index bbd9456a7..5cb5a00d2 100644 --- a/doc/guides/rel_notes/deprecation.rst +++ b/doc/guides/rel_notes/deprecation.rst @@ -49,6 +49,13 @@ Deprecation Notices rte_eth_dev_get_sec_ctx() is using uint8_t for port_id, which should be uint16_t. +* ethdev: functions add rx/tx callback will return named opaque type + rte_eth_add_rx_callback(), rte_eth_add_first_rx_callback() and + rte_eth_add_tx_callback() functions currently return callback object as + "void \*" but APIs to delete callbacks get "struct rte_eth_rxtx_callback \*" + as parameter. For consistency functions adding callback will return + "struct rte_eth_rxtx_callback \*" instead of "void * ". + * i40e: The default flexible payload configuration which extracts the first 16 bytes of the payload for RSS will be deprecated starting from 18.02. If required the previous behavior can be configured using existing flow -- 2.14.3
[dpdk-dev] [PATCH] Improve the shaper accuracy for large packets
From: Alan Robertson There were 2 issues, the first was time could be lost whilst updating the traffic-class period, the second was a frame could be delayed if not enough tokens were available for the full frame. By allowing the shaper to borrow credit from the next period the throughput is improved. --- lib/librte_sched/rte_sched.c | 58 +++- 1 file changed, 41 insertions(+), 17 deletions(-) diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c index 634486c..e53a424 100644 --- a/lib/librte_sched/rte_sched.c +++ b/lib/librte_sched/rte_sched.c @@ -57,7 +57,7 @@ struct rte_sched_subport { /* Traffic classes (TCs) */ uint64_t tc_time; /* time of next update */ uint32_t tc_credits_per_period[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE]; - uint32_t tc_credits[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE]; + int32_t tc_credits[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE]; uint32_t tc_period; /* TC oversubscription */ @@ -98,7 +98,7 @@ struct rte_sched_pipe { /* Traffic classes (TCs) */ uint64_t tc_time; /* time of next update */ - uint32_t tc_credits[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE]; + int32_t tc_credits[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE]; /* Weighted Round Robin (WRR) */ uint8_t wrr_tokens[RTE_SCHED_QUEUES_PER_PIPE]; @@ -1451,6 +1451,8 @@ grinder_credits_update(struct rte_sched_port *port, uint32_t pos) struct rte_sched_pipe *pipe = grinder->pipe; struct rte_sched_pipe_profile *params = grinder->pipe_params; uint64_t n_periods; + uint32_t tc; + uint64_t lapsed; /* Subport TB */ n_periods = (port->time - subport->tb_time) / subport->tb_period; @@ -1466,20 +1468,42 @@ grinder_credits_update(struct rte_sched_port *port, uint32_t pos) /* Subport TCs */ if (unlikely(port->time >= subport->tc_time)) { - subport->tc_credits[0] = subport->tc_credits_per_period[0]; - subport->tc_credits[1] = subport->tc_credits_per_period[1]; - subport->tc_credits[2] = subport->tc_credits_per_period[2]; - subport->tc_credits[3] = subport->tc_credits_per_period[3]; - subport->tc_time = port->time + subport->tc_period; + for (tc = 0; tc < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; tc++) { + if (subport->tc_credits[tc] < 0) + subport->tc_credits[tc] += + subport->tc_credits_per_period[tc]; + else + subport->tc_credits[tc] = + subport->tc_credits_per_period[tc]; + } + /* If we've run into the next period only update the clock to +* the time + tc_period so we'll replenish the tc tokens early +* in the next tc_period to compensate. */ + lapsed = port->time - subport->tc_time; + if (lapsed < subport->tc_period) + subport->tc_time += subport->tc_period; + else + subport->tc_time = port->time + subport->tc_period; } /* Pipe TCs */ if (unlikely(port->time >= pipe->tc_time)) { - pipe->tc_credits[0] = params->tc_credits_per_period[0]; - pipe->tc_credits[1] = params->tc_credits_per_period[1]; - pipe->tc_credits[2] = params->tc_credits_per_period[2]; - pipe->tc_credits[3] = params->tc_credits_per_period[3]; - pipe->tc_time = port->time + params->tc_period; + for (tc = 0; tc < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; tc++) { + if (pipe->tc_credits[tc] < 0) + pipe->tc_credits[tc] += + params->tc_credits_per_period[tc]; + else + pipe->tc_credits[tc] = + params->tc_credits_per_period[tc]; + } + /* If we've run into the next period only update the clock to +* the time + tc_period so we'll replenish the tc tokens early +* in the next tc_period to compensate. */ + lapsed = port->time - pipe->tc_time; + if (lapsed < params->tc_period) + pipe->tc_time += params->tc_period; + else + pipe->tc_time = port->time + params->tc_period; } } @@ -1586,16 +1610,16 @@ grinder_credits_check(struct rte_sched_port *port, uint32_t pos) uint32_t tc_index = grinder->tc_index; uint32_t pkt_len = pkt->pkt_len + port->frame_overhead; uint32_t subport_tb_credits = subport->tb_credits; - uint32_t subport_tc_credits = subport->tc_credits[tc_index]; + int32_t subport_tc_credits = subport->tc_credits[tc_ind
[dpdk-dev] [PATCH] Improve the shaper accuracy for large packets
From: Alan Robertson There were 2 issues, the first was time could be lost whilst updating the traffic-class period, the second was a frame could be delayed if not enough tokens were available for the full frame. By allowing the shaper to borrow credit from the next period the throughput is improved. Signed-off-by: Alan Robertson --- lib/librte_sched/rte_sched.c | 60 +++- 1 file changed, 43 insertions(+), 17 deletions(-) diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c index 634486c..7b06b0b 100644 --- a/lib/librte_sched/rte_sched.c +++ b/lib/librte_sched/rte_sched.c @@ -57,7 +57,7 @@ struct rte_sched_subport { /* Traffic classes (TCs) */ uint64_t tc_time; /* time of next update */ uint32_t tc_credits_per_period[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE]; - uint32_t tc_credits[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE]; + int32_t tc_credits[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE]; uint32_t tc_period; /* TC oversubscription */ @@ -98,7 +98,7 @@ struct rte_sched_pipe { /* Traffic classes (TCs) */ uint64_t tc_time; /* time of next update */ - uint32_t tc_credits[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE]; + int32_t tc_credits[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE]; /* Weighted Round Robin (WRR) */ uint8_t wrr_tokens[RTE_SCHED_QUEUES_PER_PIPE]; @@ -1451,6 +1451,8 @@ grinder_credits_update(struct rte_sched_port *port, uint32_t pos) struct rte_sched_pipe *pipe = grinder->pipe; struct rte_sched_pipe_profile *params = grinder->pipe_params; uint64_t n_periods; + uint32_t tc; + uint64_t lapsed; /* Subport TB */ n_periods = (port->time - subport->tb_time) / subport->tb_period; @@ -1466,20 +1468,44 @@ grinder_credits_update(struct rte_sched_port *port, uint32_t pos) /* Subport TCs */ if (unlikely(port->time >= subport->tc_time)) { - subport->tc_credits[0] = subport->tc_credits_per_period[0]; - subport->tc_credits[1] = subport->tc_credits_per_period[1]; - subport->tc_credits[2] = subport->tc_credits_per_period[2]; - subport->tc_credits[3] = subport->tc_credits_per_period[3]; - subport->tc_time = port->time + subport->tc_period; + for (tc = 0; tc < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; tc++) { + if (subport->tc_credits[tc] < 0) + subport->tc_credits[tc] += + subport->tc_credits_per_period[tc]; + else + subport->tc_credits[tc] = + subport->tc_credits_per_period[tc]; + } + /* If we've run into the next period only update the clock to +* the time + tc_period so we'll replenish the tc tokens early +* in the next tc_period to compensate. +*/ + lapsed = port->time - subport->tc_time; + if (lapsed < subport->tc_period) + subport->tc_time += subport->tc_period; + else + subport->tc_time = port->time + subport->tc_period; } /* Pipe TCs */ if (unlikely(port->time >= pipe->tc_time)) { - pipe->tc_credits[0] = params->tc_credits_per_period[0]; - pipe->tc_credits[1] = params->tc_credits_per_period[1]; - pipe->tc_credits[2] = params->tc_credits_per_period[2]; - pipe->tc_credits[3] = params->tc_credits_per_period[3]; - pipe->tc_time = port->time + params->tc_period; + for (tc = 0; tc < RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE; tc++) { + if (pipe->tc_credits[tc] < 0) + pipe->tc_credits[tc] += + params->tc_credits_per_period[tc]; + else + pipe->tc_credits[tc] = + params->tc_credits_per_period[tc]; + } + /* If we've run into the next period only update the clock to +* the time + tc_period so we'll replenish the tc tokens early +* in the next tc_period to compensate. +*/ + lapsed = port->time - pipe->tc_time; + if (lapsed < params->tc_period) + pipe->tc_time += params->tc_period; + else + pipe->tc_time = port->time + params->tc_period; } } @@ -1586,16 +1612,16 @@ grinder_credits_check(struct rte_sched_port *port, uint32_t pos) uint32_t tc_index = grinder->tc_index; uint32_t pkt_len = pkt->pkt_len + port->frame_overhead; uint32_t subport_tb_credits = subport->tb_credits; - uint32_t subport_tc_credits = subport->tc_credits[tc_index
[dpdk-dev] [PATCH] examples/ipsec-secgw: print correct crypto name
When AES-256 was used aes-128 was printed in the console Fixes: fa9088849e12 ("examples/ipsec-secgw: support AES 256") Signed-off-by: Radu Nicolau --- examples/ipsec-secgw/sa.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/examples/ipsec-secgw/sa.c b/examples/ipsec-secgw/sa.c index 21239dd..d9dcc0e 100644 --- a/examples/ipsec-secgw/sa.c +++ b/examples/ipsec-secgw/sa.c @@ -631,7 +631,8 @@ print_one_sa_rule(const struct ipsec_sa *sa, int inbound) printf("\tspi_%s(%3u):", inbound?"in":"out", sa->spi); for (i = 0; i < RTE_DIM(cipher_algos); i++) { - if (cipher_algos[i].algo == sa->cipher_algo) { + if (cipher_algos[i].algo == sa->cipher_algo && + cipher_algos[i].key_len == sa->cipher_key_len) { printf("%s ", cipher_algos[i].keyword); break; } -- 2.7.5
Re: [dpdk-dev] [PATCH] doc: fix ethdev API port_id parameter size
> -Original Message- > From: Yigit, Ferruh > Sent: Friday, February 9, 2018 9:54 AM > To: Neil Horman ; Mcnamara, John > ; Kovacevic, Marko > > Cc: dev@dpdk.org; Yigit, Ferruh ; Thomas Monjalon > ; Boris Pismenny ; Aviad > Yehezkel ; Nicolau, Radu > ; Doherty, Declan ; > Hemant Agrawal > Subject: [PATCH] doc: fix ethdev API port_id parameter size > > Fix rte_eth_dev_get_sec_ctx() parameter port_id storage size, form uint8_t > to uint16_t > > Signed-off-by: Ferruh Yigit > --- Acked-by: Radu Nicolau
Re: [dpdk-dev] [PATCH v2] doc: update ethdev APIs to return named opaque type
> -Original Message- > From: Yigit, Ferruh > Sent: Friday, February 9, 2018 10:18 AM > To: Neil Horman ; Mcnamara, John > ; Kovacevic, Marko > > Cc: dev@dpdk.org; Yigit, Ferruh ; Ananyev, Konstantin > ; Stephen Hemminger > ; Richardson, Bruce ; > Thomas Monjalon > Subject: [PATCH v2] doc: update ethdev APIs to return named opaque type > > Ethdev APIs to add callback return the callback object as "void *", > update return type to actual object type > "struct rte_eth_rxtx_callback *" > > Signed-off-by: Ferruh Yigit > --- > Cc: Konstantin Ananyev > Cc: Stephen Hemminger > Cc: Bruce Richardson > Cc: Thomas Monjalon > --- > doc/guides/rel_notes/deprecation.rst | 7 +++ > 1 file changed, 7 insertions(+) > > diff --git a/doc/guides/rel_notes/deprecation.rst > b/doc/guides/rel_notes/deprecation.rst > index bbd9456a7..5cb5a00d2 100644 > --- a/doc/guides/rel_notes/deprecation.rst > +++ b/doc/guides/rel_notes/deprecation.rst > @@ -49,6 +49,13 @@ Deprecation Notices >rte_eth_dev_get_sec_ctx() is using uint8_t for port_id, which should be >uint16_t. > > +* ethdev: functions add rx/tx callback will return named opaque type > + rte_eth_add_rx_callback(), rte_eth_add_first_rx_callback() and > + rte_eth_add_tx_callback() functions currently return callback object as > + "void \*" but APIs to delete callbacks get "struct rte_eth_rxtx_callback > \*" > + as parameter. For consistency functions adding callback will return > + "struct rte_eth_rxtx_callback \*" instead of "void * ". > + > * i40e: The default flexible payload configuration which extracts the first > 16 >bytes of the payload for RSS will be deprecated starting from 18.02. If >required the previous behavior can be configured using existing flow > -- Acked-by: Konstantin Ananyev > 2.14.3
[dpdk-dev] [PATCH 0/2] Vhost & Virtio fixes for -rc4
This two patches series fixes two regressions met with virtio-user. The first one is only reproduced when using the Vector path, the second one may be met whatever the rx function. Maxime Coquelin (2): virtio: fix resuming traffic with rx vector path vhost: don't take access_lock on VHOST_USER_RESET_OWNER drivers/net/virtio/virtio_rxtx.c| 34 ++--- drivers/net/virtio/virtio_rxtx_simple.c | 2 +- drivers/net/virtio/virtio_rxtx_simple.h | 2 +- lib/librte_vhost/vhost_user.c | 10 +- 4 files changed, 26 insertions(+), 22 deletions(-) -- 2.14.3
[dpdk-dev] [PATCH 1/2] virtio: fix resuming traffic with rx vector path
This patch fixes traffic resuming issue seen when using Rx vector path. Fixes: efc83a1e7fc3 ("net/virtio: fix queue setup consistency") Signed-off-by: Tiwei Bie Signed-off-by: Maxime Coquelin --- drivers/net/virtio/virtio_rxtx.c| 34 ++--- drivers/net/virtio/virtio_rxtx_simple.c | 2 +- drivers/net/virtio/virtio_rxtx_simple.h | 2 +- 3 files changed, 21 insertions(+), 17 deletions(-) diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c index 854af399e..505283edd 100644 --- a/drivers/net/virtio/virtio_rxtx.c +++ b/drivers/net/virtio/virtio_rxtx.c @@ -30,6 +30,7 @@ #include "virtio_pci.h" #include "virtqueue.h" #include "virtio_rxtx.h" +#include "virtio_rxtx_simple.h" #ifdef RTE_LIBRTE_VIRTIO_DEBUG_DUMP #define VIRTIO_DUMP_PACKET(m, len) rte_pktmbuf_dump(stdout, m, len) @@ -446,25 +447,28 @@ virtio_dev_rx_queue_setup_finish(struct rte_eth_dev *dev, uint16_t queue_idx) &rxvq->fake_mbuf; } - while (!virtqueue_full(vq)) { - m = rte_mbuf_raw_alloc(rxvq->mpool); - if (m == NULL) - break; + if (hw->use_simple_rx) { + while (vq->vq_free_cnt >= RTE_VIRTIO_VPMD_RX_REARM_THRESH) { + virtio_rxq_rearm_vec(rxvq); + nbufs += RTE_VIRTIO_VPMD_RX_REARM_THRESH; + } + } else { + while (!virtqueue_full(vq)) { + m = rte_mbuf_raw_alloc(rxvq->mpool); + if (m == NULL) + break; - /* Enqueue allocated buffers */ - if (hw->use_simple_rx) - error = virtqueue_enqueue_recv_refill_simple(vq, m); - else + /* Enqueue allocated buffers */ error = virtqueue_enqueue_recv_refill(vq, m); - - if (error) { - rte_pktmbuf_free(m); - break; + if (error) { + rte_pktmbuf_free(m); + break; + } + nbufs++; } - nbufs++; - } - vq_update_avail_idx(vq); + vq_update_avail_idx(vq); + } PMD_INIT_LOG(DEBUG, "Allocated %d bufs", nbufs); diff --git a/drivers/net/virtio/virtio_rxtx_simple.c b/drivers/net/virtio/virtio_rxtx_simple.c index 7247a0822..0a79d1d5b 100644 --- a/drivers/net/virtio/virtio_rxtx_simple.c +++ b/drivers/net/virtio/virtio_rxtx_simple.c @@ -77,7 +77,7 @@ virtio_xmit_pkts_simple(void *tx_queue, struct rte_mbuf **tx_pkts, rte_compiler_barrier(); if (nb_used >= VIRTIO_TX_FREE_THRESH) - virtio_xmit_cleanup(vq); + virtio_xmit_cleanup_simple(vq); nb_commit = nb_pkts = RTE_MIN((vq->vq_free_cnt >> 1), nb_pkts); desc_idx = (uint16_t)(vq->vq_avail_idx & desc_idx_max); diff --git a/drivers/net/virtio/virtio_rxtx_simple.h b/drivers/net/virtio/virtio_rxtx_simple.h index 2d8e6b14a..303904d64 100644 --- a/drivers/net/virtio/virtio_rxtx_simple.h +++ b/drivers/net/virtio/virtio_rxtx_simple.h @@ -60,7 +60,7 @@ virtio_rxq_rearm_vec(struct virtnet_rx *rxvq) #define VIRTIO_TX_FREE_NR 32 /* TODO: vq->tx_free_cnt could mean num of free slots so we could avoid shift */ static inline void -virtio_xmit_cleanup(struct virtqueue *vq) +virtio_xmit_cleanup_simple(struct virtqueue *vq) { uint16_t i, desc_idx; uint32_t nb_free = 0; -- 2.14.3
[dpdk-dev] [PATCH 2/2] vhost: don't take access_lock on VHOST_USER_RESET_OWNER
A deadlock happens when handling VHOST_USER_RESET_OWNER request for the same reason the lock is not taken for VHOST_USER_GET_VRING_BASE. It is safe not to take the lock, as the queues are no more used by the application when the virtqueues and the device are reset. Fixes: a3688046995f ("vhost: protect active rings from async ring changes") Cc: sta...@dpdk.org Cc: Victor Kaplansky Signed-off-by: Maxime Coquelin --- lib/librte_vhost/vhost_user.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c index 65ee33919..90ed2112e 100644 --- a/lib/librte_vhost/vhost_user.c +++ b/lib/librte_vhost/vhost_user.c @@ -1348,16 +1348,16 @@ vhost_user_msg_handler(int vid, int fd) } /* -* Note: we don't lock all queues on VHOST_USER_GET_VRING_BASE, -* since it is sent when virtio stops and device is destroyed. -* destroy_device waits for queues to be inactive, so it is safe. -* Otherwise taking the access_lock would cause a dead lock. +* Note: we don't lock all queues on VHOST_USER_GET_VRING_BASE +* and VHOST_USER_RESET_OWNER, since it is sent when virtio stops +* and device is destroyed. destroy_device waits for queues to be +* inactive, so it is safe. Otherwise taking the access_lock +* would cause a dead lock. */ switch (msg.request.master) { case VHOST_USER_SET_FEATURES: case VHOST_USER_SET_PROTOCOL_FEATURES: case VHOST_USER_SET_OWNER: - case VHOST_USER_RESET_OWNER: case VHOST_USER_SET_MEM_TABLE: case VHOST_USER_SET_LOG_BASE: case VHOST_USER_SET_LOG_FD: -- 2.14.3
Re: [dpdk-dev] [PATCH] doc: add ABI change notice for numa_node_count in eal
On Wed, Feb 07, 2018 at 03:40:20PM +0530, Jerin Jacob wrote: > -Original Message- > > Date: Tue, 23 Jan 2018 10:39:58 + > > From: "Mcnamara, John" > > To: "Burakov, Anatoly" , "dev@dpdk.org" > > > > CC: Neil Horman , "Kovacevic, Marko" > > > > Subject: Re: [dpdk-dev] [PATCH] doc: add ABI change notice for > > numa_node_count in eal > > > > > > > > > -Original Message- > > > From: Burakov, Anatoly > > > Sent: Tuesday, January 16, 2018 5:54 PM > > > To: dev@dpdk.org > > > Cc: Neil Horman ; Mcnamara, John > > > ; Kovacevic, Marko > > > Subject: [PATCH] doc: add ABI change notice for numa_node_count in eal > > > > > > There will be a new function added in v18.05 that will return number of > > > detected sockets, which will change the ABI. > > > > > > Signed-off-by: Anatoly Burakov > > > --- > > > doc/guides/rel_notes/deprecation.rst | 2 ++ > > > 1 file changed, 2 insertions(+) > > > > > > diff --git a/doc/guides/rel_notes/deprecation.rst > > > b/doc/guides/rel_notes/deprecation.rst > > > index 13e8543..9662150 100644 > > > --- a/doc/guides/rel_notes/deprecation.rst > > > +++ b/doc/guides/rel_notes/deprecation.rst > > > @@ -8,6 +8,8 @@ API and ABI deprecation notices are to be posted here. > > > Deprecation Notices > > > --- > > > > > > +* eal: new ``numa_node_count`` member will be added to ``rte_config`` > > > +structure in v18.05. > > > * eal: several API and ABI changes are planned for ``rte_devargs`` in > > > v18.02. > > > > In general it is best to leave a blank line between the bullet points. But > > that > > doesn't affect the rendering so: > > > > Acked-by: John McNamara > > Acked-by: Jerin Jacob > Acked-by: Bruce Richardson
Re: [dpdk-dev] [PATCH] doc: fix ethdev API port_id parameter size
> > > > Fix rte_eth_dev_get_sec_ctx() parameter port_id storage size, form > > uint8_t to uint16_t > > > > Signed-off-by: Ferruh Yigit > > --- > Acked-by: Radu Nicolau Acked-by: Hemant Agrawal
Re: [dpdk-dev] [PATCH] examples: update copyrights and license
Hi Lee, On 1/23/2018 10:16 PM, Lee Daly wrote: This updates the Intel, Cavium and Hasan Alayli license on files in examples to be the standard BSD-3-Clause license used for the rest of DPDK, bringing the files in compliance with the DPDK licensing policy. Please change the patch commit msg to examples/performance-thread: Signed-off-by: Lee Daly --- .../performance-thread/common/arch/x86/stack.h | 61 ++ 1 file changed, 5 insertions(+), 56 deletions(-) diff --git a/examples/performance-thread/common/arch/x86/stack.h b/examples/performance-thread/common/arch/x86/stack.h index 98723ba..2c31f7c 100644 --- a/examples/performance-thread/common/arch/x86/stack.h +++ b/examples/performance-thread/common/arch/x86/stack.h @@ -1,66 +1,15 @@ -/*- - * BSD LICENSE - * - * Copyright(c) 2015 Intel Corporation. All rights reserved. - * Copyright(c) Cavium, Inc. 2017. - * All rights reserved. - * - * Redistribution and use in source and binary forms, with or without - * modification, are permitted provided that the following conditions - * are met: - * - * * Redistributions of source code must retain the above copyright - * notice, this list of conditions and the following disclaimer. - * * Redistributions in binary form must reproduce the above copyright - * notice, this list of conditions and the following disclaimer in - * the documentation and/or other materials provided with the - * distribution. - * * Neither the name of Intel Corporation nor the names of its - * contributors may be used to endorse or promote products derived - * from this software without specific prior written permission. - * - * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS - * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT - * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR - * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT - * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, - * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT - * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, - * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY - * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT - * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE - * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2015 Intel Corporation. + * Copyright(c) Cavium, Inc. 2017. + * All rights reserved */ /* * Some portions of this software is derived from the - * https://github.com/halayli/lthread which carrys the following license. - * + * https://github.com/halayli/lthread which carries the following license. * Copyright (C) 2012, Hasan Alayli I suggest following: > +/* SPDX-License-Identifier: BSD-3-Clause > + * Copyright(c) 2015 Intel Corporation. > + * Copyright(c) Cavium, Inc. 2017. > + * All rights reserved > * Copyright (C) 2012, Hasan Alayli * Portions derived from: https://github.com/halayli/lthread * With permissions form Hasan Alayli to use them as BSD-3-Clause > */ Regards, Hemant - * - * Redistribution and use in source and binary forms, with or without - * modification, are permitted provided that the following conditions - * are met: - * 1. Redistributions of source code must retain the above copyright - *notice, this list of conditions and the following disclaimer. - * 2. Redistributions in binary form must reproduce the above copyright - *notice, this list of conditions and the following disclaimer in the - *documentation and/or other materials provided with the distribution. - * - * THIS SOFTWARE IS PROVIDED BY AUTHOR AND CONTRIBUTORS ``AS IS'' AND - * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE - * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE - * ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE - * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL - * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS - * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) - * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT - * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY - * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF - * SUCH DAMAGE. */ - #ifndef STACK_H #define STACK_H
[dpdk-dev] [PATCH v5] checkpatches.sh: Add checks for ABI symbol addition
Recently, some additional patches were added to allow for programmatic marking of C symbols as experimental. The addition of these markers is dependent on the manual addition of exported symbols to the EXPERIMENTAL section of the corresponding libraries version map file. The consensus on review is that, in addition to mandating the addition of symbols to the EXPERIMENTAL version in the map, we need a mechanism to enforce our documented process of mandating that addition when they are introduced. To that end, I am proposing this change. It is an addition to the checkpatches script, which scan incoming patches for additions and removals of symbols to the map file, and warns the user appropriately Signed-off-by: Neil Horman CC: tho...@monjalon.net CC: john.mcnam...@intel.com CC: bruce.richard...@intel.com CC: Ferruh Yigit CC: Stephen Hemminger --- Change notes v2) * Cleaned up and documented awk script (shemminger) * fixed sort/uniq usage (shemminger) * moved checking to new script (tmonjalon) * added maintainer entry (tmonjalon) * added license (tmonjalon) v3) * Changed symbol check script name (tmonjalon) * Trapped exit to clean temp file (tmonjalon) * Honored verbose command (tmonjalon) * Cleaned left over debug bits (tmonjalon) * Updated location in MAINTAINERS file (tmonjalon) v4) * Updated maintainers file (tmonjalon) v5) * undo V4 (tmojalon) --- MAINTAINERS | 1 + devtools/check-symbol-change.sh | 146 devtools/checkpatches.sh| 23 ++- 3 files changed, 169 insertions(+), 1 deletion(-) create mode 100755 devtools/check-symbol-change.sh diff --git a/MAINTAINERS b/MAINTAINERS index acd056134..d9d2abff8 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -86,6 +86,7 @@ M: Neil Horman F: lib/librte_compat/ F: doc/guides/rel_notes/deprecation.rst F: devtools/validate-abi.sh +F: devtools/check-symbol-change.sh F: buildtools/check-experimental-syms.sh Driver information diff --git a/devtools/check-symbol-change.sh b/devtools/check-symbol-change.sh new file mode 100755 index 0..22b17e6f2 --- /dev/null +++ b/devtools/check-symbol-change.sh @@ -0,0 +1,146 @@ +#!/bin/sh + +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(c) 2018 Neil Horman + +build_map_changes() +{ + local fname=$1 + local mapdb=$2 + + cat $fname | filterdiff -i *.map | awk ' + # Initialize our variables + BEGIN {map="";sym="";ar="";sec=""; in_sec=0} + + # Anything that starts with + or -, followed by an a + # and ends in the string .map is the name of our map file + # This may appear multiple times in a patch if multiple + # map files are altered, and all section/symbol names + # appearing between a triggering of this rule and the + # next trigger of this rule are associated with this file + /[-+] a\/.*\.map/ {map=$2} + + # Triggering this rule, which starts a line with a + and ends it + # with a { identifies a versioned section. The section name is + # the rest of the line with the + and { symbols remvoed. + # Triggering this rule sets in_sec to 1, which actives the + # symbol rule below + /+.*{/ {gsub("+","");sec=$1; in_sec=1} + + # This rule idenfies the end of a section, and disables the + # symbol rule + /.*}/ {in_sec=0} + + # This rule matches on a + followed by any characters except a : + # (which denotes a global vs local segment), and ends with a ;. + # The semicolon is removed and the symbol is printed with its + # association file name and version section, along with an + # indicator that the symbol is a new addition. Note this rule + # only works if we have found a version section in the rule + # above (hence the in_sec check). Otherwise we flag it as an + # unknown section + /^+[^}].*[^:*];/ {gsub(";","");sym=$2; + if (in_sec == 1) { + print map " " sym " " sec " add" + } else { + print map " " sym " unknown add" + } + } + + # This is the same rule as above, but the rule matches on a + # leading - rather than a +, denoting that the symbol is being + # removed. + /^-[^}].*[^:*];/ {gsub(";","");sym=$2; + if (in_sec == 1) { + print map " " sym " " sec " del" + } else { + print map " " sym " unknown del" + } + }' > ./$mapdb + + sort -u $mapdb > ./$mapdb.2 + mv -f $mapdb.2
[dpdk-dev] [PATCH] Add myself as driver information maintainer
I wrote pmdinfogen initially, and since there isn't a maintainer for it, I'll volunteer to take care of it Signed-off-by: Neil Horman CC: Thomas Monjalon --- MAINTAINERS | 1 + 1 file changed, 1 insertion(+) diff --git a/MAINTAINERS b/MAINTAINERS index d9d2abff8..d1ef43479 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -90,6 +90,7 @@ F: devtools/check-symbol-change.sh F: buildtools/check-experimental-syms.sh Driver information +M: Neil Horman F: buildtools/pmdinfogen/ F: usertools/dpdk-pmdinfo.py F: doc/guides/tools/pmdinfo.rst -- 2.14.3
Re: [dpdk-dev] [PATCH] Add myself as driver information maintainer
09/02/2018 16:23, Neil Horman: > I wrote pmdinfogen initially, and since there isn't a maintainer for it, > I'll volunteer to take care of it > > Signed-off-by: Neil Horman Acked-by: Thomas Monjalon Applied, thanks Neil
Re: [dpdk-dev] [PATCH v1] doc: update mlx4 flow limitations
Hi Ophir, On Thu, Feb 08, 2018 at 06:55:54AM +, Ophir Munk wrote: > From: Moti Haimovsky Relatively minor, patch author differs from the only sign off below, I don't think it's on purpose. > This patch updates mlx4 documentation with flow > configuration limitations imposed by NIC hardware and > PMD implementation > > Signed-off-by: Ophir Munk Another nit, don't hesitate to spread commit logs to their maximum width of 75 chars. We're not writing poetry :) More comments below. > --- > doc/guides/nics/mlx4.rst | 77 > > 1 file changed, 77 insertions(+) > > diff --git a/doc/guides/nics/mlx4.rst b/doc/guides/nics/mlx4.rst > index 98b9716..b81a875 100644 > --- a/doc/guides/nics/mlx4.rst > +++ b/doc/guides/nics/mlx4.rst > @@ -515,3 +515,80 @@ devices managed by librte_pmd_mlx4. >Port 3 Link Up - speed 4 Mbps - full-duplex >Done >testpmd> > + > +Limitations > +--- > + > +Flow rules > +~~ While documenting flow rule limitations is a good idea, I think this approach is not ideal. rte_flow being unbounded, no PMD can support all possible combinations. It's much easier and useful to list what is implemented or more importantly, *tested* and known to work. Ideally it should be written in a format that can be reused by other PMDs for consistency and divided in sections sorted by order of usefulness, something like: 1. Overview of supported combinations of attributes, patterns and actions. 2. Detailed description of each supported pattern (item combinations). 3. Detailed description of all supported action combinations. 4. Description of each supported pattern item and its quirks. 5. Description of each supported action and its quirks. > + > +L2 (eth) > + You need to make clear you're documenting RTE_FLOW_ITEM_TYPE_ETH when used at the beginning of a flow rule, for instance it doesn't apply to an inner ETH used after VXLAN since mlx4 can't match those yet. That's why it's important to also describe the supported combinations. > + > +- Can only use real destination MAC > +- Source MAC is not taken into consideration Neither EtherType (please end all sentences with periods). > + > + For example using testpmd command - src mask must be 00:00:00:00:00:00 > + otherwise the following command will fail > + > +.. code-block:: console > + > + testpmd> flow create 1 ingress pattern eth > + src spec 00:16:3e:2b:e6:47 src mask FF:FF:FF:FF:FF:FF > + / end actions drop / end Remember you're documenting API support limitations primarily for application writers who are not necessarily familiar with testpmd. The problem here is also that "src" is in fact an attribute of either "spec", "last" or "mask", not the other way around, hence you should refer to struct rte_flow_item / rte_flow_item_eth and its fields instead of using a testpmd example. > + > +- Supports only full MASK > + > + For example the following testpmd command will fail > + > +.. code-block:: console > + > + testpmd> flow create 1 ingress pattern eth > + src spec 00:16:3e:2b:e6:47 > + dst spec 4A:11:6C:FA:60:D0 dst mask FF:00:FF:FF:FF:00 > + / end actions drop / end > + Providing spec but no mask for src is valid (mask remains 0), however it's certainly a trap for unsuspecting readers unfamiliar with the flow command. Also providing examples is not bad in itself but they should not appear in the middle of an enumeration list as it makes them unclear. > + > +- When configured to run in promiscuous or all-multicast modes does > + not support additional rules This wording is misleading, the actual limitation is you can't provide additional items in a pattern if you want to match any destination MAC (mask.dst == 0) or only multicast traffic (spec.dst & mask.dst == 01:00:00:00:00:00). > +- Does not support the explicit exclusion of all multicast traffic > +- Does not support partial VLAN TCI VID matching This last item actually documents RTE_FLOW_ITEM_TYPE_VLAN. > + > +L3 (ipv4) > +^ > + > +- Supports only 0 or full mask. Prerequisites: Need to have eth dst spec Matching all fields of IPv4 headers is not supported, only source and destination. Not a single word about the lack of IPv6 support? > + > +L4 (tcp/udp) > + > + > +- Supports only full mask Only on source and destination ports. Empty masks are also supported. > + For example the following testpmd command will fail > + > +.. code-block:: console > + > + testpmd> flow create 0 ingress pattern eth > + src spec e4:1d:2d:2d:8d:22 > + dst spec 00:15:5D:10:8D:00 dst mask FF:FF:FF:FF:FF:FF > + / ipv4 src spec 144.144.92.0 src prefix 16 > + / end actions drop / end Neither TCP nor UDP are part of this example. > + Prerequisites: Need to have eth dst spec and IPv4 before it with all > + its limitations > + > +Flow actions > +~
Re: [dpdk-dev] [PATCH v1] doc: update mlx4 flow limitations
09/02/2018 17:21, Adrien Mazarguil: > This section is titled "Limitations" but contains a mix of features, > limitations and quirks, more like "Random thoughts regarding rte_flow > support". I think this is not what users might expect from such a > documentation which must be exhaustive and consistent. Getting there may > involve tables. +Cc Ferruh > My suggestion is to first get everyone agree on a common rte_flow > capabilities documentation format all PMDs could reuse and then fill in the > blanks. What's your opinion? I think it's better to have some random thoughts than nothing. All the comments you gave in this thread deserve to be written in the documentation as soon as possible. Working on a better standardized documentation (longer term) should not prevent us to write some messy notes in the meantime. Is there already some similar rte_flow notes in other PMD docs? About the common documentation, do you think about a generated table like it is done for other features? Do you plan to provide a template or an example?
[dpdk-dev] [PATCH] doc: add change notice for mbuf sched field
Signed-off-by: Cristian Dumitrescu Acked-by: Jasvinder Singh Acked-by: Roy Fan Zhang Acked-by: Kevin Laatz --- doc/guides/rel_notes/deprecation.rst | 5 + 1 file changed, 5 insertions(+) diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst index d59ad59..db4fea3 100644 --- a/doc/guides/rel_notes/deprecation.rst +++ b/doc/guides/rel_notes/deprecation.rst @@ -59,3 +59,8 @@ Deprecation Notices be added between the producer and consumer structures. The size of the structure and the offset of the fields will remain the same on platforms with 64B cache line, but will change on other platforms. + +* mbuf: The opaque mbuf->hash.sched field will be updated to support generic + definition in line with the ethdev TM and MTR APIs. Currently, this field + is defined in librte_sched in a non-generic way. The new generic format + will contain: queue ID, traffic class, color. Field size will not change. -- 2.7.4
[dpdk-dev] [PATCH v2] vhost: fix check if cmsg is NULL
Fixes: 8f972312b8f4 ("vhost: support vhost-user") Cc: jianfeng@intel.com Cc: sta...@dpdk.org Signed-off-by: Pawel Wodkowski Signed-off-by: Tomasz Kulasek --- v2 changes: - Changed fixline to point right commit --- lib/librte_vhost/socket.c | 5 + 1 file changed, 5 insertions(+) diff --git a/lib/librte_vhost/socket.c b/lib/librte_vhost/socket.c index 83befdced..8fd47a4d8 100644 --- a/lib/librte_vhost/socket.c +++ b/lib/librte_vhost/socket.c @@ -153,6 +153,11 @@ send_fd_message(int sockfd, char *buf, int buflen, int *fds, int fd_num) msgh.msg_control = control; msgh.msg_controllen = sizeof(control); cmsg = CMSG_FIRSTHDR(&msgh); + if (cmsg == NULL) { + RTE_LOG(ERR, VHOST_CONFIG, "cmsg == NULL\n"); + errno = EINVAL; + return -1; + } cmsg->cmsg_len = CMSG_LEN(fdsize); cmsg->cmsg_level = SOL_SOCKET; cmsg->cmsg_type = SCM_RIGHTS; -- 2.14.1
[dpdk-dev] [PATCH] vhost: fix close callfd on get vring base
This prevents from destroying & recreating user device in "incomplete" vring state. virtio_is_ready() was returning true for devices with vrings which did not have valid callfd (their VHOST_USER_SET_VRING_CALL hasn't arrived yet) Fixes: 8f972312b8f4 ("vhost: support vhost-user") Cc: huawei@intel.com Cc: sta...@dpdk.org Signed-off-by: Dariusz Stojaczyk Signed-off-by: Tomasz Kulasek --- lib/librte_vhost/vhost_user.c | 5 + 1 file changed, 5 insertions(+) diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c index 65ee33919..dd8682c09 100644 --- a/lib/librte_vhost/vhost_user.c +++ b/lib/librte_vhost/vhost_user.c @@ -897,6 +897,11 @@ vhost_user_get_vring_base(struct virtio_net *dev, vq->kickfd = VIRTIO_UNINITIALIZED_EVENTFD; + if (vq->callfd >= 0) + close(vq->callfd); + + vq->callfd = VIRTIO_UNINITIALIZED_EVENTFD; + if (dev->dequeue_zero_copy) free_zmbufs(vq); rte_free(vq->shadow_used_ring); -- 2.14.1
[dpdk-dev] [PATCH] vhost: fix double free on shutdown
The vhost connection can be closed concurrently from 2 places: * the connection thread itself * rte_vhost_driver_unregister The connection thread will terminate the connection if any recv error occurred. The unregister function will terminate the connection together with the thread. However, there is no sychronization between those two. The connection thread runs in the background without any mutex. The rte_vhost_driver_unregister now signals the connection thread to terminate itself and waits until it's killed. Fixes: 65388b43f592 ("vhost: fix fd leaks for vhost-user server mode") Cc: yuanhan@linux.intel.com Cc: sta...@dpdk.org Signed-off-by: Dariusz Stojaczyk Signed-off-by: Tomasz Kulasek --- lib/librte_vhost/socket.c | 21 - 1 file changed, 8 insertions(+), 13 deletions(-) diff --git a/lib/librte_vhost/socket.c b/lib/librte_vhost/socket.c index 83befdced..46ac88efd 100644 --- a/lib/librte_vhost/socket.c +++ b/lib/librte_vhost/socket.c @@ -735,7 +735,7 @@ rte_vhost_driver_unregister(const char *path) { int i; int count; - struct vhost_user_connection *conn, *next; + struct vhost_user_connection *conn; pthread_mutex_lock(&vhost_user.mutex); @@ -752,22 +752,17 @@ rte_vhost_driver_unregister(const char *path) } pthread_mutex_lock(&vsocket->conn_mutex); - for (conn = TAILQ_FIRST(&vsocket->conn_list); -conn != NULL; -conn = next) { - next = TAILQ_NEXT(conn, next); - - fdset_del(&vhost_user.fdset, conn->connfd); - RTE_LOG(INFO, VHOST_CONFIG, - "free connfd = %d for device '%s'\n", - conn->connfd, path); + TAILQ_FOREACH(conn, &vsocket->conn_list, next) { close(conn->connfd); - vhost_destroy_device(conn->vid); - TAILQ_REMOVE(&vsocket->conn_list, conn, next); - free(conn); } pthread_mutex_unlock(&vsocket->conn_mutex); + do { + pthread_mutex_lock(&vsocket->conn_mutex); + conn = TAILQ_FIRST(&vsocket->conn_list); + pthread_mutex_unlock(&vsocket->conn_mutex); + } while (conn != NULL); + pthread_mutex_destroy(&vsocket->conn_mutex); free(vsocket->path); free(vsocket); -- 2.14.1
[dpdk-dev] [PATCH] vhost: fix realloc failure
When reallocation of guest pages fails, vhost_user_set_mem_table() also should fail. Fixes: e246896178e6 ("vhost: get guest/host physical address mappings") Cc: yuanhan@linux.intel.com Cc: sta...@dpdk.org Signed-off-by: Ziye Yang Signed-off-by: Tomasz Kulasek --- lib/librte_vhost/vhost_user.c | 29 +++-- 1 file changed, 23 insertions(+), 6 deletions(-) diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c index fc1f1a948..4357b88e0 100644 --- a/lib/librte_vhost/vhost_user.c +++ b/lib/librte_vhost/vhost_user.c @@ -525,7 +525,7 @@ vhost_user_set_vring_base(struct virtio_net *dev, return 0; } -static void +static int add_one_guest_page(struct virtio_net *dev, uint64_t guest_phys_addr, uint64_t host_phys_addr, uint64_t size) { @@ -535,6 +535,10 @@ add_one_guest_page(struct virtio_net *dev, uint64_t guest_phys_addr, dev->max_guest_pages *= 2; dev->guest_pages = realloc(dev->guest_pages, dev->max_guest_pages * sizeof(*page)); + if (!dev->guest_pages) { + RTE_LOG(ERR, VHOST_CONFIG, "cannot realloc guest_pages\n"); + return -1; + } } if (dev->nr_guest_pages > 0) { @@ -543,7 +547,7 @@ add_one_guest_page(struct virtio_net *dev, uint64_t guest_phys_addr, if (host_phys_addr == last_page->host_phys_addr + last_page->size) { last_page->size += size; - return; + return 0; } } @@ -551,9 +555,11 @@ add_one_guest_page(struct virtio_net *dev, uint64_t guest_phys_addr, page->guest_phys_addr = guest_phys_addr; page->host_phys_addr = host_phys_addr; page->size = size; + + return 0; } -static void +static int add_guest_pages(struct virtio_net *dev, struct rte_vhost_mem_region *reg, uint64_t page_size) { @@ -567,7 +573,9 @@ add_guest_pages(struct virtio_net *dev, struct rte_vhost_mem_region *reg, size = page_size - (guest_phys_addr & (page_size - 1)); size = RTE_MIN(size, reg_size); - add_one_guest_page(dev, guest_phys_addr, host_phys_addr, size); + if (add_one_guest_page(dev, guest_phys_addr, host_phys_addr, size) < 0) + return -1; + host_user_addr += size; guest_phys_addr += size; reg_size -= size; @@ -576,12 +584,16 @@ add_guest_pages(struct virtio_net *dev, struct rte_vhost_mem_region *reg, size = RTE_MIN(reg_size, page_size); host_phys_addr = rte_mem_virt2iova((void *)(uintptr_t) host_user_addr); - add_one_guest_page(dev, guest_phys_addr, host_phys_addr, size); + if (add_one_guest_page(dev, guest_phys_addr, host_phys_addr, + size) < 0) + return -1; host_user_addr += size; guest_phys_addr += size; reg_size -= size; } + + return 0; } #ifdef RTE_LIBRTE_VHOST_DEBUG @@ -734,7 +746,12 @@ vhost_user_set_mem_table(struct virtio_net *dev, struct VhostUserMsg *pmsg) mmap_offset; if (dev->dequeue_zero_copy) - add_guest_pages(dev, reg, alignment); + if (add_guest_pages(dev, reg, alignment) < 0) { + RTE_LOG(ERR, VHOST_CONFIG, + "adding guest pages to region %u failed.\n", + i); + goto err_mmap; + } RTE_LOG(INFO, VHOST_CONFIG, "guest memory region %u, size: 0x%" PRIx64 "\n" -- 2.14.1
[dpdk-dev] [PATCH] vhost: fix remove macro name conflict
LOG_DEBUG is a symbol defined by POSIX, so if sys/log.h is included the symbols conflict. This patch changes LOG_DEBUG to VHOST_LOG_DEBUG. Fixes: 1c01d52392d5 ("vhost: add debug print") Cc: huawei@intel.com Cc: sta...@dpdk.org Signed-off-by: Ben Walker Signed-off-by: Tomasz Kulasek --- lib/librte_vhost/vhost.h | 13 +++-- lib/librte_vhost/vhost_user.c | 10 +- lib/librte_vhost/virtio_net.c | 16 3 files changed, 20 insertions(+), 19 deletions(-) diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h index d947bc9e3..319cc6620 100644 --- a/lib/librte_vhost/vhost.h +++ b/lib/librte_vhost/vhost.h @@ -296,8 +296,9 @@ vhost_log_used_vring(struct virtio_net *dev, struct vhost_virtqueue *vq, #ifdef RTE_LIBRTE_VHOST_DEBUG #define VHOST_MAX_PRINT_BUFF 6072 -#define LOG_LEVEL RTE_LOG_DEBUG -#define LOG_DEBUG(log_type, fmt, args...) RTE_LOG(DEBUG, log_type, fmt, ##args) +#define VHOST_LOG_LEVEL RTE_LOG_DEBUG +#define VHOST_LOG_DEBUG(log_type, fmt, args...) \ + RTE_LOG(DEBUG, log_type, fmt, ##args) #define PRINT_PACKET(device, addr, size, header) do { \ char *pkt_addr = (char *)(addr); \ unsigned int index; \ @@ -313,11 +314,11 @@ vhost_log_used_vring(struct virtio_net *dev, struct vhost_virtqueue *vq, } \ snprintf(packet + strnlen(packet, VHOST_MAX_PRINT_BUFF), VHOST_MAX_PRINT_BUFF - strnlen(packet, VHOST_MAX_PRINT_BUFF), "\n"); \ \ - LOG_DEBUG(VHOST_DATA, "%s", packet); \ + VHOST_LOG_DEBUG(VHOST_DATA, "%s", packet); \ } while (0) #else -#define LOG_LEVEL RTE_LOG_INFO -#define LOG_DEBUG(log_type, fmt, args...) do {} while (0) +#define VHOST_LOG_LEVEL RTE_LOG_INFO +#define VHOST_LOG_DEBUG(log_type, fmt, args...) do {} while (0) #define PRINT_PACKET(device, addr, size, header) do {} while (0) #endif @@ -411,7 +412,7 @@ vhost_vring_call(struct virtio_net *dev, struct vhost_virtqueue *vq) uint16_t old = vq->signalled_used; uint16_t new = vq->last_used_idx; - LOG_DEBUG(VHOST_DATA, "%s: used_event_idx=%d, old=%d, new=%d\n", + VHOST_LOG_DEBUG(VHOST_DATA, "%s: used_event_idx=%d, old=%d, new=%d\n", __func__, vhost_used_event(vq), old, new); diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c index 65ee33919..dc38cdeb2 100644 --- a/lib/librte_vhost/vhost_user.c +++ b/lib/librte_vhost/vhost_user.c @@ -181,7 +181,7 @@ vhost_user_set_features(struct virtio_net *dev, uint64_t features) } else { dev->vhost_hlen = sizeof(struct virtio_net_hdr); } - LOG_DEBUG(VHOST_CONFIG, + VHOST_LOG_DEBUG(VHOST_CONFIG, "(%d) mergeable RX buffers %s, virtio 1 %s\n", dev->vid, (dev->features & (1 << VIRTIO_NET_F_MRG_RXBUF)) ? "on" : "off", @@ -461,13 +461,13 @@ translate_ring_addresses(struct virtio_net *dev, int vq_index) vq->log_guest_addr = addr->log_guest_addr; - LOG_DEBUG(VHOST_CONFIG, "(%d) mapped address desc: %p\n", + VHOST_LOG_DEBUG(VHOST_CONFIG, "(%d) mapped address desc: %p\n", dev->vid, vq->desc); - LOG_DEBUG(VHOST_CONFIG, "(%d) mapped address avail: %p\n", + VHOST_LOG_DEBUG(VHOST_CONFIG, "(%d) mapped address avail: %p\n", dev->vid, vq->avail); - LOG_DEBUG(VHOST_CONFIG, "(%d) mapped address used: %p\n", + VHOST_LOG_DEBUG(VHOST_CONFIG, "(%d) mapped address used: %p\n", dev->vid, vq->used); - LOG_DEBUG(VHOST_CONFIG, "(%d) log_guest_addr: %" PRIx64 "\n", + VHOST_LOG_DEBUG(VHOST_CONFIG, "(%d) log_guest_addr: %" PRIx64 "\n", dev->vid, vq->log_guest_addr); return dev; diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c index 700aca7ce..ed7198dbb 100644 --- a/lib/librte_vhost/virtio_net.c +++ b/lib/librte_vhost/virtio_net.c @@ -295,7 +295,7 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id, uint16_t used_idx; uint32_t i, sz; - LOG_DEBUG(VHOST_DATA, "(%d) %s\n", dev->vid, __func__); + VHOST_LOG_DEBUG(VHOST_DATA, "(%d) %s\n", dev->vid, __func__); if (unlikely(!is_valid_virt_queue_idx(queue_id, 0, dev->nr_vring))) { RTE_LOG(ERR, VHOST_DATA, "(%d) %s: invalid virtqueue idx %d.\n", dev->vid, __func__, queue_id); @@ -327,7 +327,7 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id, if (count == 0) goto out; - LOG_DEBUG(VHOST_DATA, "(%d) start_idx %d | end_idx %d\n", + VHOST_LOG_DEBUG(VHOST_DATA, "(%d) start_idx %d | end_idx %d\n", dev->vid, start_idx, start_idx + count); vq->batch_copy_nb_elems = 0; @@ -524,7 +524,7 @@ copy_mbuf_to_desc_mergeable(struct virtio_net *dev, struct vhost_virtqueue *vq, hdr_phys_addr = buf_vec[vec_
[dpdk-dev] [PATCH] vhost: fix return avail ring position in get vring base
According to the "Vhost-user Protocol" document, VHOST_USER_GET_VRING_BASE should get the available vring base offset. Fixes: 8f972312b8f4 ("vhost: support vhost-user") Cc: huawei@intel.com Cc: sta...@dpdk.org Signed-off-by: Pawel Wodkowski Signed-off-by: Tomasz Kulasek --- lib/librte_vhost/vhost_user.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c index 65ee33919..04eee3a3a 100644 --- a/lib/librte_vhost/vhost_user.c +++ b/lib/librte_vhost/vhost_user.c @@ -881,8 +881,8 @@ vhost_user_get_vring_base(struct virtio_net *dev, dev->flags &= ~VIRTIO_DEV_READY; - /* Here we are safe to get the last used index */ - msg->payload.state.num = vq->last_used_idx; + /* Here we are safe to get the last avail index */ + msg->payload.state.num = vq->last_avail_idx; RTE_LOG(INFO, VHOST_CONFIG, "vring base idx:%d file:%d\n", msg->payload.state.index, -- 2.14.1
[dpdk-dev] [PATCH] vhost: fix wait for valid descriptor
For each virt queue's kickfd and callfd, there are 2 invalid status: VIRTIO_UNINITIALIZED_EVENTFD and VIRTIO_INVALID_EVENTFD. Don't set the virt queue to ready status until got the valid descriptor. This is safe for polling mode drivers in Guest OS, the backend vhost process will not post notification to interrupt vector for PMD mode in guest, but the interrupt vector still valid. Fixes: e049ca6d10e0 ("vhost-user: prepare multiple queue setup") Cc: yuanhan@linux.intel.com Cc: sta...@dpdk.org Signed-off-by: Changpeng Liu Signed-off-by: Tomasz Kulasek --- lib/librte_vhost/vhost_user.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c index 65ee33919..4508f697b 100644 --- a/lib/librte_vhost/vhost_user.c +++ b/lib/librte_vhost/vhost_user.c @@ -766,7 +766,9 @@ vq_is_ready(struct vhost_virtqueue *vq) { return vq && vq->desc && vq->avail && vq->used && vq->kickfd != VIRTIO_UNINITIALIZED_EVENTFD && - vq->callfd != VIRTIO_UNINITIALIZED_EVENTFD; + vq->callfd != VIRTIO_UNINITIALIZED_EVENTFD && + vq->kickfd != VIRTIO_INVALID_EVENTFD && + vq->callfd != VIRTIO_INVALID_EVENTFD; } static int -- 2.14.1
[dpdk-dev] [PATCH] vhost: reduce size of coredump file
If application coredumps with vhost-user devices connected to it, the generated coredump file size is huge. To limit its size, this patch adds call to madvise() with MADV_DONTDUMP on memory regions mapped from the VM. Signed-off-by: Sebastian Basierski Signed-off-by: Tomasz Kulasek --- lib/librte_vhost/vhost_user.c | 5 + 1 file changed, 5 insertions(+) diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c index 65ee33919..fc1f1a948 100644 --- a/lib/librte_vhost/vhost_user.c +++ b/lib/librte_vhost/vhost_user.c @@ -723,6 +723,11 @@ vhost_user_set_mem_table(struct virtio_net *dev, struct VhostUserMsg *pmsg) goto err_mmap; } + if (madvise(mmap_addr, mmap_size, MADV_DONTDUMP) != 0) { + RTE_LOG(INFO, VHOST_CONFIG, + "MADV_DONTDUMP advice setting failed.\n"); + } + reg->mmap_addr = mmap_addr; reg->mmap_size = mmap_size; reg->host_user_addr = (uint64_t)(uintptr_t)mmap_addr + -- 2.14.1
Re: [dpdk-dev] IXGBE, IOMMU DMAR DRHD handling fault issue
On Thu, Feb 8, 2018 at 3:20 AM, Burakov, Anatoly wrote: > On 06-Feb-18 5:55 PM, Ravi Kerur wrote: > >> >> Hi Anatoly, >> >> I am actually confused with the state of vIOMMU + DPDK. Can you please >> help me clarify? >> >> I tested following DPDK versions >> >> (1) DPDK 17.11, exhibits the issue (IOMMU width as reported by RedHat and >> solution is to prevent using the patch) >> (2) DPDK 17.05.02 (stable release) using 'testpmd' I was able to bind a >> device in VM with VFIO driver and no DMAR error message on host >> (3) DPDK 16.11.02 (stable release) using 'testpmd' I was able to bind a >> device in VM with VFIO driver and no DMAR error message on host >> >> Clearly issue seen in 17.11 without the patch you mentioned is a >> regression or the issue was masked in earlier DPDK version? I did not test >> traffic with any DPDK version because I wanted to first get DMAR errors on >> host gone. >> >> Our application 'v1' is integrated with DPDK 16.11 and 'v2' is integrated >> with DPDK 17.05.01. In both 'v1' and 'v2' cases I don't see IOMMU width >> error messages on VM, however, DMAR error messages are seen host. I am not >> able to relate what causes DMAR error messages on host? >> >> >> Thanks. >> >> > Hi Ravi, > > vIOMMU support is out of our hands, really - we can only make use of > hardware (or emulation of it) that is available. 39-bit wide address *can* > work, you just have to be lucky and get PA addresses that would fit into 39 > bits (so under 512G limit), because we set up IOVA addresses to be 1:1 to > physical addresses. We could, in principle, set up IOVA addresses to go > from zero instead of them being a 1:1 mapping to physical addresses, but > that would introduce need to translate addresses between IOVA and physical > in some cases (e.g. KNI). > > I'm not aware of any changes between 16.11 and 17.11 (and indeed 18.02) > that would make or break support for 39-bit wide PA addresses for IOMMU. It > is possible that VF/PF drivers do something differently which results in > DMAR errors showing up sooner rather than later, but as far as VFIO support > itself is concerned, there were no major changes in those releases. > > Hi Anatoly, Thank you for your explanation. I would like to ask one more thing as I need to get v-iommu+ dpdk working in VM. Can you please tell me what determines 'Host Address Width", I know my question has nothing to do with dpdk and this is a dpdk list, but if you have any information please share it? I googled and found couple of ways to influence 'Host Address Width = 46' in guest as well (since dpdk + iommu works fine on host and DMAR on host reports address width as 46). (1) Qemu has CPU param 'host-phys-bits' boolean, when set to true copies it from host (2) Qemu has 'phys-bits' integer, when set to '46' should influence guest Using above options when instantiating a VM doesn't help, Guest VM still ends up with 'Host address width = 39'. (3) There is another Qemu option 'x-aw-bits' which is for VT-d which can be set to '39' or '48'. This doesn't help either. Thanks. -- > Thanks, > Anatoly >
[dpdk-dev] [PATCH] doc: add missing SFN8xxx adapters to the list of supported
Signed-off-by: Andrew Rybchenko --- doc/guides/nics/sfc_efx.rst | 6 ++ 1 file changed, 6 insertions(+) diff --git a/doc/guides/nics/sfc_efx.rst b/doc/guides/nics/sfc_efx.rst index 8e0782c..ccdf5ff 100644 --- a/doc/guides/nics/sfc_efx.rst +++ b/doc/guides/nics/sfc_efx.rst @@ -192,8 +192,14 @@ Supported NICs - Solarflare SFN8522 Dual Port SFP+ Server Adapter + - Solarflare SFN8522M Dual Port SFP+ Server Adapter + + - Solarflare SFN8042 Dual Port QSFP+ Server Adapter + - Solarflare SFN8542 Dual Port QSFP+ Server Adapter + - Solarflare SFN8722 Dual Port SFP+ OCP Server Adapter + - Solarflare SFN7002F Dual Port SFP+ Server Adapter - Solarflare SFN7004F Quad Port SFP+ Server Adapter -- 2.7.4
Re: [dpdk-dev] [RFC v1 1/1] lib/cryptodev: add support of asymmetric crypto
Hi Shally, Comments below. > -Original Message- > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Shally Verma > Sent: Tuesday, January 23, 2018 9:54 AM > To: Doherty, Declan > Cc: dev@dpdk.org; pathr...@caviumnetworks.com; nmur...@caviumnetworks.com; > ss...@caviumnetworks.com; agu...@caviumnetworks.com; Shally Verma > > Subject: [dpdk-dev] [RFC v1 1/1] lib/cryptodev: add support of asymmetric > crypto > > From: Shally Verma > > Add support for asymmetric crypto operations in DPDK lib cryptodev > > Key feature include: > - Only session based asymmetric crypto operations > - new get and set APIs for symmetric and asymmetric session private > data and other informations > - APIs to create, configure and attch queue pair to asymmetric sessions > - new capabilities in struct device_info to indicate > -- number of dedicated queue pairs available for symmetric and > asymmetric operations, if any > -- number of asymmetric sessions possible per qp > [Fiona] Though it's probably premature to include on the API have you considered providing for pre-loaded keys in future, i.e. the ability to wrap keys or refer to keys already stored securely on the device, as an alternative to passing the keys on the API? > Proposed asymmetric cryptographic operations are: > - rsa > - dsa > - deffie-hellman key pair generation and shared key computation > - ecdeffie-hellman > - fundamental elliptic curve operations > - elliptic curve DSA > - modular exponentiation and inversion > > This patch primarily defines PMD operations and device capabilities > to perform asymmetric crypto ops on queue pairs and intend to > invite feedbacks on current proposal so as to ensure it encompass > all kind of crypto devices with different capabilities and queue > pair management. > > List of TBDs: > - Currently, patch only updated for RSA xform and associated params. > Other algoritms to be added in subsequent versions. > - per-service stats update > > Signed-off-by: Shally Verma > --- > > It is derivative of RFC v2 asymmetric crypto patch series initiated by > Umesh Kartha(mailto:umesh.kar...@caviumnetworks.com): > > http://dpdk.org/dev/patchwork/patch/24245/ > http://dpdk.org/dev/patchwork/patch/24246/ > http://dpdk.org/dev/patchwork/patch/24247/ > > And inclusive of all review comments given on RFC v2. > ( See complete discussion thread here: > http://dev.dpdk.narkive.com/yqTFFLHw/dpdk-dev-rfc-specifications-for-asymmetric-crypto- > algorithms#post12) > > Some of the RFCv2 Review comments pending for closure: > > " [Fiona] The count fn isn't used at all for sym - probably no need to add > > for asym > better instead to remove the sym fn." > > It is still present in dpdk-next-crypto for sym, so what has been > decision > on it? [Fiona] No change. The rte_cryptodev_ops fn is still not called so useless and should be removed. rte_cryptodev_queue_pair_count() returns the num_qps configured in rte_cryptodev_configure(), but never calls the PMD dev_ops.queue_pair_count(). So cryptodev_sym_queue_pair_count_t should be deprecated. And no point in adding one for asym. > > >"[Fiona] if each qp can handle only a specific service, i.e. a subset off > >the capabilities > Indicated by the device capability list, there's a need for a new API to > query > the capability of a qp." > > Current proposal doesn’t distinguish between device capability and qp > capability. > It rather leave such differences handling internal to PMDs. Thus no > capability > or API added for qp in current version. It is subject to revisit based on > review > feedback on current proposal. [Fiona] This would not work for some devices, comments below. > > - Sessionless Support. > Current proposal only support Session-based because: > 1. All one-time setup i.e. algos and associated params, such as, > public-private keys > or modulus length can be done in control path using session-init API > 2. it’s an easier way to dedicate qp to do specific service (using > queue_pair_attach()) > which cannot be case in sessionless > 3. Couldn’t find any significant advantage going sessionless way. Also > existing most of PMDs are > session-based. > > It could be added in subsequent versions, if requirement is identified, > based on review comment > on this RFC. [Fiona] Our preference would be for sessionless, as it would need fewer API calls (no session_create/session_clear) and e.g. DH and ECDH sessions are likely to be only used for a single op. However this is not a blocker for this API, we can POC it later and propose an extension to the API if it gives a performance improvement. > > Summary > --- > > This section provides an overview of key feature enabled in current > specification. > It comprise of key design challenges as have been identified on RFCv2 and > summary description of new interfaces and definitions added to handle same.
[dpdk-dev] [PATCH 2/2] lib: fix bitmap scanning
From: "Charles (Chas) Williams" index2 is used inconsistently, both as an array offset and as a bit offset. Fix usage to be consistent as a bit offset. Additonally, offset1 needs to be shifted with the size of the array1 slab entries, not the cache line sizes. __rte_bitmap_scan_read() needs to examine the current array2 slab bit by bit to find the next set bit. The unit tests for rte_bitmap_scan() aren't correct. If a slab isn't empty, there is no reason to expect rte_bitmap_scan() to advance to the next slab. Change the slab magic values so that rte_bitmap_scan() will advance on reading a bit from each slab and verify it is the bit position we expect. Fixes: de3cfa2c9823a ("sched: initial import") Signed-off-by: Chas Williams --- lib/librte_eal/common/include/rte_bitmap.h | 31 +++--- test/test/test_bitmap.c| 12 ++-- 2 files changed, 22 insertions(+), 21 deletions(-) diff --git a/lib/librte_eal/common/include/rte_bitmap.h b/lib/librte_eal/common/include/rte_bitmap.h index 7d4935f..27084d9 100644 --- a/lib/librte_eal/common/include/rte_bitmap.h +++ b/lib/librte_eal/common/include/rte_bitmap.h @@ -94,7 +94,8 @@ __rte_bitmap_mask1_get(struct rte_bitmap *bmp) static inline void __rte_bitmap_index2_set(struct rte_bitmap *bmp) { - bmp->index2 = (((bmp->index1 << RTE_BITMAP_SLAB_BIT_SIZE_LOG2) + bmp->offset1) << RTE_BITMAP_CL_SLAB_SIZE_LOG2); + bmp->index2 = ((bmp->index1 << RTE_BITMAP_SLAB_BIT_SIZE_LOG2) + + bmp->offset1) << RTE_BITMAP_SLAB_BIT_SIZE_LOG2; } #if RTE_BITMAP_OPTIMIZATIONS @@ -172,7 +173,6 @@ __rte_bitmap_scan_init(struct rte_bitmap *bmp) bmp->index1 = bmp->array1_size - 1; bmp->offset1 = RTE_BITMAP_SLAB_BIT_SIZE - 1; __rte_bitmap_index2_set(bmp); - bmp->index2 += RTE_BITMAP_CL_SLAB_SIZE; bmp->go2 = 0; } @@ -338,7 +338,8 @@ rte_bitmap_set(struct rte_bitmap *bmp, uint32_t pos) index2 = pos >> RTE_BITMAP_SLAB_BIT_SIZE_LOG2; offset2 = pos & RTE_BITMAP_SLAB_BIT_MASK; index1 = pos >> (RTE_BITMAP_SLAB_BIT_SIZE_LOG2 + RTE_BITMAP_CL_BIT_SIZE_LOG2); - offset1 = (pos >> RTE_BITMAP_CL_BIT_SIZE_LOG2) & RTE_BITMAP_SLAB_BIT_MASK; + offset1 = (pos >> RTE_BITMAP_SLAB_BIT_SIZE_LOG2) & + RTE_BITMAP_SLAB_BIT_MASK; slab2 = bmp->array2 + index2; slab1 = bmp->array1 + index1; @@ -365,7 +366,8 @@ rte_bitmap_set_slab(struct rte_bitmap *bmp, uint32_t pos, uint64_t slab) /* Set bits in array2 slab and set bit in array1 slab */ index2 = pos >> RTE_BITMAP_SLAB_BIT_SIZE_LOG2; index1 = pos >> (RTE_BITMAP_SLAB_BIT_SIZE_LOG2 + RTE_BITMAP_CL_BIT_SIZE_LOG2); - offset1 = (pos >> RTE_BITMAP_CL_BIT_SIZE_LOG2) & RTE_BITMAP_SLAB_BIT_MASK; + offset1 = (pos >> RTE_BITMAP_SLAB_BIT_SIZE_LOG2) & + RTE_BITMAP_SLAB_BIT_MASK; slab2 = bmp->array2 + index2; slab1 = bmp->array1 + index1; @@ -422,7 +424,8 @@ rte_bitmap_clear(struct rte_bitmap *bmp, uint32_t pos) /* The array2 cache line is all-zeros, so clear bit in array1 slab */ index1 = pos >> (RTE_BITMAP_SLAB_BIT_SIZE_LOG2 + RTE_BITMAP_CL_BIT_SIZE_LOG2); - offset1 = (pos >> RTE_BITMAP_CL_BIT_SIZE_LOG2) & RTE_BITMAP_SLAB_BIT_MASK; + offset1 = (pos >> RTE_BITMAP_SLAB_BIT_SIZE_LOG2) & + RTE_BITMAP_SLAB_BIT_MASK; slab1 = bmp->array1 + index1; *slab1 &= ~(1lu << offset1); @@ -471,15 +474,14 @@ __rte_bitmap_scan_read(struct rte_bitmap *bmp, uint32_t *pos, uint64_t *slab) { uint64_t *slab2; - slab2 = bmp->array2 + bmp->index2; - for ( ; bmp->go2 ; bmp->index2 ++, slab2 ++, bmp->go2 = bmp->index2 & RTE_BITMAP_CL_SLAB_MASK) { - if (*slab2) { - *pos = bmp->index2 << RTE_BITMAP_SLAB_BIT_SIZE_LOG2; - *slab = *slab2; + slab2 = bmp->array2 + (bmp->index2 >> RTE_BITMAP_SLAB_BIT_SIZE_LOG2); + for ( ; bmp->go2 ; bmp->index2++, + bmp->go2 = bmp->index2 & RTE_BITMAP_SLAB_BIT_MASK) { + uint32_t offset2 = bmp->index2 & RTE_BITMAP_SLAB_BIT_MASK; - bmp->index2 ++; - slab2 ++; - bmp->go2 = bmp->index2 & RTE_BITMAP_CL_SLAB_MASK; + if (*slab2 & (1lu << offset2)) { + *pos = bmp->index2++; + *slab = *slab2; return 1; } } @@ -518,8 +520,7 @@ rte_bitmap_scan(struct rte_bitmap *bmp, uint32_t *pos, uint64_t *slab) /* Look for non-empty array2 line */ if (__rte_bitmap_scan_search(bmp)) { __rte_bitmap_scan_read_init(bmp); - __rte_bitmap_scan_read(bmp, pos, slab); - return 1; + return __rte_bitmap_scan_read(bmp,
[dpdk-dev] [PATCH 1/2] test/bitmap: add additional tests
From: "Charles (Chas) Williams" Add addtional units tests for the rte_bitmap_scan() routine. Test that we are gettting the bits returned that we expect. Signed-off-by: Chas Williams --- test/test/test_bitmap.c | 68 + 1 file changed, 68 insertions(+) diff --git a/test/test/test_bitmap.c b/test/test/test_bitmap.c index 05d547e..f498c02 100644 --- a/test/test/test_bitmap.c +++ b/test/test/test_bitmap.c @@ -74,6 +74,71 @@ test_bitmap_scan_operations(struct rte_bitmap *bmp) } static int +test_bitmap_scan_operations2(struct rte_bitmap *bmp) +{ + uint32_t pos = 0; + uint64_t out_slab = 0; + uint32_t pattern[] = { 10, 11, 100, 101, 500, 501 }; + uint32_t pattern2[] = { 11, 101, 501 }; + uint32_t i; + + rte_bitmap_reset(bmp); + + /* Setup the initial bitmap. */ + for (i = 0; i < sizeof(pattern)/sizeof(uint32_t); i++) + rte_bitmap_set(bmp, pattern[i]); + + /* Iterate. */ + for (i = 0; i < sizeof(pattern)/sizeof(uint32_t); i++) { + if (!rte_bitmap_scan(bmp, &pos, &out_slab) || + pos != pattern[i]) { + printf("Failed to get pos %d from bitmap.\n", + pattern[i]); + return TEST_FAILED; + } + } + + /* Check wrap around. */ + if (!rte_bitmap_scan(bmp, &pos, &out_slab) || pos != pattern[0]) { + printf("Failed to get pos %d from bitmap.\n", pattern[0]); + return TEST_FAILED; + } + + /* Delete half the entries in the slabs. */ + rte_bitmap_clear(bmp, 10); + rte_bitmap_clear(bmp, 100); + rte_bitmap_clear(bmp, 500); + + /* Iterate. */ + for (i = 0; i < sizeof(pattern2)/sizeof(uint32_t); i++) { + if (!rte_bitmap_scan(bmp, &pos, &out_slab) || + pos != pattern2[i]) { + printf("Failed to get pos %d from bitmap.\n", + pattern2[i]); + return TEST_FAILED; + } + } + + /* Check wrap around. */ + if (!rte_bitmap_scan(bmp, &pos, &out_slab) || pos != pattern2[0]) { + printf("Failed to get pos %d from bitmap.\n", pattern2[0]); + return TEST_FAILED; + } + + rte_bitmap_clear(bmp, 11); + rte_bitmap_clear(bmp, 101); + rte_bitmap_clear(bmp, 501); + + /* Ensure bitmap it empty. */ + if (rte_bitmap_scan(bmp, &pos, &out_slab)) { + printf("Found pos %d in empty bitmap.\n", pos); + return TEST_FAILED; + } + + return TEST_SUCCESS; +} + +static int test_bitmap_slab_set_get(struct rte_bitmap *bmp) { uint32_t pos = 0; @@ -158,6 +223,9 @@ test_bitmap(void) if (test_bitmap_scan_operations(bmp) < 0) return TEST_FAILED; + if (test_bitmap_scan_operations2(bmp) < 0) + return TEST_FAILED; + return TEST_SUCCESS; } -- 2.9.5
Re: [dpdk-dev] [PATCH v1] net/mlx: control netdevices through ioctl only
On Thu, Feb 08, 2018 at 05:37:06PM +0100, Adrien Mazarguil wrote: > Several control operations implemented by these PMDs affect netdevices > through sysfs, itself subject to file system permission checks enforced by > the kernel, which limits their use for most purposes to applications > running with root privileges. > > Since performing the same operations through ioctl() requires fewer > capabilities (only CAP_NET_ADMIN) and given the remaining operations are > already implemented this way, this patch standardizes on ioctl() and gets > rid of redundant code. > > Signed-off-by: Adrien Mazarguil Reviewed-by: Marcelo Ricardo Leitner > --- > drivers/net/mlx4/mlx4_ethdev.c | 192 ++- > drivers/net/mlx5/mlx5.h| 2 - > drivers/net/mlx5/mlx5_ethdev.c | 255 > drivers/net/mlx5/mlx5_stats.c | 28 +++- > 4 files changed, 63 insertions(+), 414 deletions(-) > > diff --git a/drivers/net/mlx4/mlx4_ethdev.c b/drivers/net/mlx4/mlx4_ethdev.c > index 3bc692731..fbeef16c8 100644 > --- a/drivers/net/mlx4/mlx4_ethdev.c > +++ b/drivers/net/mlx4/mlx4_ethdev.c > @@ -132,167 +132,6 @@ mlx4_get_ifname(const struct priv *priv, char > (*ifname)[IF_NAMESIZE]) > } > > /** > - * Read from sysfs entry. > - * > - * @param[in] priv > - * Pointer to private structure. > - * @param[in] entry > - * Entry name relative to sysfs path. > - * @param[out] buf > - * Data output buffer. > - * @param size > - * Buffer size. > - * > - * @return > - * Number of bytes read on success, negative errno value otherwise and > - * rte_errno is set. > - */ > -static int > -mlx4_sysfs_read(const struct priv *priv, const char *entry, > - char *buf, size_t size) > -{ > - char ifname[IF_NAMESIZE]; > - FILE *file; > - int ret; > - > - ret = mlx4_get_ifname(priv, &ifname); > - if (ret) > - return ret; > - > - MKSTR(path, "%s/device/net/%s/%s", priv->ctx->device->ibdev_path, > - ifname, entry); > - > - file = fopen(path, "rb"); > - if (file == NULL) { > - rte_errno = errno; > - return -rte_errno; > - } > - ret = fread(buf, 1, size, file); > - if ((size_t)ret < size && ferror(file)) { > - rte_errno = EIO; > - ret = -rte_errno; > - } else { > - ret = size; > - } > - fclose(file); > - return ret; > -} > - > -/** > - * Write to sysfs entry. > - * > - * @param[in] priv > - * Pointer to private structure. > - * @param[in] entry > - * Entry name relative to sysfs path. > - * @param[in] buf > - * Data buffer. > - * @param size > - * Buffer size. > - * > - * @return > - * Number of bytes written on success, negative errno value otherwise and > - * rte_errno is set. > - */ > -static int > -mlx4_sysfs_write(const struct priv *priv, const char *entry, > - char *buf, size_t size) > -{ > - char ifname[IF_NAMESIZE]; > - FILE *file; > - int ret; > - > - ret = mlx4_get_ifname(priv, &ifname); > - if (ret) > - return ret; > - > - MKSTR(path, "%s/device/net/%s/%s", priv->ctx->device->ibdev_path, > - ifname, entry); > - > - file = fopen(path, "wb"); > - if (file == NULL) { > - rte_errno = errno; > - return -rte_errno; > - } > - ret = fwrite(buf, 1, size, file); > - if ((size_t)ret < size || ferror(file)) { > - rte_errno = EIO; > - ret = -rte_errno; > - } else { > - ret = size; > - } > - fclose(file); > - return ret; > -} > - > -/** > - * Get unsigned long sysfs property. > - * > - * @param priv > - * Pointer to private structure. > - * @param[in] name > - * Entry name relative to sysfs path. > - * @param[out] value > - * Value output buffer. > - * > - * @return > - * 0 on success, negative errno value otherwise and rte_errno is set. > - */ > -static int > -mlx4_get_sysfs_ulong(struct priv *priv, const char *name, unsigned long > *value) > -{ > - int ret; > - unsigned long value_ret; > - char value_str[32]; > - > - ret = mlx4_sysfs_read(priv, name, value_str, (sizeof(value_str) - 1)); > - if (ret < 0) { > - DEBUG("cannot read %s value from sysfs: %s", > - name, strerror(rte_errno)); > - return ret; > - } > - value_str[ret] = '\0'; > - errno = 0; > - value_ret = strtoul(value_str, NULL, 0); > - if (errno) { > - rte_errno = errno; > - DEBUG("invalid %s value `%s': %s", name, value_str, > - strerror(rte_errno)); > - return -rte_errno; > - } > - *value = value_ret; > - return 0; > -} > - > -/** > - * Set unsigned long sysfs property. > - * > - * @param priv > - * Pointer to private structure. > - * @param[in] name > - * Entry name relative to sysfs path. > - * @param value > - * Value to set. > - * > - * @return > - *