Re: [dpdk-dev] [RFC v2 00/23] Dynamic memory allocation for DPDK

2018-02-14 Thread Thomas Monjalon
Hi Anatoly,

19/12/2017 12:14, Anatoly Burakov:
>  * Memory tagging. This is related to previous item. Right now, we can only 
> ask
>malloc to allocate memory by page size, but one could potentially have
>different memory regions backed by pages of similar sizes (for example,
>locked 1G pages, to completely avoid TLB misses, alongside regular 1G 
> pages),
>and it would be good to have that kind of mechanism to distinguish between
>different memory types available to a DPDK application. One could, for 
> example,
>tag memory by "purpose" (i.e. "fast", "slow"), or in other ways.

How do you imagine memory tagging?
Should it be a parameter when requesting some memory from rte_malloc
or rte_mempool?
Could it be a bit-field allowing to combine some properties?
Does it make sense to have "DMA" as one of the purpose?

How to transparently allocate the best memory for the NIC?
You take care of the NUMA socket property, but there can be more
requirements, like getting memory from the NIC itself.

+Cc more people (6WIND, Cavium, Chelsio, Mellanox, Netronome, NXP, Solarflare)
in order to trigger a discussion about the ideal requirements.


Re: [dpdk-dev] [PATCH v2] net/tap: fix promiscuous rules double insersions

2018-02-14 Thread Pascal Mazon
Hi Ophir,

Typo in title: s/insersions/insertions/

I'm ok on principle, I have just a few comments inline.

Regards,
Pascal

On 13/02/2018 19:35, Ophir Munk wrote:
> Running testpmd command "port stop all" followed by command "port start
> all" may result in a TAP error:
> PMD: Kernel refused TC filter rule creation (17): File exists
>
> Root cause analysis: during the execution of "port start all" command
> testpmd calls rte_eth_promiscuous_enable() while during the execution
> of "port stop all" command testpmd does not call
> rte_eth_promiscuous_enable().
Shouldn't it be rte_eth_promiscuous_disable()?
> As a result the TAP PMD is trying to add tc (traffic control command)
> promiscuous rules to the remote netvsc device consecutively. From the
> kernel point of view it is seen as an attempt to add the same rule more
> than once. In recent kernels (e.g. version 4.13) this attempt is rejected
> with a "File exists" error. In less recent kernels (e.g. version 4.4) the
> same rule may have been accepted twice successfully, which is undesirable.
>
> In the corrupted code every tc promiscuous rule included a different
> handle number parameter. If instead an identical handle number parameter is
> used for all tc promiscuous rules - all kernels will reject the second
> rule with a "File exists" error, which is easy to identify and to silently
> ignore.
>
> Fixes: 2bc06869cd94 ("net/tap: add remote netdevice traffic capture")
> Cc: sta...@dpdk.org
>
> Signed-off-by: Ophir Munk 
> ---
> v2: add detailed commit message
>
>  drivers/net/tap/tap_flow.c | 11 +++
>  1 file changed, 11 insertions(+)
>
> diff --git a/drivers/net/tap/tap_flow.c b/drivers/net/tap/tap_flow.c
> index 65657f0..d1f4a52 100644
> --- a/drivers/net/tap/tap_flow.c
> +++ b/drivers/net/tap/tap_flow.c
> @@ -123,6 +123,7 @@ enum key_status_e {
>  };
>  
>  #define ISOLATE_HANDLE 1
> +#define REMOTE_PROMISCUOUS_HANDLE 2
>  
>  struct rte_flow {
>   LIST_ENTRY(rte_flow) next; /* Pointer to the next rte_flow structure */
> @@ -1692,9 +1693,15 @@ int tap_flow_implicit_create(struct pmd_internals *pmd,
>* The ISOLATE rule is always present and must have a static handle, as
>* the action is changed whether the feature is enabled (DROP) or
>* disabled (PASSTHRU).
> +  * There is just one REMOTE_PROMISCUOUS rule in all cases. It should
> +  * have a static handle such that adding it twice will fail with EEXIST
> +  * with any kernel version. Remark: old kernels may falsely accept the
> +  * same REMOTE_PREMISCUOUS rules if they had different handles.
s/PREMISCUOUS/PROMISCUOUS/
>*/
>   if (idx == TAP_ISOLATE)
>   remote_flow->msg.t.tcm_handle = ISOLATE_HANDLE;
> + else if (idx == TAP_REMOTE_PROMISC)
> + remote_flow->msg.t.tcm_handle = REMOTE_PROMISCUOUS_HANDLE;
>   else
>   tap_flow_set_handle(remote_flow);
>   if (priv_flow_process(pmd, attr, items, actions, NULL,
> @@ -1709,12 +1716,16 @@ int tap_flow_implicit_create(struct pmd_internals 
> *pmd,
>   }
>   err = tap_nl_recv_ack(pmd->nlsk_fd);
>   if (err < 0) {
> + /* Silently ignore re-entering remote promiscuous rule */
> + if (errno == EEXIST && idx == TAP_REMOTE_PROMISC)
> + goto success;
>   RTE_LOG(ERR, PMD,
>   "Kernel refused TC filter rule creation (%d): %s\n",
>   errno, strerror(errno));
>   goto fail;
>   }
>   LIST_INSERT_HEAD(&pmd->implicit_flows, remote_flow, next);
Are we sure the previous rule is still in the registered implicit flows?
> +success:
>   return 0;
>  fail:
>   if (remote_flow)



Re: [dpdk-dev] [RFC v2 00/23] Dynamic memory allocation for DPDK

2018-02-14 Thread Burakov, Anatoly

On 14-Feb-18 2:01 AM, Yongseok Koh wrote:



On Feb 5, 2018, at 2:03 AM, Burakov, Anatoly  wrote:

Thanks for your feedback, good to hear we're on the right track. I already have 
a prototype implementation of this working, due for v1 submission :)


Anatoly,

One more suggestion. Currently, when populating mempool, there's a chance to
have multiple chunks if system memory is highly fragmented. However, with your
new design, it is unlikely to happen unless the system is really low on memory.
Allocation will be dynamic and page by page. With your v2, you seemed to make
minimal changes on mempool. If allocation fails, it will still try to gather
fragments from malloc_heap until it acquires enough objects and the resultant
mempool will have multiple chunks. But like I mentioned, it is very unlikely and
this will only happen when the system is short of memory. Is my understanding
correct?

If so, how about making a change to drop the case where mempool has multiple
chunks?

Thanks
Yongseok



Hi Yongseok,

I would still like to keep it, as it may impact low memory cases such as 
containers.


--
Thanks,
Anatoly


Re: [dpdk-dev] [PATCH] usertools/dpdk-devbind.py: add support for wind river avp device

2018-02-14 Thread Burakov, Anatoly

On 14-Feb-18 12:48 AM, Zhang, Xiaohua wrote:

Hi Yigit and Anantoly,
I checked the nics-17.11.pdf, the following is description:
"The Accelerated Virtual Port (AVP) device is a shared memory based device only 
available
on virtualization platforms from Wind River Systems. The Wind River Systems 
virtualization
platform currently uses QEMU/KVM as its hypervisor and as such provides support 
for all of
the QEMU supported virtual and/or emulated devices (e.g., virtio, e1000, etc.). 
The platform
offers the virtio device type as the default device when launching a virtual 
machine or creating
a virtual machine port. The AVP device is a specialized device available to 
customers that
require increased throughput and decreased latency to meet the demands of their 
performance
focused applications."

I am afraid  just "memory_device" will have some misunderstanding.
Could we put it as "avp device (shared memory based)"?




Hi,

Well, from AVP PMD documentation, it seems that AVP is classified as a 
NIC. Can't we just add it to the list of NICs, even if it's not Ethernet 
class 0x20xx? Pattern-matching in devbind should work either way. For 
example, you can see there's "cavium_pkx" already classified as a NIC, 
even though its class is 08xx, not 02xx. So why not this one?


Alternatively, if you think that would be confusing, how about instead 
of "memory devices" call it "other devices", for cases which don't fit 
into one of the DPDK categories?




BR.
Xiaohua Zhang

-Original Message-
From: Ferruh Yigit [mailto:ferruh.yi...@intel.com]
Sent: Tuesday, February 13, 2018 7:07 PM
To: BURAKOV, ANATOLY; Zhang, Xiaohua; dev@dpdk.org
Subject: Re: [dpdk-dev] [PATCH] usertools/dpdk-devbind.py: add support for wind 
river avp device

On 2/13/2018 10:06 AM, Burakov, Anatoly wrote:

On 13-Feb-18 1:43 AM, Zhang, Xiaohua wrote:

Hi Anatoly,
AVP is a virtual NIC type, so you are right.

When using the AVP device, you will see the following information from lspci 
(example).
Slot:   :00:05.0
Class:  Unclassified device [00ff]
Vendor:   Red Hat, Inc [1af4]
Device:Virtio memory balloon [1002]
SVendor:  Red Hat, Inc [1af4]
SDevice:   Device [0005]
PhySlot:5
Driver:virtio-pci

It is a little different with the standard "Ethernet" controller, such as "Class:  
Ethernet controller [0200]".
Theoretically, the AVP is a memory based device. That's the reason, I put it as 
separate catalog.



OK, fair enough. Is there any way we can make this category
not-WindRiver AVP specific? Are there other similar devices out there
that could potentially fit into this category?


Can we call it "memory_devices" instead of "avp_devices" ?





BR.
Xiaohua Zhang

-Original Message-






Is there any particular reason why this device appears in its own category, 
rather than being added to one of the existing device classes?
I'm not familiar with AVP but it looks like it's a NIC, so shouldn't it be in 
network_devices category?

--
Thanks,
Anatoly









--
Thanks,
Anatoly


Re: [dpdk-dev] [RFC v2 00/23] Dynamic memory allocation for DPDK

2018-02-14 Thread Burakov, Anatoly

On 14-Feb-18 8:04 AM, Thomas Monjalon wrote:

Hi Anatoly,

19/12/2017 12:14, Anatoly Burakov:

  * Memory tagging. This is related to previous item. Right now, we can only ask
malloc to allocate memory by page size, but one could potentially have
different memory regions backed by pages of similar sizes (for example,
locked 1G pages, to completely avoid TLB misses, alongside regular 1G 
pages),
and it would be good to have that kind of mechanism to distinguish between
different memory types available to a DPDK application. One could, for 
example,
tag memory by "purpose" (i.e. "fast", "slow"), or in other ways.


How do you imagine memory tagging?
Should it be a parameter when requesting some memory from rte_malloc
or rte_mempool?


We can't make it a parameter for mempool without making it a parameter 
for rte_malloc, as every memory allocation in DPDK works through 
rte_malloc. So at the very least, rte_malloc will have it. And as long 
as rte_malloc has it, there's no reason why memzones and mempools 
couldn't - not much code to add.



Could it be a bit-field allowing to combine some properties?
Does it make sense to have "DMA" as one of the purpose?


Something like a bitfield would be my preference, yes. That way we could 
classify memory in certain ways and allocate based on that. Which 
"certain ways" these are, i'm not sure. For example, in addition to 
tagging memory as "DMA-capable" (which i think is a given), one might 
tag certain memory as "non-default", as in, never allocate from this 
chunk of memory unless explicitly asked to do so - this could be useful 
for types of memory that are a precious resource.


Then again, it is likely that we won't have many types of memory in 
DPDK, and any other type would be implementation-specific, so maybe just 
stringly-typing it is OK (maybe we can finally make use of "type" 
parameter in rte_malloc!).




How to transparently allocate the best memory for the NIC?
You take care of the NUMA socket property, but there can be more
requirements, like getting memory from the NIC itself.


I would think that we can't make it generic enough to cover all cases, 
so it's best to expose some API's and let PMD's handle this themselves.




+Cc more people (6WIND, Cavium, Chelsio, Mellanox, Netronome, NXP, Solarflare)
in order to trigger a discussion about the ideal requirements.





--
Thanks,
Anatoly


[dpdk-dev] XL710: [Q] traffic steering under DPDK.

2018-02-14 Thread Anton Grichina
Hello,
I am working with Arkady on VLAN steering. I have few questions regarding it on 
top of what was asked before.
7.4.8.4 "VEB/VEPA Switching Algorithm" states that filtering happens by 
MAC+VLAN. It is impossible to perform filtering by VLAN only, as I understand 
it is HW limitation.

XL710 has something called "7.4.8.2 S-comp Forwarding Algorithm" which looks 
like exactly what we want. It does forwarding of packets to specific VSI based 
on S-tag.
S-tag is a part of "7.4.9.5.5.1 Add VSI (0x0210)" command. I guess we can 
configure several VSIs for expected S-tags.
Do you know anything about this algorithm?

As I understood "port virtualizer" (7.4.2.4.3 "Cascaded VEB and port 
virtualizers") supposed to be configured to use this switching algorithm.
In i40e driver (kernel or DPDK) I do not see anything related to configuration 
of "port virtualizers", except "i40e_aq_add_pvirt" function which is not used 
anywhere.
Is it possible to configure "port virtualizer" with existing i40e driver?

Another question is about "i40e_aq_set_vsi_uc_promisc_on_vlan" function in i40e 
driver (7.4.9.5.9.5 "Set VSI Promiscuous Modes"). It enables promiscuous mode 
for unicast packets with specific VLAN, so all packets with that VLAN will be 
replicated to configured VSI. In XL710 datasheet I`ve found that it works only 
in "Cloud VEB algorithm" (7.4.8.6). Can we somehow enable this algorithm with 
existing i40e drivers?

Thanks


Re: [dpdk-dev] [PATCH] doc: add kernel version deprecation notice

2018-02-14 Thread Olivier Matz
On Wed, Feb 14, 2018 at 12:58:34AM +0100, Thomas Monjalon wrote:
> 31/01/2018 16:27, Stephen Hemminger:
> > Notify users of upcoming change in kernel requirement.
> > Encourage users to use current LTS kernel version.
> > 
> > Signed-off-by: Stephen Hemminger 
> > ---
> > +* linux: Linux kernel version 3.2 (which is the current minimum required
> > +  version for the DPDK) will be end of life in May 2018. Therefore the 
> > planned
> > +  minimum required kernel version for DPDK 18.5 will be next oldest Long
> > +  Term Stable (LTS) version which is 3.10. The recommended kernel version 
> > is
> > +  the latest LTS kernel which currently is 4.14.
> 
> We could print a warning at EAL init if kernel version does not satisfy the
> minimal requirement.
> 
> Acked-by: Thomas Monjalon 

Acked-by: Olivier Matz 


Re: [dpdk-dev] [PATCH] doc: add kernel version deprecation notice

2018-02-14 Thread Luca Boccassi
On Wed, 2018-02-14 at 00:58 +0100, Thomas Monjalon wrote:
> 31/01/2018 16:27, Stephen Hemminger:
> > Notify users of upcoming change in kernel requirement.
> > Encourage users to use current LTS kernel version.
> > 
> > Signed-off-by: Stephen Hemminger 
> > ---
> > +* linux: Linux kernel version 3.2 (which is the current minimum
> > required
> > +  version for the DPDK) will be end of life in May 2018. Therefore
> > the planned
> > +  minimum required kernel version for DPDK 18.5 will be next
> > oldest Long
> > +  Term Stable (LTS) version which is 3.10. The recommended kernel
> > version is
> > +  the latest LTS kernel which currently is 4.14.
> 
> We could print a warning at EAL init if kernel version does not
> satisfy the
> minimal requirement.
> 
> Acked-by: Thomas Monjalon 

Note that 3.10 is dead as well since last year (as I discovered with
immense joy when I had to backport meltdown fixes...), the next LTS in
3.16 which will be maintained until 04/2020.

-- 
Kind regards,
Luca Boccassi


Re: [dpdk-dev] [PATCH] usertools/dpdk-devbind.py: add support for wind river avp device

2018-02-14 Thread Bruce Richardson
On Wed, Feb 14, 2018 at 09:57:25AM +, Burakov, Anatoly wrote:
> On 14-Feb-18 12:48 AM, Zhang, Xiaohua wrote:
> > Hi Yigit and Anantoly,
> > I checked the nics-17.11.pdf, the following is description:
> > "The Accelerated Virtual Port (AVP) device is a shared memory based device 
> > only available
> > on virtualization platforms from Wind River Systems. The Wind River Systems 
> > virtualization
> > platform currently uses QEMU/KVM as its hypervisor and as such provides 
> > support for all of
> > the QEMU supported virtual and/or emulated devices (e.g., virtio, e1000, 
> > etc.). The platform
> > offers the virtio device type as the default device when launching a 
> > virtual machine or creating
> > a virtual machine port. The AVP device is a specialized device available to 
> > customers that
> > require increased throughput and decreased latency to meet the demands of 
> > their performance
> > focused applications."
> > 
> > I am afraid  just "memory_device" will have some misunderstanding.
> > Could we put it as "avp device (shared memory based)"?
> > 
> > 
> 
> Hi,
> 
> Well, from AVP PMD documentation, it seems that AVP is classified as a NIC.
> Can't we just add it to the list of NICs, even if it's not Ethernet class
> 0x20xx? Pattern-matching in devbind should work either way. For example, you
> can see there's "cavium_pkx" already classified as a NIC, even though its
> class is 08xx, not 02xx. So why not this one?
> 

Definite +1.

It's used for packet IO into a vm, like virtio, and it's driver is in
drivers/net.

"If it looks like a NIC, and quacks like a NIC, then it probably is a
NIC". [Alternatively if it looks and quacks like a duck, I'm not sure
what it's doing in DPDK!]

/Bruce



Re: [dpdk-dev] [PATCH] doc: add kernel version deprecation notice

2018-02-14 Thread Maxime Coquelin



On 01/31/2018 04:27 PM, Stephen Hemminger wrote:

Notify users of upcoming change in kernel requirement.
Encourage users to use current LTS kernel version.

Signed-off-by: Stephen Hemminger 
---
  doc/guides/rel_notes/deprecation.rst | 6 ++
  1 file changed, 6 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index d59ad598862b..31d64b27ba17 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -59,3 +59,9 @@ Deprecation Notices
be added between the producer and consumer structures. The size of the
structure and the offset of the fields will remain the same on
platforms with 64B cache line, but will change on other platforms.
+
+* linux: Linux kernel version 3.2 (which is the current minimum required
+  version for the DPDK) will be end of life in May 2018. Therefore the planned
+  minimum required kernel version for DPDK 18.5 will be next oldest Long
+  Term Stable (LTS) version which is 3.10. The recommended kernel version is
+  the latest LTS kernel which currently is 4.14.



Acked-by: Maxime Coquelin 

Maxime


[dpdk-dev] Mac-learning pipeline

2018-02-14 Thread sharanya k
Hi all,

I want to create an ip pipeline application where the upstream should
learn the source mac address along with the port and the downstream
has to forward packets based on that mac learned table.Can I just use
any existing pipeline application as such for this mac learning?If not
what should I do now?
Can anyone help me with this?

Regards,
Sharanya


Re: [dpdk-dev] [PATCH] doc: add kernel version deprecation notice

2018-02-14 Thread Thomas Monjalon
14/02/2018 11:31, Luca Boccassi:
> On Wed, 2018-02-14 at 00:58 +0100, Thomas Monjalon wrote:
> > 31/01/2018 16:27, Stephen Hemminger:
> > > Notify users of upcoming change in kernel requirement.
> > > Encourage users to use current LTS kernel version.
> > > 
> > > Signed-off-by: Stephen Hemminger 
> > > ---
> > > +* linux: Linux kernel version 3.2 (which is the current minimum
> > > required
> > > +  version for the DPDK) will be end of life in May 2018. Therefore
> > > the planned
> > > +  minimum required kernel version for DPDK 18.5 will be next
> > > oldest Long
> > > +  Term Stable (LTS) version which is 3.10. The recommended kernel
> > > version is
> > > +  the latest LTS kernel which currently is 4.14.
> > 
> > We could print a warning at EAL init if kernel version does not
> > satisfy the
> > minimal requirement.
> > 
> > Acked-by: Thomas Monjalon 
> 
> Note that 3.10 is dead as well since last year (as I discovered with
> immense joy when I had to backport meltdown fixes...), the next LTS in
> 3.16 which will be maintained until 04/2020.

I am with 3.16.
Can we ack 3.16 and I do the change when applying?


Re: [dpdk-dev] [PATCH] doc: add kernel version deprecation notice

2018-02-14 Thread Maxime Coquelin



On 02/14/2018 11:31 AM, Luca Boccassi wrote:

On Wed, 2018-02-14 at 00:58 +0100, Thomas Monjalon wrote:

31/01/2018 16:27, Stephen Hemminger:

Notify users of upcoming change in kernel requirement.
Encourage users to use current LTS kernel version.

Signed-off-by: Stephen Hemminger 
---
+* linux: Linux kernel version 3.2 (which is the current minimum
required
+  version for the DPDK) will be end of life in May 2018. Therefore
the planned
+  minimum required kernel version for DPDK 18.5 will be next
oldest Long
+  Term Stable (LTS) version which is 3.10. The recommended kernel
version is
+  the latest LTS kernel which currently is 4.14.


We could print a warning at EAL init if kernel version does not
satisfy the
minimal requirement.

Acked-by: Thomas Monjalon 


Note that 3.10 is dead as well since last year (as I discovered with
immense joy when I had to backport meltdown fixes...), the next LTS in
3.16 which will be maintained until 04/2020.



In this case we should differentiate upstream Kernel versions from
downstream ones. For example, RHEL7/CentOS7 are based on v3.10 ans still
maintained.


Re: [dpdk-dev] Multi-driver support for Fortville

2018-02-14 Thread Nitin Katiyar
Hi Beilei,
Thanks for clarifying the queries. We have been referring to following patches. 
https://dpdk.org/dev/patchwork/patch/34945/
https://dpdk.org/dev/patchwork/patch/34946/
https://dpdk.org/dev/patchwork/patch/34947/
https://dpdk.org/dev/patchwork/patch/34948/

Are these final versions and merged in dpdk branch? If not, where can I find 
latest patches?

Regards,
Nitin




-Original Message-
From: Xing, Beilei [mailto:beilei.x...@intel.com] 
Sent: Wednesday, February 14, 2018 6:50 AM
To: Nitin Katiyar ; dev@dpdk.org
Cc: Venkatesan Pradeep 
Subject: RE: Multi-driver support for Fortville

Hi Nitin,

> -Original Message-
> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Nitin Katiyar
> Sent: Tuesday, February 13, 2018 11:48 AM
> To: dev@dpdk.org
> Cc: Venkatesan Pradeep 
> Subject: [dpdk-dev] Multi-driver support for Fortville
> 
> Hi,
> Resending the queries with change in subject line.
> 1) With these patches, we have 2 different values for some of the 
> global registers depending upon whether single driver or multi-driver 
> is using all ports of the NIC. Does it impact any 
> functionality/performance if we use DPDK drivers in single driver vs 
> multi-driver support?

Yes. If support multi-driver,
for functionality, some configurations will not be supported. Including flow 
director flexible payload, RSS input set/RSS bit mask/hash function/symmetric 
hash/FDIR input set/TPID/flow control watermark/GRE tunnel key length 
configuration, QinQ parser and QinQ cloud filter support.
For performance, PF will use INT0 instead of INTN when support multi-driver, so 
there'll be many interrupts costing CPU cycles during receiving packets.

> 2) Why can't we have same settings for both the cases? i.e 
> Unconditionally programming the global registers in DPDK driver with 
> the same values as in Kernel driver. That way we don't have to care for extra 
> parameter.

The reason is same as above.

> 3) Does this issue need any update for kernel driver also?

As I know, there's no need to update kernel driver.

> 
> Regards,
> Nitin
> 
> -Original Message-
> From: Nitin Katiyar
> Sent: Monday, February 12, 2018 11:32 AM
> To: dev@dpdk.org
> Cc: Venkatesan Pradeep 
> Subject: RE: dev Digest, Vol 180, Issue 152
> 
> Hi Beilei,
> I was looking at the patches and have few queries regarding 
> support-multi-driver.
> 1) With these patches, we have 2 different values for some of the 
> global registers depending upon whether single driver or multi-driver 
> is using all ports of the NIC. Does it impact any 
> functionality/performance if we use DPDK drivers in single driver vs 
> multi-driver support?
> 2) Why can't we have same settings for both the cases? That way we 
> don't have to care for extra parameter.
> 3) Does this issue need any update for kernel driver also?
> 
> 
> Regards,
> Nitin
> 
> -Original Message-
> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of 
> dev-requ...@dpdk.org
> Sent: Friday, February 02, 2018 5:55 PM
> To: dev@dpdk.org
> Subject: dev Digest, Vol 180, Issue 152
> 
> Send dev mailing list submissions to
>   dev@dpdk.org
> 
> To subscribe or unsubscribe via the World Wide Web, visit
>   https://dpdk.org/ml/listinfo/dev
> or, via email, send a message with subject or body 'help' to
>   dev-requ...@dpdk.org
> 
> You can reach the person managing the list at
>   dev-ow...@dpdk.org
> 
> When replying, please edit your Subject line so it is more specific than "Re:
> Contents of dev digest..."
> 
> 
> Today's Topics:
> 
>1. [PATCH v3 2/4] net/i40e: add debug logs when writingglobal
>   registers (Beilei Xing)
>2. [PATCH v3 3/4] net/i40e: fix multiple driver supportissue
>   (Beilei Xing)
>3. [PATCH v3 4/4] net/i40e: fix interrupt conflict whenusing
>   multi-driver (Beilei Xing)
> 
> 
> --
> 
> Message: 1
> Date: Fri,  2 Feb 2018 20:25:08 +0800
> From: Beilei Xing 
> To: dev@dpdk.org, jingjing...@intel.com
> Cc: sta...@dpdk.org
> Subject: [dpdk-dev] [PATCH v3 2/4] net/i40e: add debug logs when
>   writing global registers
> Message-ID: <1517574310-93096-3-git-send-email-beilei.x...@intel.com>
> 
> Add debug logs when writing global registers.
> 
> Signed-off-by: Beilei Xing 
> Cc: sta...@dpdk.org
> ---
>  drivers/net/i40e/i40e_ethdev.c | 127
> +
>  drivers/net/i40e/i40e_ethdev.h |   8 +++
>  2 files changed, 87 insertions(+), 48 deletions(-)
> 
> diff --git a/drivers/net/i40e/i40e_ethdev.c 
> b/drivers/net/i40e/i40e_ethdev.c index 44821f2..ef23241 100644
> --- a/drivers/net/i40e/i40e_ethdev.c
> +++ b/drivers/net/i40e/i40e_ethdev.c
> @@ -716,6 +716,15 @@ rte_i40e_dev_atomic_write_link_status(struct
> rte_eth_dev *dev,
>   return 0;
>  }
> 
> +static inline void
> +i40e_write_global_rx_ctl(struct i40e_hw *hw, u32 reg_addr, u32 
> +reg_val) {
> + i40e_write_rx_ctl(hw, reg_addr, reg

Re: [dpdk-dev] [PATCH] doc: add kernel version deprecation notice

2018-02-14 Thread Luca Boccassi
On Wed, 2018-02-14 at 11:38 +0100, Maxime Coquelin wrote:
> 
> On 02/14/2018 11:31 AM, Luca Boccassi wrote:
> > On Wed, 2018-02-14 at 00:58 +0100, Thomas Monjalon wrote:
> > > 31/01/2018 16:27, Stephen Hemminger:
> > > > Notify users of upcoming change in kernel requirement.
> > > > Encourage users to use current LTS kernel version.
> > > > 
> > > > Signed-off-by: Stephen Hemminger 
> > > > ---
> > > > +* linux: Linux kernel version 3.2 (which is the current
> > > > minimum
> > > > required
> > > > +  version for the DPDK) will be end of life in May 2018.
> > > > Therefore
> > > > the planned
> > > > +  minimum required kernel version for DPDK 18.5 will be next
> > > > oldest Long
> > > > +  Term Stable (LTS) version which is 3.10. The recommended
> > > > kernel
> > > > version is
> > > > +  the latest LTS kernel which currently is 4.14.
> > > 
> > > We could print a warning at EAL init if kernel version does not
> > > satisfy the
> > > minimal requirement.
> > > 
> > > Acked-by: Thomas Monjalon 
> > 
> > Note that 3.10 is dead as well since last year (as I discovered
> > with
> > immense joy when I had to backport meltdown fixes...), the next LTS
> > in
> > 3.16 which will be maintained until 04/2020.
> > 
> 
> In this case we should differentiate upstream Kernel versions from
> downstream ones. For example, RHEL7/CentOS7 are based on v3.10 ans
> still
> maintained.

Ubuntu does 3.13 as well - I think the problem is that if we want to
support distro-specific LTS kernel versions, we need volunteers to do
the work for them :-)

-- 
Kind regards,
Luca Boccassi


Re: [dpdk-dev] [PATCH v1] doc: update deprecation notice of rte_devargs

2018-02-14 Thread Thomas Monjalon
14/02/2018 00:51, Thomas Monjalon:
> 13/02/2018 12:26, Ferruh Yigit:
> > On 2/7/2018 12:41 PM, Shreyansh Jain wrote:
> > > On Wednesday 07 February 2018 02:56 PM, Gaetan Rivet wrote:
> > >> The declaration and identification of devices will change in v18.05.
> > >>
> > >> Remove the precedent deprecation notice
> > >>
> > >> Add new one reflecting the planned changes more accurately,
> > >> updated for v18.05.
> > >>
> > >> Signed-off-by: Gaetan Rivet 
> > > 
> > > Acked-By: Shreyansh Jain 
> > 
> > Acked-by: Ferruh Yigit 
> 
> Acked-by: Thomas Monjalon 

Applied



Re: [dpdk-dev] [PATCH 1/1] doc: announce API change to lcore role function

2018-02-14 Thread Thomas Monjalon
14/02/2018 01:09, Thomas Monjalon:
> 12/01/2018 21:45, Erik Gabriel Carrillo:
> > This an API/ABI change notice for DPDK 18.05 announcing a change in
> > the meaning of the return values of the rte_lcore_has_role() function.
> > 
> > Signed-off-by: Erik Gabriel Carrillo 
> > ---
> > +* eal: The semantics of the return value for the ``rte_lcore_has_role`` 
> > function
> > +  are planned to change in v18.05. The function currently returns 0 and <0 
> > for
> > +  success and failure, respectively.  This will change to 1 and 0 for true 
> > and
> > +  false, respectively, to make use of the function more intuitive.
> 
> It will introduce some subtle bugs in applications.
> We must clearly advertise this API change in the release notes.
> 
> Acked-by: Thomas Monjalon 

Applied



Re: [dpdk-dev] [PATCH v2] net/tap: fix promiscuous rules double insersions

2018-02-14 Thread Ophir Munk
Please see inline.
I will send updated v3

> -Original Message-
> From: Pascal Mazon [mailto:pascal.ma...@6wind.com]
> Sent: Wednesday, February 14, 2018 10:51 AM
> To: Ophir Munk ; dev@dpdk.org
> Cc: Thomas Monjalon ; Olga Shern
> ; sta...@dpdk.org
> Subject: Re: [PATCH v2] net/tap: fix promiscuous rules double insersions
> 
> Hi Ophir,
> 
> Typo in title: s/insersions/insertions/
> 

Fixed in v3

> I'm ok on principle, I have just a few comments inline.
> 
> Regards,
> Pascal
> 
> On 13/02/2018 19:35, Ophir Munk wrote:
> > Running testpmd command "port stop all" followed by command "port
> > start all" may result in a TAP error:
> > PMD: Kernel refused TC filter rule creation (17): File exists
> >
> > Root cause analysis: during the execution of "port start all" command
> > testpmd calls rte_eth_promiscuous_enable() while during the execution
> > of "port stop all" command testpmd does not call
> > rte_eth_promiscuous_enable().
> Shouldn't it be rte_eth_promiscuous_disable()?

Yes it should. Fixed in v3

> > As a result the TAP PMD is trying to add tc (traffic control command)
> > promiscuous rules to the remote netvsc device consecutively. From the
> > kernel point of view it is seen as an attempt to add the same rule
> > more than once. In recent kernels (e.g. version 4.13) this attempt is
> > rejected with a "File exists" error. In less recent kernels (e.g.
> > version 4.4) the same rule may have been accepted twice successfully,
> which is undesirable.
> >
> > In the corrupted code every tc promiscuous rule included a different
> > handle number parameter. If instead an identical handle number
> > parameter is used for all tc promiscuous rules - all kernels will
> > reject the second rule with a "File exists" error, which is easy to
> > identify and to silently ignore.
> >
> > Fixes: 2bc06869cd94 ("net/tap: add remote netdevice traffic capture")
> > Cc: sta...@dpdk.org
> >
> > Signed-off-by: Ophir Munk 
> > ---
> > v2: add detailed commit message
> >
> >  drivers/net/tap/tap_flow.c | 11 +++
> >  1 file changed, 11 insertions(+)
> >
> > diff --git a/drivers/net/tap/tap_flow.c b/drivers/net/tap/tap_flow.c
> > index 65657f0..d1f4a52 100644
> > --- a/drivers/net/tap/tap_flow.c
> > +++ b/drivers/net/tap/tap_flow.c
> > @@ -123,6 +123,7 @@ enum key_status_e {  };
> >
> >  #define ISOLATE_HANDLE 1
> > +#define REMOTE_PROMISCUOUS_HANDLE 2
> >
> >  struct rte_flow {
> > LIST_ENTRY(rte_flow) next; /* Pointer to the next rte_flow structure
> > */ @@ -1692,9 +1693,15 @@ int tap_flow_implicit_create(struct
> pmd_internals *pmd,
> >  * The ISOLATE rule is always present and must have a static handle,
> as
> >  * the action is changed whether the feature is enabled (DROP) or
> >  * disabled (PASSTHRU).
> > +* There is just one REMOTE_PROMISCUOUS rule in all cases. It
> should
> > +* have a static handle such that adding it twice will fail with EEXIST
> > +* with any kernel version. Remark: old kernels may falsely accept the
> > +* same REMOTE_PREMISCUOUS rules if they had different handles.
> s/PREMISCUOUS/PROMISCUOUS/
> >  */
> > if (idx == TAP_ISOLATE)
> > remote_flow->msg.t.tcm_handle = ISOLATE_HANDLE;
> > +   else if (idx == TAP_REMOTE_PROMISC)
> > +   remote_flow->msg.t.tcm_handle =
> REMOTE_PROMISCUOUS_HANDLE;
> > else
> > tap_flow_set_handle(remote_flow);
> > if (priv_flow_process(pmd, attr, items, actions, NULL, @@ -1709,12
> > +1716,16 @@ int tap_flow_implicit_create(struct pmd_internals *pmd,
> > }
> > err = tap_nl_recv_ack(pmd->nlsk_fd);
> > if (err < 0) {
> > +   /* Silently ignore re-entering remote promiscuous rule */
> > +   if (errno == EEXIST && idx == TAP_REMOTE_PROMISC)
> > +   goto success;
> > RTE_LOG(ERR, PMD,
> > "Kernel refused TC filter rule creation (%d): %s\n",
> > errno, strerror(errno));
> > goto fail;
> > }
> > LIST_INSERT_HEAD(&pmd->implicit_flows, remote_flow, next);
> Are we sure the previous rule is still in the registered implicit flows?

I will run tests to verify that.

> > +success:
> > return 0;
> >  fail:
> > if (remote_flow)



[dpdk-dev] [PATCH v3] net/tap: fix promiscuous rules double insertions

2018-02-14 Thread Ophir Munk
Running testpmd command "port stop all" followed by command "port start
all" may result in a TAP error:
PMD: Kernel refused TC filter rule creation (17): File exists

Root cause analysis: during the execution of "port start all" command
testpmd calls rte_eth_promiscuous_enable() while during the execution
of "port stop all" command testpmd does not call
rte_eth_promiscuous_disable().
As a result the TAP PMD is trying to add tc (traffic control command)
promiscuous rules to the remote netvsc device consecutively. From the
kernel point of view it is seen as an attempt to add the same rule more
than once. In recent kernels (e.g. version 4.13) this attempt is rejected
with a "File exists" error. In less recent kernels (e.g. version 4.4) the
same rule may have been successfully accepted twice, which is undesirable.

In the corrupted code every tc promiscuous rule included a different
handle number parameter. If instead an identical handle number is
used for all tc promiscuous rules - all kernels will reject the second
identical rule with a "File exists" error, which is easy to identify and
to silently ignore.

Fixes: 2bc06869cd94 ("net/tap: add remote netdevice traffic capture")
Cc: sta...@dpdk.org

Signed-off-by: Ophir Munk 
---
v1: initial version
v2: add detailed commit message
v3: textual fixes to commit message and code comments

 drivers/net/tap/tap_flow.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/drivers/net/tap/tap_flow.c b/drivers/net/tap/tap_flow.c
index 65657f0..551b2d8 100644
--- a/drivers/net/tap/tap_flow.c
+++ b/drivers/net/tap/tap_flow.c
@@ -123,6 +123,7 @@ enum key_status_e {
 };
 
 #define ISOLATE_HANDLE 1
+#define REMOTE_PROMISCUOUS_HANDLE 2
 
 struct rte_flow {
LIST_ENTRY(rte_flow) next; /* Pointer to the next rte_flow structure */
@@ -1692,9 +1693,15 @@ int tap_flow_implicit_create(struct pmd_internals *pmd,
 * The ISOLATE rule is always present and must have a static handle, as
 * the action is changed whether the feature is enabled (DROP) or
 * disabled (PASSTHRU).
+* There is just one REMOTE_PROMISCUOUS rule in all cases. It should
+* have a static handle such that adding it twice will fail with EEXIST
+* with any kernel version. Remark: old kernels may falsely accept the
+* same REMOTE_PROMISCUOUS rules if they had different handles.
 */
if (idx == TAP_ISOLATE)
remote_flow->msg.t.tcm_handle = ISOLATE_HANDLE;
+   else if (idx == TAP_REMOTE_PROMISC)
+   remote_flow->msg.t.tcm_handle = REMOTE_PROMISCUOUS_HANDLE;
else
tap_flow_set_handle(remote_flow);
if (priv_flow_process(pmd, attr, items, actions, NULL,
@@ -1709,12 +1716,16 @@ int tap_flow_implicit_create(struct pmd_internals *pmd,
}
err = tap_nl_recv_ack(pmd->nlsk_fd);
if (err < 0) {
+   /* Silently ignore re-entering remote promiscuous rule */
+   if (errno == EEXIST && idx == TAP_REMOTE_PROMISC)
+   goto success;
RTE_LOG(ERR, PMD,
"Kernel refused TC filter rule creation (%d): %s\n",
errno, strerror(errno));
goto fail;
}
LIST_INSERT_HEAD(&pmd->implicit_flows, remote_flow, next);
+success:
return 0;
 fail:
if (remote_flow)
-- 
2.7.4



Re: [dpdk-dev] [PATCH] doc: add kernel version deprecation notice

2018-02-14 Thread Bruce Richardson
On Wed, Feb 14, 2018 at 10:54:44AM +, Luca Boccassi wrote:
> On Wed, 2018-02-14 at 11:38 +0100, Maxime Coquelin wrote:
> > 
> > On 02/14/2018 11:31 AM, Luca Boccassi wrote:
> > > On Wed, 2018-02-14 at 00:58 +0100, Thomas Monjalon wrote:
> > > > 31/01/2018 16:27, Stephen Hemminger:
> > > > > Notify users of upcoming change in kernel requirement.
> > > > > Encourage users to use current LTS kernel version.
> > > > > 
> > > > > Signed-off-by: Stephen Hemminger 
> > > > > --- +* linux: Linux kernel version 3.2 (which is the current
> > > > > minimum required +  version for the DPDK) will be end of life
> > > > > in May 2018.  Therefore the planned +  minimum required kernel
> > > > > version for DPDK 18.5 will be next oldest Long +  Term Stable
> > > > > (LTS) version which is 3.10. The recommended kernel version is
> > > > > +  the latest LTS kernel which currently is 4.14.
> > > > 
> > > > We could print a warning at EAL init if kernel version does not
> > > > satisfy the minimal requirement.
> > > > 
> > > > Acked-by: Thomas Monjalon 
> > > 
> > > Note that 3.10 is dead as well since last year (as I discovered
> > > with immense joy when I had to backport meltdown fixes...), the
> > > next LTS in 3.16 which will be maintained until 04/2020.
> > > 
> > 
> > In this case we should differentiate upstream Kernel versions from
> > downstream ones. For example, RHEL7/CentOS7 are based on v3.10 ans
> > still maintained.
> 
> Ubuntu does 3.13 as well - I think the problem is that if we want to
> support distro-specific LTS kernel versions, we need volunteers to do
> the work for them :-)
> 
> -- 
I think our kernel support plans need to be two-fold:

1) we need to support a minimum "kernel.org" kernel version, which is
what the deprecation notice is about.
2) we also will be supporting LTS distributions, e.g. I would expect us to
always support the latest RHEL, so that should be noted explicitly in the
GSG IMHO.

/Bruce



Re: [dpdk-dev] Accessing 2nd cacheline in rte_pktmbuf_prefree_seg()

2018-02-14 Thread Ananyev, Konstantin
Hi Yongseok,

> > On Feb 13, 2018, at 2:45 PM, Yongseok Koh  wrote:
> >
> > Hi Olivier
> >
> > I'm wondering why rte_pktmbuf_prefree_seg() checks m->next instead of
> > m->nb_segs? As 'next' is in the 2nd cacheline, checking nb_segs seems 
> > beneficial
> > to the cases where almost mbufs have single segment.
> >
> > A customer reported high rate of cache misses in the code and I thought the
> > following patch could be helpful. I haven't had them try it yet but just 
> > wanted
> > to hear from you.
> >
> > I'd appreciate if you can review this idea.
> >
> > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> > index 62740254d..96edbcb9e 100644
> > --- a/lib/librte_mbuf/rte_mbuf.h
> > +++ b/lib/librte_mbuf/rte_mbuf.h
> > @@ -1398,7 +1398,7 @@ rte_pktmbuf_prefree_seg(struct rte_mbuf *m)
> >if (RTE_MBUF_INDIRECT(m))
> >rte_pktmbuf_detach(m);
> >
> > -   if (m->next != NULL) {
> > +   if (m->nb_segs > 1) {
> >m->next = NULL;
> >m->nb_segs = 1;
> >}
> > @@ -1410,7 +1410,7 @@ rte_pktmbuf_prefree_seg(struct rte_mbuf *m)
> >if (RTE_MBUF_INDIRECT(m))
> >rte_pktmbuf_detach(m);
> >
> > -   if (m->next != NULL) {
> > +   if (m->nb_segs > 1) {
> >m->next = NULL;
> >m->nb_segs = 1;
> >}
> 
> Well, m->pool in the 2nd cacheline has to be accessed anyway in order to put 
> it back to the mempool.
> It looks like the cache miss is unavoidable.

As a thought: in theory PMD can store pool pointer together with each mbuf it 
has to free,
then it could be something like:

if (rte_pktmbuf_prefree_seg(m[x] != NULL)
   rte_mempool_put(pool[x], m[x]);

Then what you suggested above might help.
Konstantin



[dpdk-dev] [PATCH v2] net/i40e: fix link_state update for i40e_ethdev_vf drv

2018-02-14 Thread Tushar Mulkar
The check for bool was accounting unwanted bits in the calulation of truth 
value. In dpdk unsingned int is typedefed to bool but all it cares about is 
Least Significant Bit. But in calculation of condition expression the bits 
other than LSB was used which doesn't make sense. Some time these bits has 
values which results in to incorrect expression results. To fix this we just 
need to account LSB form the bool value . This can be easily done by anding the 
value with true.

Signed-off-by: Tushar Mulkar 
---
 drivers/net/i40e/i40e_ethdev_vf.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/i40e/i40e_ethdev_vf.c 
b/drivers/net/i40e/i40e_ethdev_vf.c
index b96d77a0c..d23dff044 100644
--- a/drivers/net/i40e/i40e_ethdev_vf.c
+++ b/drivers/net/i40e/i40e_ethdev_vf.c
@@ -2095,8 +2095,8 @@ i40evf_dev_link_update(struct rte_eth_dev *dev,
}
/* full duplex only */
new_link.link_duplex = ETH_LINK_FULL_DUPLEX;
-   new_link.link_status = vf->link_up ? ETH_LINK_UP :
-ETH_LINK_DOWN;
+   new_link.link_status = (vf->link_up & true) ? 
+ETH_LINK_UP : ETH_LINK_DOWN;
new_link.link_autoneg =
dev->data->dev_conf.link_speeds & ETH_LINK_SPEED_FIXED;
 
--
2.11.0



Re: [dpdk-dev] Accessing 2nd cacheline in rte_pktmbuf_prefree_seg()

2018-02-14 Thread Ananyev, Konstantin


> -Original Message-
> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Ananyev, Konstantin
> Sent: Wednesday, February 14, 2018 11:48 AM
> To: Yongseok Koh ; Olivier Matz 
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] Accessing 2nd cacheline in rte_pktmbuf_prefree_seg()
> 
> Hi Yongseok,
> 
> > > On Feb 13, 2018, at 2:45 PM, Yongseok Koh  wrote:
> > >
> > > Hi Olivier
> > >
> > > I'm wondering why rte_pktmbuf_prefree_seg() checks m->next instead of
> > > m->nb_segs? As 'next' is in the 2nd cacheline, checking nb_segs seems 
> > > beneficial
> > > to the cases where almost mbufs have single segment.
> > >
> > > A customer reported high rate of cache misses in the code and I thought 
> > > the
> > > following patch could be helpful. I haven't had them try it yet but just 
> > > wanted
> > > to hear from you.
> > >
> > > I'd appreciate if you can review this idea.
> > >
> > > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> > > index 62740254d..96edbcb9e 100644
> > > --- a/lib/librte_mbuf/rte_mbuf.h
> > > +++ b/lib/librte_mbuf/rte_mbuf.h
> > > @@ -1398,7 +1398,7 @@ rte_pktmbuf_prefree_seg(struct rte_mbuf *m)
> > >if (RTE_MBUF_INDIRECT(m))
> > >rte_pktmbuf_detach(m);
> > >
> > > -   if (m->next != NULL) {
> > > +   if (m->nb_segs > 1) {
> > >m->next = NULL;
> > >m->nb_segs = 1;
> > >}
> > > @@ -1410,7 +1410,7 @@ rte_pktmbuf_prefree_seg(struct rte_mbuf *m)
> > >if (RTE_MBUF_INDIRECT(m))
> > >rte_pktmbuf_detach(m);
> > >
> > > -   if (m->next != NULL) {
> > > +   if (m->nb_segs > 1) {
> > >m->next = NULL;
> > >m->nb_segs = 1;
> > >}
> >
> > Well, m->pool in the 2nd cacheline has to be accessed anyway in order to 
> > put it back to the mempool.
> > It looks like the cache miss is unavoidable.
> 
> As a thought: in theory PMD can store pool pointer together with each mbuf it 
> has to free,
> then it could be something like:
> 
> if (rte_pktmbuf_prefree_seg(m[x] != NULL)
>rte_mempool_put(pool[x], m[x]);
> 
> Then what you suggested above might help.

After another thought - we have to check m->next not m->nb_segs.
There could be a situations where nb_segs==1, but m->next != NULL
(2-nd segment of the 3 segment packet for example).
So probably we have to keep it as it is.
Sorry for the noise
Konstantin



Re: [dpdk-dev] Accessing 2nd cacheline in rte_pktmbuf_prefree_seg()

2018-02-14 Thread Bruce Richardson
On Wed, Feb 14, 2018 at 12:03:55PM +, Ananyev, Konstantin wrote:
> 
> 
> > -Original Message-
> > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Ananyev, Konstantin
> > Sent: Wednesday, February 14, 2018 11:48 AM
> > To: Yongseok Koh ; Olivier Matz 
> > Cc: dev@dpdk.org
> > Subject: Re: [dpdk-dev] Accessing 2nd cacheline in rte_pktmbuf_prefree_seg()
> > 
> > Hi Yongseok,
> > 
> > > > On Feb 13, 2018, at 2:45 PM, Yongseok Koh  wrote:
> > > >
> > > > Hi Olivier
> > > >
> > > > I'm wondering why rte_pktmbuf_prefree_seg() checks m->next instead of
> > > > m->nb_segs? As 'next' is in the 2nd cacheline, checking nb_segs seems 
> > > > beneficial
> > > > to the cases where almost mbufs have single segment.
> > > >
> > > > A customer reported high rate of cache misses in the code and I thought 
> > > > the
> > > > following patch could be helpful. I haven't had them try it yet but 
> > > > just wanted
> > > > to hear from you.
> > > >
> > > > I'd appreciate if you can review this idea.
> > > >
> > > > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> > > > index 62740254d..96edbcb9e 100644
> > > > --- a/lib/librte_mbuf/rte_mbuf.h
> > > > +++ b/lib/librte_mbuf/rte_mbuf.h
> > > > @@ -1398,7 +1398,7 @@ rte_pktmbuf_prefree_seg(struct rte_mbuf *m)
> > > >if (RTE_MBUF_INDIRECT(m))
> > > >rte_pktmbuf_detach(m);
> > > >
> > > > -   if (m->next != NULL) {
> > > > +   if (m->nb_segs > 1) {
> > > >m->next = NULL;
> > > >m->nb_segs = 1;
> > > >}
> > > > @@ -1410,7 +1410,7 @@ rte_pktmbuf_prefree_seg(struct rte_mbuf *m)
> > > >if (RTE_MBUF_INDIRECT(m))
> > > >rte_pktmbuf_detach(m);
> > > >
> > > > -   if (m->next != NULL) {
> > > > +   if (m->nb_segs > 1) {
> > > >m->next = NULL;
> > > >m->nb_segs = 1;
> > > >}
> > >
> > > Well, m->pool in the 2nd cacheline has to be accessed anyway in order to 
> > > put it back to the mempool.
> > > It looks like the cache miss is unavoidable.
> > 
> > As a thought: in theory PMD can store pool pointer together with each mbuf 
> > it has to free,
> > then it could be something like:
> > 
> > if (rte_pktmbuf_prefree_seg(m[x] != NULL)
> >rte_mempool_put(pool[x], m[x]);
> > 
> > Then what you suggested above might help.
> 
> After another thought - we have to check m->next not m->nb_segs.
> There could be a situations where nb_segs==1, but m->next != NULL
> (2-nd segment of the 3 segment packet for example).
> So probably we have to keep it as it is.
> Sorry for the noise
> Konstantin

It's still worth considering as an option. We could check nb_segs for
the first segment of a packet and thereafter iterate using the next
pointer. It means that your idea of storing the pool pointer for each
mbuf becomes useful for single-segment packets.

/Bruce


[dpdk-dev] [PATCH v1] doc: update release notes for 18.02

2018-02-14 Thread John McNamara
Fix grammar, spelling and formatting of DPDK 18.02 release notes.

Signed-off-by: John McNamara 
---
 doc/guides/rel_notes/release_18_02.rst | 194 +++--
 1 file changed, 64 insertions(+), 130 deletions(-)

diff --git a/doc/guides/rel_notes/release_18_02.rst 
b/doc/guides/rel_notes/release_18_02.rst
index 04202ba..fa41207 100644
--- a/doc/guides/rel_notes/release_18_02.rst
+++ b/doc/guides/rel_notes/release_18_02.rst
@@ -41,7 +41,7 @@ New Features
  Also, make sure to start the actual text at the margin.
  =
 
-* **Add function to allow releasing internal EAL resources on exit**
+* **Added function to allow releasing internal EAL resources on exit.**
 
   During ``rte_eal_init()`` EAL allocates memory from hugepages to enable its
   core libraries to perform their tasks. The ``rte_eal_cleanup()`` function
@@ -50,32 +50,12 @@ New Features
   exiting. Not calling this function could result in leaking hugepages, leading
   to failure during initialization of secondary processes.
 
-* **Added the ixgbe ethernet driver to support RSS with flow API.**
+* **Added igb, ixgbe and i40e ethernet driver to support RSS with flow API.**
 
-  Rte_flow actually defined to include RSS, but till now, RSS is out of
-  rte_flow. This patch is to support igb and ixgbe NIC with existing RSS
-  configuration using rte_flow API.
+  Added support for igb, ixgbe and i40e NICs with existing RSS configuration
+  using the ``rte_flow`` API.
 
-* **Add MAC loopback support for i40e.**
-
-  Add MAC loopback support for i40e in order to support test task asked by
-  users. According to the device configuration, it will setup TX->RX loopback
-  link or not.
-
-* **Add the support of run time determination of number of queues per i40e VF**
-
-  The number of queue per VF is determined by its host PF. If the PCI address
-  of an i40e PF is :bb.cc, the number of queues per VF can be configured
-  with EAL parameter like -w :bb.cc,queue-num-per-vf=n. The value n can be
-  1, 2, 4, 8 or 16. If no such parameter is configured, the number of queues
-  per VF is 4 by default.
-
-* **Added the i40e ethernet driver to support RSS with flow API.**
-
-  Rte_flow actually defined to include RSS, but till now, RSS is out of
-  rte_flow. This patch is to support i40e NIC with existing RSS
-  configuration using rte_flow API.It also enable queue region configuration
-  using flow API for i40e.
+  Also enabled queue region configuration using the ``rte_flow`` API for i40e.
 
 * **Updated i40e driver to support PPPoE/PPPoL2TP.**
 
@@ -83,6 +63,20 @@ New Features
   profiles which can be programmed by dynamic device personalization (DDP)
   process.
 
+* **Added MAC loopback support for i40e.**
+
+  Added MAC loopback support for i40e in order to support test tasks requested
+  by users. It will setup ``Tx -> Rx`` loopback link according to the device
+  configuration.
+
+* **Added support of run time determination of number of queues per i40e VF.**
+
+  The number of queue per VF is determined by its host PF. If the PCI address
+  of an i40e PF is ``:bb.cc``, the number of queues per VF can be
+  configured with EAL parameter like ``-w :bb.cc,queue-num-per-vf=n``. The
+  value n can be 1, 2, 4, 8 or 16. If no such parameter is configured, the
+  number of queues per VF is 4 by default.
+
 * **Updated mlx5 driver.**
 
   Updated the mlx5 driver including the following changes:
@@ -117,16 +111,10 @@ New Features
   * Added tunneled packets classification.
   * Added inner checksum offload.
 
-* **Added the igb ethernet driver to support RSS with flow API.**
-
-  Rte_flow actually defined to include RSS, but till now, RSS is out of
-  rte_flow. This patch is to support igb NIC with existing RSS configuration
-  using rte_flow API.
-
-* **Add AVF (Adaptive Virtual Function) net PMD.**
+* **Added AVF (Adaptive Virtual Function) net PMD.**
 
-  A new net PMD has been added, which supports Intel® Ethernet Adaptive
-  Virtual Function (AVF) with features list below:
+  Added a new net PMD called AVF (Adaptive Virtual Function), which supports
+  Intel® Ethernet Adaptive Virtual Function (AVF) with features such as:
 
   * Basic Rx/Tx burst
   * SSE vectorized Rx/Tx burst
@@ -140,16 +128,16 @@ New Features
   * Rx/Tx descriptor status
   * Link status update/event
 
-* **Add feature supports for live migration from vhost-net to vhost-user.**
+* **Added feature supports for live migration from vhost-net to vhost-user.**
 
-  To make live migration from vhost-net to vhost-user possible, added
-  feature supports for vhost-user. The features include:
+  Added feature supports for vhost-user to make live migration from vhost-net
+  to vhost-user possible. The features include:
 
-  * VIRTIO_F_ANY_LAYOUT
-  * VIRTIO_F_EVENT_IDX
-  * VIRTIO_NET_F_GUEST_ECN, VIRTIO_NET_F_HOST_ECN
-  * VIRTIO_NET_F_GUEST_UFO, VIRTIO_NET_F_HOST_UFO
-  * VIRTIO_NET_F_GSO
+  * ``VIRTIO

[dpdk-dev] [PATCH] doc: announce ABI change to support VF representors

2018-02-14 Thread Shahaf Shuler
This is following the RFC being discussed and targets 18.05

http://dpdk.org/ml/archives/dev/2018-January/085716.html

Cc: declan.dohe...@intel.com
Cc: mohammad.abdul.a...@intel.com
Cc: ferruh.yi...@intel.com
Cc: remy.hor...@intel.com

Signed-off-by: Shahaf Shuler 
---
 doc/guides/rel_notes/deprecation.rst | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index d59ad5988..f6151de63 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -59,3 +59,9 @@ Deprecation Notices
   be added between the producer and consumer structures. The size of the
   structure and the offset of the fields will remain the same on
   platforms with 64B cache line, but will change on other platforms.
+
+* ethdev: A work is being planned for 18.05 to expose VF port representors
+  as a mean to perform control and data path operation on the different VFs.
+  As VF representor is an ethdev port, new fields are needed in order to map
+  between the VF representor and the VF or the parent PF. Those new fields
+  are to be included in ``rte_eth_dev_info`` struct.
-- 
2.12.0



Re: [dpdk-dev] Accessing 2nd cacheline in rte_pktmbuf_prefree_seg()

2018-02-14 Thread Ananyev, Konstantin


> -Original Message-
> From: Richardson, Bruce
> Sent: Wednesday, February 14, 2018 12:12 PM
> To: Ananyev, Konstantin 
> Cc: Yongseok Koh ; Olivier Matz ; 
> dev@dpdk.org
> Subject: Re: [dpdk-dev] Accessing 2nd cacheline in rte_pktmbuf_prefree_seg()
> 
> On Wed, Feb 14, 2018 at 12:03:55PM +, Ananyev, Konstantin wrote:
> >
> >
> > > -Original Message-
> > > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Ananyev, Konstantin
> > > Sent: Wednesday, February 14, 2018 11:48 AM
> > > To: Yongseok Koh ; Olivier Matz 
> > > 
> > > Cc: dev@dpdk.org
> > > Subject: Re: [dpdk-dev] Accessing 2nd cacheline in 
> > > rte_pktmbuf_prefree_seg()
> > >
> > > Hi Yongseok,
> > >
> > > > > On Feb 13, 2018, at 2:45 PM, Yongseok Koh  wrote:
> > > > >
> > > > > Hi Olivier
> > > > >
> > > > > I'm wondering why rte_pktmbuf_prefree_seg() checks m->next instead of
> > > > > m->nb_segs? As 'next' is in the 2nd cacheline, checking nb_segs seems 
> > > > > beneficial
> > > > > to the cases where almost mbufs have single segment.
> > > > >
> > > > > A customer reported high rate of cache misses in the code and I 
> > > > > thought the
> > > > > following patch could be helpful. I haven't had them try it yet but 
> > > > > just wanted
> > > > > to hear from you.
> > > > >
> > > > > I'd appreciate if you can review this idea.
> > > > >
> > > > > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> > > > > index 62740254d..96edbcb9e 100644
> > > > > --- a/lib/librte_mbuf/rte_mbuf.h
> > > > > +++ b/lib/librte_mbuf/rte_mbuf.h
> > > > > @@ -1398,7 +1398,7 @@ rte_pktmbuf_prefree_seg(struct rte_mbuf *m)
> > > > >if (RTE_MBUF_INDIRECT(m))
> > > > >rte_pktmbuf_detach(m);
> > > > >
> > > > > -   if (m->next != NULL) {
> > > > > +   if (m->nb_segs > 1) {
> > > > >m->next = NULL;
> > > > >m->nb_segs = 1;
> > > > >}
> > > > > @@ -1410,7 +1410,7 @@ rte_pktmbuf_prefree_seg(struct rte_mbuf *m)
> > > > >if (RTE_MBUF_INDIRECT(m))
> > > > >rte_pktmbuf_detach(m);
> > > > >
> > > > > -   if (m->next != NULL) {
> > > > > +   if (m->nb_segs > 1) {
> > > > >m->next = NULL;
> > > > >m->nb_segs = 1;
> > > > >}
> > > >
> > > > Well, m->pool in the 2nd cacheline has to be accessed anyway in order 
> > > > to put it back to the mempool.
> > > > It looks like the cache miss is unavoidable.
> > >
> > > As a thought: in theory PMD can store pool pointer together with each 
> > > mbuf it has to free,
> > > then it could be something like:
> > >
> > > if (rte_pktmbuf_prefree_seg(m[x] != NULL)
> > >rte_mempool_put(pool[x], m[x]);
> > >
> > > Then what you suggested above might help.
> >
> > After another thought - we have to check m->next not m->nb_segs.
> > There could be a situations where nb_segs==1, but m->next != NULL
> > (2-nd segment of the 3 segment packet for example).
> > So probably we have to keep it as it is.
> > Sorry for the noise
> > Konstantin
> 
> It's still worth considering as an option. We could check nb_segs for
> the first segment of a packet and thereafter iterate using the next
> pointer.

In multi-seg case PMD frees segments (not packets).
It could happen that first segment would be already freed while the second 
still not.

> It means that your idea of storing the pool pointer for each
> mbuf becomes useful for single-segment packets.

But then we'll have to support 2 different flavors of prefree_seg().
Alternative would be to change all PMDs multi-seg TX so when first segment is 
going to be freed we update nb_segs for the second and so on.
Both options seems like too much hassle.

Konstantin


Re: [dpdk-dev] [PATCH v3] net/tap: fix promiscuous rules double insertions

2018-02-14 Thread Pascal Mazon
Good job. Looks ok to me.

Acked-by: Pascal Mazon 

On 14/02/2018 12:32, Ophir Munk wrote:
> Running testpmd command "port stop all" followed by command "port start
> all" may result in a TAP error:
> PMD: Kernel refused TC filter rule creation (17): File exists
>
> Root cause analysis: during the execution of "port start all" command
> testpmd calls rte_eth_promiscuous_enable() while during the execution
> of "port stop all" command testpmd does not call
> rte_eth_promiscuous_disable().
> As a result the TAP PMD is trying to add tc (traffic control command)
> promiscuous rules to the remote netvsc device consecutively. From the
> kernel point of view it is seen as an attempt to add the same rule more
> than once. In recent kernels (e.g. version 4.13) this attempt is rejected
> with a "File exists" error. In less recent kernels (e.g. version 4.4) the
> same rule may have been successfully accepted twice, which is undesirable.
>
> In the corrupted code every tc promiscuous rule included a different
> handle number parameter. If instead an identical handle number is
> used for all tc promiscuous rules - all kernels will reject the second
> identical rule with a "File exists" error, which is easy to identify and
> to silently ignore.
>
> Fixes: 2bc06869cd94 ("net/tap: add remote netdevice traffic capture")
> Cc: sta...@dpdk.org
>
> Signed-off-by: Ophir Munk 
> ---
> v1: initial version
> v2: add detailed commit message
> v3: textual fixes to commit message and code comments
>
>  drivers/net/tap/tap_flow.c | 11 +++
>  1 file changed, 11 insertions(+)
>
> diff --git a/drivers/net/tap/tap_flow.c b/drivers/net/tap/tap_flow.c
> index 65657f0..551b2d8 100644
> --- a/drivers/net/tap/tap_flow.c
> +++ b/drivers/net/tap/tap_flow.c
> @@ -123,6 +123,7 @@ enum key_status_e {
>  };
>  
>  #define ISOLATE_HANDLE 1
> +#define REMOTE_PROMISCUOUS_HANDLE 2
>  
>  struct rte_flow {
>   LIST_ENTRY(rte_flow) next; /* Pointer to the next rte_flow structure */
> @@ -1692,9 +1693,15 @@ int tap_flow_implicit_create(struct pmd_internals *pmd,
>* The ISOLATE rule is always present and must have a static handle, as
>* the action is changed whether the feature is enabled (DROP) or
>* disabled (PASSTHRU).
> +  * There is just one REMOTE_PROMISCUOUS rule in all cases. It should
> +  * have a static handle such that adding it twice will fail with EEXIST
> +  * with any kernel version. Remark: old kernels may falsely accept the
> +  * same REMOTE_PROMISCUOUS rules if they had different handles.
>*/
>   if (idx == TAP_ISOLATE)
>   remote_flow->msg.t.tcm_handle = ISOLATE_HANDLE;
> + else if (idx == TAP_REMOTE_PROMISC)
> + remote_flow->msg.t.tcm_handle = REMOTE_PROMISCUOUS_HANDLE;
>   else
>   tap_flow_set_handle(remote_flow);
>   if (priv_flow_process(pmd, attr, items, actions, NULL,
> @@ -1709,12 +1716,16 @@ int tap_flow_implicit_create(struct pmd_internals 
> *pmd,
>   }
>   err = tap_nl_recv_ack(pmd->nlsk_fd);
>   if (err < 0) {
> + /* Silently ignore re-entering remote promiscuous rule */
> + if (errno == EEXIST && idx == TAP_REMOTE_PROMISC)
> + goto success;
>   RTE_LOG(ERR, PMD,
>   "Kernel refused TC filter rule creation (%d): %s\n",
>   errno, strerror(errno));
>   goto fail;
>   }
>   LIST_INSERT_HEAD(&pmd->implicit_flows, remote_flow, next);
> +success:
>   return 0;
>  fail:
>   if (remote_flow)



Re: [dpdk-dev] [PATCH] doc: announce ABI change to support VF representors

2018-02-14 Thread Thomas Monjalon
14/02/2018 13:32, Shahaf Shuler:
> This is following the RFC being discussed and targets 18.05
> 
> http://dpdk.org/ml/archives/dev/2018-January/085716.html
> 
> Cc: declan.dohe...@intel.com
> Cc: mohammad.abdul.a...@intel.com
> Cc: ferruh.yi...@intel.com
> Cc: remy.hor...@intel.com
> 
> Signed-off-by: Shahaf Shuler 

Acked-by: Thomas Monjalon 


[dpdk-dev] [PATCH v2] doc: update release notes for 18.02

2018-02-14 Thread John McNamara
Fix grammar, spelling and formatting of DPDK 18.02 release notes.

Signed-off-by: John McNamara 
---
 doc/guides/rel_notes/release_18_02.rst | 199 -
 1 file changed, 69 insertions(+), 130 deletions(-)

diff --git a/doc/guides/rel_notes/release_18_02.rst 
b/doc/guides/rel_notes/release_18_02.rst
index 04202ba..bc08118 100644
--- a/doc/guides/rel_notes/release_18_02.rst
+++ b/doc/guides/rel_notes/release_18_02.rst
@@ -41,7 +41,7 @@ New Features
  Also, make sure to start the actual text at the margin.
  =
 
-* **Add function to allow releasing internal EAL resources on exit**
+* **Added function to allow releasing internal EAL resources on exit.**
 
   During ``rte_eal_init()`` EAL allocates memory from hugepages to enable its
   core libraries to perform their tasks. The ``rte_eal_cleanup()`` function
@@ -50,32 +50,12 @@ New Features
   exiting. Not calling this function could result in leaking hugepages, leading
   to failure during initialization of secondary processes.
 
-* **Added the ixgbe ethernet driver to support RSS with flow API.**
+* **Added igb, ixgbe and i40e ethernet driver to support RSS with flow API.**
 
-  Rte_flow actually defined to include RSS, but till now, RSS is out of
-  rte_flow. This patch is to support igb and ixgbe NIC with existing RSS
-  configuration using rte_flow API.
+  Added support for igb, ixgbe and i40e NICs with existing RSS configuration
+  using the ``rte_flow`` API.
 
-* **Add MAC loopback support for i40e.**
-
-  Add MAC loopback support for i40e in order to support test task asked by
-  users. According to the device configuration, it will setup TX->RX loopback
-  link or not.
-
-* **Add the support of run time determination of number of queues per i40e VF**
-
-  The number of queue per VF is determined by its host PF. If the PCI address
-  of an i40e PF is :bb.cc, the number of queues per VF can be configured
-  with EAL parameter like -w :bb.cc,queue-num-per-vf=n. The value n can be
-  1, 2, 4, 8 or 16. If no such parameter is configured, the number of queues
-  per VF is 4 by default.
-
-* **Added the i40e ethernet driver to support RSS with flow API.**
-
-  Rte_flow actually defined to include RSS, but till now, RSS is out of
-  rte_flow. This patch is to support i40e NIC with existing RSS
-  configuration using rte_flow API.It also enable queue region configuration
-  using flow API for i40e.
+  Also enabled queue region configuration using the ``rte_flow`` API for i40e.
 
 * **Updated i40e driver to support PPPoE/PPPoL2TP.**
 
@@ -83,6 +63,20 @@ New Features
   profiles which can be programmed by dynamic device personalization (DDP)
   process.
 
+* **Added MAC loopback support for i40e.**
+
+  Added MAC loopback support for i40e in order to support test tasks requested
+  by users. It will setup ``Tx -> Rx`` loopback link according to the device
+  configuration.
+
+* **Added support of run time determination of number of queues per i40e VF.**
+
+  The number of queue per VF is determined by its host PF. If the PCI address
+  of an i40e PF is ``:bb.cc``, the number of queues per VF can be
+  configured with EAL parameter like ``-w :bb.cc,queue-num-per-vf=n``. The
+  value n can be 1, 2, 4, 8 or 16. If no such parameter is configured, the
+  number of queues per VF is 4 by default.
+
 * **Updated mlx5 driver.**
 
   Updated the mlx5 driver including the following changes:
@@ -117,16 +111,10 @@ New Features
   * Added tunneled packets classification.
   * Added inner checksum offload.
 
-* **Added the igb ethernet driver to support RSS with flow API.**
-
-  Rte_flow actually defined to include RSS, but till now, RSS is out of
-  rte_flow. This patch is to support igb NIC with existing RSS configuration
-  using rte_flow API.
-
-* **Add AVF (Adaptive Virtual Function) net PMD.**
+* **Added AVF (Adaptive Virtual Function) net PMD.**
 
-  A new net PMD has been added, which supports Intel® Ethernet Adaptive
-  Virtual Function (AVF) with features list below:
+  Added a new net PMD called AVF (Adaptive Virtual Function), which supports
+  Intel® Ethernet Adaptive Virtual Function (AVF) with features such as:
 
   * Basic Rx/Tx burst
   * SSE vectorized Rx/Tx burst
@@ -140,17 +128,22 @@ New Features
   * Rx/Tx descriptor status
   * Link status update/event
 
-* **Add feature supports for live migration from vhost-net to vhost-user.**
+* **Added feature supports for live migration from vhost-net to vhost-user.**
 
-  To make live migration from vhost-net to vhost-user possible, added
-  feature supports for vhost-user. The features include:
+  Added feature supports for vhost-user to make live migration from vhost-net
+  to vhost-user possible. The features include:
 
-  * VIRTIO_F_ANY_LAYOUT
-  * VIRTIO_F_EVENT_IDX
-  * VIRTIO_NET_F_GUEST_ECN, VIRTIO_NET_F_HOST_ECN
-  * VIRTIO_NET_F_GUEST_UFO, VIRTIO_NET_F_HOST_UFO
-  * VIRTIO_NET_F_GSO
+  * ``VIRTIO

Re: [dpdk-dev] [PATCH] net/failsafe: fix Rx interrupt reinstallation

2018-02-14 Thread Gaëtan Rivet
Hi Matan,

On Tue, Feb 13, 2018 at 10:59:32PM +, Matan Azrad wrote:
> Fail-safe dev_start() operation can be called by both the application
> and the hot-plug alarm mechanism.
> 
> The installation of Rx interrupt are triggered from dev_start() in any
> time it is called while actually the Rx interrupt should be installed
> only by the application calls.
> 
> So, each plug-in event causes reinstallation which causes memory leak.
> 
> Trigger the Rx interrupt installation only for application calls.
> 
> Fixes: 9e0360aebf23 ("net/failsafe: register as Rx interrupt mode")
> 
> Signed-off-by: Matan Azrad 
> ---
>  drivers/net/failsafe/failsafe_ops.c | 10 ++
>  1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/net/failsafe/failsafe_ops.c 
> b/drivers/net/failsafe/failsafe_ops.c
> index 057e435..bbbd335 100644
> --- a/drivers/net/failsafe/failsafe_ops.c
> +++ b/drivers/net/failsafe/failsafe_ops.c
> @@ -181,10 +181,12 @@
>   int ret;
>  
>   fs_lock(dev, 0);
> - ret = failsafe_rx_intr_install(dev);
> - if (ret) {
> - fs_unlock(dev, 0);
> - return ret;
> + if (PRIV(dev)->alarm_lock == 0) {

I dislike having to rely on unrelated context of execution to decide a
code-path.

I'd prefer to make interrupt installation dependent on the interrupt
state instead.

I think it should be possible to forbid reinstallation within
failsafe_rx_intr_install directly, e.g.

diff --git a/drivers/net/failsafe/failsafe_intr.c 
b/drivers/net/failsafe/failsafe_intr.c
index f6ff04dc8..46c3aa5f2 100644
--- a/drivers/net/failsafe/failsafe_intr.c
+++ b/drivers/net/failsafe/failsafe_intr.c
@@ -523,7 +523,8 @@ failsafe_rx_intr_install(struct rte_eth_dev *dev)
const struct rte_intr_conf *const intr_conf =
&priv->dev->data->dev_conf.intr_conf;

-   if (intr_conf->rxq == 0)
+   if (intr_conf->rxq == 0 ||
+   dev->intr_handle != NULL)
return 0;
if (fs_rx_intr_vec_install(priv) < 0)
return -rte_errno;

This way the logic is self-dependent and the check limited to this
component.

There might be better way to do this, it's only an example to explain my
point.

> + ret = failsafe_rx_intr_install(dev);
> + if (ret) {
> + fs_unlock(dev, 0);
> + return ret;
> + }
>   }
>   FOREACH_SUBDEV(sdev, i, dev) {
>   if (sdev->state != DEV_ACTIVE)
> -- 
> 1.9.5
> 

-- 
Gaëtan Rivet
6WIND


Re: [dpdk-dev] [PATCH] doc: announce ABI change to support VF representors

2018-02-14 Thread Doherty, Declan

On 14/02/2018 12:32 PM, Shahaf Shuler wrote:

This is following the RFC being discussed and targets 18.05

http://dpdk.org/ml/archives/dev/2018-January/085716.html

Cc: declan.dohe...@intel.com
Cc: mohammad.abdul.a...@intel.com
Cc: ferruh.yi...@intel.com
Cc: remy.hor...@intel.com

Signed-off-by: Shahaf Shuler 
---
  doc/guides/rel_notes/deprecation.rst | 6 ++
  1 file changed, 6 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index d59ad5988..f6151de63 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -59,3 +59,9 @@ Deprecation Notices
    be added between the producer and consumer structures. The size of the
    structure and the offset of the fields will remain the same on
    platforms with 64B cache line, but will change on other platforms.
+
+* ethdev: A work is being planned for 18.05 to expose VF port representors
+  as a mean to perform control and data path operation on the different VFs.
+  As VF representor is an ethdev port, new fields are needed in order to map
+  between the VF representor and the VF or the parent PF. Those new fields
+  are to be included in ``rte_eth_dev_info`` struct.


Acked-by: Declan Doherty 


Re: [dpdk-dev] [PATCH] net/failsafe: fix Rx interrupt reinstallation

2018-02-14 Thread Matan Azrad
Hi Gaetan

Agree, will send V2.

> -Original Message-
> From: Gaëtan Rivet [mailto:gaetan.ri...@6wind.com]
> Sent: Wednesday, February 14, 2018 3:52 PM
> To: Matan Azrad 
> Cc: dev@dpdk.org
> Subject: Re: [PATCH] net/failsafe: fix Rx interrupt reinstallation
> 
> Hi Matan,
> 
> On Tue, Feb 13, 2018 at 10:59:32PM +, Matan Azrad wrote:
> > Fail-safe dev_start() operation can be called by both the application
> > and the hot-plug alarm mechanism.
> >
> > The installation of Rx interrupt are triggered from dev_start() in any
> > time it is called while actually the Rx interrupt should be installed
> > only by the application calls.
> >
> > So, each plug-in event causes reinstallation which causes memory leak.
> >
> > Trigger the Rx interrupt installation only for application calls.
> >
> > Fixes: 9e0360aebf23 ("net/failsafe: register as Rx interrupt mode")
> >
> > Signed-off-by: Matan Azrad 
> > ---
> >  drivers/net/failsafe/failsafe_ops.c | 10 ++
> >  1 file changed, 6 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/net/failsafe/failsafe_ops.c
> > b/drivers/net/failsafe/failsafe_ops.c
> > index 057e435..bbbd335 100644
> > --- a/drivers/net/failsafe/failsafe_ops.c
> > +++ b/drivers/net/failsafe/failsafe_ops.c
> > @@ -181,10 +181,12 @@
> > int ret;
> >
> > fs_lock(dev, 0);
> > -   ret = failsafe_rx_intr_install(dev);
> > -   if (ret) {
> > -   fs_unlock(dev, 0);
> > -   return ret;
> > +   if (PRIV(dev)->alarm_lock == 0) {
> 
> I dislike having to rely on unrelated context of execution to decide a code-
> path.
> 
> I'd prefer to make interrupt installation dependent on the interrupt state
> instead.
> 
> I think it should be possible to forbid reinstallation within
> failsafe_rx_intr_install directly, e.g.
> 
> diff --git a/drivers/net/failsafe/failsafe_intr.c
> b/drivers/net/failsafe/failsafe_intr.c
> index f6ff04dc8..46c3aa5f2 100644
> --- a/drivers/net/failsafe/failsafe_intr.c
> +++ b/drivers/net/failsafe/failsafe_intr.c
> @@ -523,7 +523,8 @@ failsafe_rx_intr_install(struct rte_eth_dev *dev)
> const struct rte_intr_conf *const intr_conf =
> &priv->dev->data->dev_conf.intr_conf;
> 
> -   if (intr_conf->rxq == 0)
> +   if (intr_conf->rxq == 0 ||
> +   dev->intr_handle != NULL)
> return 0;
> if (fs_rx_intr_vec_install(priv) < 0)
> return -rte_errno;
> 
> This way the logic is self-dependent and the check limited to this component.
> 
> There might be better way to do this, it's only an example to explain my
> point.
> 
> > +   ret = failsafe_rx_intr_install(dev);
> > +   if (ret) {
> > +   fs_unlock(dev, 0);
> > +   return ret;
> > +   }
> > }
> > FOREACH_SUBDEV(sdev, i, dev) {
> > if (sdev->state != DEV_ACTIVE)
> > --
> > 1.9.5
> >
> 
> --
> Gaëtan Rivet
> 6WIND


Re: [dpdk-dev] Accessing 2nd cacheline in rte_pktmbuf_prefree_seg()

2018-02-14 Thread Ananyev, Konstantin


> -Original Message-
> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Ananyev, Konstantin
> Sent: Wednesday, February 14, 2018 12:35 PM
> To: Richardson, Bruce 
> Cc: Yongseok Koh ; Olivier Matz ; 
> dev@dpdk.org
> Subject: Re: [dpdk-dev] Accessing 2nd cacheline in rte_pktmbuf_prefree_seg()
> 
> 
> 
> > -Original Message-
> > From: Richardson, Bruce
> > Sent: Wednesday, February 14, 2018 12:12 PM
> > To: Ananyev, Konstantin 
> > Cc: Yongseok Koh ; Olivier Matz 
> > ; dev@dpdk.org
> > Subject: Re: [dpdk-dev] Accessing 2nd cacheline in rte_pktmbuf_prefree_seg()
> >
> > On Wed, Feb 14, 2018 at 12:03:55PM +, Ananyev, Konstantin wrote:
> > >
> > >
> > > > -Original Message-
> > > > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Ananyev, Konstantin
> > > > Sent: Wednesday, February 14, 2018 11:48 AM
> > > > To: Yongseok Koh ; Olivier Matz 
> > > > 
> > > > Cc: dev@dpdk.org
> > > > Subject: Re: [dpdk-dev] Accessing 2nd cacheline in 
> > > > rte_pktmbuf_prefree_seg()
> > > >
> > > > Hi Yongseok,
> > > >
> > > > > > On Feb 13, 2018, at 2:45 PM, Yongseok Koh  
> > > > > > wrote:
> > > > > >
> > > > > > Hi Olivier
> > > > > >
> > > > > > I'm wondering why rte_pktmbuf_prefree_seg() checks m->next instead 
> > > > > > of
> > > > > > m->nb_segs? As 'next' is in the 2nd cacheline, checking nb_segs 
> > > > > > seems beneficial
> > > > > > to the cases where almost mbufs have single segment.
> > > > > >
> > > > > > A customer reported high rate of cache misses in the code and I 
> > > > > > thought the
> > > > > > following patch could be helpful. I haven't had them try it yet but 
> > > > > > just wanted
> > > > > > to hear from you.
> > > > > >
> > > > > > I'd appreciate if you can review this idea.
> > > > > >
> > > > > > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> > > > > > index 62740254d..96edbcb9e 100644
> > > > > > --- a/lib/librte_mbuf/rte_mbuf.h
> > > > > > +++ b/lib/librte_mbuf/rte_mbuf.h
> > > > > > @@ -1398,7 +1398,7 @@ rte_pktmbuf_prefree_seg(struct rte_mbuf *m)
> > > > > >if (RTE_MBUF_INDIRECT(m))
> > > > > >rte_pktmbuf_detach(m);
> > > > > >
> > > > > > -   if (m->next != NULL) {
> > > > > > +   if (m->nb_segs > 1) {
> > > > > >m->next = NULL;
> > > > > >m->nb_segs = 1;
> > > > > >}
> > > > > > @@ -1410,7 +1410,7 @@ rte_pktmbuf_prefree_seg(struct rte_mbuf *m)
> > > > > >if (RTE_MBUF_INDIRECT(m))
> > > > > >rte_pktmbuf_detach(m);
> > > > > >
> > > > > > -   if (m->next != NULL) {
> > > > > > +   if (m->nb_segs > 1) {
> > > > > >m->next = NULL;
> > > > > >m->nb_segs = 1;
> > > > > >}
> > > > >
> > > > > Well, m->pool in the 2nd cacheline has to be accessed anyway in order 
> > > > > to put it back to the mempool.
> > > > > It looks like the cache miss is unavoidable.
> > > >
> > > > As a thought: in theory PMD can store pool pointer together with each 
> > > > mbuf it has to free,
> > > > then it could be something like:
> > > >
> > > > if (rte_pktmbuf_prefree_seg(m[x] != NULL)
> > > >rte_mempool_put(pool[x], m[x]);
> > > >
> > > > Then what you suggested above might help.
> > >
> > > After another thought - we have to check m->next not m->nb_segs.
> > > There could be a situations where nb_segs==1, but m->next != NULL
> > > (2-nd segment of the 3 segment packet for example).
> > > So probably we have to keep it as it is.
> > > Sorry for the noise
> > > Konstantin
> >
> > It's still worth considering as an option. We could check nb_segs for
> > the first segment of a packet and thereafter iterate using the next
> > pointer.
> 
> In multi-seg case PMD frees segments (not packets).
> It could happen that first segment would be already freed while the second
> still not.
> 
> > It means that your idea of storing the pool pointer for each
> > mbuf becomes useful for single-segment packets.
> 
> But then we'll have to support 2 different flavors of prefree_seg().
> Alternative would be to change all PMDs multi-seg TX so when first segment is
> going to be freed we update nb_segs for the second and so on.
> Both options seems like too much hassle.
> 

As  a side thought what probably can be  done to minimize access
to 2-nd mbuf's cache line at PMD tx free:
Introduce something like that:
static __rte_always_inline struct rte_mepool *
xxx_prefree_seg(struct rte_mbuf *m)
{
if (rte_mbuf_refcnt_read(m) == 1 && RTE_MBUF_DIRECT(m)) {
if (m->next != NULL) {
m->next = NULL;
m->nb_segs = 1;
}
return m->pool;
   }
   return NULL;
}

Then at tx_burst() before doing actual TX PMD can call that function
and store it's return value along with mbuf:
..
m[x] = pkt;
pool[x] = xxx_prefree_seg(m[x]);

Re: [dpdk-dev] [PATCH v2] net/tap: fix promiscuous rules double insersions

2018-02-14 Thread Ophir Munk
Hi,
Regarding your question:
> Are we sure the previous rule is still in the registered implicit flows?

It is confirmed. 

After running several "port stop/start" commands in testpmd I am executing 
testpmd> flow isolate  1 
and notice that promiscuous rule is removed from remote device.

Regards,
Ophir

> -Original Message-
> From: Ophir Munk
> Sent: Wednesday, February 14, 2018 1:24 PM
> To: 'Pascal Mazon' ; dev@dpdk.org
> Cc: Thomas Monjalon ; Olga Shern
> ; sta...@dpdk.org
> Subject: RE: [PATCH v2] net/tap: fix promiscuous rules double insersions
> 
> Please see inline.
> I will send updated v3
> 
> > -Original Message-
> > From: Pascal Mazon [mailto:pascal.ma...@6wind.com]
> > Sent: Wednesday, February 14, 2018 10:51 AM
> > To: Ophir Munk ; dev@dpdk.org
> > Cc: Thomas Monjalon ; Olga Shern
> > ; sta...@dpdk.org
> > Subject: Re: [PATCH v2] net/tap: fix promiscuous rules double
> > insersions
> >
> > Hi Ophir,
> >
> > Typo in title: s/insersions/insertions/
> >
> 
> Fixed in v3
> 
> > I'm ok on principle, I have just a few comments inline.
> >
> > Regards,
> > Pascal
> >
> > On 13/02/2018 19:35, Ophir Munk wrote:
> > > Running testpmd command "port stop all" followed by command "port
> > > start all" may result in a TAP error:
> > > PMD: Kernel refused TC filter rule creation (17): File exists
> > >
> > > Root cause analysis: during the execution of "port start all"
> > > command testpmd calls rte_eth_promiscuous_enable() while during the
> > > execution of "port stop all" command testpmd does not call
> > > rte_eth_promiscuous_enable().
> > Shouldn't it be rte_eth_promiscuous_disable()?
> 
> Yes it should. Fixed in v3
> 
> > > As a result the TAP PMD is trying to add tc (traffic control
> > > command) promiscuous rules to the remote netvsc device
> > > consecutively. From the kernel point of view it is seen as an
> > > attempt to add the same rule more than once. In recent kernels (e.g.
> > > version 4.13) this attempt is rejected with a "File exists" error. In less
> recent kernels (e.g.
> > > version 4.4) the same rule may have been accepted twice
> > > successfully,
> > which is undesirable.
> > >
> > > In the corrupted code every tc promiscuous rule included a different
> > > handle number parameter. If instead an identical handle number
> > > parameter is used for all tc promiscuous rules - all kernels will
> > > reject the second rule with a "File exists" error, which is easy to
> > > identify and to silently ignore.
> > >
> > > Fixes: 2bc06869cd94 ("net/tap: add remote netdevice traffic
> > > capture")
> > > Cc: sta...@dpdk.org
> > >
> > > Signed-off-by: Ophir Munk 
> > > ---
> > > v2: add detailed commit message
> > >
> > >  drivers/net/tap/tap_flow.c | 11 +++
> > >  1 file changed, 11 insertions(+)
> > >
> > > diff --git a/drivers/net/tap/tap_flow.c b/drivers/net/tap/tap_flow.c
> > > index 65657f0..d1f4a52 100644
> > > --- a/drivers/net/tap/tap_flow.c
> > > +++ b/drivers/net/tap/tap_flow.c
> > > @@ -123,6 +123,7 @@ enum key_status_e {  };
> > >
> > >  #define ISOLATE_HANDLE 1
> > > +#define REMOTE_PROMISCUOUS_HANDLE 2
> > >
> > >  struct rte_flow {
> > >   LIST_ENTRY(rte_flow) next; /* Pointer to the next rte_flow
> > > structure */ @@ -1692,9 +1693,15 @@ int
> > > tap_flow_implicit_create(struct
> > pmd_internals *pmd,
> > >* The ISOLATE rule is always present and must have a static
> > > handle,
> > as
> > >* the action is changed whether the feature is enabled (DROP) or
> > >* disabled (PASSTHRU).
> > > +  * There is just one REMOTE_PROMISCUOUS rule in all cases. It
> > should
> > > +  * have a static handle such that adding it twice will fail with EEXIST
> > > +  * with any kernel version. Remark: old kernels may falsely accept the
> > > +  * same REMOTE_PREMISCUOUS rules if they had different handles.
> > s/PREMISCUOUS/PROMISCUOUS/
> > >*/
> > >   if (idx == TAP_ISOLATE)
> > >   remote_flow->msg.t.tcm_handle = ISOLATE_HANDLE;
> > > + else if (idx == TAP_REMOTE_PROMISC)
> > > + remote_flow->msg.t.tcm_handle =
> > REMOTE_PROMISCUOUS_HANDLE;
> > >   else
> > >   tap_flow_set_handle(remote_flow);
> > >   if (priv_flow_process(pmd, attr, items, actions, NULL, @@ -1709,12
> > > +1716,16 @@ int tap_flow_implicit_create(struct pmd_internals *pmd,
> > >   }
> > >   err = tap_nl_recv_ack(pmd->nlsk_fd);
> > >   if (err < 0) {
> > > + /* Silently ignore re-entering remote promiscuous rule */
> > > + if (errno == EEXIST && idx == TAP_REMOTE_PROMISC)
> > > + goto success;
> > >   RTE_LOG(ERR, PMD,
> > >   "Kernel refused TC filter rule creation (%d): %s\n",
> > >   errno, strerror(errno));
> > >   goto fail;
> > >   }
> > >   LIST_INSERT_HEAD(&pmd->implicit_flows, remote_flow, next);
> > Are we sure the previous rule is still in the registered implicit flows?
> 
> I will run tests to verify that.
> 
> > > +success:
> > >   return 0;
> > >  fail:
> > >   if (re

Re: [dpdk-dev] [PATCH] doc: add ABI change notice for numa_node_count in eal

2018-02-14 Thread Thomas Monjalon
14/02/2018 01:04, Thomas Monjalon:
> > > > > There will be a new function added in v18.05 that will return number 
> > > > > of
> > > > > detected sockets, which will change the ABI.
> > > > > 
> > > > > Signed-off-by: Anatoly Burakov 
> > > > > ---
> > > > > +* eal: new ``numa_node_count`` member will be added to ``rte_config``
> > > > > +structure in v18.05.
> > > > 
> > > > Acked-by: John McNamara 
> > > 
> > > Acked-by: Jerin Jacob 
> > >
> > Acked-by: Bruce Richardson 
> 
> Acked-by: Thomas Monjalon 

Applied


Re: [dpdk-dev] [PATCH v3] net/tap: fix promiscuous rules double insertions

2018-02-14 Thread Thomas Monjalon
14/02/2018 14:13, Pascal Mazon:
> On 14/02/2018 12:32, Ophir Munk wrote:
> > Running testpmd command "port stop all" followed by command "port start
> > all" may result in a TAP error:
> > PMD: Kernel refused TC filter rule creation (17): File exists
> >
> > Root cause analysis: during the execution of "port start all" command
> > testpmd calls rte_eth_promiscuous_enable() while during the execution
> > of "port stop all" command testpmd does not call
> > rte_eth_promiscuous_disable().
> > As a result the TAP PMD is trying to add tc (traffic control command)
> > promiscuous rules to the remote netvsc device consecutively. From the
> > kernel point of view it is seen as an attempt to add the same rule more
> > than once. In recent kernels (e.g. version 4.13) this attempt is rejected
> > with a "File exists" error. In less recent kernels (e.g. version 4.4) the
> > same rule may have been successfully accepted twice, which is undesirable.
> >
> > In the corrupted code every tc promiscuous rule included a different
> > handle number parameter. If instead an identical handle number is
> > used for all tc promiscuous rules - all kernels will reject the second
> > identical rule with a "File exists" error, which is easy to identify and
> > to silently ignore.
> >
> > Fixes: 2bc06869cd94 ("net/tap: add remote netdevice traffic capture")
> > Cc: sta...@dpdk.org
> >
> > Signed-off-by: Ophir Munk 
> Acked-by: Pascal Mazon 

Applied, thanks


[dpdk-dev] [PATCH] doc: add tested platforms with Mellanox NICs

2018-02-14 Thread Raslan Darawsheh
Signed-off-by: Raslan Darawsheh 
---
 doc/guides/rel_notes/release_18_02.rst | 146 +
 1 file changed, 146 insertions(+)

diff --git a/doc/guides/rel_notes/release_18_02.rst 
b/doc/guides/rel_notes/release_18_02.rst
index 04202ba..8f681f0 100644
--- a/doc/guides/rel_notes/release_18_02.rst
+++ b/doc/guides/rel_notes/release_18_02.rst
@@ -477,3 +477,149 @@ Tested Platforms
* Device id (pf/vf): 8086:1521 / 8086:1520
* Driver version: 5.3.0-k (igb)
 
+* Intel(R) platforms with Mellanox(R) NICs combinations
+
+   * Platform details:
+ * Intel(R) Xeon(R) CPU E5-2697A v4 @ 2.60GHz
+ * Intel(R) Xeon(R) CPU E5-2697 v3 @ 2.60GHz
+ * Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz
+ * Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz
+ * Intel(R) Xeon(R) CPU E5-2640 @ 2.50GHz
+ * Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz
+
+   * OS:
+ * Red Hat Enterprise Linux Server release 7.5 Beta (Maipo)
+ * Red Hat Enterprise Linux Server release 7.4 (Maipo)
+ * Red Hat Enterprise Linux Server release 7.3 (Maipo)
+ * Red Hat Enterprise Linux Server release 7.2 (Maipo)
+ * Ubuntu 17.10
+ * Ubuntu 16.10
+ * Ubuntu 16.04
+
+   * MLNX_OFED: 4.2-1.0.0.0
+   * MLNX_OFED: 4.3-0.1.6.0
+
+   * NICs:
+
+ * Mellanox(R) ConnectX(R)-3 Pro 40G MCX354A-FCC_Ax (2x40G)
+
+   * Host interface: PCI Express 3.0 x8
+   * Device ID: 15b3:1007
+   * Firmware version: 2.42.5000
+
+ * Mellanox(R) ConnectX(R)-4 10G MCX4111A-XCAT (1x10G)
+
+   * Host interface: PCI Express 3.0 x8
+   * Device ID: 15b3:1013
+   * Firmware version: 12.21.1000 and above
+
+ * Mellanox(R) ConnectX(R)-4 10G MCX4121A-XCAT (2x10G)
+
+   * Host interface: PCI Express 3.0 x8
+   * Device ID: 15b3:1013
+   * Firmware version: 12.21.1000 and above
+
+ * Mellanox(R) ConnectX(R)-4 25G MCX4111A-ACAT (1x25G)
+
+   * Host interface: PCI Express 3.0 x8
+   * Device ID: 15b3:1013
+   * Firmware version: 12.21.1000 and above
+
+ * Mellanox(R) ConnectX(R)-4 25G MCX4121A-ACAT (2x25G)
+
+   * Host interface: PCI Express 3.0 x8
+   * Device ID: 15b3:1013
+   * Firmware version: 12.21.1000 and above
+
+ * Mellanox(R) ConnectX(R)-4 40G MCX4131A-BCAT/MCX413A-BCAT (1x40G)
+
+   * Host interface: PCI Express 3.0 x8
+   * Device ID: 15b3:1013
+   * Firmware version: 12.21.1000 and above
+
+ * Mellanox(R) ConnectX(R)-4 40G MCX415A-BCAT (1x40G)
+
+   * Host interface: PCI Express 3.0 x16
+   * Device ID: 15b3:1013
+   * Firmware version: 12.21.1000 and above
+
+ * Mellanox(R) ConnectX(R)-4 50G MCX4131A-GCAT/MCX413A-GCAT (1x50G)
+
+   * Host interface: PCI Express 3.0 x8
+   * Device ID: 15b3:1013
+   * Firmware version: 12.21.1000 and above
+
+ * Mellanox(R) ConnectX(R)-4 50G MCX414A-BCAT (2x50G)
+
+   * Host interface: PCI Express 3.0 x8
+   * Device ID: 15b3:1013
+   * Firmware version: 12.21.1000 and above
+
+ * Mellanox(R) ConnectX(R)-4 50G MCX415A-GCAT/MCX416A-BCAT/MCX416A-GCAT 
(2x50G)
+
+   * Host interface: PCI Express 3.0 x16
+   * Device ID: 15b3:1013
+   * Firmware version: 12.21.1000 and above
+   * Firmware version: 12.21.1000 and above
+
+ * Mellanox(R) ConnectX(R)-4 50G MCX415A-CCAT (1x100G)
+
+   * Host interface: PCI Express 3.0 x16
+   * Device ID: 15b3:1013
+   * Firmware version: 12.21.1000 and above
+
+ * Mellanox(R) ConnectX(R)-4 100G MCX416A-CCAT (2x100G)
+
+  * Host interface: PCI Express 3.0 x16
+  * Device ID: 15b3:1013
+  * Firmware version: 12.21.1000 and above
+
+ * Mellanox(R) ConnectX(R)-4 Lx 10G MCX4121A-XCAT (2x10G)
+
+  * Host interface: PCI Express 3.0 x8
+  * Device ID: 15b3:1015
+  * Firmware version: 14.21.1000 and above
+
+ * Mellanox(R) ConnectX(R)-4 Lx 25G MCX4121A-ACAT (2x25G)
+
+  * Host interface: PCI Express 3.0 x8
+  * Device ID: 15b3:1015
+  * Firmware version: 14.21.1000 and above
+
+ * Mellanox(R) ConnectX(R)-5 100G MCX556A-ECAT (2x100G)
+
+   * Host interface: PCI Express 3.0 x16
+   * Device ID: 15b3:1017
+   * Firmware version: 16.21.1000 and above
+
+ * Mellanox(R) ConnectX-5 Ex EN 100G MCX516A-CDAT (2x100G)
+
+   * Host interface: PCI Express 4.0 x16
+   * Device ID: 15b3:1019
+   * Firmware version: 16.21.1000 and above
+
+* ARM platforms with Mellanox(R) NICs combinations
+
+   * Platform details:
+
+ * Qualcomm ARM 1.1 2500MHz
+
+   * OS:
+
+ * Ubuntu 16.04
+
+   * MLNX_OFED: 4.2-1.0.0.0
+
+   * NICs:
+
+ * Mellanox(R) ConnectX(R)-4 Lx 25G MCX4121A-ACAT (2x25G)
+
+   * Host interface: PCI Express 3.0 x8
+   * Device ID: 15b3:1015
+   * Firmware version: 14.21.1000
+
+ * Mellanox(R) ConnectX(R)-5 100G MCX556A-ECAT (2x100G)
+
+   * Host interface: PCI Express 3.0 x16
+   * Device ID: 15b3:1017
+   * Firmware version: 16.21.1000
-- 
2.7.4



[dpdk-dev] [PATCH v2] net/failsafe: fix Rx interrupt reinstallation

2018-02-14 Thread Matan Azrad
Fail-safe dev_start() operation can be called by both the application
and the hot-plug alarm mechanism.

The installation of Rx interrupt are triggered from dev_start() in any
time it is called while actually the Rx interrupt should be installed
only by the application calls.

So, each plug-in event causes reinstallation which causes memory leak
and spoils the fail-safe Rx interrupt mechanism.

Trigger the Rx interrupt installation only when it does not exist.

Fixes: 9e0360aebf23 ("net/failsafe: register as Rx interrupt mode")

Signed-off-by: Matan Azrad 
---
 drivers/net/failsafe/failsafe_intr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/failsafe/failsafe_intr.c 
b/drivers/net/failsafe/failsafe_intr.c
index f6ff04d..6b7f9c1 100644
--- a/drivers/net/failsafe/failsafe_intr.c
+++ b/drivers/net/failsafe/failsafe_intr.c
@@ -523,7 +523,7 @@ void failsafe_rx_intr_uninstall_subdevice(struct sub_device 
*sdev)
const struct rte_intr_conf *const intr_conf =
&priv->dev->data->dev_conf.intr_conf;
 
-   if (intr_conf->rxq == 0)
+   if (intr_conf->rxq == 0 || dev->intr_handle != NULL)
return 0;
if (fs_rx_intr_vec_install(priv) < 0)
return -rte_errno;
-- 
1.9.5



Re: [dpdk-dev] [PATCH v2] doc: add deprecation notice for memory hotplug changes

2018-02-14 Thread Thomas Monjalon
07/02/2018 11:11, Jerin Jacob:
> -Original Message-
> > Date: Mon, 5 Feb 2018 11:47:42 +
> > From: Bruce Richardson 
> > To: Anatoly Burakov 
> > CC: dev@dpdk.org, Neil Horman , John McNamara
> >  , Marko Kovacevic 
> > Subject: Re: [dpdk-dev] [PATCH v2] doc: add deprecation notice for memory
> >  hotplug changes
> > User-Agent: Mutt/1.9.1 (2017-09-22)
> > 
> > On Thu, Jan 18, 2018 at 10:32:28AM +, Anatoly Burakov wrote:
> > > Due to coming changes outlined in memory hotplug RFC, there will
> > > be several API/ABI changes.
> > > 
> > > Signed-off-by: Anatoly Burakov 
> > > ---
> > Acked-by: Bruce Richardson 
> 
> Acked-by: Jerin Jacob 

Applied


Re: [dpdk-dev] [PATCH] doc: announce ABI change to support VF representors

2018-02-14 Thread Remy Horton


On 14/02/2018 12:32, Shahaf Shuler wrote:
[..]

Signed-off-by: Shahaf Shuler 
---
 doc/guides/rel_notes/deprecation.rst | 6 ++
 1 file changed, 6 insertions(+)


Acked-by: Remy Horton 


[dpdk-dev] [PATCH 0/4] add to support for virtio-user server mode

2018-02-14 Thread Zhiyong Yang
When vhost user/ovs-dpdk restart, virtio user is expected to keep alive
so that vhost user can reconnect it successfully and continue to exchange
packets.

The series support the feature and target for 18.05 release.

Virtio user with server mode creates socket file and then starts to wait
for first connection from vhost user with client mode in blocking mode.

Virtio user with server mode supports many times' vhost reconnections with
same configurations. 

Virtio user supports only one connection at the same time in server/client
mode.

How to test?
for example:

./x86_64-native-linuxapp-gcc/app/testpmd -c 0x3 -n 4 -m 256,0 --no-pci \
--file-prefix=testpmd0 --vdev=net_virtio_user0,mac=00:11:22:33:44:10, \
path=/tmp/sock0,server=1,queues=1 -- -i --rxq=1 --txq=1 --no-numa

./x86_64-native-linuxapp-gcc/app/testpmd -c 0x3e000 -n 4 --socket-mem 256,0 \
--vdev 'net_vhost0,iface=/tmp/sock0,client=0,queues=1' -- -i --rxq=1 --txq=1 \
--nb-cores=1 --no-numa

Zhiyong Yang (4):
  vhost: move fdset functions from fd_man.c to fd_man.h
  net/virtio-user: add data members to support server mode
  net/virtio-user: support server mode
  net/vhost: add memory checking to support client mode

 drivers/net/vhost/rte_eth_vhost.c|   9 +
 drivers/net/virtio/virtio_ethdev.c   |   9 +-
 drivers/net/virtio/virtio_user/vhost_user.c  |  77 ++-
 drivers/net/virtio/virtio_user/virtio_user_dev.c |  44 ++--
 drivers/net/virtio/virtio_user/virtio_user_dev.h |   8 +
 drivers/net/virtio/virtio_user_ethdev.c  |  81 ++-
 lib/librte_vhost/Makefile|   3 +-
 lib/librte_vhost/fd_man.c| 274 ---
 lib/librte_vhost/fd_man.h| 258 -
 9 files changed, 456 insertions(+), 307 deletions(-)
 delete mode 100644 lib/librte_vhost/fd_man.c

-- 
2.13.3



[dpdk-dev] [PATCH 1/4] vhost: move fdset functions from fd_man.c to fd_man.h

2018-02-14 Thread Zhiyong Yang
The patch moves fdset related funcitons from fd_man.c to fd_man.h in
order to reuse these funcitons from the perspective of PMDs.

Signed-off-by: Zhiyong Yang 
---
 lib/librte_vhost/Makefile |   3 +-
 lib/librte_vhost/fd_man.c | 274 --
 lib/librte_vhost/fd_man.h | 258 +--
 3 files changed, 253 insertions(+), 282 deletions(-)
 delete mode 100644 lib/librte_vhost/fd_man.c

diff --git a/lib/librte_vhost/Makefile b/lib/librte_vhost/Makefile
index 5d6c6abae..e201df79c 100644
--- a/lib/librte_vhost/Makefile
+++ b/lib/librte_vhost/Makefile
@@ -21,10 +21,11 @@ endif
 LDLIBS += -lrte_eal -lrte_mempool -lrte_mbuf -lrte_ethdev -lrte_net
 
 # all source are stored in SRCS-y
-SRCS-$(CONFIG_RTE_LIBRTE_VHOST) := fd_man.c iotlb.c socket.c vhost.c \
+SRCS-$(CONFIG_RTE_LIBRTE_VHOST) := iotlb.c socket.c vhost.c \
vhost_user.c virtio_net.c
 
 # install includes
 SYMLINK-$(CONFIG_RTE_LIBRTE_VHOST)-include += rte_vhost.h
+SYMLINK-$(CONFIG_RTE_LIBRTE_VHOST)-include += fd_man.h
 
 include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_vhost/fd_man.c b/lib/librte_vhost/fd_man.c
deleted file mode 100644
index 181711c2a..0
--- a/lib/librte_vhost/fd_man.c
+++ /dev/null
@@ -1,274 +0,0 @@
-/* SPDX-License-Identifier: BSD-3-Clause
- * Copyright(c) 2010-2014 Intel Corporation
- */
-
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-
-#include 
-#include 
-
-#include "fd_man.h"
-
-#define FDPOLLERR (POLLERR | POLLHUP | POLLNVAL)
-
-static int
-get_last_valid_idx(struct fdset *pfdset, int last_valid_idx)
-{
-   int i;
-
-   for (i = last_valid_idx; i >= 0 && pfdset->fd[i].fd == -1; i--)
-   ;
-
-   return i;
-}
-
-static void
-fdset_move(struct fdset *pfdset, int dst, int src)
-{
-   pfdset->fd[dst]= pfdset->fd[src];
-   pfdset->rwfds[dst] = pfdset->rwfds[src];
-}
-
-static void
-fdset_shrink_nolock(struct fdset *pfdset)
-{
-   int i;
-   int last_valid_idx = get_last_valid_idx(pfdset, pfdset->num - 1);
-
-   for (i = 0; i < last_valid_idx; i++) {
-   if (pfdset->fd[i].fd != -1)
-   continue;
-
-   fdset_move(pfdset, i, last_valid_idx);
-   last_valid_idx = get_last_valid_idx(pfdset, last_valid_idx - 1);
-   }
-   pfdset->num = last_valid_idx + 1;
-}
-
-/*
- * Find deleted fd entries and remove them
- */
-static void
-fdset_shrink(struct fdset *pfdset)
-{
-   pthread_mutex_lock(&pfdset->fd_mutex);
-   fdset_shrink_nolock(pfdset);
-   pthread_mutex_unlock(&pfdset->fd_mutex);
-}
-
-/**
- * Returns the index in the fdset for a given fd.
- * @return
- *   index for the fd, or -1 if fd isn't in the fdset.
- */
-static int
-fdset_find_fd(struct fdset *pfdset, int fd)
-{
-   int i;
-
-   for (i = 0; i < pfdset->num && pfdset->fd[i].fd != fd; i++)
-   ;
-
-   return i == pfdset->num ? -1 : i;
-}
-
-static void
-fdset_add_fd(struct fdset *pfdset, int idx, int fd,
-   fd_cb rcb, fd_cb wcb, void *dat)
-{
-   struct fdentry *pfdentry = &pfdset->fd[idx];
-   struct pollfd *pfd = &pfdset->rwfds[idx];
-
-   pfdentry->fd  = fd;
-   pfdentry->rcb = rcb;
-   pfdentry->wcb = wcb;
-   pfdentry->dat = dat;
-
-   pfd->fd = fd;
-   pfd->events  = rcb ? POLLIN : 0;
-   pfd->events |= wcb ? POLLOUT : 0;
-   pfd->revents = 0;
-}
-
-void
-fdset_init(struct fdset *pfdset)
-{
-   int i;
-
-   if (pfdset == NULL)
-   return;
-
-   for (i = 0; i < MAX_FDS; i++) {
-   pfdset->fd[i].fd = -1;
-   pfdset->fd[i].dat = NULL;
-   }
-   pfdset->num = 0;
-}
-
-/**
- * Register the fd in the fdset with read/write handler and context.
- */
-int
-fdset_add(struct fdset *pfdset, int fd, fd_cb rcb, fd_cb wcb, void *dat)
-{
-   int i;
-
-   if (pfdset == NULL || fd == -1)
-   return -1;
-
-   pthread_mutex_lock(&pfdset->fd_mutex);
-   i = pfdset->num < MAX_FDS ? pfdset->num++ : -1;
-   if (i == -1) {
-   fdset_shrink_nolock(pfdset);
-   i = pfdset->num < MAX_FDS ? pfdset->num++ : -1;
-   if (i == -1) {
-   pthread_mutex_unlock(&pfdset->fd_mutex);
-   return -2;
-   }
-   }
-
-   fdset_add_fd(pfdset, i, fd, rcb, wcb, dat);
-   pthread_mutex_unlock(&pfdset->fd_mutex);
-
-   return 0;
-}
-
-/**
- *  Unregister the fd from the fdset.
- *  Returns context of a given fd or NULL.
- */
-void *
-fdset_del(struct fdset *pfdset, int fd)
-{
-   int i;
-   void *dat = NULL;
-
-   if (pfdset == NULL || fd == -1)
-   return NULL;
-
-   do {
-   pthread_mutex_lock(&pfdset->fd_mutex);
-
-   i = fdset_find_fd(pfdset, fd);
-   if (i != -1 && pfdset->fd[i].busy == 0) {
-   /*

[dpdk-dev] [PATCH 2/4] net/virtio-user: add data members to support server mode

2018-02-14 Thread Zhiyong Yang
Add data members so as to support server mode.

Signed-off-by: Zhiyong Yang 
---
 drivers/net/virtio/virtio_user/virtio_user_dev.h | 8 
 1 file changed, 8 insertions(+)

diff --git a/drivers/net/virtio/virtio_user/virtio_user_dev.h 
b/drivers/net/virtio/virtio_user/virtio_user_dev.h
index 64467b4f9..e640a3438 100644
--- a/drivers/net/virtio/virtio_user/virtio_user_dev.h
+++ b/drivers/net/virtio/virtio_user/virtio_user_dev.h
@@ -6,13 +6,21 @@
 #define _VIRTIO_USER_DEV_H
 
 #include 
+#include 
 #include "../virtio_pci.h"
 #include "../virtio_ring.h"
 #include "vhost.h"
+#include "fd_man.h"
 
 struct virtio_user_dev {
/* for vhost_user backend */
int vhostfd;
+   int listenfd;   /* listening fd  */
+   boolconnected;  /* connection status */
+
+   /* support for server/clinet mode */
+   boolis_server;
+   struct fdsetfdset;
 
/* for vhost_kernel backend */
char*ifname;
-- 
2.13.3



[dpdk-dev] [PATCH 3/4] net/virtio-user: support server mode

2018-02-14 Thread Zhiyong Yang
virtio user adds to support for server mode.

Virtio user with server mode creates socket file and then starts to wait
for first connection from vhost user with client mode in blocking mode.

Server mode virtio user supports many times' vhost reconnections with
same configurations.

Support only one connection at the same time in server mode.

Signed-off-by: Zhiyong Yang 
---
 drivers/net/virtio/virtio_ethdev.c   |  9 ++-
 drivers/net/virtio/virtio_user/vhost_user.c  | 77 --
 drivers/net/virtio/virtio_user/virtio_user_dev.c | 44 +
 drivers/net/virtio/virtio_user_ethdev.c  | 81 ++--
 4 files changed, 186 insertions(+), 25 deletions(-)

diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index 884f74ad0..44d037d6b 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -1273,9 +1273,13 @@ static void
 virtio_notify_peers(struct rte_eth_dev *dev)
 {
struct virtio_hw *hw = dev->data->dev_private;
-   struct virtnet_rx *rxvq = dev->data->rx_queues[0];
+   struct virtnet_rx *rxvq = NULL;
struct rte_mbuf *rarp_mbuf;
 
+   if (!dev->data->rx_queues)
+   return;
+
+   rxvq = dev->data->rx_queues[0];
rarp_mbuf = rte_net_make_rarp_packet(rxvq->mpool,
(struct ether_addr *)hw->mac_addr);
if (rarp_mbuf == NULL) {
@@ -1333,7 +1337,8 @@ virtio_interrupt_handler(void *param)
 
if (isr & VIRTIO_NET_S_ANNOUNCE) {
virtio_notify_peers(dev);
-   virtio_ack_link_announce(dev);
+   if (hw->cvq)
+   virtio_ack_link_announce(dev);
}
 }
 
diff --git a/drivers/net/virtio/virtio_user/vhost_user.c 
b/drivers/net/virtio/virtio_user/vhost_user.c
index 91c6449bb..fd806e106 100644
--- a/drivers/net/virtio/virtio_user/vhost_user.c
+++ b/drivers/net/virtio/virtio_user/vhost_user.c
@@ -378,6 +378,55 @@ vhost_user_sock(struct virtio_user_dev *dev,
return 0;
 }
 
+static void
+virtio_user_set_block(int fd, bool enabled)
+{
+   int f;
+
+   f = fcntl(fd, F_GETFL);
+   if (enabled)
+   fcntl(fd, F_SETFL, f & ~O_NONBLOCK);
+   else
+   fcntl(fd, F_SETFL, f | O_NONBLOCK);
+}
+
+#define MAX_VIRTIO_USER_BACKLOG 128
+static int
+virtio_user_start_server(struct virtio_user_dev *dev, struct sockaddr_un *un)
+{
+   int ret;
+   int fd = dev->listenfd;
+   int connectfd;
+
+   ret = bind(fd, (struct sockaddr *)un, sizeof(*un));
+   if (ret < 0) {
+   PMD_DRV_LOG(ERR, "failed to bind to %s: %s; remove it and try 
again\n",
+   dev->path, strerror(errno));
+   goto err;
+   }
+   ret = listen(fd, MAX_VIRTIO_USER_BACKLOG);
+   if (ret < 0)
+   goto err;
+
+   virtio_user_set_block(fd, true);
+   PMD_DRV_LOG(NOTICE, "virtio user server mode is waiting for connection 
from vhost user.");
+   while (1) {
+   connectfd = accept(fd, NULL, NULL);
+   if (connectfd >= 0) {
+   dev->connected = true;
+   break;
+   }
+   }
+
+   dev->vhostfd = connectfd;
+   virtio_user_set_block(connectfd, true);
+
+   return 0;
+err:
+   close(fd);
+   return -1;
+}
+
 /**
  * Set up environment to talk with a vhost user backend.
  *
@@ -390,6 +439,7 @@ vhost_user_setup(struct virtio_user_dev *dev)
 {
int fd;
int flag;
+   int ret;
struct sockaddr_un un;
 
fd = socket(AF_UNIX, SOCK_STREAM, 0);
@@ -405,13 +455,30 @@ vhost_user_setup(struct virtio_user_dev *dev)
memset(&un, 0, sizeof(un));
un.sun_family = AF_UNIX;
snprintf(un.sun_path, sizeof(un.sun_path), "%s", dev->path);
-   if (connect(fd, (struct sockaddr *)&un, sizeof(un)) < 0) {
-   PMD_DRV_LOG(ERR, "connect error, %s", strerror(errno));
-   close(fd);
-   return -1;
+
+   if (dev->is_server) {
+   static pthread_t fdset_tid;
+
+   dev->listenfd = fd;
+   if (fdset_tid == 0) {
+   ret = pthread_create(&fdset_tid, NULL,
+fdset_event_dispatch,
+&dev->fdset);
+   if (ret < 0)
+   PMD_DRV_LOG(ERR, "failed to create fdset 
handling thread");
+   }
+   return virtio_user_start_server(dev, &un);
+
+   } else {
+   dev->vhostfd = fd;
+   if (connect(fd, (struct sockaddr *)&un, sizeof(un)) < 0) {
+   PMD_DRV_LOG(ERR, "connect error, %s", strerror(errno));
+   close(fd);
+   return -1;
+   }
+   dev->connected = true;
}
 
-   dev->vhostfd = fd;

[dpdk-dev] [PATCH 4/4] net/vhost: add memory checking to support client mode

2018-02-14 Thread Zhiyong Yang
When vhost user PMD works in client mode to connect/reconnect virtio
user in server mode, new thread sometimes may run to new_device before
queue_setup has been done, So have to wait until memory allocation
is done.

Signed-off-by: Zhiyong Yang 
---
 drivers/net/vhost/rte_eth_vhost.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/drivers/net/vhost/rte_eth_vhost.c 
b/drivers/net/vhost/rte_eth_vhost.c
index 3aae01c39..cd67bc7c5 100644
--- a/drivers/net/vhost/rte_eth_vhost.c
+++ b/drivers/net/vhost/rte_eth_vhost.c
@@ -580,6 +580,15 @@ new_device(int vid)
eth_dev->data->numa_node = newnode;
 #endif
 
+   /* The thread may run here before eth_dev->data->rx_queues or
+* eth_dev->data->tx_queues have gotten valid memory, so have to
+* wait until memory allocation is done.
+*/
+   while (!eth_dev->data->rx_queues ||
+  !eth_dev->data->tx_queues) {
+   ;
+   }
+
for (i = 0; i < eth_dev->data->nb_rx_queues; i++) {
vq = eth_dev->data->rx_queues[i];
if (vq == NULL)
-- 
2.13.3



Re: [dpdk-dev] [PATCH v2] doc: remove eal API for default mempool ops name

2018-02-14 Thread Thomas Monjalon
13/02/2018 12:28, Ferruh Yigit:
> On 2/2/2018 2:01 PM, Olivier Matz wrote:
> > On Fri, Feb 02, 2018 at 02:01:42PM +0530, Hemant Agrawal wrote:
> >> Signed-off-by: Hemant Agrawal 
> >> ---
> >> +* eal: a new set of mbuf mempool ops name APIs for user, platform and best
> >> +  mempool names have been defined in ``rte_mbuf`` in v18.02. The uses of
> >> +  ``rte_eal_mbuf_default_mempool_ops`` shall be replaced by
> >> +  ``rte_mbuf_best_mempool_ops``.
> >> +  The following function is now redundant and it is target to be 
> >> deprecated in
> >> +  18.05:
> >> +
> >> +  - ``rte_eal_mbuf_default_mempool_ops``
> > 
> > Acked-by: Olivier Matz 
> 
> Acked-by: Ferruh Yigit 

Applied



Re: [dpdk-dev] [PATCH v2] net/failsafe: fix Rx interrupt reinstallation

2018-02-14 Thread Gaëtan Rivet
On Wed, Feb 14, 2018 at 02:47:26PM +, Matan Azrad wrote:
> Fail-safe dev_start() operation can be called by both the application
> and the hot-plug alarm mechanism.
> 
> The installation of Rx interrupt are triggered from dev_start() in any
> time it is called while actually the Rx interrupt should be installed
> only by the application calls.
> 
> So, each plug-in event causes reinstallation which causes memory leak
> and spoils the fail-safe Rx interrupt mechanism.
> 
> Trigger the Rx interrupt installation only when it does not exist.
> 
> Fixes: 9e0360aebf23 ("net/failsafe: register as Rx interrupt mode")
> 
> Signed-off-by: Matan Azrad 

Acked-by: Gaetan Rivet 

-- 
Gaëtan Rivet
6WIND


Re: [dpdk-dev] [PATCH v2] net/failsafe: fix Rx interrupt reinstallation

2018-02-14 Thread Gaëtan Rivet
On Wed, Feb 14, 2018 at 04:00:13PM +0100, Gaëtan Rivet wrote:
> On Wed, Feb 14, 2018 at 02:47:26PM +, Matan Azrad wrote:
> > Fail-safe dev_start() operation can be called by both the application
> > and the hot-plug alarm mechanism.
> > 
> > The installation of Rx interrupt are triggered from dev_start() in any
> > time it is called while actually the Rx interrupt should be installed
> > only by the application calls.
> > 
> > So, each plug-in event causes reinstallation which causes memory leak
> > and spoils the fail-safe Rx interrupt mechanism.
> > 
> > Trigger the Rx interrupt installation only when it does not exist.
> > 
> > Fixes: 9e0360aebf23 ("net/failsafe: register as Rx interrupt mode")
> > 
> > Signed-off-by: Matan Azrad 
> 
> Acked-by: Gaetan Rivet 

Actually no!

There is a mistake in the patch, you disabled the uninstall, instead of
the installation.

-- 
Gaëtan Rivet
6WIND


[dpdk-dev] [PATCH] net/mlx5: fix flow creation with a single target queue

2018-02-14 Thread Nelio Laranjeiro
Adding a pattern targeting a single queues wrongly behaves as it is an RSS
request, ending by creating several Verbs flows rules to match the RSS
configuration.

Fixes: 8086cf08b2f0 ("net/mlx5: handle RSS hash configuration in RSS flow")
Cc: sta...@dpdk.org

Signed-off-by: Nelio Laranjeiro 
---
 drivers/net/mlx5/mlx5_flow.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index 26002c4b9..42381c578 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -912,6 +912,15 @@ priv_flow_convert_finalise(struct priv *priv, struct 
mlx5_flow_parse *parser)
unsigned int i;
 
(void)priv;
+   /* Remove any other flow not matching the pattern. */
+   if (parser->queues_n == 1) {
+   for (i = 0; i != hash_rxq_init_n; ++i) {
+   if (i == parser->layer || !parser->queue[i].ibv_attr)
+   continue;
+   rte_free(parser->queue[i].ibv_attr);
+   parser->queue[i].ibv_attr = NULL;
+   }
+   }
if (parser->layer == HASH_RXQ_ETH) {
goto fill;
} else {
-- 
2.11.0



Re: [dpdk-dev] [PATCH v2] net/failsafe: fix Rx interrupt reinstallation

2018-02-14 Thread Matan Azrad
Hi Gaetan

From: Gaëtan Rivet, Sent: Wednesday, February 14, 2018 5:01 PM
> On Wed, Feb 14, 2018 at 04:00:13PM +0100, Gaëtan Rivet wrote:
> > On Wed, Feb 14, 2018 at 02:47:26PM +, Matan Azrad wrote:
> > > Fail-safe dev_start() operation can be called by both the
> > > application and the hot-plug alarm mechanism.
> > >
> > > The installation of Rx interrupt are triggered from dev_start() in
> > > any time it is called while actually the Rx interrupt should be
> > > installed only by the application calls.
> > >
> > > So, each plug-in event causes reinstallation which causes memory
> > > leak and spoils the fail-safe Rx interrupt mechanism.
> > >
> > > Trigger the Rx interrupt installation only when it does not exist.
> > >
> > > Fixes: 9e0360aebf23 ("net/failsafe: register as Rx interrupt mode")
> > >
> > > Signed-off-by: Matan Azrad 
> >
> > Acked-by: Gaetan Rivet 
> 
> Actually no!
> 
> There is a mistake in the patch, you disabled the uninstall, instead of the
> installation.
>
No Gaetan, I think it is in the install.
Please recheck maybe by applying.
 
> --
> Gaëtan Rivet
> 6WIND


Re: [dpdk-dev] [PATCH v2] net/failsafe: fix Rx interrupt reinstallation

2018-02-14 Thread Gaëtan Rivet
On Wed, Feb 14, 2018 at 04:01:29PM +0100, Gaëtan Rivet wrote:
> On Wed, Feb 14, 2018 at 04:00:13PM +0100, Gaëtan Rivet wrote:
> > On Wed, Feb 14, 2018 at 02:47:26PM +, Matan Azrad wrote:
> > > Fail-safe dev_start() operation can be called by both the application
> > > and the hot-plug alarm mechanism.
> > > 
> > > The installation of Rx interrupt are triggered from dev_start() in any
> > > time it is called while actually the Rx interrupt should be installed
> > > only by the application calls.
> > > 
> > > So, each plug-in event causes reinstallation which causes memory leak
> > > and spoils the fail-safe Rx interrupt mechanism.
> > > 
> > > Trigger the Rx interrupt installation only when it does not exist.
> > > 
> > > Fixes: 9e0360aebf23 ("net/failsafe: register as Rx interrupt mode")
> > > 
> > > Signed-off-by: Matan Azrad 
> > 
> > Acked-by: Gaetan Rivet 
> 
> Actually no!
> 
> There is a mistake in the patch, you disabled the uninstall, instead of
> the installation.

Okay, this is weird.

> > > Fail-safe dev_start() operation can be called by both the application
> > > and the hot-plug alarm mechanism.
> > > 
> > > The installation of Rx interrupt are triggered from dev_start() in any
> > > time it is called while actually the Rx interrupt should be installed
> > > only by the application calls.
> > > 
> > > So, each plug-in event causes reinstallation which causes memory leak
> > > and spoils the fail-safe Rx interrupt mechanism.
> > > 
> > > Trigger the Rx interrupt installation only when it does not exist.
> > > 
> > > Fixes: 9e0360aebf23 ("net/failsafe: register as Rx interrupt mode")
> > > 
> > > Signed-off-by: Matan Azrad 
> > > ---
> > >  drivers/net/failsafe/failsafe_intr.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/net/failsafe/failsafe_intr.c 
> > > b/drivers/net/failsafe/failsafe_intr.c
> > > index f6ff04d..6b7f9c1 100644
> > > --- a/drivers/net/failsafe/failsafe_intr.c
> > > +++ b/drivers/net/failsafe/failsafe_intr.c
> > > @@ -523,7 +523,7 @@ void failsafe_rx_intr_uninstall_subdevice(struct 
> > > sub_device *sdev)

Here the context is incorrect, this is not within 
failsafe_rx_intr_uninstall_subdevice,
so the fix is correct. Confirming my ack then, this seems like a
format-patch bug or something.

> > > const struct rte_intr_conf *const intr_conf =
> > > &priv->dev->data->dev_conf.intr_conf;
> > > 
> > > -   if (intr_conf->rxq == 0)
> > > +   if (intr_conf->rxq == 0 || dev->intr_handle != NULL)
> > > return 0;
> > > if (fs_rx_intr_vec_install(priv) < 0)
> > > return -rte_errno;
> > > --
> > > 1.9.5



-- 
Gaëtan Rivet
6WIND


Re: [dpdk-dev] [PATCH] doc: announce API/ABI changes for mempool

2018-02-14 Thread Thomas Monjalon
> >>> An API/ABI changes are planned for 18.05 [1]:
> >>>
> >>>   * Allow to customize how mempool objects are stored in memory.
> >>>   * Deprecate mempool XMEM API.
> >>>   * Add mempool driver ops to get information from mempool driver and
> >>> dequeue contiguous blocks of objects if driver supports it.
> >>>
> >>> [1] http://dpdk.org/ml/archives/dev/2018-January/088698.html
> >>>
> >>> Signed-off-by: Andrew Rybchenko 
> >>
> >> Acked-by: Olivier Matz 
> > 
> > Acked-by: Jerin Jacob 
> > 
> Acked-by: Hemant Agrawal 

Applied



Re: [dpdk-dev] [PATCH v2] net/failsafe: fix Rx interrupt reinstallation

2018-02-14 Thread Thomas Monjalon
14/02/2018 16:00, Gaëtan Rivet:
> On Wed, Feb 14, 2018 at 02:47:26PM +, Matan Azrad wrote:
> > Fail-safe dev_start() operation can be called by both the application
> > and the hot-plug alarm mechanism.
> > 
> > The installation of Rx interrupt are triggered from dev_start() in any
> > time it is called while actually the Rx interrupt should be installed
> > only by the application calls.
> > 
> > So, each plug-in event causes reinstallation which causes memory leak
> > and spoils the fail-safe Rx interrupt mechanism.
> > 
> > Trigger the Rx interrupt installation only when it does not exist.
> > 
> > Fixes: 9e0360aebf23 ("net/failsafe: register as Rx interrupt mode")
> > 
> > Signed-off-by: Matan Azrad 
> 
> Acked-by: Gaetan Rivet 

Applied, thanks


Re: [dpdk-dev] [PATCH] doc: announce ABI change to support VF representors

2018-02-14 Thread Boccassi, Luca
On Wed, 2018-02-14 at 14:32 +0200, Shahaf Shuler wrote:
> This is following the RFC being discussed and targets 18.05
> 
> http://dpdk.org/ml/archives/dev/2018-January/085716.html
> 
> Cc: declan.dohe...@intel.com
> Cc: mohammad.abdul.a...@intel.com
> Cc: ferruh.yi...@intel.com
> Cc: remy.hor...@intel.com
> 
> Signed-off-by: Shahaf Shuler 
> ---
>  doc/guides/rel_notes/deprecation.rst | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/doc/guides/rel_notes/deprecation.rst
> b/doc/guides/rel_notes/deprecation.rst
> index d59ad5988..f6151de63 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -59,3 +59,9 @@ Deprecation Notices
>    be added between the producer and consumer structures. The size of
> the
>    structure and the offset of the fields will remain the same on
>    platforms with 64B cache line, but will change on other platforms.
> +
> +* ethdev: A work is being planned for 18.05 to expose VF port
> representors
> +  as a mean to perform control and data path operation on the
> different VFs.
> +  As VF representor is an ethdev port, new fields are needed in
> order to map
> +  between the VF representor and the VF or the parent PF. Those new
> fields
> +  are to be included in ``rte_eth_dev_info`` struct.

Acked-by: Luca Boccassi 
Acked-by: Alex Zelezniak 

Acking on behalf of my colleague Alex as well, who replied privately.

-- 
Kind regards,
Luca Boccassi

[dpdk-dev] [PATCH v1 0/4] doc: announce API changes for flow rules

2018-02-14 Thread Adrien Mazarguil
Series of API/ABI change announcements for rte_flow to enable proper
encap/decap support and address various design issues that can't be
addressed without ABI impact.

Adrien Mazarguil (4):
  doc: announce API change for flow actions
  doc: announce API change for flow RSS action
  doc: announce API change for flow RSS/RAW actions
  doc: announce API change for flow VLAN pattern item

 doc/guides/rel_notes/deprecation.rst | 23 +++
 1 file changed, 23 insertions(+)

-- 
2.11.0


[dpdk-dev] [PATCH v1 1/4] doc: announce API change for flow actions

2018-02-14 Thread Adrien Mazarguil
This announce is related to the discussion regarding TEP and the need for
encap/decap support in rte_flow [1].

It's now clear that PMD support for chaining multiple non-terminating flow
rules of varying priority levels is prohibitively difficult to implement
compared to simply allowing multiple identical actions performed in a
defined order by a single flow rule.

[1] http://dpdk.org/ml/archives/dev/2017-December/084676.html

Signed-off-by: Adrien Mazarguil 
---
 doc/guides/rel_notes/deprecation.rst | 8 
 1 file changed, 8 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index d59ad5988..663550acb 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -59,3 +59,11 @@ Deprecation Notices
   be added between the producer and consumer structures. The size of the
   structure and the offset of the fields will remain the same on
   platforms with 64B cache line, but will change on other platforms.
+
+* rte_flow: flow rule action semantics will be modified to enable support
+  for encap/decap. All actions part of a flow rule will be taken into
+  account; not only the last one in case of repeated actions. Their order
+  will matter. This change will make the DUP action redundant, and the
+  (non-)terminating property of actions will be discarded. Instead, flow
+  rules themselves will be considered terminating by default unless a
+  PASSTHRU action is also specified.
-- 
2.11.0


[dpdk-dev] [PATCH v1 2/4] doc: announce API change for flow RSS action

2018-02-14 Thread Adrien Mazarguil
Since its inception, the rte_flow RSS action has been relying in part on
struct rte_eth_rss_conf for compatibility with the legacy RSS API. This
structure lacks parameters such as the hash function to use, and more
recently, a method to tell which layer RSS should be performed on [1].

Given struct rte_eth_rss_conf will never be flexible enough to represent a
complete RSS configuration (e.g. RETA table), struct rte_flow_action_rss
will be extended instead.

Depending on the outcome of RSS level implementation work, this deprecation
notice may either cancel or come in conjunction with [1].

[1] http://dpdk.org/ml/archives/dev/2018-February/090359.html

Signed-off-by: Adrien Mazarguil 
---
 doc/guides/rel_notes/deprecation.rst | 5 +
 1 file changed, 5 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index 663550acb..40b76b391 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -67,3 +67,8 @@ Deprecation Notices
   (non-)terminating property of actions will be discarded. Instead, flow
   rules themselves will be considered terminating by default unless a
   PASSTHRU action is also specified.
+
+* rte_flow: RSS action will stop relying on the legacy ``struct
+  rte_eth_rss_conf`` due to its limitations. All parameters, including the
+  currently missing hash function to use will be made part of ``struct
+  rte_flow_action_rss`` directly.
-- 
2.11.0


[dpdk-dev] [PATCH v1 3/4] doc: announce API change for flow RSS/RAW actions

2018-02-14 Thread Adrien Mazarguil
C99-style flexible arrays were a bad idea for this API. This announces a
minor API/ABI change to remove them.

Signed-off-by: Adrien Mazarguil 
---
 doc/guides/rel_notes/deprecation.rst | 5 +
 1 file changed, 5 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index 40b76b391..77390ce9f 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -72,3 +72,8 @@ Deprecation Notices
   rte_eth_rss_conf`` due to its limitations. All parameters, including the
   currently missing hash function to use will be made part of ``struct
   rte_flow_action_rss`` directly.
+
+* rte_flow: C99-style flexible arrays in ``struct rte_flow_action_rss`` and
+  ``struct rte_flow_item_raw`` will be replaced by standard pointers to the
+  same data. They proved difficult to use in the field (e.g. no possibility
+  of static initialization) and are unsuitable for C++ applications.
-- 
2.11.0


[dpdk-dev] [PATCH v1 4/4] doc: announce API change for flow VLAN pattern item

2018-02-14 Thread Adrien Mazarguil
This will finally bring consistency to the VLAN pattern item definition,
particularly when attempting to match QinQ traffic. Applications relying on
TCI and no QinQ shouldn't notice a difference.

On the other hand, applications relying on EtherType matching will have to
adapt their patterns so they match from outermost to innermost (as on the
wire) instead of the current mess (innermost, then outermost to innermost
in case of QinQ).

Signed-off-by: Adrien Mazarguil 
---
 doc/guides/rel_notes/deprecation.rst | 5 +
 1 file changed, 5 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index 77390ce9f..5cd337807 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -77,3 +77,8 @@ Deprecation Notices
   ``struct rte_flow_item_raw`` will be replaced by standard pointers to the
   same data. They proved difficult to use in the field (e.g. no possibility
   of static initialization) and are unsuitable for C++ applications.
+
+* rte_flow: VLAN pattern item (``struct rte_flow_item_vlan``) will be
+  redefined more logically with TCI followed by inner EtherType (wire order)
+  instead of outer TPID followed by TCI (with inner EtherType part of the
+  previous pattern item), as the latter results in much confusion.
-- 
2.11.0


Re: [dpdk-dev] [PATCH] net/mlx5: fix flow creation with a single target queue

2018-02-14 Thread Adrien Mazarguil
On Wed, Feb 14, 2018 at 04:04:45PM +0100, Nelio Laranjeiro wrote:
> Adding a pattern targeting a single queues wrongly behaves as it is an RSS
> request, ending by creating several Verbs flows rules to match the RSS
> configuration.
> 
> Fixes: 8086cf08b2f0 ("net/mlx5: handle RSS hash configuration in RSS flow")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Nelio Laranjeiro 

Acked-by: Adrien Mazarguil 

-- 
Adrien Mazarguil
6WIND


Re: [dpdk-dev] [PATCH] doc: announce control mbuf removal

2018-02-14 Thread Thomas Monjalon
14/02/2018 01:02, Thomas Monjalon:
> 29/01/2018 10:30, Olivier Matz:
> > Link: http://dpdk.org/ml/archives/dev/2017-July/069813.html
> > Link: http://dpdk.org/dev/patchwork/patch/32041/
> > 
> > Signed-off-by: Olivier Matz 
> > ---
> > +* mbuf: The control mbuf API will be removed in v18.05. The impacted
> > +  functions and macros are:
> > +
> > +  - ``rte_ctrlmbuf_init()``
> > +  - ``rte_ctrlmbuf_alloc()``
> > +  - ``rte_ctrlmbuf_free()``
> > +  - ``rte_ctrlmbuf_data()``
> > +  - ``rte_ctrlmbuf_len()``
> > +  - ``rte_is_ctrlmbuf()``
> > +  - ``CTRL_MBUF_FLAG``
> > +
> > +  The packet mbuf API should be used as a replacement.
> 
> Acked-by: Thomas Monjalon 

Applied



Re: [dpdk-dev] [PATCH] doc: announce ABI change to support VF representors

2018-02-14 Thread Jerin Jacob
-Original Message-
> Date: Wed, 14 Feb 2018 15:27:50 +
> From: "Boccassi, Luca" 
> To: "shah...@mellanox.com" , "tho...@monjalon.net"
>  , "nhor...@tuxdriver.com" 
> CC: "remy.hor...@intel.com" ,
>  "mohammad.abdul.a...@intel.com" ,
>  "declan.dohe...@intel.com" ,
>  "ferruh.yi...@intel.com" , "dev@dpdk.org"
>  
> Subject: Re: [dpdk-dev] [PATCH] doc: announce ABI change to support VF
>  representors
> 
> On Wed, 2018-02-14 at 14:32 +0200, Shahaf Shuler wrote:
> > This is following the RFC being discussed and targets 18.05
> > 
> > http://dpdk.org/ml/archives/dev/2018-January/085716.html
> > 
> > Cc: declan.dohe...@intel.com
> > Cc: mohammad.abdul.a...@intel.com
> > Cc: ferruh.yi...@intel.com
> > Cc: remy.hor...@intel.com
> > 
> > Signed-off-by: Shahaf Shuler 
> > ---
> >  doc/guides/rel_notes/deprecation.rst | 6 ++
> >  1 file changed, 6 insertions(+)
> > 
> > diff --git a/doc/guides/rel_notes/deprecation.rst
> > b/doc/guides/rel_notes/deprecation.rst
> > index d59ad5988..f6151de63 100644
> > --- a/doc/guides/rel_notes/deprecation.rst
> > +++ b/doc/guides/rel_notes/deprecation.rst
> > @@ -59,3 +59,9 @@ Deprecation Notices
> >    be added between the producer and consumer structures. The size of
> > the
> >    structure and the offset of the fields will remain the same on
> >    platforms with 64B cache line, but will change on other platforms.
> > +
> > +* ethdev: A work is being planned for 18.05 to expose VF port
> > representors
> > +  as a mean to perform control and data path operation on the
> > different VFs.
> > +  As VF representor is an ethdev port, new fields are needed in
> > order to map
> > +  between the VF representor and the VF or the parent PF. Those new
> > fields
> > +  are to be included in ``rte_eth_dev_info`` struct.
> 
> Acked-by: Luca Boccassi 
> Acked-by: Alex Zelezniak 

Acked-by: Jerin Jacob 

> 
> Acking on behalf of my colleague Alex as well, who replied privately.
> 
> -- 
> Kind regards,
> Luca Boccassi


Re: [dpdk-dev] [PATCH v1 0/4] doc: announce API changes for flow rules

2018-02-14 Thread Nélio Laranjeiro
On Wed, Feb 14, 2018 at 04:37:26PM +0100, Adrien Mazarguil wrote:
> Series of API/ABI change announcements for rte_flow to enable proper
> encap/decap support and address various design issues that can't be
> addressed without ABI impact.
> 
> Adrien Mazarguil (4):
>   doc: announce API change for flow actions
>   doc: announce API change for flow RSS action
>   doc: announce API change for flow RSS/RAW actions
>   doc: announce API change for flow VLAN pattern item
> 
>  doc/guides/rel_notes/deprecation.rst | 23 +++
>  1 file changed, 23 insertions(+)
> 
> -- 
> 2.11.0

For the series,

Acked-by: Nelio Laranjeiro 

-- 
Nélio Laranjeiro
6WIND


Re: [dpdk-dev] [PATCH] doc: add change notice for mbuf sched field

2018-02-14 Thread Thomas Monjalon
> > > > Signed-off-by: Cristian Dumitrescu 
> > > > Acked-by: Jasvinder Singh 
> > > > Acked-by: Roy Fan Zhang 
> > > > Acked-by: Kevin Laatz 
> > > 
> > > Acked-by: Jerin Jacob 
> > 
> > Acked-by: Hemant Agrawal 
> 
> Acked-by: Olivier Matz 

Applied



Re: [dpdk-dev] [PATCH v1 0/4] doc: announce API changes for flow rules

2018-02-14 Thread Andrew Rybchenko

On 02/14/2018 06:55 PM, Nélio Laranjeiro wrote:

On Wed, Feb 14, 2018 at 04:37:26PM +0100, Adrien Mazarguil wrote:

Series of API/ABI change announcements for rte_flow to enable proper
encap/decap support and address various design issues that can't be
addressed without ABI impact.

Adrien Mazarguil (4):
   doc: announce API change for flow actions
   doc: announce API change for flow RSS action
   doc: announce API change for flow RSS/RAW actions
   doc: announce API change for flow VLAN pattern item

  doc/guides/rel_notes/deprecation.rst | 23 +++
  1 file changed, 23 insertions(+)

--
2.11.0

For the series,

Acked-by: Nelio Laranjeiro 


For the series,

Acked-by: Andrew Rybchenko 


[dpdk-dev] [PATCH] doc/gsg: remove reference to old distros

2018-02-14 Thread Harry van Haaren
Remove reference to Fedora 18 which is EOL-ed, reword
surrounding sentences to read correctly.

Signed-off-by: Harry van Haaren 

---

@Thomas, perhaps consider for 18.02 - to get outdated references cleaned up?

---
 doc/guides/linux_gsg/sys_reqs.rst | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/doc/guides/linux_gsg/sys_reqs.rst 
b/doc/guides/linux_gsg/sys_reqs.rst
index e582f63..e2230f3 100644
--- a/doc/guides/linux_gsg/sys_reqs.rst
+++ b/doc/guides/linux_gsg/sys_reqs.rst
@@ -34,8 +34,8 @@ Compilation of the DPDK
 
 .. note::
 
-Testing has been performed using Fedora 18. The setup commands and 
installed packages needed on other systems may be different.
-For details on other Linux distributions and the versions tested, please 
consult the DPDK Release Notes.
+The setup commands and installed packages needed on various systems may be 
different.
+For details on Linux distributions and the versions tested, please consult 
the DPDK Release Notes.
 
 *   GNU ``make``.
 
-- 
2.7.4



Re: [dpdk-dev] [PATCH v2] doc: announce ABI change for RSS configuration structure

2018-02-14 Thread Thomas Monjalon
> > >> Update deprecation notice for the new rss_level field of 
> > >> rte_eth_rss_conf.
> > >>
> > >> Link: http://www.dpdk.org/dev/patchwork/patch/31891
> > >>
> > >> Signed-off-by: Xueming Li 
> > >> ---
> > >> +* ethdev: A new rss level field planned in 18.05.
> > >> +  The new API add rss_level field to ``rte_eth_rss_conf`` to enable a
> > >> +choice
> > >> +  of RSS hash calculation on outer or inner header of tunneled packet.
> > > 
> > > Acked-By: Shahaf Shuler 
> > 
> > Acked-by: Ferruh Yigit 
> 
> Acked-by: Jerin Jacob 

Applied



Re: [dpdk-dev] [PATCH v3] doc: add preferred burst size support

2018-02-14 Thread Thomas Monjalon
> > > rte_eth_rx_burst(..,nb_pkts) function has semantic that if return value is
> > > smaller than requested, application can consider it end of packet stream.
> > > Some hardware can only support smaller burst sizes which need to be
> > > advertised. Similar is the case for Tx burst.
> > > 
> > > This patch adds deprecation notice for rte_eth_dev_info structure as new
> > > members, for preferred Rx and Tx burst and ring size would be added -
> > > impacting the size of the structure.
> > > 
> > > Signed-off-by: Shreyansh Jain 
> > > Acked-by: Hemant Agrawal 
> > > Acked-by: Andrew Rybchenko 
> > > Acked-by: Bruce Richardson 
> > 
> > Acked-by: Zhiyong Yang 
> 
> Maybe that we want to re-use the same struct to define min, max
> and preferred sizes.
> 
> For the global idea,
> Acked-by: Thomas Monjalon 

Applied



Re: [dpdk-dev] [PATCH] doc: announce ABI change to support VF representors

2018-02-14 Thread Thomas Monjalon
> > > This is following the RFC being discussed and targets 18.05
> > > 
> > > http://dpdk.org/ml/archives/dev/2018-January/085716.html
> > > 
> > > Cc: declan.dohe...@intel.com
> > > Cc: mohammad.abdul.a...@intel.com
> > > Cc: ferruh.yi...@intel.com
> > > Cc: remy.hor...@intel.com
> > > 
> > > Signed-off-by: Shahaf Shuler 
> > > ---
> > > +* ethdev: A work is being planned for 18.05 to expose VF port
> > > representors
> > > +  as a mean to perform control and data path operation on the
> > > different VFs.
> > > +  As VF representor is an ethdev port, new fields are needed in
> > > order to map
> > > +  between the VF representor and the VF or the parent PF. Those new
> > > fields
> > > +  are to be included in ``rte_eth_dev_info`` struct.
> > 
> > Acked-by: Luca Boccassi 
> > Acked-by: Alex Zelezniak 
> 
> Acked-by: Jerin Jacob 

Applied





Re: [dpdk-dev] [RFC v2] doc compression API for DPDK

2018-02-14 Thread Ahmed Mansour
On 2/14/2018 12:41 AM, Verma, Shally wrote:
> Hi Ahmed
>
>> -Original Message-
>> From: Ahmed Mansour [mailto:ahmed.mans...@nxp.com]
>> Sent: 02 February 2018 02:20
>> To: Trahe, Fiona ; Verma, Shally 
>> ; dev@dpdk.org
>> Cc: Athreya, Narayana Prasad ; Gupta, 
>> Ashish ; Sahu, Sunila
>> ; De Lara Guarch, Pablo 
>> ; Challa, Mahipal
>> ; Jain, Deepak K ; 
>> Hemant Agrawal ; Roy
>> Pledge ; Youri Querry 
>> Subject: Re: [RFC v2] doc compression API for DPDK
>>
> [Fiona] I propose if BFINAL bit is detected before end of input
> the decompression should stop. In this case consumed will be < src.length.
> produced will be < dst buffer size. Do we need an extra STATUS response?
> STATUS_BFINAL_DETECTED  ?
 [Shally] @fiona, I assume you mean here decompressor stop after processing 
 Final block right?
>>> [Fiona] Yes.
>>>
>>>  And if yes,
 and if it can process that final block successfully/unsuccessfully, then 
 status could simply be
 SUCCESS/FAILED.
 I don't see need of specific return code for this use case. Just to share, 
 in past, we have practically run into
 such cases with boost lib, and decompressor has simply worked this way.
>>> [Fiona] I'm ok with this.
>>>
> Only thing I don't like this is it can impact on performance, as normally
> we can just look for STATUS == SUCCESS. Anything else should be an 
> exception.
> Now the application would have to check for SUCCESS || BFINAL_DETECTED 
> every time.
> Do you have a suggestion on how we should handle this?
>
>> [Ahmed] This makes sense. So in all cases the PMD should assume that it
>> should stop as soon as a BFINAL is observed.
>>
>> A question. What happens ins stateful vs stateless modes when
>> decompressing an op that encompasses multiple BFINALs. I assume the
>> caller in that case will use the consumed=x bytes to find out how far in
>> to the input is the end of the first stream and start from the next
>> byte. Is this correct?
> [Shally]  As per my understanding, each op can be tied up to only one stream 
> as we have only one stream pointer per op and one stream can have only one 
> BFINAL (as stream is one complete compressed data) but looks like you're 
> suggesting a case where one op can carry multiple independent streams? and 
> thus multiple BFINAL?! , such as, below here is op pointing to more than one 
> streams
>
> 
> op --> |stream1|stream2| |stream3|
>
>
> Could you confirm if I understand your question correct?
[Ahmed] Correct. We found that in some storage applications the user
does not know where exactly the BFINAL is. They rely on zlib software
today. zlib.net software halts at the first BFINAL. Users put multiple
streams in one op and rely on zlib to  stop and inform them of the end
location of the first stream.
>
> Thanks
> Shally
>



Re: [dpdk-dev] [PATCH] doc: announce PMD API change for set default MAC

2018-02-14 Thread Thomas Monjalon
14/02/2018 01:00, Thomas Monjalon:
> > > >> +* ethdev: The prototype and the behavior of
> > > >> +  ``dev_ops->eth_mac_addr_set()`` will change in v18.05. A return
> > > >> +code
> > > >> +  will be added to notify the caller if an error occurred in the
> > > >> +PMD. In
> > > >> +  ``rte_eth_dev_default_mac_addr_set()``, the new default MAC
> > > >> +address
> > > >> +  will be copied in ``dev->data->mac_addrs[0]`` only if the
> > > >> +operation is
> > > >> +  succesfull. This modification will only impact the PMDs, not the
> > > >> +  applications.
> > > >
> > > > Acked-by: Andrew Rybchenko 
> > > 
> > > Acked-by: Ferruh Yigit 
> > 
> > Acked-by: Shahaf Shuler 
> 
> Acked-by: Thomas Monjalon 

Applied


Re: [dpdk-dev] [PATCH v2] doc: update ethdev APIs to return named opaque type

2018-02-14 Thread Thomas Monjalon
> >>> Ethdev APIs to add callback return the callback object as "void *",
> >>> update return type to actual object type
> >>> "struct rte_eth_rxtx_callback *"
> >>>
> >>> Signed-off-by: Ferruh Yigit 
> >>> ---
> >>> +* ethdev: functions add rx/tx callback will return named opaque type
> >>> +  rte_eth_add_rx_callback(), rte_eth_add_first_rx_callback() and
> >>> +  rte_eth_add_tx_callback() functions currently return callback object as
> >>> +  "void \*" but APIs to delete callbacks get "struct 
> >>> rte_eth_rxtx_callback \*"
> >>> +  as parameter. For consistency functions adding callback will return
> >>> +  "struct rte_eth_rxtx_callback \*" instead of "void * ".
> >>> +
> >>>   * i40e: The default flexible payload configuration which extracts the 
> >>> first 16
> >>> bytes of the payload for RSS will be deprecated starting from 18.02. 
> >>> If
> >>> required the previous behavior can be configured using existing flow
> >>> --
> >>
> >> Acked-by: Konstantin Ananyev 
> > 
> > Acked-by: Jerin Jacob 
> > 
> Acked-by: Hemant Agrawal 

Applied


[dpdk-dev] [PATCH] doc: fix outdated link

2018-02-14 Thread Pablo de Lara
Fixes: 924e84f87306 ("aesni_mb: add driver for multi buffer based crypto")
Cc: sta...@dpdk.org

Signed-off-by: Pablo de Lara 
---
 doc/guides/cryptodevs/aesni_mb.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/guides/cryptodevs/aesni_mb.rst 
b/doc/guides/cryptodevs/aesni_mb.rst
index 888b87950..3950daae0 100644
--- a/doc/guides/cryptodevs/aesni_mb.rst
+++ b/doc/guides/cryptodevs/aesni_mb.rst
@@ -8,7 +8,7 @@ AESN-NI Multi Buffer Crypto Poll Mode Driver
 The AESNI MB PMD (**librte_pmd_aesni_mb**) provides poll mode crypto driver
 support for utilizing Intel multi buffer library, see the white paper
 `Fast Multi-buffer IPsec Implementations on Intel® Architecture Processors
-`_.
+`_.
 
 The AES-NI MB PMD has current only been tested on Fedora 21 64-bit with gcc.
 
-- 
2.14.3



Re: [dpdk-dev] [PATCH v3] doc: ethdev ABI change deprecation notice

2018-02-14 Thread Thomas Monjalon
14/02/2018 01:14, Thomas Monjalon:
> > > >> Signed-off-by: Kirill Rybalchenko 
> > > >>
> > > >> Acked-by: Marko Kovacevic 
> > > >> ---
> > > >> +* ethdev: announce ABI change
> > > >> +  The size of variables flow_types_mask in rte_eth_fdir_info 
> > > >> structure,
> > > >> +  sym_hash_enable_mask and valid_bit_mask in rte_eth_hash_global_conf 
> > > >> structure
> > > >> +  will be increased from 32 to 64 bits to fulfill hardware 
> > > >> requirements.
> > > >> +  This change will break existing ABI as size of the structures will 
> > > >> increase.
> > > >> +
> > > > Acked-by: Neil Horman 
> > > 
> > > Acked-by: Ferruh Yigit 
> > 
> > Acked-by: Olivier Matz 
> 
> Acked-by: Thomas Monjalon 
> 
> I would prefer you drop the legacy code to keep only rte_flow.

Applied


Re: [dpdk-dev] [PATCH] doc/gsg: remove reference to old distros

2018-02-14 Thread Mcnamara, John


> -Original Message-
> From: Van Haaren, Harry
> Sent: Wednesday, February 14, 2018 4:16 PM
> To: dev@dpdk.org
> Cc: tho...@monjalon.net; Mcnamara, John ; Van
> Haaren, Harry 
> Subject: [PATCH] doc/gsg: remove reference to old distros
> 
> Remove reference to Fedora 18 which is EOL-ed, reword surrounding
> sentences to read correctly.
> 
> Signed-off-by: Harry van Haaren 

Acked-by: John McNamara 




Re: [dpdk-dev] [PATCH] doc: fix ethdev API port_id parameter size

2018-02-14 Thread Thomas Monjalon
> > > > Fix rte_eth_dev_get_sec_ctx() parameter port_id storage size, form
> > > > uint8_t to uint16_t
> > > >
> > > > Signed-off-by: Ferruh Yigit 
> > > > ---
> > > Acked-by: Radu Nicolau 
> > 
> > Acked-by: Hemant Agrawal 
> 
> Acked-by: Jerin Jacob 

Applied


Re: [dpdk-dev] [PATCH] doc/gsg: remove reference to old distros

2018-02-14 Thread Thomas Monjalon
> > Remove reference to Fedora 18 which is EOL-ed, reword surrounding
> > sentences to read correctly.
> > 
> > Signed-off-by: Harry van Haaren 
> 
> Acked-by: John McNamara 

Applied, thanks



Re: [dpdk-dev] [PATCH] doc: fix outdated link

2018-02-14 Thread Thomas Monjalon
14/02/2018 18:14, Pablo de Lara:
> Fixes: 924e84f87306 ("aesni_mb: add driver for multi buffer based crypto")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Pablo de Lara 

Applied, thanks


Re: [dpdk-dev] [PATCH] doc: add tested platforms with Mellanox NICs

2018-02-14 Thread Thomas Monjalon
14/02/2018 15:30, Raslan Darawsheh:
> Signed-off-by: Raslan Darawsheh 

Applied with few rst fixes, thanks



Re: [dpdk-dev] [PATCH] doc: add virtio GUEST ANNOUNCE to release notes

2018-02-14 Thread Thomas Monjalon
14/02/2018 00:31, Thomas Monjalon:
> 09/02/2018 10:56, Mcnamara, John:
> > From: Wang, Xiao W
> > > +* **Added VIRTIO_NET_F_GUEST_ANNOUNCE feature support in virtio pmd.**
> > > +
> > > +  In scenario where the vhost backend doesn't have the ability to
> > > + generate RARP  packet, the VM running virtio pmd can still be live
> > > + migrated if  VIRTIO_NET_F_GUEST_ANNOUNCE feature is negotiated.
> > > +
> > >  .. note::
> > > 
> > >  This new build system support is incomplete at this point and is
> > > added
> > 
> > The text has been added between the previous section and a note belonging
> > to the previous section.
> 
> Please, can you merge this text with the item added by Jiayu about migration?

Applied as it is reworded in the final release notes patch by John.


Re: [dpdk-dev] [PATCH v2] doc: update release notes for 18.02

2018-02-14 Thread Thomas Monjalon
14/02/2018 14:50, John McNamara:
> Fix grammar, spelling and formatting of DPDK 18.02 release notes.
> 
> Signed-off-by: John McNamara 

Applied, thanks



Re: [dpdk-dev] [PATCH] doc: add kernel version deprecation notice

2018-02-14 Thread Stephen Hemminger
On Wed, 14 Feb 2018 11:44:15 +
Bruce Richardson  wrote:

> On Wed, Feb 14, 2018 at 10:54:44AM +, Luca Boccassi wrote:
> > On Wed, 2018-02-14 at 11:38 +0100, Maxime Coquelin wrote:  
> > > 
> > > On 02/14/2018 11:31 AM, Luca Boccassi wrote:  
> > > > On Wed, 2018-02-14 at 00:58 +0100, Thomas Monjalon wrote:  
> > > > > 31/01/2018 16:27, Stephen Hemminger:  
> > > > > > Notify users of upcoming change in kernel requirement.
> > > > > > Encourage users to use current LTS kernel version.
> > > > > > 
> > > > > > Signed-off-by: Stephen Hemminger 
> > > > > > --- +* linux: Linux kernel version 3.2 (which is the current
> > > > > > minimum required +  version for the DPDK) will be end of life
> > > > > > in May 2018.  Therefore the planned +  minimum required kernel
> > > > > > version for DPDK 18.5 will be next oldest Long +  Term Stable
> > > > > > (LTS) version which is 3.10. The recommended kernel version is
> > > > > > +  the latest LTS kernel which currently is 4.14.  
> > > > > 
> > > > > We could print a warning at EAL init if kernel version does not
> > > > > satisfy the minimal requirement.
> > > > > 
> > > > > Acked-by: Thomas Monjalon   
> > > > 
> > > > Note that 3.10 is dead as well since last year (as I discovered
> > > > with immense joy when I had to backport meltdown fixes...), the
> > > > next LTS in 3.16 which will be maintained until 04/2020.
> > > >   
> > > 
> > > In this case we should differentiate upstream Kernel versions from
> > > downstream ones. For example, RHEL7/CentOS7 are based on v3.10 ans
> > > still maintained.  
> > 
> > Ubuntu does 3.13 as well - I think the problem is that if we want to
> > support distro-specific LTS kernel versions, we need volunteers to do
> > the work for them :-)
> > 
> > --   
> I think our kernel support plans need to be two-fold:
> 
> 1) we need to support a minimum "kernel.org" kernel version, which is
> what the deprecation notice is about.
> 2) we also will be supporting LTS distributions, e.g. I would expect us to
> always support the latest RHEL, so that should be noted explicitly in the
> GSG IMHO.

For distributions, we need to have a maintainer. Ideally, from the vendor or
project doing the distribution.


Re: [dpdk-dev] [RFC v2 00/23] Dynamic memory allocation for DPDK

2018-02-14 Thread Yongseok Koh


> On Feb 14, 2018, at 1:32 AM, Burakov, Anatoly  
> wrote:
> 
> On 14-Feb-18 2:01 AM, Yongseok Koh wrote:
>>> On Feb 5, 2018, at 2:03 AM, Burakov, Anatoly  
>>> wrote:
>>> 
>>> Thanks for your feedback, good to hear we're on the right track. I already 
>>> have a prototype implementation of this working, due for v1 submission :)
>> Anatoly,
>> One more suggestion. Currently, when populating mempool, there's a chance to
>> have multiple chunks if system memory is highly fragmented. However, with 
>> your
>> new design, it is unlikely to happen unless the system is really low on 
>> memory.
>> Allocation will be dynamic and page by page. With your v2, you seemed to make
>> minimal changes on mempool. If allocation fails, it will still try to gather
>> fragments from malloc_heap until it acquires enough objects and the resultant
>> mempool will have multiple chunks. But like I mentioned, it is very unlikely 
>> and
>> this will only happen when the system is short of memory. Is my understanding
>> correct?
>> If so, how about making a change to drop the case where mempool has multiple
>> chunks?
>> Thanks
>> Yongseok
> 
> Hi Yongseok,
> 
> I would still like to keep it, as it may impact low memory cases such as 
> containers.

Agreed. I overlooked that kind of use-cases.

Thanks,
Yongseok

[dpdk-dev] [PATCH] net/i40evf: regression fix - reenable interrupts in handler

2018-02-14 Thread Konrad Jankowski
Commit 66b8304f removed the rte_intr_enable() call from
i40evf_dev_interrupt_handler() as a "bonus". On one of my systems this causes
the AdminQ messages to stop beeing delivered to the VF. This results in
unability to initialize and use the port. With this patch it works again.

System in question:
Wind River OVP6 running kernel 3.10.58-ovp-rt58-WR6.0.0.13_preempt-rt

Signed-off-by: Konrad Jankowski 
---
 drivers/net/i40e/i40e_ethdev_vf.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/i40e/i40e_ethdev_vf.c 
b/drivers/net/i40e/i40e_ethdev_vf.c
index fd003fe..b927a35 100644
--- a/drivers/net/i40e/i40e_ethdev_vf.c
+++ b/drivers/net/i40e/i40e_ethdev_vf.c
@@ -1404,6 +1404,7 @@ i40evf_dev_interrupt_handler(void *param)
 
 done:
i40evf_enable_irq0(hw);
+   rte_intr_enable(dev->intr_handle);
 }
 
 static int
-- 
2.5.5

--
Intel Research and Development Ireland Limited
Registered in Ireland
Registered Office: Collinstown Industrial Park, Leixlip, County Kildare
Registered Number: 308263


This e-mail and any attachments may contain confidential material for the sole
use of the intended recipient(s). Any review or distribution by others is
strictly prohibited. If you are not the intended recipient, please contact the
sender and delete all copies.



[dpdk-dev] [dpdk-announce] DPDK 18.02 released

2018-02-14 Thread Thomas Monjalon
A new major release is available:
http://fast.dpdk.org/rel/dpdk-18.02.tar.xz

Special attention was paid to not break the ABI in this release.
It means 18.02 could replace 17.11 without rebuilding the applications.
However it is advised to keep using 17.11 LTS for long term deployments.

Some highlights:
- new license header (SPDX tag)
- bbdev (Wireless Base Band) device class
- rawdev device class
- ethdev probe notifications and port ownership
- Hyper-V platform driver
- AVF (Adaptive Virtual Function) ethdev driver
- IPsec offload in DPAA
- DPAA eventdev driver
- OPDL (Ordered Packet Distribution Library) eventdev driver
- experimental tags and automatic check
- meson build system (beta)

More details in the release notes:
http://dpdk.org/doc/guides/rel_notes/release_18_02.html

The statistics are similar to previous release:
1315 patches from 145 authors
2316 files changed, 100569 insertions(+), 77209 deletions(-)

There are 46 new contributors
(including authors, reviewers and testers):
Thanks to Aleksey Baulin, Amr Mokhtar, Andrea Grandi, Andrew Jackson,
Anoob Joseph, Avi Kivity, Bao-Long Tran, Bharat Mota, Cheryl Houser,
Ciara Power, David Coyle, Dustin Lundquist, Erik Gabriel Carrillo,
George Wilkie, Georgios Katsikas, Gong Deli, Hyong Youb Kim,
Jerry Lilijun, Jun Yang, Junjie Chen, Kefu Chai, Kevin Laatz,
Laszlo Ersek, Liang Ma, Mallesh Koujalagi, Martin Klozik,
Matthew Smith, Michael McConville, Natalie Samsonov, Nikhil Agarwal,
Peter Mccarthy, Prashant Bhole, Rafal Kozik, Rosen Xu, Roy Franz,
Sharmila Podury, Stefan Hajnoczi, Sunil Kumar Kori, Thomas Speier,
Tomasz Jozwiak, Vijay Srivastava, Wisam Jaddo, Xin Long, Yang Zhang,
Yanglong Wu and Zhike Wang.

Below is the number of patches per company (accuracy not perfect):
463 Intel (57)
213 Mellanox (11)
132 NXP (7)
131 Cavium (9)
102 6WIND (8)
 83 Solarflare (6)
 27 Broadcom (2)
 24 RedHat (5)
 21 Semihalf (3)
 20 Microsoft (2)
 17 Cisco (3)
 16 OKTET Labs (2)
  9 AT&T (4)
  6 Marvell (1)
  5 Netronome (1)
  5 IBM (2)
  4 ZTE (1)
  4 Linaro (1)
  4 HXT Semiconductor (1)
  4 ARM (2)

The new features for 18.05 must be submitted before the next month,
in order to be reviewed and integrated during March.
The next release is expected to happen at the beginning of May.

Thanks everyone

PS: Like last year, this release is done during Valentine's day.
It is an opportunity to stop working and offer a day to your Valentine!


[dpdk-dev] [PATCH v6] checkpatches.sh: Add checks for ABI symbol addition

2018-02-14 Thread Neil Horman
Recently, some additional patches were added to allow for programmatic
marking of C symbols as experimental.  The addition of these markers is
dependent on the manual addition of exported symbols to the EXPERIMENTAL
section of the corresponding libraries version map file.  The consensus
on review is that, in addition to mandating the addition of symbols to
the EXPERIMENTAL version in the map, we need a mechanism to enforce our
documented process of mandating that addition when they are introduced.
To that end, I am proposing this change.  It is an addition to the
checkpatches script, which scan incoming patches for additions and
removals of symbols to the map file, and warns the user appropriately

Signed-off-by: Neil Horman 
CC: tho...@monjalon.net
CC: john.mcnam...@intel.com
CC: bruce.richard...@intel.com
CC: Ferruh Yigit 
CC: Stephen Hemminger 

---
Change notes

v2)
 * Cleaned up and documented awk script (shemminger)
 * fixed sort/uniq usage (shemminger)
 * moved checking to new script (tmonjalon)
 * added maintainer entry (tmonjalon)
 * added license (tmonjalon)

v3)
 * Changed symbol check script name (tmonjalon)
 * Trapped exit to clean temp file (tmonjalon)
 * Honored verbose command (tmonjalon)
 * Cleaned left over debug bits (tmonjalon)
 * Updated location in MAINTAINERS file (tmonjalon)

v4)
 * Updated maintainers file (tmonjalon)

v5)
 * undo V4 (tmojalon)

v6)
 * Cleaning up more nits (tmonjalon)
 * Combining some lines (tmonjalon)
 * Fixing error print condition (tmonjalon)
 * Redirect stdin to a file to allow rewinding for
   Multiple passes on tools (nhorman)
---
 MAINTAINERS |   1 +
 devtools/check-symbol-change.sh | 146 
 devtools/checkpatches.sh|  46 +++--
 3 files changed, 188 insertions(+), 5 deletions(-)
 create mode 100755 devtools/check-symbol-change.sh

diff --git a/MAINTAINERS b/MAINTAINERS
index a646ca3e1..f83b9ab33 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -87,6 +87,7 @@ M: Neil Horman 
 F: lib/librte_compat/
 F: doc/guides/rel_notes/deprecation.rst
 F: devtools/validate-abi.sh
+F: devtools/check-symbol-change.sh
 F: buildtools/check-experimental-syms.sh
 
 Driver information
diff --git a/devtools/check-symbol-change.sh b/devtools/check-symbol-change.sh
new file mode 100755
index 0..22b17e6f2
--- /dev/null
+++ b/devtools/check-symbol-change.sh
@@ -0,0 +1,146 @@
+#!/bin/sh
+
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2018 Neil Horman 
+
+build_map_changes()
+{
+   local fname=$1
+   local mapdb=$2
+
+   cat $fname | filterdiff -i *.map | awk '
+   # Initialize our variables
+   BEGIN {map="";sym="";ar="";sec=""; in_sec=0}
+
+   # Anything that starts with + or -, followed by an a
+   # and ends in the string .map is the name of our map file
+   # This may appear multiple times in a patch if multiple
+   # map files are altered, and all section/symbol names
+   # appearing between a triggering of this rule and the
+   # next trigger of this rule are associated with this file
+   /[-+] a\/.*\.map/ {map=$2}
+
+   # Triggering this rule, which starts a line with a + and ends it
+   # with a { identifies a versioned section.  The section name is
+   # the rest of the line with the + and { symbols remvoed.
+   # Triggering this rule sets in_sec to 1, which actives the
+   # symbol rule below
+   /+.*{/ {gsub("+","");sec=$1; in_sec=1}
+
+   # This rule idenfies the end of a section, and disables the
+   # symbol rule
+   /.*}/ {in_sec=0}
+
+   # This rule matches on a + followed by any characters except a :
+   # (which denotes a global vs local segment), and ends with a ;.
+   # The semicolon is removed and the symbol is printed with its
+   # association file name and version section, along with an
+   # indicator that the symbol is a new addition.  Note this rule
+   # only works if we have found a version section in the rule
+   # above (hence the in_sec check).  Otherwise we flag it as an
+   # unknown section
+   /^+[^}].*[^:*];/ {gsub(";","");sym=$2;
+   if (in_sec == 1) {
+   print map " " sym " " sec " add"
+   } else {
+   print map " " sym " unknown add"
+   }
+   }
+
+   # This is the same rule as above, but the rule matches on a
+   # leading - rather than a +, denoting that the symbol is being
+   # removed.
+   /^-[^}].*[^:*];/ {gsub(";","");sym=$2;
+   if (in_sec == 1) {
+   print map " " sym " " sec " del"
+

Re: [dpdk-dev] IXGBE, IOMMU DMAR DRHD handling fault issue

2018-02-14 Thread Ravi Kerur
On Tue, Feb 13, 2018 at 6:31 AM, Burakov, Anatoly  wrote:

> On 12-Feb-18 10:00 PM, Ravi Kerur wrote:
>
>>
>> Let me just give you what has been tested and working/nonworking
>> scenarios. Some of your questions might get answered as well.
>> Test bed is very simple with 2 VF's created under IXGBE PF on
>> host with one VF interface added to ovs-bridge on host and
>> another VF interface given to guest. Test connectivity between
>> VF's via ping.
>>
>> Host and guest -- Kernel 4.9
>> Host -- Qemu 2.11.50 (tried both released 2.11 and tip of the
>> git (2.11.50))
>> DPDK -- 17.05.1 on host and guest
>> Host and guest -- booted with GRUB intel_iommu=on (which enables
>> IOMMU). Have tried with "iommu=on and intel_iommu=on" as well,
>> but iommu=on is not needed when intel_iommu=on is set.
>>
>> Test-scenario-1: Host -- ixgbe_vf driver, Guest ixgbe_vf driver
>> ping works
>> Test-scenario-2: Host -- DPDK vfio-pci driver, Guest ixgbe_vf
>> driver ping works
>> Test-scenario-3: Host -- DPDK vfio-pci driver, Guest DPDK
>> vfio-pci driver, DMAR errors seen on host, ping doesn't work
>>
>>
>> OK, that makes it clearer, thanks. Does the third scenario work in
>> other DPDK versions?
>>
>>
>> No. Tried 16.11 same issue on guest and works fine on host.
>>
>>
>> So now we've moved from "this worked on 16.11" to "this never worked".
>
> It would be good to see output of rte_dump_physmem_layout() on both host
> and guest, and check which address triggers the DMAR error (i.e. if the
> physical address is present in mappings for either DPDK process).
>
> --
>

Earlier I was focusing only on DMAR errors and I might have said 'it
worked' when I didn't notice them on host when dpdk was started on guest.
When trying to send packets out of that interface from guest I did see DMAR
errors. I am attaching information you requested.  I have enabled
log-level=8 and files contain dpdk EAL/PMD logs as well.

Snippets below

on host, DMAR fault address from dmesg

[351576.998109] DMAR: DRHD: handling fault status reg 702
[351576.998113] DMAR: [DMA Read] Request device [04:10.0] fault addr
257617000 [fault reason 06] PTE Read access is not set

on guest (dump phys_mem_layout)

Segment 235: phys:0x25760, len:2097152, virt:0x7fce87e0,
socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0
...
PMD: ixgbe_dev_rx_queue_setup(): sw_ring=0x7fce87e0f4c0
sw_sc_ring=0x7fce87e07380 hw_ring=0x7fce87e17600 dma_addr=0x257617600
PMD: ixgbe_dev_rx_queue_setup(): sw_ring=0x7fce89c67d40
sw_sc_ring=0x7fce89c5fc00 hw_ring=0x7fce89c6fe80 dma_addr=0x25406fe80
...

Thanks,
Ravi



> Thanks,
> Anatoly
>


Re: [dpdk-dev] [PATCH] usertools/dpdk-devbind.py: add support for wind river avp device

2018-02-14 Thread Zhang, Xiaohua
That's no problem for me to move it to the "network" catalog.
Should I generate a new patch?


BR.
Xiaohua Zhang

-Original Message-
From: Bruce Richardson [mailto:bruce.richard...@intel.com] 
Sent: Wednesday, February 14, 2018 6:32 PM
To: BURAKOV, ANATOLY
Cc: Zhang, Xiaohua; YIGIT, FERRUH; dev@dpdk.org
Subject: Re: [dpdk-dev] [PATCH] usertools/dpdk-devbind.py: add support for wind 
river avp device

On Wed, Feb 14, 2018 at 09:57:25AM +, Burakov, Anatoly wrote:
> On 14-Feb-18 12:48 AM, Zhang, Xiaohua wrote:
> > Hi Yigit and Anantoly,
> > I checked the nics-17.11.pdf, the following is description:
> > "The Accelerated Virtual Port (AVP) device is a shared memory based 
> > device only available on virtualization platforms from Wind River 
> > Systems. The Wind River Systems virtualization platform currently 
> > uses QEMU/KVM as its hypervisor and as such provides support for all 
> > of the QEMU supported virtual and/or emulated devices (e.g., virtio, 
> > e1000, etc.). The platform offers the virtio device type as the 
> > default device when launching a virtual machine or creating a 
> > virtual machine port. The AVP device is a specialized device available to 
> > customers that require increased throughput and decreased latency to meet 
> > the demands of their performance focused applications."
> > 
> > I am afraid  just "memory_device" will have some misunderstanding.
> > Could we put it as "avp device (shared memory based)"?
> > 
> > 
> 
> Hi,
> 
> Well, from AVP PMD documentation, it seems that AVP is classified as a NIC.
> Can't we just add it to the list of NICs, even if it's not Ethernet 
> class 0x20xx? Pattern-matching in devbind should work either way. For 
> example, you can see there's "cavium_pkx" already classified as a NIC, 
> even though its class is 08xx, not 02xx. So why not this one?
> 

Definite +1.

It's used for packet IO into a vm, like virtio, and it's driver is in 
drivers/net.

"If it looks like a NIC, and quacks like a NIC, then it probably is a NIC". 
[Alternatively if it looks and quacks like a duck, I'm not sure what it's doing 
in DPDK!]

/Bruce



[dpdk-dev] [dpdk-announce] DPDK Bangalore Summit 2016, Agenda announced

2018-02-14 Thread Tibrewala, Sujata
Agenda [1] for DPDK Summit Bangalore 2018 March  9th  | Leela Palace, 
Bangalore,  India is announced

The agenda covers many interesting topics such as Hardware assist with DPDK, VM 
optimizations, Memzone Monitor, Data Plane Corruption, Service Function 
Chaining, OVN, OVS hardware offload, SPDK, NVMe-oF target service via 
SPDK/DPDK, Open Transport Layer Protocols in the Cloud Networking etc.  There 
will be a demo zone where you will see various interesting demos by the 
community and our sponsors.

We are looking forward to seeing you there. Please register here [2].

Reminder: This is a community conference - so let's try to avoid blatant 
product and/or vendor sales pitches.

Question on registrations? Contact us at eve...@dpdk.org

[1] https://dpdkbangalore2018.sched.com/
[2] 
https://www.regonline.com/registration/checkin.aspx?EventId=2151252&RegTypeID=489278

Thanks
Sujata Tibrewala @sujatatibre
Community Development Manager 
Intel Developer Zone
https://software.intel.com/networking
NPG Marketing Training PM (DOT)



[dpdk-dev] [dpdk-announce] DPDK hands on lab Bangalore March 10th

2018-02-14 Thread Tibrewala, Sujata
Hi,

Agenda for the DPDK hands on lab  Bangalore has been published. Please apply to 
attend at [1].
If you want to stay in touch with our local Bangalore DPDK events do not forget 
to join our meet up group at [2].

DPDK hands on Lab
What are latest new features DPDK brings into 2018? 
We will provide an overview of the new features of the latest DPDK release 
including source code browsing and API listing of top two new features of 
latest DPDK release. And on top of that, there will be a hands-on lab, on the 
Intel(r) microarchitecture (code name Skylake) servers, to learn how getting 
started with DPDK will become much simpler and powerful.

NFVI Enabling Kit Demo/Lab
An easy-to-use, automatic, self-contained toolkit to accelerate ODM* 
benchmarking NFVi-ready server designs on Whitley platform based on golden 
benchmark to characterize baseline performance test on DPDK, QAT and OVS, 
running on a single Xeon SP server.



Centralized Emergency Traffic Optimizer NEV SDK

We will be showcasing our CETO (Centralized Emergency Traffic Optimizer), a V2X 
and connected cars use case utilizing mobile edge computing framework using 
edge and centralized computing and analytics engine. This use case will 
showcase how edge traffic control engine is used to find the shortest path and 
create fastest traffic route for emergency vehicles by clearing the traffic of 
each traffic junction before the emergency vehicle arrives at the junction. To 
calculate the path, it considers the current density of each traffic junction 
and predicted density on each junction on the emergency vehicle suggested using 
the analytics engine running on the edge node. Assuming all cars are connected 
cars, It also connects to each car to suggest an alternative route to their 
destination if the car is on the same path as ambulance to reduce traffic 
congestion and faster route for all the vehicles at the same time. 

There are three ways to show case it, 
1) Using our cloud ran, MME, UE and Intel's MEC which will be deployed on their 
network. The challenge in this approach is we are still not very clear on the 
connectivity part during the hands-on session - i..e, connectivity of the 
laptop at the premise to the server that will run remotely in your New Mexico 
lab. Once we test this, we will be sure. 
2) Complete our own setup including MEC on our own laptop - this will be the 
backup with very limited features. 


[1] http://bit.ly/2mU41YZ
[2] 
https://www.meetup.com/Out-of-the-Box-Network-Developers-Bangalore/events/246703001/


Thanks
Sujata Tibrewala @sujatatibre
Community Development Manager 
Intel Developer Zone
https://software.intel.com/networking
NPG Marketing Training PM (DOT)




Re: [dpdk-dev] [RFC v1 1/1] lib/cryptodev: add support of asymmetric crypto

2018-02-14 Thread Verma, Shally
HI Fiona

Thanks for your feedback. Response below.

>-Original Message-
>From: Trahe, Fiona [mailto:fiona.tr...@intel.com]
>Sent: 09 February 2018 23:43
>To: dev@dpdk.org; Athreya, Narayana Prasad 
>; Murthy, Nidadavolu
>; Sahu, Sunila ; Gupta, 
>Ashish ; Verma,
>Shally ; Doherty, Declan ; 
>Keating, Brian A ;
>Griffin, John 
>Cc: Trahe, Fiona ; De Lara Guarch, Pablo 
>
>Subject: RE: [dpdk-dev] [RFC v1 1/1] lib/cryptodev: add support of asymmetric 
>crypto
>
>Hi Shally,
>Comments below.
>
>> -Original Message-
>> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Shally Verma
>> Sent: Tuesday, January 23, 2018 9:54 AM
>> To: Doherty, Declan 
>> Cc: dev@dpdk.org; pathr...@caviumnetworks.com; nmur...@caviumnetworks.com;
>> ss...@caviumnetworks.com; agu...@caviumnetworks.com; Shally Verma
>> 
>> Subject: [dpdk-dev] [RFC v1 1/1] lib/cryptodev: add support of asymmetric 
>> crypto
>>
>> From: Shally Verma 
>>
>> Add support for asymmetric crypto operations in DPDK lib cryptodev
>>
>> Key feature include:
>> - Only session based asymmetric crypto operations
>> - new get and set APIs for symmetric and asymmetric session private
>>   data and other informations
>> - APIs to create, configure and attch queue pair to asymmetric sessions
>> - new capabilities in struct device_info to indicate
>>   -- number of dedicated queue pairs available for symmetric and
>>  asymmetric operations, if any
>>   -- number of asymmetric sessions possible per qp
>>
>[Fiona] Though it's probably premature to include on the API have you 
>considered
>providing for pre-loaded keys in future, i.e. the ability to wrap keys or 
>refer to keys
>already stored securely on the device, as an alternative to passing the keys 
>on the API?
>
[Shally] Current intention of DPDK asym spec is to expose HW capabilities to 
application to offload these compute intensive ops. 
Though your use-case is very much a practical requirement , but to achieve this 
I believe we need a layer above this spec and need different interfaces. Such 
as, some public/secure key library which provide interfaces to generate / 
store/ perform op using an opaque key handles which internally can use DPDK to 
do part of it. For such case, I envision dpdk (or library above) will likely 
run on some secure processor. Thus, currently I kept such use-case out of 
current spec scope.

>
>> Proposed asymmetric cryptographic operations are:
>> - rsa
>> - dsa
>> - deffie-hellman key pair generation and shared key computation
>> - ecdeffie-hellman
>> - fundamental elliptic curve operations
>> - elliptic curve DSA
>> - modular exponentiation and inversion
>>
>> This patch primarily defines PMD operations and device capabilities
>> to perform asymmetric crypto ops on queue pairs and intend to
>> invite feedbacks on current proposal so as to ensure it encompass
>> all kind of crypto devices with different capabilities and queue
>> pair management.
>>
>> List of TBDs:
>> - Currently, patch only updated for RSA xform and associated params.
>>   Other algoritms to be added in subsequent versions.
>> - per-service stats update
>>
>> Signed-off-by: Shally Verma 
>> ---
>>
>> It is derivative of RFC v2 asymmetric crypto patch series initiated by
>> Umesh Kartha(mailto:umesh.kar...@caviumnetworks.com):
>>
>>  http://dpdk.org/dev/patchwork/patch/24245/
>>  http://dpdk.org/dev/patchwork/patch/24246/
>>  http://dpdk.org/dev/patchwork/patch/24247/
>>
>> And inclusive of all review comments given on RFC v2.
>>  ( See complete discussion thread here:
>> http://dev.dpdk.narkive.com/yqTFFLHw/dpdk-dev-rfc-specifications-for-asymmetric-crypto-
>> algorithms#post12)
>>
>> Some of the RFCv2 Review comments pending for closure:
>> > " [Fiona] The count fn isn't used at all for sym - probably no need to add 
>> > for asym
>>  better instead to remove the sym fn."
>>
>>  It is still present in dpdk-next-crypto for sym, so what has been 
>> decision
>>  on it?
>[Fiona] No change. The rte_cryptodev_ops fn is still not called so useless and 
>should be removed.
>rte_cryptodev_queue_pair_count() returns the num_qps configured in
>rte_cryptodev_configure(), but never calls the PMD dev_ops.queue_pair_count().
>So cryptodev_sym_queue_pair_count_t should be deprecated.
>And no point in adding one for asym.
>

[Shally] Ok

>>
>> >"[Fiona] if each qp can handle only a specific service, i.e. a subset off 
>> >the capabilities
>> Indicated by the device capability list, there's a need for a new API to 
>> query
>> the capability of a qp."
>>
>> Current proposal doesn’t distinguish between device capability and qp 
>> capability.
>> It rather leave such differences handling internal to PMDs. Thus no 
>> capability
>> or API added for qp in current version. It is subject to revisit based 
>> on review
>> feedback on current proposal.
>[Fiona] This would not work for some devices, comments below.
>
>>
>> - Sessionless Support.
>> Current proposal only support Ses

Re: [dpdk-dev] [RFC v2] doc compression API for DPDK

2018-02-14 Thread Verma, Shally


>-Original Message-
>From: Ahmed Mansour [mailto:ahmed.mans...@nxp.com]
>Sent: 14 February 2018 22:25
>To: Verma, Shally ; Trahe, Fiona 
>; dev@dpdk.org
>Cc: Athreya, Narayana Prasad ; Gupta, 
>Ashish ; Sahu, Sunila
>; De Lara Guarch, Pablo 
>; Challa, Mahipal
>; Jain, Deepak K ; Hemant 
>Agrawal ; Roy
>Pledge ; Youri Querry 
>Subject: Re: [RFC v2] doc compression API for DPDK
>
>On 2/14/2018 12:41 AM, Verma, Shally wrote:
>> Hi Ahmed
>>
>>> -Original Message-
>>> From: Ahmed Mansour [mailto:ahmed.mans...@nxp.com]
>>> Sent: 02 February 2018 02:20
>>> To: Trahe, Fiona ; Verma, Shally 
>>> ; dev@dpdk.org
>>> Cc: Athreya, Narayana Prasad ; Gupta, 
>>> Ashish ; Sahu, Sunila
>>> ; De Lara Guarch, Pablo 
>>> ; Challa, Mahipal
>>> ; Jain, Deepak K ; 
>>> Hemant Agrawal ; Roy
>>> Pledge ; Youri Querry 
>>> Subject: Re: [RFC v2] doc compression API for DPDK
>>>
>> [Fiona] I propose if BFINAL bit is detected before end of input
>> the decompression should stop. In this case consumed will be < 
>> src.length.
>> produced will be < dst buffer size. Do we need an extra STATUS response?
>> STATUS_BFINAL_DETECTED  ?
> [Shally] @fiona, I assume you mean here decompressor stop after 
> processing Final block right?
 [Fiona] Yes.

  And if yes,
> and if it can process that final block successfully/unsuccessfully, then 
> status could simply be
> SUCCESS/FAILED.
> I don't see need of specific return code for this use case. Just to 
> share, in past, we have practically run into
> such cases with boost lib, and decompressor has simply worked this way.
 [Fiona] I'm ok with this.

>> Only thing I don't like this is it can impact on performance, as normally
>> we can just look for STATUS == SUCCESS. Anything else should be an 
>> exception.
>> Now the application would have to check for SUCCESS || BFINAL_DETECTED 
>> every time.
>> Do you have a suggestion on how we should handle this?
>>
>>> [Ahmed] This makes sense. So in all cases the PMD should assume that it
>>> should stop as soon as a BFINAL is observed.
>>>
>>> A question. What happens ins stateful vs stateless modes when
>>> decompressing an op that encompasses multiple BFINALs. I assume the
>>> caller in that case will use the consumed=x bytes to find out how far in
>>> to the input is the end of the first stream and start from the next
>>> byte. Is this correct?
>> [Shally]  As per my understanding, each op can be tied up to only one stream 
>> as we have only one stream pointer per op and one
>stream can have only one BFINAL (as stream is one complete compressed data) 
>but looks like you're suggesting a case where one op
>can carry multiple independent streams? and thus multiple BFINAL?! , such as, 
>below here is op pointing to more than one streams
>>
>> 
>> op --> |stream1|stream2| |stream3|
>>
>>
>> Could you confirm if I understand your question correct?
>[Ahmed] Correct. We found that in some storage applications the user
>does not know where exactly the BFINAL is. They rely on zlib software
>today. zlib.net software halts at the first BFINAL. Users put multiple
>streams in one op and rely on zlib to  stop and inform them of the end
>location of the first stream.

[Shally] Then this is practically case possible on decompressor and 
decompressor doesn't regard flush flag. So in that case, I expect PMD to 
internally reset themselves (say in case of zlib going through cycle of 
deflateEnd and deflateInit or deflateReset) and return with status = SUCCESS 
with updated produced and consumed. Now in such case, if previous stream also 
has some footer followed by start of next stream, then I am not sure how PMD / 
lib can support that case. Have you had practically run of such use-case on 
zlib? If yes, how then such application handle it in your experience? 
I can imagine for such input zlib would return with Z_FLUSH_END after 1st 
BFINAL is processed to the user. Then application doing deflateReset() or 
Init-End() cycle before starting with next. But if it starts with input that 
doesn't have valid zlib header, then likely it will throw an error.

>>
>> Thanks
>> Shally
>>