[dpdk-dev] PKT_RX_VLAN_PKT when VLAN stripping is disabled

2016-04-26 Thread John Daley (johndale)
Hi Olivier and Ananyev,

I like the new packet types and how they work the same for VLAN and QINQ. Just 
so I understand your suggestion, X710 (as it seems to work today) would not set 
RTE_PTYPE_L2_ETHER_VLAN  in dev_supported_ptypes_get() because it does not know 
how to determine that packet type in the receive path if stripping is disabled? 
But if stripping was enabled, the application could still trust m->vlan_tci if 
the flag was set?

Re changing the meaning of the PKT_RX_VLAN_PKT flag- I think it could cause 
hard to find errors and confusion. I would rather see the flag deprecated and a 
new one defined. Perhaps the flag could be called PKT_RX_VLAN_STRIPPED*.

Maybe another less elegant but more compatible solution would be just keep the 
Niantic behavior and fix other pmd's to match its behavior. For X710, with vlan 
stripping disabled this might mean looking at each packet's Ethernet type and 
set the flag accordingly.  It might not be too expensive since Ethernet type is 
in the 1st cacheline and hopefully prefetched. Thoughts?

*In the future perhaps another flag could be added called 
PKT_RX_VLAN_TCI_VALID. This may not be the same as PKT_RX_VLAN_STRIPPED- enic 
and maybe some other nics are able to set vlan_tci even when not stripping vlan 
tags and this feature could be exposed with this separate flag.

-john

> -Original Message-
> From: Olivier Matz [mailto:olivier.matz at 6wind.com]
> Sent: Monday, April 25, 2016 6:51 AM
> To: Ananyev, Konstantin ; John Daley
> (johndale) ; dev at dpdk.org
> Subject: Re: [dpdk-dev] PKT_RX_VLAN_PKT when VLAN stripping is disabled
> 
> Hi,
> 
> On 04/25/2016 02:02 PM, Ananyev, Konstantin wrote:
> > Hi John,
> > From rte_mbuf.h:
> > #define PKT_RX_VLAN_PKT  (1ULL << 0)  /**< RX packet is a 802.1q VLAN
> packet. */
> > So yes, in theory it should be set up for vlan packet with both stripping
> on/off.
> > The problem is that (as far as I know) when VLAN stripping is disabled
> > FVL RXD doesn't contain information does that packet contain a VLAN or
> not.
> > Don't really know what is the best option in that case: keep things as
> > it is or change the meaning of the VLAN_PKT flag to indicate is
> mbuf.vlan_tci field is valid or not.
> > Konstantin
> 
> It seems the meaning of the PKT_RX_VLAN_PKT bit depends on the port
> configuration:
> - if vlan stripping is configured, it means VLAN is present in vlan_tci
>   mbuf field
> - if not configured, it means a VLAN is present in the packet
> 
> I don't think this is a good behavior since the application has to know the 
> port
> configuration to properly interpret the meaning of the flag.
> 
> I suggest to change the meaning of this flag to: "vlan was stripped by
> hardware, and vlan tag is now located in m->vlan_tci".
> 
> The same could apply to PKT_RX_QINQ_PKT and m->outer_vlan_tci.
> 
> We could add a new packet_type to tell if the mbuf is a VLAN/QinQ is
> detected in the packet but not stripped.
> 
> Example:
> 
> - vlan stripping is disabled
> 
>   - vlan packet recvd: flags=0, ptype=RTE_PTYPE_L2_ETHER_VLAN
>   - qinq packet recvd: flags=0, ptype=RTE_PTYPE_L2_ETHER_QINQ
> 
> - vlan stripping is enabled
> 
>   - vlan packet recvd: flags=PKT_RX_VLAN_PKT,
> ptype=RTE_PTYPE_L2_ETHER,
> m->vlan_tci=id
>   - qinq packet recvd: flags=PKT_RX_VLAN_PKT|PKT_RX_QINQ_PKT,
> ptype=RTE_PTYPE_L2_ETHER, m->vlan_tci=id, m->vlan_tci_outer=id
> 
> 
> Thoughts?
> 
> 
> 
> >
> >> -Original Message-
> >> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of John Daley
> >> (johndale)
> >> Sent: Friday, April 22, 2016 12:37 AM
> >> To: dev at dpdk.org
> >> Subject: [dpdk-dev] PKT_RX_VLAN_PKT when VLAN stripping is disabled
> >>
> >> Hi,
> >>
> >> When VLAN stripping is disabled, X710 and 82599ES act differently for
> >> me in this case when receiving VLAN tagged packets. On 82599ES the flag
> is set but on X710 the flag not set.
> >>
> >> Do I maybe have old X710 firmware? Or is it not set for X710 on
> >> purpose in this case and instead the flag is used to indicate if vlan_tci 
> >> is
> valid? Right now the enic pmd does what my X710 does, which I think is
> incorrect and I want to fix it.
> >>
> >> Thanks,
> >> John
> >


[dpdk-dev] [PATCH] i40evf: add ops for rx queue and tx queue

2016-04-26 Thread Wu, Jingjing


On 4/23/2016 7:29 PM, Xing, Beilei wrote:
> Add 3 vf ops: rx_queue_count, rxq_info_get and
> txq_info_get. They can reuse corresponding pv APIs.
>
> Signed-off-by: Beilei Xing 
> ---
>   drivers/net/i40e/i40e_ethdev_vf.c | 3 +++
>   1 file changed, 3 insertions(+)
>
> diff --git a/drivers/net/i40e/i40e_ethdev_vf.c 
> b/drivers/net/i40e/i40e_ethdev_vf.c
> index 2bce69b..87d6a64 100644
> --- a/drivers/net/i40e/i40e_ethdev_vf.c
> +++ b/drivers/net/i40e/i40e_ethdev_vf.c
> @@ -214,6 +214,9 @@ static const struct eth_dev_ops i40evf_eth_dev_ops = {
>   .rx_descriptor_done   = i40e_dev_rx_descriptor_done,
>   .tx_queue_setup   = i40e_dev_tx_queue_setup,
>   .tx_queue_release = i40e_dev_tx_queue_release,
> + .rx_queue_count   = i40e_dev_rx_queue_count,
> + .rxq_info_get = i40e_rxq_info_get,
> + .txq_info_get = i40e_txq_info_get,
>   .mac_addr_add = i40evf_add_mac_addr,
>   .mac_addr_remove  = i40evf_del_mac_addr,
>   .reta_update  = i40evf_dev_rss_reta_update,

Acked-by: Jingjing Wu mailto:jingjing.wu at 
intel.com>>



[dpdk-dev] [PATCH] i40evf: add ops for rx queue and tx queue

2016-04-26 Thread Wu, Jingjing


On 4/23/2016 7:29 PM, Xing, Beilei wrote:
> Add 3 vf ops: rx_queue_count, rxq_info_get and
> txq_info_get. They can reuse corresponding pv APIs.
a typo here? pv -> pf ?
>
> Signed-off-by: Beilei Xing 
> ---
>   drivers/net/i40e/i40e_ethdev_vf.c | 3 +++
>   1 file changed, 3 insertions(+)
>
> diff --git a/drivers/net/i40e/i40e_ethdev_vf.c 
> b/drivers/net/i40e/i40e_ethdev_vf.c
> index 2bce69b..87d6a64 100644
> --- a/drivers/net/i40e/i40e_ethdev_vf.c
> +++ b/drivers/net/i40e/i40e_ethdev_vf.c
> @@ -214,6 +214,9 @@ static const struct eth_dev_ops i40evf_eth_dev_ops = {
>   .rx_descriptor_done   = i40e_dev_rx_descriptor_done,
>   .tx_queue_setup   = i40e_dev_tx_queue_setup,
>   .tx_queue_release = i40e_dev_tx_queue_release,
> + .rx_queue_count   = i40e_dev_rx_queue_count,
> + .rxq_info_get = i40e_rxq_info_get,
> + .txq_info_get = i40e_txq_info_get,
>   .mac_addr_add = i40evf_add_mac_addr,
>   .mac_addr_remove  = i40evf_del_mac_addr,
>   .reta_update  = i40evf_dev_rss_reta_update,



[dpdk-dev] [PATCH] i40e: configure MTU

2016-04-26 Thread Wu, Jingjing


On 4/23/2016 7:26 PM, Xing, Beilei wrote:
> This patch enables configuring MTU for i40e.
> Since changing MTU needs to reconfigure queue, stop port first
> before configuring MTU.
>
> Signed-off-by: Beilei Xing 
> ---
>   drivers/net/i40e/i40e_ethdev.c | 49 
> ++
>   1 file changed, 49 insertions(+)
>
> diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
> index bc28d3c..29259b9 100644
> --- a/drivers/net/i40e/i40e_ethdev.c
> +++ b/drivers/net/i40e/i40e_ethdev.c
> @@ -447,6 +447,8 @@ static int i40e_get_eeprom(struct rte_eth_dev *dev,
>   static void i40e_set_default_mac_addr(struct rte_eth_dev *dev,
> struct ether_addr *mac_addr);
>   
> +static int i40e_dev_mtu_set(struct rte_eth_dev *dev, uint16_t mtu);
> +
>   static const struct rte_pci_id pci_id_i40e_map[] = {
>   #define RTE_PCI_DEV_ID_DECL_I40E(vend, dev) {RTE_PCI_DEVICE(vend, dev)},
>   #include "rte_pci_dev_ids.h"
> @@ -520,6 +522,7 @@ static const struct eth_dev_ops i40e_eth_dev_ops = {
>   .get_eeprom_length= i40e_get_eeprom_length,
>   .get_eeprom   = i40e_get_eeprom,
>   .mac_addr_set = i40e_set_default_mac_addr,
> + .mtu_set  = i40e_dev_mtu_set,
>   };
>   
>   /* store statistics names and its offset in stats structure */
> @@ -9104,3 +9107,49 @@ static void i40e_set_default_mac_addr(struct 
> rte_eth_dev *dev,
>   /* Flags: 0x3 updates port address */
>   i40e_aq_mac_address_write(hw, 0x3, mac_addr->addr_bytes, NULL);
>   }
> +
> +static int i40e_dev_mtu_set(struct rte_eth_dev *dev, uint16_t mtu)
According to the coding style, at function definition, "The function 
type should be on a line by itself preceding the function."
> +{
> + struct i40e_pf *pf = I40E_DEV_PRIVATE_TO_PF(dev->data->dev_private);
> + struct i40e_hw *hw = I40E_DEV_PRIVATE_TO_HW(dev->data->dev_private);
> + struct rte_eth_dev_data *dev_data = pf->dev_data;
> + struct rte_eth_dev_info dev_info;
> + struct i40e_rx_queue *rxq;
> + int i;
> + uint16_t len;
> + uint32_t frame_size = mtu + ETHER_HDR_LEN + ETHER_CRC_LEN;
> + int ret = 0;
> +
> + i40e_dev_info_get(dev, &dev_info);
> +
> + /* check if mtu is within the allowed range */
> + if ((mtu < ETHER_MIN_MTU) || (frame_size > dev_info.max_rx_pktlen))
> + return -EINVAL;
> +
The dev_info.max_rx_pktlen queried by i40e_dev_info_get is 
"I40E_FRAME_SIZE_MAX".
No need to call the API to get dev info in driver.



[dpdk-dev] [PATCH] vhost: Fix linkage of vhost PMD

2016-04-26 Thread Tetsuya Mukawa
Currently, vhost PMD doesn't have linkage for librte_vhost, even though
it depends on librte_vhost APIs. This causes a linkage error if below
conditions are fulfilled.

 - DPDK libraries are compiled as shared libraries.
 - DPDK application doesn't link librte_vhost.
 - Above application tries to link vhost PMD using '-d' DPDK option.

The patch adds linkage for librte_vhost to vhost PMD not to cause an
above error.

Acked-by: Panu Matilainen 
Signed-off-by: Tetsuya Mukawa 
---
 drivers/net/vhost/Makefile | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/vhost/Makefile b/drivers/net/vhost/Makefile
index f49a69b..30b91a0 100644
--- a/drivers/net/vhost/Makefile
+++ b/drivers/net/vhost/Makefile
@@ -38,6 +38,7 @@ LIB = librte_pmd_vhost.a

 CFLAGS += -O3
 CFLAGS += $(WERROR_FLAGS)
+LDLIBS += -lrte_vhost

 EXPORT_MAP := rte_pmd_vhost_version.map

-- 
2.5.0



[dpdk-dev] [PATCH] virtio: fix modify drv_flags for specific device

2016-04-26 Thread Jianfeng Tan
Issue: virtio's drv_flags are decided by devices types (modern vs legacy),
and which kernel driver is used, and the negotiated features (especially
VIRTIO_NET_STATUS) with backend, which makes it possible to multiple
virtio devices have different versions of drv_flags, but this variable
is currently shared by each virtio device.

How to fix: dev_flags is a device-specific variable to store this info.

Fixes: da978dfdc43 ("virtio: use port IO to get PCI resource")

Reported-by: David Marchand 
Suggested-by: David Marchand 
Signed-off-by: Jianfeng Tan 
---
 drivers/net/virtio/virtio_ethdev.c | 27 ---
 drivers/net/virtio/virtio_pci.c| 13 +++--
 drivers/net/virtio/virtio_pci.h|  3 ++-
 3 files changed, 25 insertions(+), 18 deletions(-)

diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index 63a368a..b144a58 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -59,6 +59,7 @@
 #include "virtqueue.h"
 #include "virtio_rxtx.h"

+#define VIRTIO_DRV_FLAGS   RTE_PCI_DRV_DETACHABLE

 static int eth_virtio_dev_init(struct rte_eth_dev *eth_dev);
 static int eth_virtio_dev_uninit(struct rte_eth_dev *eth_dev);
@@ -491,7 +492,6 @@ static void
 virtio_dev_close(struct rte_eth_dev *dev)
 {
struct virtio_hw *hw = dev->data->dev_private;
-   struct rte_pci_device *pci_dev = dev->pci_dev;

PMD_INIT_LOG(DEBUG, "virtio_dev_close");

@@ -499,7 +499,7 @@ virtio_dev_close(struct rte_eth_dev *dev)
virtio_dev_stop(dev);

/* reset the NIC */
-   if (pci_dev->driver->drv_flags & RTE_PCI_DRV_INTR_LSC)
+   if (dev->data->dev_flags & RTE_PCI_DRV_INTR_LSC)
vtpci_irq_config(hw, VIRTIO_MSI_NO_VECTOR);
vtpci_reset(hw);
virtio_dev_free_mbufs(dev);
@@ -1034,6 +1034,7 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev)
struct virtio_net_config *config;
struct virtio_net_config local_config;
struct rte_pci_device *pci_dev;
+   uint32_t dev_flags = VIRTIO_DRV_FLAGS;
int ret;

RTE_BUILD_BUG_ON(RTE_PKTMBUF_HEADROOM < sizeof(struct virtio_net_hdr));
@@ -1057,7 +1058,7 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev)

pci_dev = eth_dev->pci_dev;

-   ret = vtpci_init(pci_dev, hw);
+   ret = vtpci_init(pci_dev, hw, &dev_flags);
if (ret)
return ret;

@@ -1074,9 +1075,15 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev)

/* If host does not support status then disable LSC */
if (!vtpci_with_feature(hw, VIRTIO_NET_F_STATUS))
-   pci_dev->driver->drv_flags &= ~RTE_PCI_DRV_INTR_LSC;
+   dev_flags &= ~RTE_PCI_DRV_INTR_LSC;

rte_eth_copy_pci_info(eth_dev, pci_dev);
+   /* For virtio devices, dev_flags are decided according to feature
+* negotiation, aka if VIRTIO_NET_F_STATUS is set, and which kernel
+* driver is used, dynamically. And we should keep drv_flags shared
+* and unvaried.
+*/
+   eth_dev->data->dev_flags = dev_flags;

rx_func_get(eth_dev);

@@ -1155,7 +1162,7 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev)
pci_dev->id.device_id);

/* Setup interrupt callback  */
-   if (pci_dev->driver->drv_flags & RTE_PCI_DRV_INTR_LSC)
+   if (eth_dev->data->dev_flags & RTE_PCI_DRV_INTR_LSC)
rte_intr_callback_register(&pci_dev->intr_handle,
   virtio_interrupt_handler, eth_dev);

@@ -1190,7 +1197,7 @@ eth_virtio_dev_uninit(struct rte_eth_dev *eth_dev)
eth_dev->data->mac_addrs = NULL;

/* reset interrupt callback  */
-   if (pci_dev->driver->drv_flags & RTE_PCI_DRV_INTR_LSC)
+   if (eth_dev->data->dev_flags & RTE_PCI_DRV_INTR_LSC)
rte_intr_callback_unregister(&pci_dev->intr_handle,
virtio_interrupt_handler,
eth_dev);
@@ -1205,7 +1212,7 @@ static struct eth_driver rte_virtio_pmd = {
.pci_drv = {
.name = "rte_virtio_pmd",
.id_table = pci_id_virtio_map,
-   .drv_flags = RTE_PCI_DRV_DETACHABLE,
+   .drv_flags = VIRTIO_DRV_FLAGS,
},
.eth_dev_init = eth_virtio_dev_init,
.eth_dev_uninit = eth_virtio_dev_uninit,
@@ -1240,7 +1247,6 @@ virtio_dev_configure(struct rte_eth_dev *dev)
 {
const struct rte_eth_rxmode *rxmode = &dev->data->dev_conf.rxmode;
struct virtio_hw *hw = dev->data->dev_private;
-   struct rte_pci_device *pci_dev = dev->pci_dev;

PMD_INIT_LOG(DEBUG, "configure");

@@ -1258,7 +1264,7 @@ virtio_dev_configure(struct rte_eth_dev *dev)
return -ENOTSUP;
}

-   if (pci_dev->driver->drv_flags & RTE_PCI_DRV_INTR_LSC)
+   if (dev->data->dev_flags & RTE_PCI_DRV_INTR_LSC)
if (vtpci_irq_config(hw, 0) == V

[dpdk-dev] [PATCH v2] virtio: fix segfault when transmit pkts

2016-04-26 Thread Tan, Jianfeng
Hi Yuanhan,

> -Original Message-
> From: Yuanhan Liu [mailto:yuanhan.liu at linux.intel.com]
> Sent: Tuesday, April 26, 2016 11:43 AM
> To: Tan, Jianfeng
> Cc: dev at dpdk.org; Xie, Huawei
> Subject: Re: [PATCH v2] virtio: fix segfault when transmit pkts
> 
> On Mon, Apr 25, 2016 at 02:37:45AM +, Jianfeng Tan wrote:
> > Issue: when using virtio nic to transmit pkts, it causes segment fault.
> >
> > How to reproduce:
> > Basically, we need to construct a case with vm send packets to vhost-user,
> > and this issue does not happen when transmitting packets using indirect
> > desc. Besides, make sure all descriptors are exhausted before vhost
> > dequeues any packets.
> >
> > a. start testpmd with vhost.
> >   $ testpmd -c 0x3 -n 4 --socket-mem 1024,0 --no-pci \
> > --vdev 'eth_vhost0,iface=/tmp/sock0,queues=1' -- -i --nb-cores=1
> >
> > b. start a qemu with a virtio nic connected with the vhost-user port, just
> > make sure mrg_rxbuf is enabled.
> >
> > c. enable testpmd on the host.
> >   testpmd> set fwd io
> >   testpmd> start (better without start vhost-user)
> >
> > d. start testpmd in VM.
> >   $testpmd -c 0x3 -n 4 -m 1024 -- -i --disable-hw-vlan-filter 
> > --txqflags=0xf01
> >   testpmd> set fwd txonly
> >   testpmd> start
> >
> > How to fix: this bug is because inside virtqueue_enqueue_xmit(), the flag
> of
>   ^^^
> > desc has been updated inside the do {} while (), not necessary to update
> after
> > the loop.
> 
> That's not a right "because": you were stating a fact of the right way
> to do setup desc flags, but not the cause of this bug.
> 
> > (And if we do that after the loop, if all descs could have run out,
> > idx is VQ_RING_DESC_CHAIN_END (32768), use this idx to reference the
> start_dp
> > array will lead to segment fault.)
> 
> And that's the cause. So, you should state the cause first, then the fix
> (which we already have), but not in the verse order you just did.
> 
> So, I'd like to reword the commit log a bit, to something like following.
> What do you think of it? If no objection, I could merge it soon. Thanks
> for the fix, BTW!
> 

Your refinement sounds much better, thanks!

Jianfeng


[dpdk-dev] [PATCH] virtio: fix segfault when transmit pkts

2016-04-26 Thread Tan, Jianfeng
Hi Stephen,

On 4/26/2016 12:48 PM, Stephen Hemminger wrote:
> On Thu, 21 Apr 2016 12:36:10 +
> Jianfeng Tan  wrote:
>
>> Issue: when using virtio nic to transmit pkts, it causes segment fault.
>>
>> How to reproduce:
>> a. start testpmd with vhost.
>> $testpmd -c 0x3 -n 4 --socket-mem 1024,0 --no-pci \
>>--vdev 'eth_vhost0,iface=/tmp/sock0,queues=1' -- -i --nb-cores=1
>> b. start a qemu with a virtio nic connected with the vhost-user port.
>> $qemu -smp cores=2,sockets=1 -cpu host -enable-kvm vm-0.img -vnc :5 -m 4G \
>>-object memory-backend-file,id=mem,size=4096M,mem-path=,share=on \
>>-numa node,memdev=mem -mem-prealloc \
>>-chardev socket,id=char1,path=$sock_vhost \
>>-netdev type=vhost-user,id=net1,chardev=char1 \
>>-device virtio-net-pci,netdev=net1,mac=00:01:02:03:04:05
>> c. enable testpmd on the host.
>> testpmd> set fwd io
>> testpmd> start
>> d. start testpmd in VM.
>> $testpmd -c 0x3 -n 4 -m 1024 -- -i --disable-hw-vlan-filter --txqflags=0xf01
>> testpmd> set fwd txonly
>> testpmd> start
>>
>> How to fix: this bug is because inside virtqueue_enqueue_xmit(), the flag of
>> desc has been updated inside the do {} while (); and after the loop, all 
>> descs
>> could have run out, so idx is VQ_RING_DESC_CHAIN_END (32768), use this idx to
>> reference the start_dp array will lead to segment fault.
>>
>> Signed-off-by: Jianfeng Tan 
>> ---
>>   drivers/net/virtio/virtio_rxtx.c | 2 --
>>   1 file changed, 2 deletions(-)
>>
>> diff --git a/drivers/net/virtio/virtio_rxtx.c 
>> b/drivers/net/virtio/virtio_rxtx.c
>> index ef21d8e..432aeab 100644
>> --- a/drivers/net/virtio/virtio_rxtx.c
>> +++ b/drivers/net/virtio/virtio_rxtx.c
>> @@ -271,8 +271,6 @@ virtqueue_enqueue_xmit(struct virtqueue *txvq, struct 
>> rte_mbuf *cookie,
>>  idx = start_dp[idx].next;
>>  } while ((cookie = cookie->next) != NULL);
>>   
>> -start_dp[idx].flags &= ~VRING_DESC_F_NEXT;
>> -
>>  if (use_indirect)
>>  idx = txvq->vq_ring.desc[head_idx].next;
>>   
> At this point in the code idx is the index past the current set of ring
> descriptors. So yes this is a real bug.
>
> I think the description meta-data needs work to explain it better.
>
>
Yes, please see v2. Yuanhan gives a hand to refine it already.

Thanks,
Jianfeng



[dpdk-dev] [RFC] Link ibrte_vhost to librte_pmd_vhost

2016-04-26 Thread Tetsuya Mukawa
On 2016/04/26 12:47, Yuanhan Liu wrote:
> On Mon, Apr 25, 2016 at 12:28:37PM +0300, Panu Matilainen wrote:
>  >
>>> Another way is applying a below patch.
>>> --- a/drivers/net/vhost/Makefile
>>> +++ b/drivers/net/vhost/Makefile
>>> @@ -38,6 +38,7 @@ LIB = librte_pmd_vhost.a
>>>
>>>  CFLAGS += -O3
>>>  CFLAGS += $(WERROR_FLAGS)
>>> +LDLIBS += -lrte_vhost
>>>
>>>  EXPORT_MAP := rte_pmd_vhost_version.map
>>>
>>> This is same way to link libpcap to librte_pmd_pcap.
>>> What do you think about adding it to vhost PMD?
>> Yes, this is absolutely the right thing to do.
>>
>> Ultimately this should be done for all dependencies in all libraries, but
>> missing dependencies are even more pronounced in plugins so the sooner this
>> goes in, the better.
>>
>> Acked-by: Panu Matilainen 
> Panu, thanks for the input.
>
> Tetsuya, please submit a formal patch so that I can merge.
>
>   --yliu

Hi Yuanhan,

Oh sorry, I forgot to add "--in-reply-to" while sending the patch, so
you may miss it.
Also the order of Acked-by and Signed-off-by was wrong in above patch.
So I will send v2 soon.

Thanks,
Tetsuya


[dpdk-dev] [PATCH v2] vhost: Fix linkage of vhost PMD

2016-04-26 Thread Tetsuya Mukawa
Currently, vhost PMD doesn't have linkage for librte_vhost, even though
it depends on librte_vhost APIs. This causes a linkage error if below
conditions are fulfilled.

 - DPDK libraries are compiled as shared libraries.
 - DPDK application doesn't link librte_vhost.
 - Above application tries to link vhost PMD using '-d' DPDK option.

The patch adds linkage for librte_vhost to vhost PMD not to cause an
above error.

Signed-off-by: Tetsuya Mukawa 
Acked-by: Panu Matilainen 
---
 drivers/net/vhost/Makefile | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/vhost/Makefile b/drivers/net/vhost/Makefile
index f49a69b..30b91a0 100644
--- a/drivers/net/vhost/Makefile
+++ b/drivers/net/vhost/Makefile
@@ -38,6 +38,7 @@ LIB = librte_pmd_vhost.a

 CFLAGS += -O3
 CFLAGS += $(WERROR_FLAGS)
+LDLIBS += -lrte_vhost

 EXPORT_MAP := rte_pmd_vhost_version.map

-- 
2.5.0



[dpdk-dev] [RFC] Link ibrte_vhost to librte_pmd_vhost

2016-04-26 Thread Tetsuya Mukawa
On 2016/04/26 14:48, Yuanhan Liu wrote:
> On Tue, Apr 26, 2016 at 02:37:37PM +0900, Tetsuya Mukawa wrote:
>> On 2016/04/26 12:47, Yuanhan Liu wrote:
>>> On Mon, Apr 25, 2016 at 12:28:37PM +0300, Panu Matilainen wrote:
>>>  >
> Another way is applying a below patch.
> --- a/drivers/net/vhost/Makefile
> +++ b/drivers/net/vhost/Makefile
> @@ -38,6 +38,7 @@ LIB = librte_pmd_vhost.a
>
>  CFLAGS += -O3
>  CFLAGS += $(WERROR_FLAGS)
> +LDLIBS += -lrte_vhost
>
>  EXPORT_MAP := rte_pmd_vhost_version.map
>
> This is same way to link libpcap to librte_pmd_pcap.
> What do you think about adding it to vhost PMD?
 Yes, this is absolutely the right thing to do.

 Ultimately this should be done for all dependencies in all libraries, but
 missing dependencies are even more pronounced in plugins so the sooner this
 goes in, the better.

 Acked-by: Panu Matilainen 
>>> Panu, thanks for the input.
>>>
>>> Tetsuya, please submit a formal patch so that I can merge.
>>>
>>> --yliu
>> Hi Yuanhan,
>>
>> Oh sorry, I forgot to add "--in-reply-to" while sending the patch, so
>> you may miss it.
>> Also the order of Acked-by and Signed-off-by was wrong in above patch.
>> So I will send v2 soon.
> No worry. I could fix them, and I don't think there is a very strict
> rule for both of them :) BTW, FYI, I just found my linux.intel.com
> seems to be broken (no idea why), thus I made late response.
>
> (well, I found you just send a v2; I will apply that tomorrow).
>
>   --yliu

Thank you so much!

Tetsuya


[dpdk-dev] [PATCH] kni: add chained mbufs support

2016-04-26 Thread Zhang, Helin
Have you tested with it?
I think we need to test it in a longer time, e.g. 1 hour
My commetns inlined.

Thanks,
Helin

> -Original Message-
> From: Yigit, Ferruh
> Sent: Tuesday, April 26, 2016 12:11 AM
> To: dev at dpdk.org
> Cc: Zhang, Helin; Yigit, Ferruh
> Subject: [PATCH] kni: add chained mbufs support
> 
> rx_q fifo may have chained mbufs, merge them into single skb before
> handing to the network stack.
> 
> Signed-off-by: Ferruh Yigit 
> ---
>  .../linuxapp/eal/include/exec-env/rte_kni_common.h |  4 +-
>  lib/librte_eal/linuxapp/kni/kni_net.c  | 83 
> --
>  2 files changed, 64 insertions(+), 23 deletions(-)
> 
> diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
> b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
> index 7e5e598..2acdfd9 100644
> --- a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
> +++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
> @@ -113,7 +113,9 @@ struct rte_kni_mbuf {
>   void *buf_addr __attribute__((__aligned__(RTE_CACHE_LINE_SIZE)));
>   char pad0[10];
>   uint16_t data_off;  /**< Start address of data in segment buffer.
> */
> - char pad1[4];
> + char pad1[2];
> + uint8_t nb_segs;/**< Number of segments. */
> + char pad4[1];
>   uint64_t ol_flags;  /**< Offload features. */
>   char pad2[4];
>   uint32_t pkt_len;   /**< Total pkt len: sum of all segment data_len.
> */
> diff --git a/lib/librte_eal/linuxapp/kni/kni_net.c
> b/lib/librte_eal/linuxapp/kni/kni_net.c
> index cfa8339..570de71 100644
> --- a/lib/librte_eal/linuxapp/kni/kni_net.c
> +++ b/lib/librte_eal/linuxapp/kni/kni_net.c
> @@ -156,7 +156,8 @@ kni_net_rx_normal(struct kni_dev *kni)
>   /* Transfer received packets to netif */
>   for (i = 0; i < num_rx; i++) {
>   kva = (void *)va[i] - kni->mbuf_va + kni->mbuf_kva;
> - len = kva->data_len;
> + len = kva->pkt_len;
> +
>   data_kva = kva->buf_addr + kva->data_off - kni->mbuf_va
>   + kni->mbuf_kva;
> 
> @@ -165,22 +166,41 @@ kni_net_rx_normal(struct kni_dev *kni)
>   KNI_ERR("Out of mem, dropping pkts\n");
>   /* Update statistics */
>   kni->stats.rx_dropped++;
> + continue;
>   }
> - else {
> - /* Align IP on 16B boundary */
> - skb_reserve(skb, 2);
> +
> + /* Align IP on 16B boundary */
> + skb_reserve(skb, 2);
> +
> + if (kva->nb_segs == 0) {
I guess it should compare nb_segs with 1, but not 0. Am I wrong?

>   memcpy(skb_put(skb, len), data_kva, len);
> - skb->dev = dev;
> - skb->protocol = eth_type_trans(skb, dev);
> - skb->ip_summed = CHECKSUM_UNNECESSARY;
> + } else {
> + int nb_segs;
> + int kva_nb_segs = kva->nb_segs;
> 
> - /* Call netif interface */
> - netif_rx_ni(skb);
> + for (nb_segs = 0; nb_segs < kva_nb_segs; nb_segs++)
Kva_nb_segs might not needed at all, use kva->nb_segs directly?

> {
> + memcpy(skb_put(skb, kva->data_len),
> + data_kva, kva->data_len);
> 
> - /* Update statistics */
> - kni->stats.rx_bytes += len;
> - kni->stats.rx_packets++;
> + if (!kva->next)
> + break;
> +
> + kva = kva->next - kni->mbuf_va + kni-
> >mbuf_kva;
> + data_kva = kva->buf_addr + kva->data_off
> + - kni->mbuf_va + kni->mbuf_kva;
> + }
>   }
> +
> + skb->dev = dev;
> + skb->protocol = eth_type_trans(skb, dev);
> + skb->ip_summed = CHECKSUM_UNNECESSARY;
> +
> + /* Call netif interface */
> + netif_rx_ni(skb);
> +
> + /* Update statistics */
> + kni->stats.rx_bytes += len;
> + kni->stats.rx_packets++;
>   }
> 
>   /* Burst enqueue mbufs into free_q */
> @@ -317,7 +337,7 @@ kni_net_rx_lo_fifo_skb(struct kni_dev *kni)
>   /* Copy mbufs to sk buffer and then call tx interface */
>   for (i = 0; i < num; i++) {
>   kva = (void *)va[i] - kni->mbuf_va + kni->mbuf_kva;
> - len = kva->data_len;
> + len = kva->pkt_len;
>   data_kva = kva->buf_addr + kva->data_off - kni->mbuf_va +
>   kni->mbuf_kva;
> 
> @@ -338,20 +358,39 @@ kni_net_rx_lo_fifo_skb(struct kni_dev *kni)
>   if (skb == NULL) {
>   KNI_ERR("Out of mem, dropping pkts\n");

[dpdk-dev] [RFC] eal: provide option to set vhost_user socket owner/permissions

2016-04-26 Thread Christian Ehrhardt
Thanks,
great that you added more on CC for a wider discussion - I think that is
the only right way to go.

Just to "defend" a bit - solution a) was created under the special
circumstance that I wanted a workaround that would work today.
But that is/was special to what I package with DPDK 2.2 + OVS 2.5 as of
today - and therefore was the right place for a fast interim fix for me.
I totally agree that the A in EAL was meant for abstraction and we might
want to avoid vhost specific things in there that in the long run.

I like your suggestion of a new API as a proper long term solution, but I
don't feel deeply enough involved yet on the API level to give it any
judgement.
So I look forward for more opinions on it.

P.S. the patch bot hit me hard with 2 pages of space/bracket issues, sorry
for that - but it was only meant as RFC after all :-)


Christian Ehrhardt
Software Engineer, Ubuntu Server
Canonical Ltd

On Tue, Apr 26, 2016 at 6:16 AM, Yuanhan Liu 
wrote:

> On Mon, Apr 25, 2016 at 11:18:16AM +0200, Christian Ehrhardt wrote:
> > The API doesn't hold a way to specify a owner/permission set for
> vhost_user
> > created sockets.
>
> Yes, it's kind of like a known issue. So, thanks for bringing it, with
> a solution, for dicussion (cc'ed more people).
>
> > I don't even think an API change would make that much sense.
> >
> > Projects consuming DPDK start to do 'their own workarounds' like
> openvswitch
> > https://patchwork.ozlabs.org/patch/559043/
> > https://patchwork.ozlabs.org/patch/559045/
> > But for this specific example they are blocked/stalled behind a bigger
> > rework (https://patchwork.ozlabs.org/patch/604898/).
> > Also one could ask why each project would need their own workaround.
> >
> > At the same time - as I want it for existing code linking against DPDK I
> > wanted to avoid changing API/ABI. That way I want to provide something
> existing
> > users could utilize. So I created a DPDK EAL commandline option based
> ideas in
> > the former patches.
> >
> > For myself I consider this a nice interim solution for existing released
> > Openvswitch+DPDK solution. And I intend to put it as delta into the DPDK
> 2.2
> > currently packaged in Ubuntu to get it working more smoothly with
> > openvswitch 2.5.
> >
> > But I'd be interested if DPDK in general would be interested in:
> > a) an approach like this?
>
> You were trying to add a vhost specific stuff as EAL command option,
> which is something we might should try to avoid.
>
> > b) would prefer a change of the API?
>
> Adding a new option to the current register API might will not work well,
> either. It gives you no ability to do a dynamic change later. I mean,
> taking OVS as an example, OVS provides you the flexible ability to do all
> kinds of configuration in a dynamic way, say number of rx queues. If we
> do the permissions setup in the register time, there would be no way to
> change it later, right?
>
> So, I'm thinking that we may could add a new API for that? It then would
> allow applications to change it at anytime.
>
> > c) consider it an issue of consuming projects and let them take care?
>
> It's not exactly an issue of consuming projects; we created the socket
> file after all.
>
> And I'd like to hear what others would say.
>
> Thanks.
>
> --yliu
>


[dpdk-dev] [PATCH] eal: out-of-bounds write

2016-04-26 Thread Slawomir Mrozowicz
Fix issue reported by Coverity.

Coverity ID 13282: Out-of-bounds write
overrun-local: Overrunning array mcfg->memseg of 256 44-byte elements
at element index 257 using index j.

Fixes: af75078fece3 ("first public release")

Signed-off-by: Slawomir Mrozowicz 
---
 lib/librte_eal/linuxapp/eal/eal_memory.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c 
b/lib/librte_eal/linuxapp/eal/eal_memory.c
index 5b9132c..1e737e4 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memory.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
@@ -1333,7 +1333,7 @@ rte_eal_hugepage_init(void)

if (new_memseg) {
j += 1;
-   if (j == RTE_MAX_MEMSEG)
+   if (j >= RTE_MAX_MEMSEG)
break;

mcfg->memseg[j].phys_addr = hugepage[i].physaddr;
-- 
1.9.1



[dpdk-dev] [PATCH] kni: add chained mbufs support

2016-04-26 Thread Ferruh Yigit
On 4/26/2016 7:49 AM, Zhang, Helin wrote:
> Have you tested with it?
Yes, has been tested.

> I think we need to test it in a longer time, e.g. 1 hour
I will make a longevity test before sending next patch.

> My commetns inlined.
> 
> Thanks,
> Helin
> 
>> -Original Message-
>> From: Yigit, Ferruh
>> Sent: Tuesday, April 26, 2016 12:11 AM
>> To: dev at dpdk.org
>> Cc: Zhang, Helin; Yigit, Ferruh
>> Subject: [PATCH] kni: add chained mbufs support
>>
>> rx_q fifo may have chained mbufs, merge them into single skb before
>> handing to the network stack.
>>
>> Signed-off-by: Ferruh Yigit 
>> ---
>>  .../linuxapp/eal/include/exec-env/rte_kni_common.h |  4 +-
>>  lib/librte_eal/linuxapp/kni/kni_net.c  | 83 
>> --
>>  2 files changed, 64 insertions(+), 23 deletions(-)
>>
>> diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
>> b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
>> index 7e5e598..2acdfd9 100644
>> --- a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
>> +++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
>> @@ -113,7 +113,9 @@ struct rte_kni_mbuf {
>>  void *buf_addr __attribute__((__aligned__(RTE_CACHE_LINE_SIZE)));
>>  char pad0[10];
>>  uint16_t data_off;  /**< Start address of data in segment buffer.
>> */
>> -char pad1[4];
>> +char pad1[2];
>> +uint8_t nb_segs;/**< Number of segments. */
>> +char pad4[1];
>>  uint64_t ol_flags;  /**< Offload features. */
>>  char pad2[4];
>>  uint32_t pkt_len;   /**< Total pkt len: sum of all segment data_len.
>> */
>> diff --git a/lib/librte_eal/linuxapp/kni/kni_net.c
>> b/lib/librte_eal/linuxapp/kni/kni_net.c
>> index cfa8339..570de71 100644
>> --- a/lib/librte_eal/linuxapp/kni/kni_net.c
>> +++ b/lib/librte_eal/linuxapp/kni/kni_net.c
>> @@ -156,7 +156,8 @@ kni_net_rx_normal(struct kni_dev *kni)
>>  /* Transfer received packets to netif */
>>  for (i = 0; i < num_rx; i++) {
>>  kva = (void *)va[i] - kni->mbuf_va + kni->mbuf_kva;
>> -len = kva->data_len;
>> +len = kva->pkt_len;
>> +
>>  data_kva = kva->buf_addr + kva->data_off - kni->mbuf_va
>>  + kni->mbuf_kva;
>>
>> @@ -165,22 +166,41 @@ kni_net_rx_normal(struct kni_dev *kni)
>>  KNI_ERR("Out of mem, dropping pkts\n");
>>  /* Update statistics */
>>  kni->stats.rx_dropped++;
>> +continue;
>>  }
>> -else {
>> -/* Align IP on 16B boundary */
>> -skb_reserve(skb, 2);
>> +
>> +/* Align IP on 16B boundary */
>> +skb_reserve(skb, 2);
>> +
>> +if (kva->nb_segs == 0) {
> I guess it should compare nb_segs with 1, but not 0. Am I wrong?
> 

Right, this needs to be 1, I will send a new patch.

>>  memcpy(skb_put(skb, len), data_kva, len);
>> -skb->dev = dev;
>> -skb->protocol = eth_type_trans(skb, dev);
>> -skb->ip_summed = CHECKSUM_UNNECESSARY;
>> +} else {
>> +int nb_segs;
>> +int kva_nb_segs = kva->nb_segs;
>>
>> -/* Call netif interface */
>> -netif_rx_ni(skb);
>> +for (nb_segs = 0; nb_segs < kva_nb_segs; nb_segs++)
> Kva_nb_segs might not needed at all, use kva->nb_segs directly?
> 

It is needed, kva keeps updated, so need to save number of segment for
first mbuf.

>> {
>> +memcpy(skb_put(skb, kva->data_len),
>> +data_kva, kva->data_len);
>>
>> -/* Update statistics */
>> -kni->stats.rx_bytes += len;
>> -kni->stats.rx_packets++;
>> +if (!kva->next)
>> +break;
>> +
>> +kva = kva->next - kni->mbuf_va + kni-
>>> mbuf_kva;
>> +data_kva = kva->buf_addr + kva->data_off
>> +- kni->mbuf_va + kni->mbuf_kva;
>> +}
>>  }
>> +
>> +skb->dev = dev;
>> +skb->protocol = eth_type_trans(skb, dev);
>> +skb->ip_summed = CHECKSUM_UNNECESSARY;
>> +
>> +/* Call netif interface */
>> +netif_rx_ni(skb);
>> +
>> +/* Update statistics */
>> +kni->stats.rx_bytes += len;
>> +kni->stats.rx_packets++;
>>  }
>>
>>  /* Burst enqueue mbufs into free_q */
>> @@ -317,7 +337,7 @@ kni_net_rx_lo_fifo_skb(struct kni_dev *kni)
>>  /* Copy mbufs to sk buffer and then call tx interface */
>>  for (i = 0; i < num; i++) {
>>  kva = (void *)va[i] - kni->mbuf_va + kni->mbuf_kva;
>> -len = kva->data_len;
>> +

[dpdk-dev] [PATCH 1/4] ixgbe: rearrange vector PMD code for x86

2016-04-26 Thread Jianbo Liu
On 26 April 2016 at 00:35, Bruce Richardson  
wrote:
> On Wed, Apr 20, 2016 at 09:44:59PM +0800, Jianbo Liu wrote:
>> move SSE-dependent code to new file "ixgbe_rxtx_vec_sse.h"
>>
>> Signed-off-by: Jianbo Liu 
>> ---
>>  drivers/net/ixgbe/ixgbe_rxtx_vec.c | 369 +
>>  drivers/net/ixgbe/ixgbe_rxtx_vec_sse.h | 408 
>> +
>>  2 files changed, 409 insertions(+), 368 deletions(-)
>>  create mode 100644 drivers/net/ixgbe/ixgbe_rxtx_vec_sse.h
>>
> Hi Jianbo,
>
> functionally I've given this a quick sanity test and see no issues with 
> performance
> on the x86(_64) side of things.
>
> However, in terms of how the driver split in done in this set of patches, I 
> think
> it might be better to reverse what goes in the header files and in the .c 
> files.
> Rather than having the common code in the .c file and the arch specific code 
> in
> the header file, I think the common code should be in a header file and the
> arch specific code in a .c file.
>
> The reason for this is the need for possibly different compiler flags to be
> passed for the vector drivers from the makefile e.g. as is done by my patchset
> for i40e [http://dpdk.org/dev/patchwork/patch/12082/]. This would be a bit 
> more
> awkward if that one C file is shared by multiple architectures, as we'd have
> architecture specific branches in both makefile and C file. As well as that,
> the possibility exists of multiple vector drivers for one architecture, e.g.
> an SSE and AVX driver for x86_64 with selection of code patch at runtime as 
> done
> by the ACL library. In that case, you want multiple vector code paths compiled
> with different CFLAG overrides, which necessitates different C files.
>
> Therefore, I think using a C file per instruction set/architecture, rather 
> than
> a header file per arch may be more expandable in future.
>

Good suggestion. I will submit v2 later.

Thanks!
Jianbo


[dpdk-dev] [PATCH v2] virtio: fix segfault when transmit pkts

2016-04-26 Thread Thomas Monjalon
Talking about wording,

2016-04-25 20:43, Yuanhan Liu:
> ---
> Subject: virtio: fix segfault on Tx desc flags setup

I think the english word "crash" is better than "segfault".


[dpdk-dev] [RFC] eal: provide option to set vhost_user socket owner/permissions

2016-04-26 Thread Thomas Monjalon
2016-04-25 21:16, Yuanhan Liu:
> On Mon, Apr 25, 2016 at 11:18:16AM +0200, Christian Ehrhardt wrote:
> > The API doesn't hold a way to specify a owner/permission set for vhost_user
> > created sockets.
> 
> Yes, it's kind of like a known issue. So, thanks for bringing it, with
> a solution, for dicussion (cc'ed more people).
[...]
> > But I'd be interested if DPDK in general would be interested in:
> > a) an approach like this?
> 
> You were trying to add a vhost specific stuff as EAL command option,
> which is something we might should try to avoid.

Yes, -1

> > b) would prefer a change of the API?
> 
> Adding a new option to the current register API might will not work well,
> either. It gives you no ability to do a dynamic change later. I mean,
> taking OVS as an example, OVS provides you the flexible ability to do all
> kinds of configuration in a dynamic way, say number of rx queues. If we
> do the permissions setup in the register time, there would be no way to
> change it later, right?
> 
> So, I'm thinking that we may could add a new API for that? It then would
> allow applications to change it at anytime.

A vhost API in the library?
And for vhost PMD? What about devargs parameters?

> > c) consider it an issue of consuming projects and let them take care?
> 
> It's not exactly an issue of consuming projects; we created the socket
> file after all.

Yes


[dpdk-dev] [PATCH] eal: out-of-bounds write

2016-04-26 Thread Bruce Richardson
On Tue, Apr 26, 2016 at 09:44:47AM +0200, Slawomir Mrozowicz wrote:
> Fix issue reported by Coverity.
> 
> Coverity ID 13282: Out-of-bounds write
> overrun-local: Overrunning array mcfg->memseg of 256 44-byte elements
> at element index 257 using index j.
> 
> Fixes: af75078fece3 ("first public release")
> 
> Signed-off-by: Slawomir Mrozowicz 
> ---
>  lib/librte_eal/linuxapp/eal/eal_memory.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c 
> b/lib/librte_eal/linuxapp/eal/eal_memory.c
> index 5b9132c..1e737e4 100644
> --- a/lib/librte_eal/linuxapp/eal/eal_memory.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
> @@ -1333,7 +1333,7 @@ rte_eal_hugepage_init(void)
>  
>   if (new_memseg) {
>   j += 1;
> - if (j == RTE_MAX_MEMSEG)
> + if (j >= RTE_MAX_MEMSEG)
>   break;
>  
>   mcfg->memseg[j].phys_addr = hugepage[i].physaddr;
> -- 

This does appear to be a valid fix for the issue. However, looking at the code,
it appears that the only way we could actually hit the problem is if 
j == RTE_MAX_MEMSEG on exiting the previous loop. Would a check there be a 
better
fix for this issue (or perhaps we want both fixes).

Thoughts?

/Bruce


[dpdk-dev] ixgbe : query regarding your code changes for VF mac add

2016-04-26 Thread santosh
Hi

Looks like there is a bug in API "ixgbevf_remove_mac_addr()".   For
deleting a given MAC it deletes all MAC (including permanent MAC) and
while adding (it does after few statements), it skips to add permanent
MAC.

ixgbevf_remove_mac_addr(struct rte_eth_dev *dev, uint32_t index)
{
...

/*
 * The IXGBE_VF_SET_MACVLAN command of the ixgbe-pf driver does
 * not support the deletion of a given MAC address.
 * Instead, it imposes to delete all MAC addresses, then to add again
 * all MAC addresses with the exception of the one to be deleted.
 */


(void) ixgbevf_set_uc_addr_vf(hw, 0, NULL);
...

2.   Following modification at "ixgbevf_remove_mac_addr()}  also
helped to make ping works

$ git diff  dpdk/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
--- a/dpdk/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
+++ b/dpdk/lib/librte_pmd_ixgbe/ixgbe_ethdev.c

 @@ -3530,10 +3557,7 @@ ixgbevf_remove_mac_addr(struct
rte_eth_dev *dev, uint32_t index)
/* Skip NULL MAC addresses */
if (is_zero_ether_addr(mac_addr))
continue;
-   /* Skip the permanent MAC address */
-   if (memcmp(perm_addr, mac_addr, sizeof(struct ether_addr)) == 0)
-   continue;

On Mon, Apr 25, 2016 at 7:05 PM, santosh  wrote:
> Hi Ivan
>
>  ixgbevf_set_default_mac_addr()   could not find in our code base.
> put traces at other places as suggested by you.
>  Log at "eth_ixgbevf_dev_init"  never displayed
> Rest logs displayed as shown below.  I re-build the  driver module and
> loaded on our virtual router and rebooted the system.
>
> 1.
> Logs at the time of boot up:
> -
>
> INIT: Initializing NIC port 0 RX queue 0 ...
> INIT: Initializing NIC port 0 TX queue 0 ...
> Santosh ixgbevf_add_mac_addr portid=0 mac=00:50:56:A0:10:C2
> .
> 
>
> Santosh ixgbevf_add_mac_addr portid=0 mac=00:50:56:A0:10:C2
> RUNTIME: Detected port  0 status changed to UP.
>
> 2.
> a.
> Configured  a new MAC at Virtual Router CLI
>
> root at mx86-bgl-2-r1# set interfaces ge-0/0/0 mac 00:50:56:a0:a0:c3
>
> [edit]
> root at mx86-bgl-2-r1# commit
> commit complete
>
> [edit]
> root at mx86-bgl-2-r1# show interfaces
> ge-0/0/0 {
> mac 00:50:56:a0:a0:c3;
> unit 0 {
> family inet {
> address 10.1.1.102/24;
> }
> }
> }
>
> [edit]
> root at mx86-bgl-2-r1# run ping 10.1.1.101
> PING 10.1.1.101 (10.1.1.101): 56 data bytes
> 64 bytes from 10.1.1.101: icmp_seq=0 ttl=64 time=2.142 ms
> 64 bytes from 10.1.1.101: icmp_seq=1 ttl=64 time=8.465 ms
> ^C
> --- 10.1.1.101 ping statistics ---
> 2 packets transmitted, 2 packets received, 0% packet loss
> round-trip min/avg/max/stddev = 2.142/5.303/8.465/3.162 ms
>
> [edit]
> root at mx86-bgl-2-r1#
>
> b.
>Logs at console for above config:
>-
>
> root at localhost:/home/pfe# Santosh ixgbevf_add_mac_addr portid=0
> mac=00:50:56:A0:A0:C3
>
> root at localhost:/home/pfe#
>
>
> 3.
> a. Deleted above config MAC
> root at mx86-bgl-2-r1# delete interfaces ge-0/0/0 mac
>
> [edit]
> root at mx86-bgl-2-r1# commit
> commit complete
>
> [edit]
> root at mx86-bgl-2-r1# show interfaces
> ge-0/0/0 {
> unit 0 {
> family inet {
> address 10.1.1.102/24;
> }
> }
> }
>
> [edit]
> root at mx86-bgl-2-r1# run ping 10.1.1.101
> PING 10.1.1.101 (10.1.1.101): 56 data bytes
> ^C
> --- 10.1.1.101 ping statistics ---
> 3 packets transmitted, 0 packets received, 100% packet loss
>
> [edit]
> root at mx86-bgl-2-r1#
>
> b.
>  Logs at console for above cofig
>  -
>
>
> root at localhost:/home/pfe# ixgbevf_remove_mac_addr santosh portid=0
> index=1 mac=00:50:56:A0:A0:C3
>
>
>
> Thanks
> Santosh
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> On Fri, Apr 22, 2016 at 12:51 PM, Ivan Boule  wrote:
>> Hi Santosh,
>>
>> My job at 6WIND does not consist in answering to DPDK questions. In general,
>> I have other priorities, including vacations...
>> In the meantime, nobody prevents you to add traces in the code to really
>> understand what happens, as suggested in my last answer.
>>
>> Regards,
>> Ivan
>>
>>
>> On 04/21/2016 07:55 AM, santosh wrote:
>>>
>>> Hi Ivan and team,
>>>
>>> Please respond to my last mail and  let me know if there is any
>>> alternate way to handle this.
>>> Our release is in pending due to this issue.
>>>
>>>
>>> Thanks & Regards
>>> Santosh
>>>
>>> On Wed, Apr 20, 2016 at 2:35 PM, santosh  wrote:

 Hi Ivan,

 Thanks for your response.

 Let me explain you the issue that we are facing on our virtual router
 in VMware environment.

 1. We are using ixgbe driver , SRIOV enabled .
  root at localhost:~# lspci
 "Intel Corporation 82599 Ethernet Controller Virtual
 Function"

 2.  "mx86-bgl-1-r1"  is our router under testing and  R2 is a standard
 router.

mx86-bgl-1-r1 is connected t

[dpdk-dev] ixgbe : query regarding your code changes for VF mac add

2016-04-26 Thread Ivan Boule
Hi Santosh,

Things are a little bit more complex.
When the permanent MAC address of a VF is assigned by the PF driver, it 
cannot be changed nor deleted by the VF through commands issued to the 
PF driver. In this case, the function xgbevf_remove_mac_addr() must not 
add the permanent MAC address again, as currently done.

Conversely, when the permanent MAC address of a VF is assigned by the Vf 
itself, then it will be also deleted by the call to 
ixgbevf_set_uc_addr_vf(hw, 0, NULL) in the function 
xgbevf_remove_mac_addr (as other MAC addresses that have been previously 
added through a call to the function ixgbevf_add_mac_addr).
In this case, the function xgbevf_remove_mac_addr must effectively add 
the permanent MAC address again.

To resume, the ixgbevf PMD must record in a new dedicated flag if the 
permanent MAC address of a VF has been assigned by the PF driver or not, 
and test this flag in the function xgbevf_remove_mac_addr to decide if 
it must add again the permanent MAC address of the VF.

Regards,
Ivan

On 04/26/2016 07:36 AM, santosh wrote:
> Hi
>
> Looks like there is a bug in API "ixgbevf_remove_mac_addr()".   For
> deleting a given MAC it deletes all MAC (including permanent MAC) and
> while adding (it does after few statements), it skips to add permanent
> MAC.
>
> ixgbevf_remove_mac_addr(struct rte_eth_dev *dev, uint32_t index)
> {
> ...
>
>  /*
>   * The IXGBE_VF_SET_MACVLAN command of the ixgbe-pf driver does
>   * not support the deletion of a given MAC address.
>   * Instead, it imposes to delete all MAC addresses, then to add again
>   * all MAC addresses with the exception of the one to be deleted.
>   */
>
>
>  (void) ixgbevf_set_uc_addr_vf(hw, 0, NULL);
> ...
>
> 2.   Following modification at "ixgbevf_remove_mac_addr()}  also
> helped to make ping works
>
> $ git diff  dpdk/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
> --- a/dpdk/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
> +++ b/dpdk/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
>
>   @@ -3530,10 +3557,7 @@ ixgbevf_remove_mac_addr(struct
> rte_eth_dev *dev, uint32_t index)
>  /* Skip NULL MAC addresses */
>  if (is_zero_ether_addr(mac_addr))
>  continue;
> -   /* Skip the permanent MAC address */
> -   if (memcmp(perm_addr, mac_addr, sizeof(struct ether_addr)) == 
> 0)
> -   continue;
>
> On Mon, Apr 25, 2016 at 7:05 PM, santosh  wrote:
>> Hi Ivan
>>
>>   ixgbevf_set_default_mac_addr()   could not find in our code base.
>> put traces at other places as suggested by you.
>>   Log at "eth_ixgbevf_dev_init"  never displayed
>> Rest logs displayed as shown below.  I re-build the  driver module and
>> loaded on our virtual router and rebooted the system.
>>
>> 1.
>> Logs at the time of boot up:
>> -
>>
>> INIT: Initializing NIC port 0 RX queue 0 ...
>> INIT: Initializing NIC port 0 TX queue 0 ...
>> Santosh ixgbevf_add_mac_addr portid=0 mac=00:50:56:A0:10:C2
>> .
>> 
>>
>> Santosh ixgbevf_add_mac_addr portid=0 mac=00:50:56:A0:10:C2
>> RUNTIME: Detected port  0 status changed to UP.
>>
>> 2.
>> a.
>> Configured  a new MAC at Virtual Router CLI
>>
>> root at mx86-bgl-2-r1# set interfaces ge-0/0/0 mac 00:50:56:a0:a0:c3
>>
>> [edit]
>> root at mx86-bgl-2-r1# commit
>> commit complete
>>
>> [edit]
>> root at mx86-bgl-2-r1# show interfaces
>> ge-0/0/0 {
>>  mac 00:50:56:a0:a0:c3;
>>  unit 0 {
>>  family inet {
>>  address 10.1.1.102/24;
>>  }
>>  }
>> }
>>
>> [edit]
>> root at mx86-bgl-2-r1# run ping 10.1.1.101
>> PING 10.1.1.101 (10.1.1.101): 56 data bytes
>> 64 bytes from 10.1.1.101: icmp_seq=0 ttl=64 time=2.142 ms
>> 64 bytes from 10.1.1.101: icmp_seq=1 ttl=64 time=8.465 ms
>> ^C
>> --- 10.1.1.101 ping statistics ---
>> 2 packets transmitted, 2 packets received, 0% packet loss
>> round-trip min/avg/max/stddev = 2.142/5.303/8.465/3.162 ms
>>
>> [edit]
>> root at mx86-bgl-2-r1#
>>
>> b.
>> Logs at console for above config:
>> -
>>
>> root at localhost:/home/pfe# Santosh ixgbevf_add_mac_addr portid=0
>> mac=00:50:56:A0:A0:C3
>>
>> root at localhost:/home/pfe#
>>
>>
>> 3.
>> a. Deleted above config MAC
>> root at mx86-bgl-2-r1# delete interfaces ge-0/0/0 mac
>>
>> [edit]
>> root at mx86-bgl-2-r1# commit
>> commit complete
>>
>> [edit]
>> root at mx86-bgl-2-r1# show interfaces
>> ge-0/0/0 {
>>  unit 0 {
>>  family inet {
>>  address 10.1.1.102/24;
>>  }
>>  }
>> }
>>
>> [edit]
>> root at mx86-bgl-2-r1# run ping 10.1.1.101
>> PING 10.1.1.101 (10.1.1.101): 56 data bytes
>> ^C
>> --- 10.1.1.101 ping statistics ---
>> 3 packets transmitted, 0 packets received, 100% packet loss
>>
>> [edit]
>> root at mx86-bgl-2-r1#
>>
>> b.
>>   Logs at console for above cofig
>>   -
>>
>>
>> root at localhost:/home/pfe# ixgbevf_remove_mac_addr s

[dpdk-dev] [PATCH v2] vhost: Fix linkage of vhost PMD

2016-04-26 Thread Panu Matilainen
On 04/26/2016 08:39 AM, Tetsuya Mukawa wrote:
> Currently, vhost PMD doesn't have linkage for librte_vhost, even though
> it depends on librte_vhost APIs. This causes a linkage error if below
> conditions are fulfilled.
>
>  - DPDK libraries are compiled as shared libraries.
>  - DPDK application doesn't link librte_vhost.
>  - Above application tries to link vhost PMD using '-d' DPDK option.
>
> The patch adds linkage for librte_vhost to vhost PMD not to cause an
> above error.
>
> Signed-off-by: Tetsuya Mukawa 
> Acked-by: Panu Matilainen 
> ---
>  drivers/net/vhost/Makefile | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/drivers/net/vhost/Makefile b/drivers/net/vhost/Makefile
> index f49a69b..30b91a0 100644
> --- a/drivers/net/vhost/Makefile
> +++ b/drivers/net/vhost/Makefile
> @@ -38,6 +38,7 @@ LIB = librte_pmd_vhost.a
>
>  CFLAGS += -O3
>  CFLAGS += $(WERROR_FLAGS)
> +LDLIBS += -lrte_vhost
>
>  EXPORT_MAP := rte_pmd_vhost_version.map
>
>

Hmm, turns out this isn't quite enough, simply because its the first of 
its kind (first internal dependency between libraries), at least I'm 
getting:

== Build drivers/net/vhost
gcc -m64 
-Wl,--version-script=/srv/work/dist/dpdk/dpdk-16.04/drivers/net/vhost/rte_pmd_vhost_version.map
 
  -shared rte_eth_vhost.o -lrte_vhost -Wl,-soname,librte_pmd_vhost.so.1 
-o librte_pmd_vhost.so.1
/usr/bin/ld: cannot find -lrte_vhost
collect2: error: ld returned 1 exit status
/srv/work/dist/dpdk/dpdk-16.04/mk/rte.lib.mk:127: recipe for target 
'librte_pmd_vhost.so.1' failed

So it'll need something like this as a pre-requisite to add the internal 
libraries to the linker path:

diff --git a/mk/rte.lib.mk b/mk/rte.lib.mk
index 8f7e021..f5d7b04 100644
--- a/mk/rte.lib.mk
+++ b/mk/rte.lib.mk
@@ -86,7 +86,7 @@ O_TO_A_DO = @set -e; \
 $(O_TO_A) && \
 echo $(O_TO_A_CMD) > $(call exe2cmd,$(@))

-O_TO_S = $(LD) $(_CPU_LDFLAGS) $(EXTRA_LDFLAGS) -shared $(OBJS-y) 
$(LDLIBS) \
+O_TO_S = $(LD) -L$(RTE_OUTPUT)/lib $(_CPU_LDFLAGS) $(EXTRA_LDFLAGS) 
-shared $(OBJS-y) $(LDLIBS) \
  -Wl,-soname,$(LIB) -o $(LIB)
  O_TO_S_STR = $(subst ','\'',$(O_TO_S)) #'# fix syntax highlight
  O_TO_S_DISP = $(if $(V),"$(O_TO_S_STR)","  LD $(@)")


I can submit an official patch for this later but I'm not exactly 
feeling like the sharpest knife in the drawer today so if somebody beats 
me to it, feel free.

- Panu -


[dpdk-dev] [PATCH] eal: out-of-bounds write

2016-04-26 Thread Sergio Gonzalez Monroy
On 26/04/2016 09:53, Bruce Richardson wrote:
> On Tue, Apr 26, 2016 at 09:44:47AM +0200, Slawomir Mrozowicz wrote:
>> Fix issue reported by Coverity.
>>
>> Coverity ID 13282: Out-of-bounds write
>> overrun-local: Overrunning array mcfg->memseg of 256 44-byte elements
>> at element index 257 using index j.
>>
>> Fixes: af75078fece3 ("first public release")
>>
>> Signed-off-by: Slawomir Mrozowicz 
>> ---
>>   lib/librte_eal/linuxapp/eal/eal_memory.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c 
>> b/lib/librte_eal/linuxapp/eal/eal_memory.c
>> index 5b9132c..1e737e4 100644
>> --- a/lib/librte_eal/linuxapp/eal/eal_memory.c
>> +++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
>> @@ -1333,7 +1333,7 @@ rte_eal_hugepage_init(void)
>>   
>>  if (new_memseg) {
>>  j += 1;
>> -if (j == RTE_MAX_MEMSEG)
>> +if (j >= RTE_MAX_MEMSEG)
>>  break;
>>   
>>  mcfg->memseg[j].phys_addr = hugepage[i].physaddr;
>> -- 
> This does appear to be a valid fix for the issue. However, looking at the 
> code,
> it appears that the only way we could actually hit the problem is if
> j == RTE_MAX_MEMSEG on exiting the previous loop. Would a check there be a 
> better
> fix for this issue (or perhaps we want both fixes).
>
> Thoughts?

It doesn't make sense to go into the loop if we don't have free memsegs.
Either way we should print the error indicating that we reached MAX_MEMSEG.

Sergio


> /Bruce



[dpdk-dev] [PATCH] bond: inherit maximum rx packet length

2016-04-26 Thread Declan Doherty
On 14/04/16 18:23, Eric Kinzie wrote:
>Instead of a hard-coded maximum receive length, allow the bond interface
>to inherit this limit from the first slave added.  This allows
>an application that uses jumbo frames to pass realistic values to
>rte_eth_dev_configure without causing an error.
>
> Signed-off-by: Eric Kinzie 
> ---
...
>

Hey Eric, just one small thing, I think it probably makes sense to 
return the max rx pktlen for all slaves, so as we add each slave just 
check if that the slave being value is larger than the current value.

@@ -385,6 +389,10 @@ __eth_bond_slave_add_lock_free(uint8_t 
bonded_port_id, uint8_t slave_port_id)
 internals->tx_offload_capa &= dev_info.tx_offload_capa;
 internals->flow_type_rss_offloads &= 
dev_info.flow_type_rss_offloads;

+   /* If new slave's max rx packet size is larger than 
current value then override */
+   if (dev_info.max_rx_pktlen > internals->max_rx_pktlen)
+   internals->max_rx_pktlen = dev_info.max_rx_pktlen;
+

Declan


[dpdk-dev] [PATCH] examples/l3fwd: report error when no vector engine is available

2016-04-26 Thread Jan Viktorin
If no SSE nor NEON are available the l3fwd should complain loudly to quickly
find out the reason.

Signed-off-by: Jan Viktorin 
---
It has happened to me once when I've accidently built a GCC without the NEON
support. It was confusing as at first I thought it is a bug... It is not,
there is just missing an error message telling the reason.
---
 examples/l3fwd/l3fwd_em.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/examples/l3fwd/l3fwd_em.c b/examples/l3fwd/l3fwd_em.c
index fc59243..259094d 100644
--- a/examples/l3fwd/l3fwd_em.c
+++ b/examples/l3fwd/l3fwd_em.c
@@ -259,6 +259,8 @@ em_mask_key(void *key, xmm_t mask)

return vandq_s32(data, mask);
 }
+#else
+#error No vector engine (SSE, NEON) available, check your toolchain
 #endif

 static inline uint8_t
-- 
2.8.0



[dpdk-dev] [PATCH] nfp: modifying guide about using uio modules

2016-04-26 Thread Alejandro Lucero
 - Removing dependency on nfp_uio kernel module. The igb_uio
   kernel modules can be used instead.

Fixes: 80bc1752f16e ("nfp: add guide")

Signed-off-by: Alejandro Lucero 
---
 doc/guides/nics/nfp.rst | 47 ---
 1 file changed, 16 insertions(+), 31 deletions(-)

diff --git a/doc/guides/nics/nfp.rst b/doc/guides/nics/nfp.rst
index dfc3683..e4ebc71 100644
--- a/doc/guides/nics/nfp.rst
+++ b/doc/guides/nics/nfp.rst
@@ -61,9 +61,8 @@ instructions.

 DPDK runs in userspace and PMDs uses the Linux kernel UIO interface to
 allow access to physical devices from userspace. The NFP PMD requires
-a separate UIO driver, **nfp_uio**, to perform correct
-initialization. This driver is part of Netronome?s BSP and it is
-equivalent to Intel's igb_uio driver.
+the **igb_uio** UIO driver, available with DPDK, to perform correct
+initialization.

 Building the software
 -
@@ -201,27 +200,18 @@ Using the NFP PMD is not different to using other PMDs. 
Usual steps are:

The module should now be listed by the lsmod command.

-#. **To install the nfp_uio kernel module (manually):** This module supports
-   NFP-6xxx devices through the UIO interface.
-
-   This module is part of Netronome?s BSP and it should be available when the
-   BSP is installed.
+#. **To install the igb_uio kernel module (manually):** This module is part
+   of DPDK sources and configured by default (CONFIG_RTE_EAL_IGB_UIO=y).

.. code-block:: console

-  modprobe nfp_uio.ko
+  modprobe igb_uio.ko

The module should now be listed by the lsmod command.

-   Depending on which NFP modules are loaded, nfp_uio may be automatically
-   bound to the NFP PCI devices by the system. Otherwise the binding needs
-   to be done explicitly. This is the case when nfp_netvf, the Linux kernel
-   driver for NFP VFs, was loaded when VFs were created. As described later
-   in this document this configuration may also be performed using scripts
-   provided by the Netronome?s BSP.
-
-   First the device needs to be unbound, for example from the nfp_netvf
-   driver:
+   Depending on which NFP modules are loaded, it could be necessary to
+   detach NFP devices from the nfp_netvf module. If this is the case the
+   device needs to be unbound, for example:

.. code-block:: console

@@ -232,30 +222,25 @@ Using the NFP PMD is not different to using other PMDs. 
Usual steps are:
The output of lspci should now show that :03:08.0 is not bound to
any driver.

-   The next step is to add the NFP PCI ID to the NFP UIO driver:
+   The next step is to add the NFP PCI ID to the IGB UIO driver:

.. code-block:: console

-  echo 19ee 6003 > /sys/bus/pci/drivers/nfp_uio/new_id
+  echo 19ee 6003 > /sys/bus/pci/drivers/igb_uio/new_id

-   And then to bind the device to the nfp_uio driver:
+   And then to bind the device to the igb_uio driver:

.. code-block:: console

-  echo :03:08.0 > /sys/bus/pci/drivers/nfp_uio/bind
+  echo :03:08.0 > /sys/bus/pci/drivers/igb_uio/bind

   lspci -d19ee: -k

-   lspci should show that device bound to nfp_uio driver.
-
-#. **Using tools from Netronome?s BSP to install and bind modules:** DPDK 
provides
-   scripts which are useful for installing the UIO modules and for binding the
-   right device to those modules avoiding doing so manually. However, these 
scripts
-   have not support for Netronome?s UIO driver. Along with drivers, the BSP 
installs
-   those DPDK scripts slightly modified with support for Netronome?s UIO 
driver.
+   lspci should show that device bound to igb_uio driver.

-   Those specific scripts can be found in Netronome?s BSP installation 
directory.
-   Refer to BSP documentation for more information.
+#. **Using scripts to install and bind modules:** DPDK provides scripts which 
are
+   useful for installing the UIO modules and for binding the right device to 
those
+   modules avoiding doing so manually:

* **setup.sh**
* **dpdk_nic_bind.py**
-- 
1.9.1



[dpdk-dev] [PATCH] eal: out-of-bounds write

2016-04-26 Thread Mrozowicz, SlawomirX


>-Original Message-
>From: Gonzalez Monroy, Sergio
>Sent: Tuesday, April 26, 2016 11:44 AM
>To: Richardson, Bruce ; Mrozowicz, SlawomirX
>
>Cc: david.marchand at 6wind.com; dev at dpdk.org
>Subject: Re: [dpdk-dev] [PATCH] eal: out-of-bounds write
>
>On 26/04/2016 09:53, Bruce Richardson wrote:
>> On Tue, Apr 26, 2016 at 09:44:47AM +0200, Slawomir Mrozowicz wrote:
>>> Fix issue reported by Coverity.
>>>
>>> Coverity ID 13282: Out-of-bounds write
>>> overrun-local: Overrunning array mcfg->memseg of 256 44-byte elements
>>> at element index 257 using index j.
>>>
>>> Fixes: af75078fece3 ("first public release")
>>>
>>> Signed-off-by: Slawomir Mrozowicz 
>>> ---
>>>   lib/librte_eal/linuxapp/eal/eal_memory.c | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c
>>> b/lib/librte_eal/linuxapp/eal/eal_memory.c
>>> index 5b9132c..1e737e4 100644
>>> --- a/lib/librte_eal/linuxapp/eal/eal_memory.c
>>> +++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
>>> @@ -1333,7 +1333,7 @@ rte_eal_hugepage_init(void)
>>>
>>> if (new_memseg) {
>>> j += 1;
>>> -   if (j == RTE_MAX_MEMSEG)
>>> +   if (j >= RTE_MAX_MEMSEG)
>>> break;
>>>
>>> mcfg->memseg[j].phys_addr = hugepage[i].physaddr;
>>> --
>> This does appear to be a valid fix for the issue. However, looking at
>> the code, it appears that the only way we could actually hit the
>> problem is if j == RTE_MAX_MEMSEG on exiting the previous loop. Would
>> a check there be a better fix for this issue (or perhaps we want both fixes).
>>
>> Thoughts?
>
>It doesn't make sense to go into the loop if we don't have free memsegs.
>Either way we should print the error indicating that we reached
>MAX_MEMSEG.
>
>Sergio
>

It is possible to add additional checking available memseg before the loop. 
In this case it will be checked twice before loop and inside loop. 
In my opinion it is not necessary.

Anyway it is valuable to add in line 1336 error message if the MAX_MEMSEG is 
reached.

S?awomir

>
>> /Bruce



[dpdk-dev] [PATCH] virtio: fix modify drv_flags for specific device

2016-04-26 Thread David Marchand
On Tue, Apr 26, 2016 at 4:24 AM, Jianfeng Tan  wrote:
> Issue: virtio's drv_flags are decided by devices types (modern vs legacy),
> and which kernel driver is used, and the negotiated features (especially
> VIRTIO_NET_STATUS) with backend, which makes it possible to multiple
> virtio devices have different versions of drv_flags, but this variable
> is currently shared by each virtio device.
>
> How to fix: dev_flags is a device-specific variable to store this info.
>
> Fixes: da978dfdc43 ("virtio: use port IO to get PCI resource")
>
> Reported-by: David Marchand 
> Suggested-by: David Marchand 
> Signed-off-by: Jianfeng Tan 

- ethdev dev_flags is supposed to be filled with
RTE_ETH_DEV_DETACHABLE, RTE_ETH_DEV_INTR_LSC etc... not pci macros.

- I would have kept the init code as it is until the
rte_eth_copy_pci_info() step, then sanitise the dev_flags, but this
might be a matter of taste.


-- 
David Marchand


[dpdk-dev] [PATCH] nfp: avoiding concurrency when hardware reconfig

2016-04-26 Thread Alejandro Lucero
Some apps calling some functions from different threads at the
same time could lead to reconfig problems. Reconfig mechanism is
based on a hardware queue where incrementing a counter signals the
firmware to do the reconfig. If there are two increments before the
first one has been processed the firmware will stop and a device
reset is necessary.

Signed-off-by: Alejandro Lucero 
---
 drivers/net/nfp/nfp_net.c | 8 
 drivers/net/nfp/nfp_net_pmd.h | 1 +
 2 files changed, 9 insertions(+)

diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index bc0a3d8..ba0ee04 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -58,6 +58,7 @@
 #include "nfp_net_pmd.h"
 #include "nfp_net_logs.h"
 #include "nfp_net_ctrl.h"
+#include 

 /* Prototypes */
 static void nfp_net_close(struct rte_eth_dev *dev);
@@ -407,6 +408,8 @@ nfp_net_reconfig(struct nfp_net_hw *hw, uint32_t ctrl, 
uint32_t update)
PMD_DRV_LOG(DEBUG, "nfp_net_reconfig: ctrl=%08x update=%08x\n",
ctrl, update);

+   rte_spinlock_lock(&hw->reconfig_lock);
+
nn_cfg_writel(hw, NFP_NET_CFG_CTRL, ctrl);
nn_cfg_writel(hw, NFP_NET_CFG_UPDATE, update);

@@ -414,6 +417,8 @@ nfp_net_reconfig(struct nfp_net_hw *hw, uint32_t ctrl, 
uint32_t update)

err = __nfp_net_reconfig(hw, update);

+   rte_spinlock_unlock(&hw->reconfig_lock);
+
if (!err)
return 0;

@@ -2399,6 +2404,9 @@ nfp_net_init(struct rte_eth_dev *eth_dev)
PMD_INIT_LOG(INFO, "max_rx_queues: %u, max_tx_queues: %u\n",
 hw->max_rx_queues, hw->max_tx_queues);

+   /* Initializing spinlock for reconfigs */
+   rte_spinlock_init(&hw->reconfig_lock);
+
/* Allocating memory for mac addr */
eth_dev->data->mac_addrs = rte_zmalloc("mac_addr", ETHER_ADDR_LEN, 0);
if (eth_dev->data->mac_addrs == NULL) {
diff --git a/drivers/net/nfp/nfp_net_pmd.h b/drivers/net/nfp/nfp_net_pmd.h
index 232ce5c..c180972 100644
--- a/drivers/net/nfp/nfp_net_pmd.h
+++ b/drivers/net/nfp/nfp_net_pmd.h
@@ -406,6 +406,7 @@ struct nfp_net_hw {
int stride_tx;

uint8_t *qcp_cfg;
+   rte_spinlock_t reconfig_lock;

uint32_t max_tx_queues;
uint32_t max_rx_queues;
-- 
1.9.1



[dpdk-dev] [PATCH] mk: cleanup leftover references to librte_malloc

2016-04-26 Thread Panu Matilainen
librte_malloc was long since merged into librte_eal, mop up the
leftovers from rarer drivers.

Fixes: 2f9d47013e4d ("mem: move librte_malloc to eal/common")

Signed-off-by: Panu Matilainen 
---
 drivers/net/cxgbe/Makefile| 2 +-
 drivers/net/ena/Makefile  | 2 +-
 drivers/net/mpipe/Makefile| 2 +-
 drivers/net/nfp/Makefile  | 2 +-
 drivers/net/szedata2/Makefile | 1 -
 5 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/net/cxgbe/Makefile b/drivers/net/cxgbe/Makefile
index 0711976..e2ff412 100644
--- a/drivers/net/cxgbe/Makefile
+++ b/drivers/net/cxgbe/Makefile
@@ -82,6 +82,6 @@ SRCS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += t4_hw.c
 # this lib depends upon:
 DEPDIRS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += lib/librte_eal lib/librte_ether
 DEPDIRS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += lib/librte_mempool lib/librte_mbuf
-DEPDIRS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += lib/librte_net lib/librte_malloc
+DEPDIRS-$(CONFIG_RTE_LIBRTE_CXGBE_PMD) += lib/librte_net

 include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/net/ena/Makefile b/drivers/net/ena/Makefile
index ac2b55d..a0d3358 100644
--- a/drivers/net/ena/Makefile
+++ b/drivers/net/ena/Makefile
@@ -54,7 +54,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_ENA_PMD) += ena_eth_com.c
 # this lib depends upon:
 DEPDIRS-$(CONFIG_RTE_LIBRTE_ENA_PMD) += lib/librte_eal lib/librte_ether
 DEPDIRS-$(CONFIG_RTE_LIBRTE_ENA_PMD) += lib/librte_mempool lib/librte_mbuf
-DEPDIRS-$(CONFIG_RTE_LIBRTE_ENA_PMD) += lib/librte_net lib/librte_malloc
+DEPDIRS-$(CONFIG_RTE_LIBRTE_ENA_PMD) += lib/librte_net

 CFLAGS += $(INCLUDES)

diff --git a/drivers/net/mpipe/Makefile b/drivers/net/mpipe/Makefile
index 46f046d..846e2e0 100644
--- a/drivers/net/mpipe/Makefile
+++ b/drivers/net/mpipe/Makefile
@@ -42,6 +42,6 @@ SRCS-$(CONFIG_RTE_LIBRTE_MPIPE_PMD) += mpipe_tilegx.c

 DEPDIRS-$(CONFIG_RTE_LIBRTE_MPIPE_PMD) += lib/librte_eal lib/librte_ether
 DEPDIRS-$(CONFIG_RTE_LIBRTE_MPIPE_PMD) += lib/librte_mempool lib/librte_mbuf
-DEPDIRS-$(CONFIG_RTE_LIBRTE_MPIPE_PMD) += lib/librte_net lib/librte_malloc
+DEPDIRS-$(CONFIG_RTE_LIBRTE_MPIPE_PMD) += lib/librte_net

 include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/net/nfp/Makefile b/drivers/net/nfp/Makefile
index 11f..4cadd13 100644
--- a/drivers/net/nfp/Makefile
+++ b/drivers/net/nfp/Makefile
@@ -53,6 +53,6 @@ SRCS-$(CONFIG_RTE_LIBRTE_NFP_PMD) += nfp_net.c
 # this lib depends upon:
 DEPDIRS-$(CONFIG_RTE_LIBRTE_NFP_PMD) += lib/librte_eal lib/librte_ether
 DEPDIRS-$(CONFIG_RTE_LIBRTE_NFP_PMD) += lib/librte_mempool lib/librte_mbuf
-DEPDIRS-$(CONFIG_RTE_LIBRTE_NFP_PMD) += lib/librte_net lib/librte_malloc
+DEPDIRS-$(CONFIG_RTE_LIBRTE_NFP_PMD) += lib/librte_net

 include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/net/szedata2/Makefile b/drivers/net/szedata2/Makefile
index 963a8d6..ee4986c 100644
--- a/drivers/net/szedata2/Makefile
+++ b/drivers/net/szedata2/Makefile
@@ -57,7 +57,6 @@ SYMLINK-y-include +=
 # this lib depends upon:
 DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_SZEDATA2) += lib/librte_mbuf
 DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_SZEDATA2) += lib/librte_ether
-DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_SZEDATA2) += lib/librte_malloc
 DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_SZEDATA2) += lib/librte_kvargs

 include $(RTE_SDK)/mk/rte.lib.mk
-- 
2.5.5



[dpdk-dev] [PATCH] virtio: fix memory leak of virtqueue memzones

2016-04-26 Thread Jianfeng Tan
Issue: When virtio was proposed in DPDK, there is no API to free memzones.
But this has changed since rte_memzone_free() has been implemented by
commit ff909fe21f.

This patch is to make sure memzones in struct virtqueue, like mz and
virtio_net_hdr_mz, are freed when queue is released or setup fails.

Signed-off-by: Jianfeng Tan 
---
 drivers/net/virtio/virtio_ethdev.c | 69 --
 drivers/net/virtio/virtio_ethdev.h |  2 +-
 drivers/net/virtio/virtio_rxtx.c   |  4 +--
 3 files changed, 40 insertions(+), 35 deletions(-)

diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index 63a368a..54eacf6 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -261,12 +261,18 @@ virtio_set_multiple_queues(struct rte_eth_dev *dev, 
uint16_t nb_queues)
 }

 void
-virtio_dev_queue_release(struct virtqueue *vq) {
+virtio_dev_queue_release(struct virtqueue *vq, int io_related)
+{
struct virtio_hw *hw;

if (vq) {
hw = vq->hw;
-   hw->vtpci_ops->del_queue(hw, vq);
+   if (io_related)
+   hw->vtpci_ops->del_queue(hw, vq);
+
+   rte_memzone_free(vq->mz);
+   if (vq->virtio_net_hdr_mz)
+   rte_memzone_free(vq->virtio_net_hdr_mz);

rte_free(vq->sw_ring);
rte_free(vq);
@@ -286,6 +292,7 @@ int virtio_dev_queue_setup(struct rte_eth_dev *dev,
unsigned int vq_size, size;
struct virtio_hw *hw = dev->data->dev_private;
struct virtqueue *vq = NULL;
+   const char *queue_names[] = {"rvq", "txq", "cvq"};

PMD_INIT_LOG(DEBUG, "setting up queue: %u", vtpci_queue_idx);

@@ -305,34 +312,34 @@ int virtio_dev_queue_setup(struct rte_eth_dev *dev,
return -EINVAL;
}

-   if (queue_type == VTNET_RQ) {
-   snprintf(vq_name, sizeof(vq_name), "port%d_rvq%d",
-   dev->data->port_id, queue_idx);
-   vq = rte_zmalloc(vq_name, sizeof(struct virtqueue) +
-   vq_size * sizeof(struct vq_desc_extra), 
RTE_CACHE_LINE_SIZE);
-   vq->sw_ring = rte_zmalloc_socket("rxq->sw_ring",
-   (RTE_PMD_VIRTIO_RX_MAX_BURST + vq_size) *
-   sizeof(vq->sw_ring[0]), RTE_CACHE_LINE_SIZE, socket_id);
-   } else if (queue_type == VTNET_TQ) {
-   snprintf(vq_name, sizeof(vq_name), "port%d_tvq%d",
-   dev->data->port_id, queue_idx);
-   vq = rte_zmalloc(vq_name, sizeof(struct virtqueue) +
-   vq_size * sizeof(struct vq_desc_extra), 
RTE_CACHE_LINE_SIZE);
-   } else if (queue_type == VTNET_CQ) {
-   snprintf(vq_name, sizeof(vq_name), "port%d_cvq",
-   dev->data->port_id);
-   vq = rte_zmalloc(vq_name, sizeof(struct virtqueue) +
-   vq_size * sizeof(struct vq_desc_extra),
-   RTE_CACHE_LINE_SIZE);
+   if (queue_type < VTNET_RQ || queue_type > VTNET_RQ) {
+   PMD_INIT_LOG(ERR, "invalid queue type: %d", queue_type);
+   return -EINVAL;
}
+
+   snprintf(vq_name, sizeof(vq_name), "port%d_%s%d",
+dev->data->port_id, queue_names[queue_type], queue_idx);
+   vq = rte_zmalloc(vq_name, sizeof(struct virtqueue) +
+vq_size * sizeof(struct vq_desc_extra),
+RTE_CACHE_LINE_SIZE);
if (vq == NULL) {
PMD_INIT_LOG(ERR, "Can not allocate virtqueue");
return -ENOMEM;
}
-   if (queue_type == VTNET_RQ && vq->sw_ring == NULL) {
-   PMD_INIT_LOG(ERR, "Can not allocate RX soft ring");
-   rte_free(vq);
-   return -ENOMEM;
+
+   if (queue_type == VTNET_RQ) {
+   size_t sz_sw;
+
+   sz_sw = (RTE_PMD_VIRTIO_RX_MAX_BURST + vq_size) *
+   sizeof(vq->sw_ring[0]);
+   vq->sw_ring = rte_zmalloc_socket("rxq->sw_ring", sz_sw,
+RTE_CACHE_LINE_SIZE,
+socket_id);
+   if (!vq->sw_ring) {
+   PMD_INIT_LOG(ERR, "Can not allocate RX soft ring");
+   virtio_dev_queue_release(vq, 0);
+   return -ENOMEM;
+   }
}

vq->hw = hw;
@@ -358,7 +365,7 @@ int virtio_dev_queue_setup(struct rte_eth_dev *dev,
if (rte_errno == EEXIST)
mz = rte_memzone_lookup(vq_name);
if (mz == NULL) {
-   rte_free(vq);
+   virtio_dev_queue_release(vq, 0);
return -ENOMEM;
}
}
@@ -370,7 +377,7 @@ int virtio_dev_queue_setup(struct rte_eth_dev *dev,
 */
if ((mz->phys_addr + vq->vq_ri

[dpdk-dev] [PATCH v2] kni: add chained mbufs support

2016-04-26 Thread Ferruh Yigit
rx_q fifo may have chained mbufs, merge them into single skb before
handing to the network stack.

Signed-off-by: Ferruh Yigit 
---
 .../linuxapp/eal/include/exec-env/rte_kni_common.h |  4 +-
 lib/librte_eal/linuxapp/kni/kni_net.c  | 83 --
 2 files changed, 64 insertions(+), 23 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h 
b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
index 7e5e598..2acdfd9 100644
--- a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
+++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
@@ -113,7 +113,9 @@ struct rte_kni_mbuf {
void *buf_addr __attribute__((__aligned__(RTE_CACHE_LINE_SIZE)));
char pad0[10];
uint16_t data_off;  /**< Start address of data in segment buffer. */
-   char pad1[4];
+   char pad1[2];
+   uint8_t nb_segs;/**< Number of segments. */
+   char pad4[1];
uint64_t ol_flags;  /**< Offload features. */
char pad2[4];
uint32_t pkt_len;   /**< Total pkt len: sum of all segment 
data_len. */
diff --git a/lib/librte_eal/linuxapp/kni/kni_net.c 
b/lib/librte_eal/linuxapp/kni/kni_net.c
index cfa8339..44f49cc 100644
--- a/lib/librte_eal/linuxapp/kni/kni_net.c
+++ b/lib/librte_eal/linuxapp/kni/kni_net.c
@@ -156,7 +156,8 @@ kni_net_rx_normal(struct kni_dev *kni)
/* Transfer received packets to netif */
for (i = 0; i < num_rx; i++) {
kva = (void *)va[i] - kni->mbuf_va + kni->mbuf_kva;
-   len = kva->data_len;
+   len = kva->pkt_len;
+
data_kva = kva->buf_addr + kva->data_off - kni->mbuf_va
+ kni->mbuf_kva;

@@ -165,22 +166,41 @@ kni_net_rx_normal(struct kni_dev *kni)
KNI_ERR("Out of mem, dropping pkts\n");
/* Update statistics */
kni->stats.rx_dropped++;
+   continue;
}
-   else {
-   /* Align IP on 16B boundary */
-   skb_reserve(skb, 2);
+
+   /* Align IP on 16B boundary */
+   skb_reserve(skb, 2);
+
+   if (kva->nb_segs == 1) {
memcpy(skb_put(skb, len), data_kva, len);
-   skb->dev = dev;
-   skb->protocol = eth_type_trans(skb, dev);
-   skb->ip_summed = CHECKSUM_UNNECESSARY;
+   } else {
+   int nb_segs;
+   int kva_nb_segs = kva->nb_segs;

-   /* Call netif interface */
-   netif_rx_ni(skb);
+   for (nb_segs = 0; nb_segs < kva_nb_segs; nb_segs++) {
+   memcpy(skb_put(skb, kva->data_len),
+   data_kva, kva->data_len);

-   /* Update statistics */
-   kni->stats.rx_bytes += len;
-   kni->stats.rx_packets++;
+   if (!kva->next)
+   break;
+
+   kva = kva->next - kni->mbuf_va + kni->mbuf_kva;
+   data_kva = kva->buf_addr + kva->data_off
+   - kni->mbuf_va + kni->mbuf_kva;
+   }
}
+
+   skb->dev = dev;
+   skb->protocol = eth_type_trans(skb, dev);
+   skb->ip_summed = CHECKSUM_UNNECESSARY;
+
+   /* Call netif interface */
+   netif_rx_ni(skb);
+
+   /* Update statistics */
+   kni->stats.rx_bytes += len;
+   kni->stats.rx_packets++;
}

/* Burst enqueue mbufs into free_q */
@@ -317,7 +337,7 @@ kni_net_rx_lo_fifo_skb(struct kni_dev *kni)
/* Copy mbufs to sk buffer and then call tx interface */
for (i = 0; i < num; i++) {
kva = (void *)va[i] - kni->mbuf_va + kni->mbuf_kva;
-   len = kva->data_len;
+   len = kva->pkt_len;
data_kva = kva->buf_addr + kva->data_off - kni->mbuf_va +
kni->mbuf_kva;

@@ -338,20 +358,39 @@ kni_net_rx_lo_fifo_skb(struct kni_dev *kni)
if (skb == NULL) {
KNI_ERR("Out of mem, dropping pkts\n");
kni->stats.rx_dropped++;
+   continue;
}
-   else {
-   /* Align IP on 16B boundary */
-   skb_reserve(skb, 2);
+
+   /* Align IP on 16B boundary */
+   skb_reserve(skb, 2);
+
+   if (kva->nb_segs == 1) {
memcpy(skb_put(skb, len), data_kva, len);
-   skb->dev = dev;
-   skb->ip_summed = CHECKSUM_UNNECESSARY;
+

[dpdk-dev] [PATCH] eal: add option --avail-cores to detect lcores

2016-04-26 Thread Tan, Jianfeng
Hi,

Since some guys are asking about the status of this patch, I'd like to 
ping if anyone still has concerns.
Current conclusion is: with option --avail-cores.

Thanks,
Jianfeng

On 3/4/2016 6:05 PM, Jianfeng Tan wrote:
> This patch adds option, --avail-cores, to use lcores which are available
> by calling pthread_getaffinity_np() to narrow down detected cores before
> parsing coremask (-c), corelist (-l), and coremap (--lcores).
>
> Test example:
> $ taskset 0xc ./examples/helloworld/build/helloworld \
>   --avail-cores -m 1024
>
> Signed-off-by: Jianfeng Tan 
> Acked-by: Neil Horman 
> ---
>   lib/librte_eal/common/eal_common_options.c | 52 
> ++
>   lib/librte_eal/common/eal_options.h|  2 ++
>   2 files changed, 54 insertions(+)
>
> diff --git a/lib/librte_eal/common/eal_common_options.c 
> b/lib/librte_eal/common/eal_common_options.c
> index 29942ea..dc4882d 100644
> --- a/lib/librte_eal/common/eal_common_options.c
> +++ b/lib/librte_eal/common/eal_common_options.c
> @@ -95,6 +95,7 @@ eal_long_options[] = {
>   {OPT_VFIO_INTR, 1, NULL, OPT_VFIO_INTR_NUM},
>   {OPT_VMWARE_TSC_MAP,0, NULL, OPT_VMWARE_TSC_MAP_NUM   },
>   {OPT_XEN_DOM0,  0, NULL, OPT_XEN_DOM0_NUM },
> + {OPT_AVAIL_CORES,   0, NULL, OPT_AVAIL_CORES_NUM  },
>   {0, 0, NULL, 0}
>   };
>   
> @@ -681,6 +682,37 @@ err:
>   }
>   
>   static int
> +eal_parse_avail_cores(void)
> +{
> + int i, count;
> + pthread_t tid;
> + rte_cpuset_t cpuset;
> + struct rte_config *cfg = rte_eal_get_configuration();
> +
> + tid = pthread_self();
> + if (pthread_getaffinity_np(tid, sizeof(rte_cpuset_t), &cpuset) != 0)
> + return -1;
> +
> + for (i = 0, count = 0; i < RTE_MAX_LCORE; i++) {
> + if (lcore_config[i].detected && !CPU_ISSET(i, &cpuset)) {
> + RTE_LOG(DEBUG, EAL, "Flag lcore %u as undetected\n", i);
> + lcore_config[i].detected = 0;
> + lcore_config[i].core_index = -1;
> + cfg->lcore_role[i] = ROLE_OFF;
> + count++;
> + }
> + }
> + cfg->lcore_count -= count;
> + if (cfg->lcore_count == 0) {
> + RTE_LOG(ERR, EAL, "No lcores available\n");
> + return -1;
> + }
> +
> + return 0;
> +}
> +
> +
> +static int
>   eal_parse_syslog(const char *facility, struct internal_config *conf)
>   {
>   int i;
> @@ -754,6 +786,10 @@ eal_parse_proc_type(const char *arg)
>   return RTE_PROC_INVALID;
>   }
>   
> +static int param_coremask;
> +static int param_corelist;
> +static int param_coremap;
> +
>   int
>   eal_parse_common_option(int opt, const char *optarg,
>   struct internal_config *conf)
> @@ -775,6 +811,7 @@ eal_parse_common_option(int opt, const char *optarg,
>   break;
>   /* coremask */
>   case 'c':
> + param_coremask = 1;
>   if (eal_parse_coremask(optarg) < 0) {
>   RTE_LOG(ERR, EAL, "invalid coremask\n");
>   return -1;
> @@ -782,6 +819,7 @@ eal_parse_common_option(int opt, const char *optarg,
>   break;
>   /* corelist */
>   case 'l':
> + param_corelist = 1;
>   if (eal_parse_corelist(optarg) < 0) {
>   RTE_LOG(ERR, EAL, "invalid core list\n");
>   return -1;
> @@ -890,12 +928,25 @@ eal_parse_common_option(int opt, const char *optarg,
>   break;
>   }
>   case OPT_LCORES_NUM:
> + param_coremap = 1;
>   if (eal_parse_lcores(optarg) < 0) {
>   RTE_LOG(ERR, EAL, "invalid parameter for --"
>   OPT_LCORES "\n");
>   return -1;
>   }
>   break;
> + case OPT_AVAIL_CORES_NUM:
> + if (param_coremask || param_corelist || param_coremap) {
> + RTE_LOG(ERR, EAL, "should put --" OPT_AVAIL_CORES
> + " before -c, -l and --" OPT_LCORES "\n");
> + return -1;
> + }
> + if (eal_parse_avail_cores() < 0) {
> + RTE_LOG(ERR, EAL, "failed to use --"
> + OPT_AVAIL_CORES "\n");
> + return -1;
> + }
> + break;
>   
>   /* don't know what to do, leave this to caller */
>   default:
> @@ -990,6 +1041,7 @@ eal_common_usage(void)
>  "  ',' is used for single number 
> separator.\n"
>  "  '( )' can be omitted for single element 
> group,\n"
>  "  '@' can be omitted if cpus and lcores 
> have the same value\n"
> +"  --"OPT_AVAIL_CORES"   Use pthread_getaffinity_np() to 
> detect cores 

[dpdk-dev] [PATCH v6 2/8] qede: Add base driver

2016-04-26 Thread Bruce Richardson
On Mon, Apr 25, 2016 at 10:13:00PM -0700, Rasesh Mody wrote:
> The base driver is the backend module for the QLogic FastLinQ QL4
> 25G/40G CNA family of adapters as well as their virtual functions (VF)
> in SR-IOV context.
> 
> The purpose of the base module is to:
>  - provide all the common code that will be shared between the various
>drivers that would be used with said line of products. Flows such as
>chip initialization and de-initialization fall under this category.
>  - abstract the protocol-specific HW & FW components, allowing the
>protocol drivers to have clean APIs, which are detached in its
>slowpath configuration from the actual Hardware Software Interface(HSI).
> 
> This patch adds a base module without any protocol-specific bits.
> I.e., this adds a basic implementation that almost entirely falls under
> the first category.
> 
> Signed-off-by: Harish Patil 
> Signed-off-by: Rasesh Mody 
> Signed-off-by: Sony Chacko 
> ---



> +#
> +# CLANG VERSION
> +#
> +IS_CLANG_GT_362 := $(shell \
> + CLANG_MAJOR=`echo | clang -dM -E - 2>/dev/null | grep 
> clang_major | cut -d" " -f3`; \
> + CLANG_MINOR=`echo | clang -dM -E - 2>/dev/null | grep 
> clang_minor | cut -d" " -f3`; \
> + CLANG_PATCH=`echo | clang -dM -E - 2>/dev/null | grep 
> clang_patch | cut -d" " -f3`; \
> + if [ "0$$CLANG_MAJOR" -gt "03" ]; then \
> + echo 1; \
> + elif [ "0$$CLANG_MAJOR" -eq "03" -a "0$$CLANG_MINOR" 
> -gt "06" ]; then \
> + echo 1; \
> + elif [ "0$$CLANG_MAJOR" -eq "03" -a "0$$CLANG_MINOR" 
> -eq "06" -a "0$$CLANG_PATCH" -gt "02" ]; then \
> + echo 1; \
> + fi)
> +

While the clang version seems something that might be generally useful, this
seems a long way of doing things just to see what compiler warning flag you need
to set. How about just testing with clang to see if you get an error with the
new flag or not.

For example, on Fedora 23 (clang 3.7):

  bruce at Fedora:dpdk-next-net$ clang -Wno-shift-negative-value -Werror -E - < 
/dev/null > /dev/null 2>&1
  bruce at Fedora:dpdk-next-net$ echo $?
  0

While the same commands on FreeBSD 10.3 (clang 3.4):

  bruce at bsd10:~$ clang -Wno-shift-negative-value -Werror -E - < /dev/null > 
/dev/null 2>&1
  bruce at bsd10:~$ echo $?
  1

> +#
> +# CFLAGS
> +#
> +CFLAGS_BASE_DRIVER = -Wno-unused-parameter
> +CFLAGS_BASE_DRIVER += -Wno-unused-value
> +CFLAGS_BASE_DRIVER += -Wno-sign-compare
> +CFLAGS_BASE_DRIVER += -Wno-missing-prototypes
> +CFLAGS_BASE_DRIVER += -Wno-cast-qual
> +CFLAGS_BASE_DRIVER += -Wno-unused-function
> +CFLAGS_BASE_DRIVER += -Wno-unused-variable
> +CFLAGS_BASE_DRIVER += -Wno-strict-aliasing
> +CFLAGS_BASE_DRIVER += -Wno-missing-prototypes
> +CFLAGS_BASE_DRIVER += -Wno-format-nonliteral
> +ifeq ($(OS_TYPE),Linux)
> +ifeq ($(IS_CLANG_GT_362),1)
> +CFLAGS_BASE_DRIVER += -Wno-shift-negative-value # Support added after clang 
> 3.6
> +else
> +CFLAGS_BASE_DRIVER += -Wno-shift-sign-overflow
> +endif
> +endif
> +




[dpdk-dev] [PATCH v2] nfp: avoiding concurrency when hardware reconfig

2016-04-26 Thread Alejandro Lucero
Some apps calling some functions from different threads at the
same time could lead to reconfig problems. Reconfig mechanism is
based on a hardware queue where incrementing a counter signals the
firmware to do the reconfig. If there are two increments before the
 first one has been processed the firmware will stop and a device
reset is necessary.

 - v2: header file to the right place

Signed-off-by: Alejandro Lucero 
---
 drivers/net/nfp/nfp_net.c | 8 
 drivers/net/nfp/nfp_net_pmd.h | 1 +
 2 files changed, 9 insertions(+)

diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index bc0a3d8..559ebe6 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -54,6 +54,7 @@
 #include 
 #include 
 #include 
+#include 

 #include "nfp_net_pmd.h"
 #include "nfp_net_logs.h"
@@ -407,6 +408,8 @@ nfp_net_reconfig(struct nfp_net_hw *hw, uint32_t ctrl, 
uint32_t update)
PMD_DRV_LOG(DEBUG, "nfp_net_reconfig: ctrl=%08x update=%08x\n",
ctrl, update);

+   rte_spinlock_lock(&hw->reconfig_lock);
+
nn_cfg_writel(hw, NFP_NET_CFG_CTRL, ctrl);
nn_cfg_writel(hw, NFP_NET_CFG_UPDATE, update);

@@ -414,6 +417,8 @@ nfp_net_reconfig(struct nfp_net_hw *hw, uint32_t ctrl, 
uint32_t update)

err = __nfp_net_reconfig(hw, update);

+   rte_spinlock_unlock(&hw->reconfig_lock);
+
if (!err)
return 0;

@@ -2399,6 +2404,9 @@ nfp_net_init(struct rte_eth_dev *eth_dev)
PMD_INIT_LOG(INFO, "max_rx_queues: %u, max_tx_queues: %u\n",
 hw->max_rx_queues, hw->max_tx_queues);

+   /* Initializing spinlock for reconfigs */
+   rte_spinlock_init(&hw->reconfig_lock);
+
/* Allocating memory for mac addr */
eth_dev->data->mac_addrs = rte_zmalloc("mac_addr", ETHER_ADDR_LEN, 0);
if (eth_dev->data->mac_addrs == NULL) {
diff --git a/drivers/net/nfp/nfp_net_pmd.h b/drivers/net/nfp/nfp_net_pmd.h
index 232ce5c..c180972 100644
--- a/drivers/net/nfp/nfp_net_pmd.h
+++ b/drivers/net/nfp/nfp_net_pmd.h
@@ -406,6 +406,7 @@ struct nfp_net_hw {
int stride_tx;

uint8_t *qcp_cfg;
+   rte_spinlock_t reconfig_lock;

uint32_t max_tx_queues;
uint32_t max_rx_queues;
-- 
1.9.1



[dpdk-dev] [PATCH v6 1/8] qede: Add maintainers, documentation and license

2016-04-26 Thread Bruce Richardson
On Mon, Apr 25, 2016 at 10:12:59PM -0700, Rasesh Mody wrote:
> Signed-off-by: Harish Patil 
> Signed-off-by: Rasesh Mody 
> Signed-off-by: Sony Chacko 
> ---
>  MAINTAINERS   |7 +
>  doc/guides/nics/index.rst |1 +
>  doc/guides/nics/overview.rst  |   86 +-
>  doc/guides/nics/qede.rst  |  314 
> +
>  drivers/net/qede/LICENSE.qede_pmd |   28 
>  5 files changed, 393 insertions(+), 43 deletions(-)
>  create mode 100644 doc/guides/nics/qede.rst
>  create mode 100644 drivers/net/qede/LICENSE.qede_pmd
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 1953ea2..ba4053a 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -332,6 +332,13 @@ M: Rasesh Mody 
>  F: drivers/net/bnx2x/
>  F: doc/guides/nics/bnx2x.rst
>  
> +QLogic qede PMD
> +M: Harish Patil 
> +M: Rasesh Mody 
> +M: Sony Chacko 
> +F: drivers/net/qede/
> +F: doc/guides/nics/qede.rst
> +
>  RedHat virtio
>  M: Huawei Xie 
>  M: Yuanhan Liu 
> diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst
> index 769f677..d67f056 100644
> --- a/doc/guides/nics/index.rst
> +++ b/doc/guides/nics/index.rst
> @@ -53,6 +53,7 @@ Network Interface Controller Drivers
>  vhost
>  vmxnet3
>  pcap_ring
> +qede

Minor nit - this addition should be between "nfp" and "szedata2" as the list
is in alphabetical order apart from the last entry which covers two virtual
NICs in the one section.

/Bruce


[dpdk-dev] [PATCH] nfp: fixing a bug when gather

2016-04-26 Thread Alejandro Lucero
mbufs where not properly released when they are chained.

Fixes: b812daadad0d ("nfp: add Rx and Tx")

Signed-off-by: Alejandro Lucero 
---
 drivers/net/nfp/nfp_net.c | 21 ++---
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index 559ebe6..1259d2c 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -323,7 +323,7 @@ nfp_net_tx_queue_release_mbufs(struct nfp_net_txq *txq)

for (i = 0; i < txq->tx_count; i++) {
if (txq->txbufs[i].mbuf) {
-   rte_pktmbuf_free_seg(txq->txbufs[i].mbuf);
+   rte_pktmbuf_free(txq->txbufs[i].mbuf);
txq->txbufs[i].mbuf = NULL;
}
}
@@ -1976,11 +1976,16 @@ nfp_net_xmit_pkts(void *tx_queue, struct rte_mbuf 
**tx_pkts, uint16_t nb_pkts)
 */
pkt_size = pkt->pkt_len;

-   while (pkt_size) {
-   /* Releasing mbuf which was prefetched above */
-   if (*lmbuf)
-   rte_pktmbuf_free_seg(*lmbuf);
+   /* Releasing mbuf which was prefetched above */
+   if (*lmbuf)
+   rte_pktmbuf_free(*lmbuf);
+   /*
+* Linking mbuf with descriptor for being released
+* next time descriptor is used
+*/
+   *lmbuf = pkt;

+   while (pkt_size) {
dma_size = pkt->data_len;
dma_addr = rte_mbuf_data_dma_addr(pkt);
PMD_TX_LOG(DEBUG, "Working with mbuf at dma address:"
@@ -1994,12 +1999,6 @@ nfp_net_xmit_pkts(void *tx_queue, struct rte_mbuf 
**tx_pkts, uint16_t nb_pkts)
ASSERT(free_descs > 0);
free_descs--;

-   /*
-* Linking mbuf with descriptor for being released
-* next time descriptor is used
-*/
-   *lmbuf = pkt;
-
txq->wr_p++;
txq->tail++;
if (unlikely(txq->tail == txq->tx_count)) /* wrapping?*/
-- 
1.9.1



[dpdk-dev] [PATCH] nfp: add flag for enabling device hotplug

2016-04-26 Thread Alejandro Lucero
RTE_PCI_DRV_DETACHABLE is required for detaching a device
during execution.

Signed-off-by: Alejandro Lucero 
---
 drivers/net/nfp/nfp_net.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index 1259d2c..ea5a2a3 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -2466,7 +2466,8 @@ static struct eth_driver rte_nfp_net_pmd = {
{
.name = "rte_nfp_net_pmd",
.id_table = pci_id_nfp_net_map,
-   .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
+   .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
+RTE_PCI_DRV_DETACHABLE,
},
.eth_dev_init = nfp_net_init,
.dev_private_size = sizeof(struct nfp_net_adapter),
-- 
1.9.1



[dpdk-dev] [PATCH v6 2/8] qede: Add base driver

2016-04-26 Thread Bruce Richardson
On Mon, Apr 25, 2016 at 10:13:00PM -0700, Rasesh Mody wrote:
> The base driver is the backend module for the QLogic FastLinQ QL4
> 25G/40G CNA family of adapters as well as their virtual functions (VF)
> in SR-IOV context.
> 
> The purpose of the base module is to:
>  - provide all the common code that will be shared between the various
>drivers that would be used with said line of products. Flows such as
>chip initialization and de-initialization fall under this category.
>  - abstract the protocol-specific HW & FW components, allowing the
>protocol drivers to have clean APIs, which are detached in its
>slowpath configuration from the actual Hardware Software Interface(HSI).
> 
> This patch adds a base module without any protocol-specific bits.
> I.e., this adds a basic implementation that almost entirely falls under
> the first category.
> 
> Signed-off-by: Harish Patil 
> Signed-off-by: Rasesh Mody 
> Signed-off-by: Sony Chacko 
> ---

Checkpatch complains about a few things when I run it, notably this typo:

WARNING:TYPO_SPELLING: 'DIDNT' may be misspelled - perhaps 'DIDN'T'?
#13900: FILE: drivers/net/qede/base/ecore_hsi_tools.h:913:
+   DBG_STATUS_DATA_DIDNT_TRIGGER,

and a few complaints about:

WARNING:UNSPECIFIED_INT: Prefer 'unsigned int' to bare use of 'unsigned'
#223: FILE: drivers/net/qede/base/bcm_osal.c:90:
+   unsigned socket_id;

Regards,
/Bruce


[dpdk-dev] [PATCH v6 6/8] qede: Add attention support

2016-04-26 Thread Bruce Richardson
On Mon, Apr 25, 2016 at 10:13:04PM -0700, Rasesh Mody wrote:
> Physical link is handled by the management Firmware.
> This patch lays the infrastructure for attention handling in the driver,
> as link change notifications arrive via async attentions, as well as the
> handling of such notifications. It adds async event notification handler
> interfaces to the PMD.
> 
> Signed-off-by: Harish Patil 
> Signed-off-by: Rasesh Mody 
> Signed-off-by: Sony Chacko 
> ---
>  drivers/net/qede/base/ecore_attn_values.h |13287 
> +
>  drivers/net/qede/base/ecore_dev.c |   51 +
>  drivers/net/qede/base/ecore_int.c | 1131 +++
>  3 files changed, 14469 insertions(+)
>  create mode 100644 drivers/net/qede/base/ecore_attn_values.h
> 
I'm not familiar with the term "attentions" or "attention handling". 
Would "interrupt handling" or "async event handling" not be better to use in the
title and commit message to make things more unstandable to readers.

If you do want to use the term attentions in the commit message body, please
explain the term first. [I don't believe the term should be used in the commit
title, though]

Regards,
/Bruce


[dpdk-dev] [RFC] eal: provide option to set vhost_user socket owner/permissions

2016-04-26 Thread Aaron Conole
Thomas Monjalon  writes:

> 2016-04-25 21:16, Yuanhan Liu:
>> On Mon, Apr 25, 2016 at 11:18:16AM +0200, Christian Ehrhardt wrote:
>> > The API doesn't hold a way to specify a owner/permission set for vhost_user
>> > created sockets.
>> 
>> Yes, it's kind of like a known issue. So, thanks for bringing it, with
>> a solution, for dicussion (cc'ed more people).
> [...]
>> > But I'd be interested if DPDK in general would be interested in:
>> > a) an approach like this?
>> 
>> You were trying to add a vhost specific stuff as EAL command option,
>> which is something we might should try to avoid.
>
> Yes, -1
>
>> > b) would prefer a change of the API?
>> 
>> Adding a new option to the current register API might will not work well,
>> either. It gives you no ability to do a dynamic change later. I mean,
>> taking OVS as an example, OVS provides you the flexible ability to do all
>> kinds of configuration in a dynamic way, say number of rx queues. If we
>> do the permissions setup in the register time, there would be no way to
>> change it later, right?
>> 
>> So, I'm thinking that we may could add a new API for that? It then would
>> allow applications to change it at anytime.
>
> A vhost API in the library?
> And for vhost PMD? What about devargs parameters?

I don't know the most sane solution here, other than to echo the
sentiment that a new API for this is probably appropriate. Where that
API lives, and how it looks should be hashed out. For now, I'm working
on a solution in OVS because no such API or facility exists in DPDK.

Actually, there are a number of edge cases with vhost-user sockets. I
don't want to get into all of them, but since we're discussing the API a
bit here, I'd like to also bring up the following:

  What is the desired behavior w.r.t. file cleanup when the application
  crashes, restarts, and tells DPDK to use that file again (which hasn't
  been cleaned up due to the crash)?
  At present, the vhost-user code errors out. But how does the
  application correct the situation without deleting arbitrary files on
  the filesystem?

>> > c) consider it an issue of consuming projects and let them take care?
>> 
>> It's not exactly an issue of consuming projects; we created the socket
>> file after all.
>
> Yes

Just want to reiterate at present there is no solution, so projects will
invent their own. I can point to Ubuntu and Red Hat customer bugs which
require silly workarounds like "after you started a bunch of stuff, go
to the directory and run chmod/chown."

I'm actually not opposed to any solution that seems sane. If DPDK takes
the stance that the file is specified by the application, and therefore
"file management" activities (removal, permissions, ownership, etc.) are
the responsibility of the application, so be it. If the stance is that
DPDK owns the management of the file, so be that as well. I think the
first case is easier for the library maintainers (do nothing), the
second is easier for the applications (use these semantics).

If it really is the responsibility of DPDK, then I think the only sane
approach is an API for managing this. That may require an additional
library framework to link the vhost-user PMD and rte_ethdev facilities
so that a common API could be provided.

Just my $.02.

Thanks,
Aaron


[dpdk-dev] [PATCH v2 1/4] ixgbe: rearrange vector PMD code for x86

2016-04-26 Thread Jianbo Liu
move common code to new file "ixgbe_rxtx_vec_common.h",
and vPMD for x86 is implemented in ixgbe_rxtx_vec.c

Signed-off-by: Jianbo Liu 
Suggested-by: Bruce Richardson 
---
 drivers/net/ixgbe/ixgbe_rxtx_vec.c| 256 +--
 drivers/net/ixgbe/ixgbe_rxtx_vec_common.h | 325 ++
 2 files changed, 333 insertions(+), 248 deletions(-)
 create mode 100644 drivers/net/ixgbe/ixgbe_rxtx_vec_common.h

diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec.c 
b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
index 5040704..b704a57 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx_vec.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx_vec.c
@@ -37,6 +37,7 @@

 #include "ixgbe_ethdev.h"
 #include "ixgbe_rxtx.h"
+#include "ixgbe_rxtx_vec_common.h"

 #include 

@@ -414,69 +415,6 @@ ixgbe_recv_pkts_vec(void *rx_queue, struct rte_mbuf 
**rx_pkts,
return _recv_raw_pkts_vec(rx_queue, rx_pkts, nb_pkts, NULL);
 }

-static inline uint16_t
-reassemble_packets(struct ixgbe_rx_queue *rxq, struct rte_mbuf **rx_bufs,
-   uint16_t nb_bufs, uint8_t *split_flags)
-{
-   struct rte_mbuf *pkts[nb_bufs]; /*finished pkts*/
-   struct rte_mbuf *start = rxq->pkt_first_seg;
-   struct rte_mbuf *end =  rxq->pkt_last_seg;
-   unsigned pkt_idx, buf_idx;
-
-   for (buf_idx = 0, pkt_idx = 0; buf_idx < nb_bufs; buf_idx++) {
-   if (end != NULL) {
-   /* processing a split packet */
-   end->next = rx_bufs[buf_idx];
-   rx_bufs[buf_idx]->data_len += rxq->crc_len;
-
-   start->nb_segs++;
-   start->pkt_len += rx_bufs[buf_idx]->data_len;
-   end = end->next;
-
-   if (!split_flags[buf_idx]) {
-   /* it's the last packet of the set */
-   start->hash = end->hash;
-   start->ol_flags = end->ol_flags;
-   /* we need to strip crc for the whole packet */
-   start->pkt_len -= rxq->crc_len;
-   if (end->data_len > rxq->crc_len)
-   end->data_len -= rxq->crc_len;
-   else {
-   /* free up last mbuf */
-   struct rte_mbuf *secondlast = start;
-
-   start->nb_segs--;
-   while (secondlast->next != end)
-   secondlast = secondlast->next;
-   secondlast->data_len -= (rxq->crc_len -
-   end->data_len);
-   secondlast->next = NULL;
-   rte_pktmbuf_free_seg(end);
-   end = secondlast;
-   }
-   pkts[pkt_idx++] = start;
-   start = end = NULL;
-   }
-   } else {
-   /* not processing a split packet */
-   if (!split_flags[buf_idx]) {
-   /* not a split packet, save and skip */
-   pkts[pkt_idx++] = rx_bufs[buf_idx];
-   continue;
-   }
-   end = start = rx_bufs[buf_idx];
-   rx_bufs[buf_idx]->data_len += rxq->crc_len;
-   rx_bufs[buf_idx]->pkt_len += rxq->crc_len;
-   }
-   }
-
-   /* save the partial packet for next time */
-   rxq->pkt_first_seg = start;
-   rxq->pkt_last_seg = end;
-   memcpy(rx_bufs, pkts, pkt_idx * (sizeof(*pkts)));
-   return pkt_idx;
-}
-
 /*
  * vPMD receive routine that reassembles scattered packets
  *
@@ -539,72 +477,6 @@ vtx(volatile union ixgbe_adv_tx_desc *txdp,
vtx1(txdp, *pkt, flags);
 }

-static inline int __attribute__((always_inline))
-ixgbe_tx_free_bufs(struct ixgbe_tx_queue *txq)
-{
-   struct ixgbe_tx_entry_v *txep;
-   uint32_t status;
-   uint32_t n;
-   uint32_t i;
-   int nb_free = 0;
-   struct rte_mbuf *m, *free[RTE_IXGBE_TX_MAX_FREE_BUF_SZ];
-
-   /* check DD bit on threshold descriptor */
-   status = txq->tx_ring[txq->tx_next_dd].wb.status;
-   if (!(status & IXGBE_ADVTXD_STAT_DD))
-   return 0;
-
-   n = txq->tx_rs_thresh;
-
-   /*
-* first buffer to free from S/W ring is at index
-* tx_next_dd - (tx_rs_thresh-1)
-*/
-   txep = &txq->sw_ring_v[txq->tx_next_dd - (n - 1)];
-   m = __rte_pktmbuf_prefree_seg(txep[0].mbuf);
-   if (likely(m != NULL)) {
-   free[0] = m;
-   nb_free = 1;
-   for (i = 1; i < n; i++) {
- 

[dpdk-dev] [PATCH v2 2/4] ixgbe: implement vector PMD for arm architecture

2016-04-26 Thread Jianbo Liu
use ARM NEON intrinsic to implement ixgbe vPMD

Signed-off-by: Jianbo Liu 
---
 drivers/net/ixgbe/Makefile  |   4 +
 drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c | 556 
 2 files changed, 560 insertions(+)
 create mode 100644 drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c

diff --git a/drivers/net/ixgbe/Makefile b/drivers/net/ixgbe/Makefile
index 50bf51c..b1c7a60 100644
--- a/drivers/net/ixgbe/Makefile
+++ b/drivers/net/ixgbe/Makefile
@@ -108,7 +108,11 @@ SRCS-$(CONFIG_RTE_LIBRTE_IXGBE_PMD) += ixgbe_rxtx.c
 SRCS-$(CONFIG_RTE_LIBRTE_IXGBE_PMD) += ixgbe_ethdev.c
 SRCS-$(CONFIG_RTE_LIBRTE_IXGBE_PMD) += ixgbe_fdir.c
 SRCS-$(CONFIG_RTE_LIBRTE_IXGBE_PMD) += ixgbe_pf.c
+ifeq ($(CONFIG_RTE_ARCH_ARM64),y)
+SRCS-$(CONFIG_RTE_IXGBE_INC_VECTOR) += ixgbe_rxtx_vec_neon.c
+else
 SRCS-$(CONFIG_RTE_IXGBE_INC_VECTOR) += ixgbe_rxtx_vec.c
+endif

 ifeq ($(CONFIG_RTE_NIC_BYPASS),y)
 SRCS-$(CONFIG_RTE_LIBRTE_IXGBE_PMD) += ixgbe_bypass.c
diff --git a/drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c 
b/drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c
new file mode 100644
index 000..2d63490
--- /dev/null
+++ b/drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c
@@ -0,0 +1,556 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include 
+#include 
+#include 
+
+#include "ixgbe_ethdev.h"
+#include "ixgbe_rxtx.h"
+#include "ixgbe_rxtx_vec_common.h"
+
+#include 
+
+#pragma GCC diagnostic ignored "-Wcast-qual"
+
+static inline void
+ixgbe_rxq_rearm(struct ixgbe_rx_queue *rxq)
+{
+   int i;
+   uint16_t rx_id;
+   volatile union ixgbe_adv_rx_desc *rxdp;
+   struct ixgbe_rx_entry *rxep = &rxq->sw_ring[rxq->rxrearm_start];
+   struct rte_mbuf *mb0, *mb1;
+   uint64x2_t dma_addr0, dma_addr1;
+   uint64x2_t zero = vdupq_n_u64(0);
+   uint64_t paddr;
+   uint8x8_t p;
+
+   rxdp = rxq->rx_ring + rxq->rxrearm_start;
+
+   /* Pull 'n' more MBUFs into the software ring */
+   if (unlikely(rte_mempool_get_bulk(rxq->mb_pool,
+ (void *)rxep,
+ RTE_IXGBE_RXQ_REARM_THRESH) < 0)) {
+   if (rxq->rxrearm_nb + RTE_IXGBE_RXQ_REARM_THRESH >=
+   rxq->nb_rx_desc) {
+   for (i = 0; i < RTE_IXGBE_DESCS_PER_LOOP; i++) {
+   rxep[i].mbuf = &rxq->fake_mbuf;
+   vst1q_u64((uint64_t *)&rxdp[i].read,
+ zero);
+   }
+   }
+   rte_eth_devices[rxq->port_id].data->rx_mbuf_alloc_failed +=
+   RTE_IXGBE_RXQ_REARM_THRESH;
+   return;
+   }
+
+   p = vld1_u8((uint8_t *)&rxq->mbuf_initializer);
+
+   /* Initialize the mbufs in vector, process 2 mbufs in one loop */
+   for (i = 0; i < RTE_IXGBE_RXQ_REARM_THRESH; i += 2, rxep += 2) {
+   mb0 = rxep[0].mbuf;
+   mb1 = rxep[1].mbuf;
+
+   /*
+* Flush mbuf with pkt template.
+* Data to be rearmed is 6 bytes long.
+* Though, RX will overwrite ol_flags that are coming next
+* anyway. So overwrite whole 8 bytes with one load:
+* 6 bytes of rearm_data plus first 2 bytes of ol_flags.
+*/
+ 

[dpdk-dev] [PATCH v2 3/4] ixgbe: enable ixgbe vector PMD on ARMv8a platform

2016-04-26 Thread Jianbo Liu
Signed-off-by: Jianbo Liu 
---
 config/defconfig_arm64-armv8a-linuxapp-gcc | 1 -
 1 file changed, 1 deletion(-)

diff --git a/config/defconfig_arm64-armv8a-linuxapp-gcc 
b/config/defconfig_arm64-armv8a-linuxapp-gcc
index 9abeca4..98cc054 100644
--- a/config/defconfig_arm64-armv8a-linuxapp-gcc
+++ b/config/defconfig_arm64-armv8a-linuxapp-gcc
@@ -42,7 +42,6 @@ CONFIG_RTE_FORCE_INTRINSICS=y
 CONFIG_RTE_TOOLCHAIN="gcc"
 CONFIG_RTE_TOOLCHAIN_GCC=y

-CONFIG_RTE_IXGBE_INC_VECTOR=n
 CONFIG_RTE_LIBRTE_IVSHMEM=n
 CONFIG_RTE_LIBRTE_FM10K_PMD=n
 CONFIG_RTE_LIBRTE_I40E_PMD=n
-- 
1.8.3.1



[dpdk-dev] [PATCH v2 4/4] maintainers: claim responsibility for ixgbe vector PMD on ARM

2016-04-26 Thread Jianbo Liu
Signed-off-by: Jianbo Liu 
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 1953ea2..20158e3 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -142,6 +142,7 @@ F: lib/librte_eal/common/include/arch/arm/*_64.h
 F: lib/librte_acl/acl_run_neon.*
 F: lib/librte_lpm/rte_lpm_neon.h
 F: lib/librte_hash/rte*_arm64.h
+F: drivers/net/ixgbe/ixgbe_rxtx_vec_neon.c

 EZchip TILE-Gx
 M: Zhigang Lu 
-- 
1.8.3.1



[dpdk-dev] [PATCH] version: 16.07-rc0

2016-04-26 Thread Thomas Monjalon
> > After having removed the deprecated stuff, we can start pushing
> > new fixes and features in the version 16.07.
> > 
> > Signed-off-by: Thomas Monjalon 
> Acked-by: Bruce Richardson 

Applied


[dpdk-dev] [PATCH v6 8/8] qede: Enable PMD build

2016-04-26 Thread Bruce Richardson
On Mon, Apr 25, 2016 at 10:13:06PM -0700, Rasesh Mody wrote:
> This patch enables the QEDE PMD build.
> 
> Signed-off-by: Harish Patil 
> Signed-off-by: Rasesh Mody 
> Signed-off-by: Sony Chacko 
> ---
>  config/common_base   |   12 
>  drivers/net/Makefile |1 +
>  mk/rte.app.mk|2 ++
>  3 files changed, 15 insertions(+)
> 
Without any other changes, this patch can be placed two patches earlier in the
series, i.e. before the attention support and DCBX features are added in. 
Ideally,
it would be good to have this patch just after the core driver is added in patch
3, but I suspect that that may be infeasible, so please just move it to the
earliest point in the series where the code can reasonably compile.

thanks,
/Bruce


[dpdk-dev] [PATCH v6 1/8] qede: Add maintainers, documentation and license

2016-04-26 Thread Bruce Richardson
On Mon, Apr 25, 2016 at 10:12:59PM -0700, Rasesh Mody wrote:
> Signed-off-by: Harish Patil 
> Signed-off-by: Rasesh Mody 
> Signed-off-by: Sony Chacko 
> ---
>  MAINTAINERS   |7 +
>  doc/guides/nics/index.rst |1 +
>  doc/guides/nics/overview.rst  |   86 +-
>  doc/guides/nics/qede.rst  |  314 
> +
>  drivers/net/qede/LICENSE.qede_pmd |   28 
>  5 files changed, 393 insertions(+), 43 deletions(-)
>  create mode 100644 doc/guides/nics/qede.rst
>  create mode 100644 drivers/net/qede/LICENSE.qede_pmd
> 
Hi,

While it's great to see the documentation for each driver coming in the same
patchset as the driver itself, can you perhaps see about merging some of the doc
changes here in with the other patches. For example, the license and maintainers
changes can probably go with the base code drop, while the NIC overview
documentation should go with the core code for the driver. 

Ideally, the feature matrix would be updated as each patch adds new features,
but since this is a new driver, I'm ok with just having this as part of that
core driver patch too - with a note in the commit stating that the following
commits contain the code for the features not implemented in that one.

Thomas, John McNamara, as keen viewers and maintainers of our documentation,
any comments or strong objection to this?

Regards,
/Bruce


[dpdk-dev] [PATCH v6 1/8] qede: Add maintainers, documentation and license

2016-04-26 Thread Mcnamara, John


> -Original Message-
> From: Richardson, Bruce
> Sent: Tuesday, April 26, 2016 4:04 PM
> To: Rasesh Mody 
> Cc: thomas.monjalon at 6wind.com; dev at dpdk.org; ameen.rahman at qlogic.com;
> Harish Patil ; Sony Chacko
> ; Mcnamara, John 
> Subject: Re: [PATCH v6 1/8] qede: Add maintainers, documentation and
> license
> 
> Ideally, the feature matrix would be updated as each patch adds new
> features, but since this is a new driver, I'm ok with just having this as
> part of that core driver patch too - with a note in the commit stating
> that the following commits contain the code for the features not
> implemented in that one.
> 
> Thomas, John McNamara, as keen viewers and maintainers of our
> documentation, any comments or strong objection to this?

Hi,

No objection from me. That sounds reasonable.

John


[dpdk-dev] [PATCH v6 1/8] qede: Add maintainers, documentation and license

2016-04-26 Thread Thomas Monjalon
2016-04-26 15:19, Mcnamara, John:
> > From: Richardson, Bruce
> > Ideally, the feature matrix would be updated as each patch adds new
> > features, but since this is a new driver, I'm ok with just having this as
> > part of that core driver patch too - with a note in the commit stating
> > that the following commits contain the code for the features not
> > implemented in that one.
> > 
> > Thomas, John McNamara, as keen viewers and maintainers of our
> > documentation, any comments or strong objection to this?
> 
> Hi,
> 
> No objection from me. That sounds reasonable.

Yes sounds a reasonable requirement.
Having everything split in feature's patches would be perfect,
so do at your best. Thanks


[dpdk-dev] [PATCH] examples/performance-thread: fix segfault with in gcc 5.x

2016-04-26 Thread Tomasz Kulasek
It seems that with gcc >5.x and -O2/-O3 optimization breaks packet grouping
algorithm in l3fwd-thread application causing segfault.

When last packet pointer "lp" and "pnum->u64" buffer points the same
memory buffer, high optimization can cause unpredictable results. It seems
that assignment of precalculated group sizes may interfere with
initialization of new group size when lp points value inside current group
and didn't should be changed.

With gcc >5.x and optimization we cannot be sure which assignment will be
done first, so the group size can be counted incorrectly causing segfault.

This patch eliminates intersection of assignment of initial group size
(lp[0] = 1) and precalculated group sizes when gptbl[v].idx < 4.

Fixes: d48415e1fee3 ("examples/performance-thread: add l3fwd-thread app")

Signed-off-by: Tomasz Kulasek 
---
 examples/performance-thread/l3fwd-thread/main.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/examples/performance-thread/l3fwd-thread/main.c 
b/examples/performance-thread/l3fwd-thread/main.c
index 15c0a4d..3417fd5 100644
--- a/examples/performance-thread/l3fwd-thread/main.c
+++ b/examples/performance-thread/l3fwd-thread/main.c
@@ -1658,9 +1658,9 @@ port_groupx4(uint16_t pn[FWDSTEP + 1], uint16_t *lp, 
__m128i dp1, __m128i dp2)

/* if dest port value has changed. */
if (v != GRPMSK) {
-   lp = pnum->u16 + gptbl[v].idx;
-   lp[0] = 1;
pnum->u64 = gptbl[v].pnum;
+   pnum->u16[FWDSTEP] = 1;
+   lp = pnum->u16 + gptbl[v].idx;
}

return lp;
-- 
1.7.9.5



[dpdk-dev] [PATCH v6 1/8] qede: Add maintainers, documentation and license

2016-04-26 Thread Mcnamara, John
Hi,

Thanks for the documentation. 

In general you should generate and view the Html output to make sure
everything is okay:

make doc-guides-html 
firefox build/doc/html/guides/nics/qede.html &

Other comments below.


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Rasesh Mody
> Sent: Tuesday, April 26, 2016 6:13 AM
> To: thomas.monjalon at 6wind.com; Richardson, Bruce
> 
> Cc: dev at dpdk.org; ameen.rahman at qlogic.com; Rasesh Mody
> ; Harish Patil ; Sony
> Chacko 
> Subject: [dpdk-dev] [PATCH v6 1/8] qede: Add maintainers, documentation
> and license
> 
> ...
>
> +
> +Prerequisites
> +-
> +
> +- Requires firmware version **8.7.x. ** and management

Omit the space before the second "**" bold or it won't render properly.


> +  firmware version **8.7.x or higher**. Firmware may be available
> +  inbox in certain newer Linux distros under the standard directory
> +  E.g. /lib/firmware/qed/qed_init_values_zipped-8.7.7.0.bin

Paths should be wrapped in fixed  quotes.


> +
> +- If the required firmware files are not available then visit
> +  `QLogic Driver Download Center `

The link requires a _ at the end to render correctly, and a full stop at
the end would be better as well.




> +- ``CONFIG_RTE_LIBRTE_QEDE_FW`` (default **""**)
> +
> +  Gives absolute path of firmware file.
> +  Eg: "/lib/firmware/qed/qed_init_values_zipped-8.7.7.0.bin"

Paths should be wrapped in fixed  quotes.



> +  Empty string indicates driver will pick up the firmware file  from
> + the default location.
> +
> +Driver Compilation
> +~~
> +
> +To compile QEDE PMD for Linux x86_64 gcc target, run the following "make"
> +command::

Commands like "make" and constants should be wrapped in fixed  quotes.


> +
> +#. Bind the QLogic 4 adapters to ``igb_uio`` loaded in the
> +   previous step::
> +
> +   .. code-block:: console
> +  ./tools/dpdk_nic_bind.py --bind igb_uio :84:00.0 :84:00.1 \


The :: after step is overriding/confusing the ::console directive. Use ore
or the other. Also there should be a blank line between ::console and the
text.


> +
> +**Note**: librte_pmd_qede will be used to bind to SR-IOV VF device and
> +Linux native kernel driver (QEDE) will function as SR-IOV PF

The second line of the note shouldn't be indented or else it doesn't render
correctly.

Alternatively, you could use a real RST note:

  .. Note::

 Some text here indent 3 spaces.



+
> +   Assign MAC address to the VF using iproute2 utility. The syntax is::
> +  ip link set  vf  mac 
> +

There should be a blank line between :: and the text.

John.
-- 



[dpdk-dev] [PATCH v1] doc: changed nic overview table to clarify supported features

2016-04-26 Thread Thomas Monjalon
2016-04-11 23:27, John McNamara:
> Changed symbol on NIC overview table from X to Y to help
> clarify the indicated features are supported. The X caused
> confusion for some readers.
> 
> Also, added * character to indicate partially supported
> features. This can be used in the future to direct the reader
> to more specific details in the individual NIC guides.
> 
> Signed-off-by: John McNamara 

Applied, thanks


[dpdk-dev] [PATCH] virtio: check if virtio net header could fit in mbuf headroom

2016-04-26 Thread Yuanhan Liu
On Mon, Apr 25, 2016 at 10:21:32PM +0800, Huawei Xie wrote:
> check merge-able header as it is supported.
> previously we don't support merge-able feature, so non merge-able
> header is checked.

Signed-off-by is missing here. Otherwise, this patch looks good to me

--yliu


[dpdk-dev] [PATCH v2] virtio: fix segfault when transmit pkts

2016-04-26 Thread Yuanhan Liu
On Tue, Apr 26, 2016 at 10:43:35AM +0200, Thomas Monjalon wrote:
> Talking about wording,
> 
> 2016-04-25 20:43, Yuanhan Liu:
> > ---
> > Subject: virtio: fix segfault on Tx desc flags setup
> 
> I think the english word "crash" is better than "segfault".

Yes.

Acked-by: Yuanhan Liu 

And, applied to dpdk-next-virtio, with the commit log rewording.

Thanks.

--yliu


[dpdk-dev] [RFC PATCH 0/4]: Implement module information export

2016-04-26 Thread Neil Horman
Hey-
So a few days ago we were reviewing Davids patch series to introduce the
abiilty to dump hardware support from pmd DSO's in a human readable format.
That effort encountered some problems, most notably the fact that stripping a
DSO removed the required information that the proposed tool keyed off, as well
as the need to dead reckon offsets between symbols that may not be constant
(dependent on architecture).

I was going to start looking into the possibility of creating a modinfo
section in a simmilar fashion to what kernel modules do in linux or BSD.  I
decided to propose this solution instead though, because the kernel style
solution requires a significant amount of infrastructure that I think we can
maybe avoid maintaining, if we accept some minor caviats

To do this We emit a set of well known marker symbols for each DSO that an
external application can search for (in this case I called them
this_pmd_driver, where n is a counter macro).  These marker symbols are
n is a counter macro).  These marker symbols are exported by PMDs for
external access.  External tools can then access these symbols via the
dlopen/dlsym api (or via elfutils libraries)

The symbols above alias the rte_driver struct for each PMD, and the external
application can then interrogate the registered driver information.  

I also add a pointer to the pci id table struct for each PMD so that we can
export hardware support.

This approach has a few pros and cons:

pros:
1) Its simple, and doesn't require extra infrastructure to implement.  E.g. we
don't need a new tool to extract driver information and emit the C code to build
the binary data for the special section, nor do we need a custom linker script
to link said special section in place

2) Its stable.  Because the marker symbols are explicitly exported, this
approach is resilient against stripping.

cons:
1) It creates an artifact in that PMD_REGISTER_DRIVER has to be used in one
compilation unit per DSO.  As an example em and igb effectively merge two
drivers into one DSO, and the uses of PMD_REGISTER_DRIVER occur in two separate
C files for the same single linked DSO.  Because of the use of the __COUNTER__
macro we get multiple definitions of the same marker symbols.  

I would make the argument that the downside of the above artifact isn't that big
a deal.  Traditionally in other projects a unit like a module (or DSO in our
case) only ever codifies a single driver (e.g. module_init() in the linux kernel
is only ever used once per built module).  If we have code like igb/em that
shares some core code, we should build the shared code to object files and link
them twice, once to an em.so pmd and again to an igb.so pmd.

But regardless, I thought I would propose this to see what you all thought of
it.

FWIW, heres sample output of the pmdinfo tool from this series probing the
librte_pmd_ena.so module:

[nhorman at hmsreliant dpdk]$ ./build/app/pmdinfo
~/git/dpdk/build/lib/librte_pmd_ena.so
PMD 0 Information:
Driver Name: ena_driver
Driver Type: PCI
|PCI Table|
| VENDOR ID | DEVICE ID | SUBVENDOR ID | SUBDEVICE ID |
|-|
|   1d0f|   ec20|  |  |
|   1d0f|   ec21|  |  |
|-|





[dpdk-dev] [RFC PATCH 1/4] pmd: Modify PMD_REGISTER_DRIVER to emit a marker symbol

2016-04-26 Thread Neil Horman
modify PMD_REGISTER_DRIVER so that, when building as a DSO, PMD's emit an
additional set of symbols named this_pmd_driver, where  is an incrementing
counter.  This gives well known symbol names that external apps can search for
when looking up PMD information.  These new symbols are aliased to the passed in
rte_driver_struct, which future apps can use to interrogate the PMD's for useful
information

Also modify the rte_driver struct to add a union that can hold pmd type specific
information.  Currently, only PMD_PDEV uses this to store a pointer to the PMD's
pci id table.

Signed-off-by: Neil Horman 
CC: David Marchand 
CC: Stephen Hemminger 
CC: "Richardson, Bruce" 
CC: Panu Matilainen 
CC: Thomas Monjalon 
---
 lib/librte_eal/common/include/rte_dev.h | 21 +
 1 file changed, 21 insertions(+)

diff --git a/lib/librte_eal/common/include/rte_dev.h 
b/lib/librte_eal/common/include/rte_dev.h
index f1b5507..a81d901 100644
--- a/lib/librte_eal/common/include/rte_dev.h
+++ b/lib/librte_eal/common/include/rte_dev.h
@@ -50,6 +50,9 @@ extern "C" {
 #include 

 #include 
+#include 
+#include 
+#include 

 __attribute__((format(printf, 2, 0)))
 static inline void
@@ -131,6 +134,9 @@ struct rte_driver {
const char *name;   /**< Driver name. */
rte_dev_init_t *init;  /**< Device init. function. */
rte_dev_uninit_t *uninit;  /**< Device uninit. function. */
+   union {
+   const struct rte_pci_id *pci_table;
+   };
 };

 /**
@@ -178,12 +184,27 @@ int rte_eal_vdev_init(const char *name, const char *args);
  */
 int rte_eal_vdev_uninit(const char *name);

+#ifdef RTE_BUILD_SHARED_LIB 
+#define DRIVER_EXPORT_NAME(name, idx) name##idx
+#define DECLARE_DRIVER_EXPORT(src, idx)\
+extern struct rte_driver DRIVER_EXPORT_NAME(this_pmd_driver, idx)\
+ __attribute__((alias(RTE_STR(src
+
+#define PMD_REGISTER_DRIVER(d)\
+void devinitfn_ ##d(void);\
+void __attribute__((constructor, used)) devinitfn_ ##d(void)\
+{\
+   rte_eal_driver_register(&d);\
+}\
+DECLARE_DRIVER_EXPORT(d, __COUNTER__)
+#else
 #define PMD_REGISTER_DRIVER(d)\
 void devinitfn_ ##d(void);\
 void __attribute__((constructor, used)) devinitfn_ ##d(void)\
 {\
rte_eal_driver_register(&d);\
 }
+#endif

 #ifdef __cplusplus
 }
-- 
2.5.5



[dpdk-dev] [RFC PATCH 2/4] pmds: export this_pmd_driver* symbols

2016-04-26 Thread Neil Horman
Because the DPDK DSO's are opt-in for symbol export, we need to add the symbols
that the modified PMD_REGISTER_DRIVER macro creates so that external
applications can see them

Signed-off-by: Neil Horman 
CC: David Marchand 
CC: Stephen Hemminger 
CC: "Richardson, Bruce" 
CC: Panu Matilainen 
CC: Thomas Monjalon 
---
 drivers/crypto/aesni_gcm/rte_pmd_aesni_gcm_version.map | 1 +
 drivers/crypto/aesni_mb/rte_pmd_aesni_version.map  | 1 +
 drivers/crypto/null/rte_pmd_null_crypto_version.map| 1 +
 drivers/crypto/qat/rte_pmd_qat_version.map | 3 ++-
 drivers/crypto/snow3g/rte_pmd_snow3g_version.map   | 1 +
 drivers/net/af_packet/rte_pmd_af_packet_version.map| 2 +-
 drivers/net/bnx2x/rte_pmd_bnx2x_version.map| 1 +
 drivers/net/bonding/rte_eth_bond_version.map   | 1 +
 drivers/net/cxgbe/rte_pmd_cxgbe_version.map| 2 +-
 drivers/net/e1000/rte_pmd_e1000_version.map| 2 +-
 drivers/net/ena/rte_pmd_ena_version.map| 1 +
 drivers/net/enic/rte_pmd_enic_version.map  | 1 +
 drivers/net/fm10k/rte_pmd_fm10k_version.map| 1 +
 drivers/net/i40e/rte_pmd_i40e_version.map  | 1 +
 drivers/net/ixgbe/rte_pmd_ixgbe_version.map| 2 +-
 drivers/net/mlx4/rte_pmd_mlx4_version.map  | 1 +
 drivers/net/mlx5/rte_pmd_mlx5_version.map  | 1 +
 drivers/net/mpipe/rte_pmd_mpipe_version.map| 1 +
 drivers/net/nfp/rte_pmd_nfp_version.map| 1 +
 drivers/net/null/rte_pmd_null_version.map  | 2 +-
 drivers/net/pcap/rte_pmd_pcap_version.map  | 2 +-
 drivers/net/szedata2/rte_pmd_szedata2_version.map  | 1 +
 drivers/net/vhost/rte_pmd_vhost_version.map| 1 +
 drivers/net/virtio/rte_pmd_virtio_version.map  | 2 +-
 drivers/net/vmxnet3/rte_pmd_vmxnet3_version.map| 2 +-
 25 files changed, 26 insertions(+), 9 deletions(-)

diff --git a/drivers/crypto/aesni_gcm/rte_pmd_aesni_gcm_version.map 
b/drivers/crypto/aesni_gcm/rte_pmd_aesni_gcm_version.map
index dc4d417..62341f9 100644
--- a/drivers/crypto/aesni_gcm/rte_pmd_aesni_gcm_version.map
+++ b/drivers/crypto/aesni_gcm/rte_pmd_aesni_gcm_version.map
@@ -1,3 +1,4 @@
 DPDK_16.04 {
+   global: this_pmd_driver*;
local: *;
 };
diff --git a/drivers/crypto/aesni_mb/rte_pmd_aesni_version.map 
b/drivers/crypto/aesni_mb/rte_pmd_aesni_version.map
index ad607bb..6f727b0 100644
--- a/drivers/crypto/aesni_mb/rte_pmd_aesni_version.map
+++ b/drivers/crypto/aesni_mb/rte_pmd_aesni_version.map
@@ -1,3 +1,4 @@
 DPDK_2.2 {
+   global: this_pmd_driver*;
local: *;
 };
diff --git a/drivers/crypto/null/rte_pmd_null_crypto_version.map 
b/drivers/crypto/null/rte_pmd_null_crypto_version.map
index dc4d417..62341f9 100644
--- a/drivers/crypto/null/rte_pmd_null_crypto_version.map
+++ b/drivers/crypto/null/rte_pmd_null_crypto_version.map
@@ -1,3 +1,4 @@
 DPDK_16.04 {
+   global: this_pmd_driver*;
local: *;
 };
diff --git a/drivers/crypto/qat/rte_pmd_qat_version.map 
b/drivers/crypto/qat/rte_pmd_qat_version.map
index bbaf1c8..6f727b0 100644
--- a/drivers/crypto/qat/rte_pmd_qat_version.map
+++ b/drivers/crypto/qat/rte_pmd_qat_version.map
@@ -1,3 +1,4 @@
 DPDK_2.2 {
+   global: this_pmd_driver*;
local: *;
-};
\ No newline at end of file
+};
diff --git a/drivers/crypto/snow3g/rte_pmd_snow3g_version.map 
b/drivers/crypto/snow3g/rte_pmd_snow3g_version.map
index dc4d417..62341f9 100644
--- a/drivers/crypto/snow3g/rte_pmd_snow3g_version.map
+++ b/drivers/crypto/snow3g/rte_pmd_snow3g_version.map
@@ -1,3 +1,4 @@
 DPDK_16.04 {
+   global: this_pmd_driver*;
local: *;
 };
diff --git a/drivers/net/af_packet/rte_pmd_af_packet_version.map 
b/drivers/net/af_packet/rte_pmd_af_packet_version.map
index ef35398..55e2bb1 100644
--- a/drivers/net/af_packet/rte_pmd_af_packet_version.map
+++ b/drivers/net/af_packet/rte_pmd_af_packet_version.map
@@ -1,4 +1,4 @@
 DPDK_2.0 {
-
+   global: this_pmd_driver*;
local: *;
 };
diff --git a/drivers/net/bnx2x/rte_pmd_bnx2x_version.map 
b/drivers/net/bnx2x/rte_pmd_bnx2x_version.map
index bd8138a..0fccfa3 100644
--- a/drivers/net/bnx2x/rte_pmd_bnx2x_version.map
+++ b/drivers/net/bnx2x/rte_pmd_bnx2x_version.map
@@ -1,4 +1,5 @@
 DPDK_2.1 {
+   global: this_pmd_driver*;

local: *;
 };
diff --git a/drivers/net/bonding/rte_eth_bond_version.map 
b/drivers/net/bonding/rte_eth_bond_version.map
index 22bd920..1071960 100644
--- a/drivers/net/bonding/rte_eth_bond_version.map
+++ b/drivers/net/bonding/rte_eth_bond_version.map
@@ -17,6 +17,7 @@ DPDK_2.0 {
rte_eth_bond_slaves_get;
rte_eth_bond_xmit_policy_get;
rte_eth_bond_xmit_policy_set;
+   this_pmd_driver*;

local: *;
 };
diff --git a/drivers/net/cxgbe/rte_pmd_cxgbe_version.map 
b/drivers/net/cxgbe/rte_pmd_cxgbe_version.map
index bd8138a..6d92937 100644
--- a/drivers/net/cxgbe/rte_pmd_cxgbe_version.map
+++ b/drivers/net/cxgbe/rte_pmd_cxgbe_versio

[dpdk-dev] [RFC PATCH 3/4] pmd: Modify drivers to export appropriate information

2016-04-26 Thread Neil Horman
For the PMD's which support pci devices, add the appropriate pci table pointer
to the rte_driver structure so external applications can find it.

Also note, some modifications to the em/igb and i40e drivers.  These are done to
support an artifact of the use of the __COUNTER__ macro in PMD_REGISTER_DRIVER.
The __COUNTER__ macro evaluates to a string integer that gets incremented each
time it expands.  While that offers a great predictable indexing feature, it
resets for each compilation unit (as would be expected).  However, for the two
aforementioned DSO's they register multiple drivers in multiple compilation
units, which leads to multiple definitions of variables.  I would make the
argument that a single DSO should support only a single driver (which is
analgous to how kernel modules work in linux and bsd), and if there is common
code between multiple modules, that common code should be built into a archive
and linked to each separate DSO.  However, that is a significant amount of work,
and so here, I instead modified the affected DSO's to simply register all
drivers in the same compilation unit.

Signed-off-by: Neil Horman 
CC: David Marchand 
CC: Stephen Hemminger 
CC: "Richardson, Bruce" 
CC: Panu Matilainen 
CC: Thomas Monjalon 
---
 drivers/crypto/qat/rte_qat_cryptodev.c  |  1 +
 drivers/net/bnx2x/bnx2x_ethdev.c|  2 ++
 drivers/net/cxgbe/cxgbe_ethdev.c|  1 +
 drivers/net/e1000/Makefile  |  1 +
 drivers/net/e1000/em_ethdev.c   | 13 +++--
 drivers/net/e1000/igb_ethdev.c  | 24 ++---
 drivers/net/e1000/pmds.c| 48 +
 drivers/net/ena/ena_ethdev.c|  1 +
 drivers/net/enic/enic_ethdev.c  |  1 +
 drivers/net/fm10k/fm10k_ethdev.c|  1 +
 drivers/net/i40e/i40e_ethdev.c  | 14 ++
 drivers/net/i40e/i40e_ethdev_vf.c   | 14 --
 drivers/net/ixgbe/ixgbe_ethdev.c|  2 ++
 drivers/net/mlx4/mlx4.c |  1 +
 drivers/net/mlx5/mlx5.c |  1 +
 drivers/net/nfp/nfp_net.c   |  1 +
 drivers/net/szedata2/rte_eth_szedata2.c |  1 +
 drivers/net/virtio/virtio_ethdev.c  |  1 +
 drivers/net/vmxnet3/vmxnet3_ethdev.c|  1 +
 19 files changed, 95 insertions(+), 34 deletions(-)
 create mode 100644 drivers/net/e1000/pmds.c

diff --git a/drivers/crypto/qat/rte_qat_cryptodev.c 
b/drivers/crypto/qat/rte_qat_cryptodev.c
index a7912f5..20426e8 100644
--- a/drivers/crypto/qat/rte_qat_cryptodev.c
+++ b/drivers/crypto/qat/rte_qat_cryptodev.c
@@ -135,6 +135,7 @@ rte_qat_pmd_init(const char *name __rte_unused, const char 
*params __rte_unused)
 static struct rte_driver pmd_qat_drv = {
.type = PMD_PDEV,
.init = rte_qat_pmd_init,
+   .pci_table = pci_id_qat_map,
 };

 PMD_REGISTER_DRIVER(pmd_qat_drv);
diff --git a/drivers/net/bnx2x/bnx2x_ethdev.c b/drivers/net/bnx2x/bnx2x_ethdev.c
index 071b44f..ba7d009 100644
--- a/drivers/net/bnx2x/bnx2x_ethdev.c
+++ b/drivers/net/bnx2x/bnx2x_ethdev.c
@@ -543,11 +543,13 @@ static int rte_bnx2xvf_pmd_init(const char *name 
__rte_unused, const char *param
 static struct rte_driver rte_bnx2x_driver = {
.type = PMD_PDEV,
.init = rte_bnx2x_pmd_init,
+   .pci_table = pci_id_bnx2x_map,
 };

 static struct rte_driver rte_bnx2xvf_driver = {
.type = PMD_PDEV,
.init = rte_bnx2xvf_pmd_init,
+   .pci_table = pci_id_bnx2xvf_map,
 };

 PMD_REGISTER_DRIVER(rte_bnx2x_driver);
diff --git a/drivers/net/cxgbe/cxgbe_ethdev.c b/drivers/net/cxgbe/cxgbe_ethdev.c
index 04eddaf..c047096 100644
--- a/drivers/net/cxgbe/cxgbe_ethdev.c
+++ b/drivers/net/cxgbe/cxgbe_ethdev.c
@@ -892,6 +892,7 @@ static struct rte_driver rte_cxgbe_driver = {
.name = "cxgbe_driver",
.type = PMD_PDEV,
.init = rte_cxgbe_pmd_init,
+   .pci_table = cxgb4_pci_tbl,
 };

 PMD_REGISTER_DRIVER(rte_cxgbe_driver);
diff --git a/drivers/net/e1000/Makefile b/drivers/net/e1000/Makefile
index f4879e6..058faff 100644
--- a/drivers/net/e1000/Makefile
+++ b/drivers/net/e1000/Makefile
@@ -93,6 +93,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_IGB_PMD) += igb_rxtx.c
 SRCS-$(CONFIG_RTE_LIBRTE_IGB_PMD) += igb_pf.c
 SRCS-$(CONFIG_RTE_LIBRTE_EM_PMD) += em_ethdev.c
 SRCS-$(CONFIG_RTE_LIBRTE_EM_PMD) += em_rxtx.c
+SRCS-y += pmds.c

 # this lib depends upon:
 DEPDIRS-$(CONFIG_RTE_LIBRTE_E1000_PMD) += lib/librte_eal lib/librte_ether
diff --git a/drivers/net/e1000/em_ethdev.c b/drivers/net/e1000/em_ethdev.c
index 653be09..62bd811 100644
--- a/drivers/net/e1000/em_ethdev.c
+++ b/drivers/net/e1000/em_ethdev.c
@@ -60,7 +60,8 @@

 #define PMD_ROUNDUP(x,y)   (((x) + (y) - 1)/(y) * (y))

-
+int
+rte_em_pmd_init(const char *name __rte_unused, const char *params 
__rte_unused);
 static int eth_em_configure(struct rte_eth_dev *dev);
 static int eth_em_start(struct rte_eth_dev *dev);
 static void eth_em_stop(struct rte_eth_dev *dev);
@@ -136,7 +137,7 @@ static enum e1000_fc_mode em_fc_setting = e1000_fc_full;
 /*
  * 

[dpdk-dev] [RFC PATCH 4/4] pmdinfo: Add application to extract pmd driver info

2016-04-26 Thread Neil Horman
This tool uses the prior infrastructure to provide human readable information to
a user about the devices which a given pmd DSO supports.

Usage:
pmdinfo /path/to/driver/pmd

pmdinfo dlopens the specified file, then iteratively looks up the
this_pmd_driver symbol.  For each found symbol in the DSO, it prints
information found in the corresponding rte_driver struct

Signed-off-by: Neil Horman 
CC: David Marchand 
CC: Stephen Hemminger 
CC: "Richardson, Bruce" 
CC: Panu Matilainen 
CC: Thomas Monjalon 
---
 app/Makefile  |  1 +
 app/pmdinfo/Makefile  | 55 +
 app/pmdinfo/pmdinfo.c | 96 +++
 3 files changed, 152 insertions(+)
 create mode 100644 app/pmdinfo/Makefile
 create mode 100644 app/pmdinfo/pmdinfo.c

diff --git a/app/Makefile b/app/Makefile
index 1151e09..42ea130 100644
--- a/app/Makefile
+++ b/app/Makefile
@@ -37,5 +37,6 @@ DIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += test-pipeline
 DIRS-$(CONFIG_RTE_TEST_PMD) += test-pmd
 DIRS-$(CONFIG_RTE_LIBRTE_CMDLINE) += cmdline_test
 DIRS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += proc_info
+DIRS-$(CONFIG_RTE_BUILD_SHARED_LIB) += pmdinfo

 include $(RTE_SDK)/mk/rte.subdir.mk
diff --git a/app/pmdinfo/Makefile b/app/pmdinfo/Makefile
new file mode 100644
index 000..eb38aab
--- /dev/null
+++ b/app/pmdinfo/Makefile
@@ -0,0 +1,55 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of Intel Corporation nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),y)
+
+#
+# library name
+#
+APP = pmdinfo
+
+CFLAGS += -Os -g
+CFLAGS += $(WERROR_FLAGS)
+
+LDLIBS := -ldl
+
+DEPDIRS-y += lib
+
+#
+# all source are stored in SRCS-y
+#
+SRCS-y := pmdinfo.c
+
+include $(RTE_SDK)/mk/rte.app.mk
+
+endif
diff --git a/app/pmdinfo/pmdinfo.c b/app/pmdinfo/pmdinfo.c
new file mode 100644
index 000..7db61b2
--- /dev/null
+++ b/app/pmdinfo/pmdinfo.c
@@ -0,0 +1,96 @@
+#include 
+#include 
+#include 
+#include 
+#include 
+
+
+static void dump_pci_table(struct rte_driver *driver)
+{
+   int i;
+
+   if (!driver->pci_table) {
+   printf(" No PCI Table defined for this driver\n");
+   return;
+   }
+
+   printf("|PCI Table|\n");
+   printf("| VENDOR ID | DEVICE ID | SUBVENDOR ID | SUBDEVICE ID |\n");
+   printf("|-|\n");
+   for (i=0; driver->pci_table[i].vendor_id != 0; i++) {
+   printf("|%11x|%11x|%14x|%14x|\n",
+   driver->pci_table[i].vendor_id, driver->pci_table[i].device_id,
+   driver->pci_table[i].subsystem_vendor_id,
+   driver->pci_table[i].subsystem_device_id);
+   }
+   printf("|-|\n");
+}
+
+static void dump_driver_info(int idx, struct rte_driver *driver)
+{
+   printf("PMD %d Information:\n", idx);
+   printf("Driver Name: %s\n", driver->name);
+
+   switch (driver->type) {
+   case PMD_VDEV:
+   printf("Driver Type: Virtual\n");
+   break;
+   case PMD_PDEV:
+   printf("Driver Type: PCI\n");
+   dump_pci_table(driver);
+   break;
+   default:
+   printf("Driver Type: UNKNOWN (%d)\n", driver->type);
+   break;
+   }
+}
+
+int main(int argc,

[dpdk-dev] [PATCH v6 1/8] qede: Add maintainers, documentation and license

2016-04-26 Thread Rasesh Mody
> From: Bruce Richardson [mailto:bruce.richardson at intel.com]
> Sent: Tuesday, April 26, 2016 6:03 AM
> 
> On Mon, Apr 25, 2016 at 10:12:59PM -0700, Rasesh Mody wrote:
> > Signed-off-by: Harish Patil 
> > Signed-off-by: Rasesh Mody 
> > Signed-off-by: Sony Chacko 
> > ---
> >  MAINTAINERS   |7 +
> >  doc/guides/nics/index.rst |1 +
> >  doc/guides/nics/overview.rst  |   86 +-
> >  doc/guides/nics/qede.rst  |  314
> +
> >  drivers/net/qede/LICENSE.qede_pmd |   28 
> >  5 files changed, 393 insertions(+), 43 deletions(-)  create mode
> > 100644 doc/guides/nics/qede.rst  create mode 100644
> > drivers/net/qede/LICENSE.qede_pmd
> >
> > diff --git a/MAINTAINERS b/MAINTAINERS index 1953ea2..ba4053a 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -332,6 +332,13 @@ M: Rasesh Mody 
> >  F: drivers/net/bnx2x/
> >  F: doc/guides/nics/bnx2x.rst
> >
> > +QLogic qede PMD
> > +M: Harish Patil 
> > +M: Rasesh Mody 
> > +M: Sony Chacko 
> > +F: drivers/net/qede/
> > +F: doc/guides/nics/qede.rst
> > +
> >  RedHat virtio
> >  M: Huawei Xie 
> >  M: Yuanhan Liu  diff --git
> > a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst index
> > 769f677..d67f056 100644
> > --- a/doc/guides/nics/index.rst
> > +++ b/doc/guides/nics/index.rst
> > @@ -53,6 +53,7 @@ Network Interface Controller Drivers
> >  vhost
> >  vmxnet3
> >  pcap_ring
> > +qede
> 
> Minor nit - this addition should be between "nfp" and "szedata2" as the list 
> is
> in alphabetical order apart from the last entry which covers two virtual NICs
> in the one section.
> 
> /Bruce

During our v5 submission we received a comment to keep the logic order by 
adding qede below bnx2x, hence the change. We can place it under "nfp" and 
"szedata2" to maintain the alphabetical order.

Thanks!
Rasesh


[dpdk-dev] [PATCH v6 1/8] qede: Add maintainers, documentation and license

2016-04-26 Thread Thomas Monjalon
2016-04-26 18:27, Rasesh Mody:
> > From: Bruce Richardson [mailto:bruce.richardson at intel.com]
> > Sent: Tuesday, April 26, 2016 6:03 AM
> > 
> > On Mon, Apr 25, 2016 at 10:12:59PM -0700, Rasesh Mody wrote:
> > > Signed-off-by: Harish Patil 
> > > Signed-off-by: Rasesh Mody 
> > > Signed-off-by: Sony Chacko 
> > > ---
> > >  MAINTAINERS   |7 +
> > >  doc/guides/nics/index.rst |1 +
> > >  doc/guides/nics/overview.rst  |   86 +-
> > >  doc/guides/nics/qede.rst  |  314
> > +
> > >  drivers/net/qede/LICENSE.qede_pmd |   28 
> > >  5 files changed, 393 insertions(+), 43 deletions(-)  create mode
> > > 100644 doc/guides/nics/qede.rst  create mode 100644
> > > drivers/net/qede/LICENSE.qede_pmd
> > >
> > > diff --git a/MAINTAINERS b/MAINTAINERS index 1953ea2..ba4053a 100644
> > > --- a/MAINTAINERS
> > > +++ b/MAINTAINERS
> > > @@ -332,6 +332,13 @@ M: Rasesh Mody 
> > >  F: drivers/net/bnx2x/
> > >  F: doc/guides/nics/bnx2x.rst
> > >
> > > +QLogic qede PMD
> > > +M: Harish Patil 
> > > +M: Rasesh Mody 
> > > +M: Sony Chacko 
> > > +F: drivers/net/qede/
> > > +F: doc/guides/nics/qede.rst
> > > +
> > >  RedHat virtio
> > >  M: Huawei Xie 
> > >  M: Yuanhan Liu  diff --git
> > > a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst index
> > > 769f677..d67f056 100644
> > > --- a/doc/guides/nics/index.rst
> > > +++ b/doc/guides/nics/index.rst
> > > @@ -53,6 +53,7 @@ Network Interface Controller Drivers
> > >  vhost
> > >  vmxnet3
> > >  pcap_ring
> > > +qede
> > 
> > Minor nit - this addition should be between "nfp" and "szedata2" as the 
> > list is
> > in alphabetical order apart from the last entry which covers two virtual 
> > NICs
> > in the one section.
> > 
> > /Bruce
> 
> During our v5 submission we received a comment to keep the logic order by 
> adding qede below bnx2x, hence the change. We can place it under "nfp" and 
> "szedata2" to maintain the alphabetical order.

I think you are talking of 2 different things:
MAINTAINERS file et doc/guides/nics.
For MAINTAINERS, this patch is right.
For doc, it is not.


[dpdk-dev] [PATCH v6 1/8] qede: Add maintainers, documentation and license

2016-04-26 Thread Rasesh Mody
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Tuesday, April 26, 2016 12:34 PM
> 
> 2016-04-26 18:27, Rasesh Mody:
> > > From: Bruce Richardson [mailto:bruce.richardson at intel.com]
> > > Sent: Tuesday, April 26, 2016 6:03 AM
> > >
> > > On Mon, Apr 25, 2016 at 10:12:59PM -0700, Rasesh Mody wrote:
> > > > Signed-off-by: Harish Patil 
> > > > Signed-off-by: Rasesh Mody 
> > > > Signed-off-by: Sony Chacko 
> > > > ---
> > > >  MAINTAINERS   |7 +
> > > >  doc/guides/nics/index.rst |1 +
> > > >  doc/guides/nics/overview.rst  |   86 +-
> > > >  doc/guides/nics/qede.rst  |  314
> > > +
> > > >  drivers/net/qede/LICENSE.qede_pmd |   28 
> > > >  5 files changed, 393 insertions(+), 43 deletions(-)  create mode
> > > > 100644 doc/guides/nics/qede.rst  create mode 100644
> > > > drivers/net/qede/LICENSE.qede_pmd
> > > >
> > > > diff --git a/MAINTAINERS b/MAINTAINERS index 1953ea2..ba4053a
> > > > 100644
> > > > --- a/MAINTAINERS
> > > > +++ b/MAINTAINERS
> > > > @@ -332,6 +332,13 @@ M: Rasesh Mody 
> > > >  F: drivers/net/bnx2x/
> > > >  F: doc/guides/nics/bnx2x.rst
> > > >
> > > > +QLogic qede PMD
> > > > +M: Harish Patil 
> > > > +M: Rasesh Mody 
> > > > +M: Sony Chacko 
> > > > +F: drivers/net/qede/
> > > > +F: doc/guides/nics/qede.rst
> > > > +
> > > >  RedHat virtio
> > > >  M: Huawei Xie 
> > > >  M: Yuanhan Liu  diff --git
> > > > a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst index
> > > > 769f677..d67f056 100644
> > > > --- a/doc/guides/nics/index.rst
> > > > +++ b/doc/guides/nics/index.rst
> > > > @@ -53,6 +53,7 @@ Network Interface Controller Drivers
> > > >  vhost
> > > >  vmxnet3
> > > >  pcap_ring
> > > > +qede
> > >
> > > Minor nit - this addition should be between "nfp" and "szedata2" as
> > > the list is in alphabetical order apart from the last entry which
> > > covers two virtual NICs in the one section.
> > >
> > > /Bruce
> >
> > During our v5 submission we received a comment to keep the logic order
> by adding qede below bnx2x, hence the change. We can place it under "nfp"
> and "szedata2" to maintain the alphabetical order.
> 
> I think you are talking of 2 different things:
> MAINTAINERS file et doc/guides/nics.
> For MAINTAINERS, this patch is right.
> For doc, it is not.

You are right, I overlooked it, will modify it.
Thanks!
Rasesh


[dpdk-dev] [PATCH] enic: fix 'imissed' to count drops due to no RX buffers

2016-04-26 Thread John Daley
Fixes: 7182d3e7d177 ("enic: expose Rx missed packets counter")
Signed-off-by: John Daley 
---
 drivers/net/enic/enic_main.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/enic/enic_main.c b/drivers/net/enic/enic_main.c
index 60fe765..be4e9e5 100644
--- a/drivers/net/enic/enic_main.c
+++ b/drivers/net/enic/enic_main.c
@@ -243,10 +243,10 @@ void enic_dev_stats_get(struct enic *enic, struct 
rte_eth_stats *r_stats)
r_stats->ibytes = stats->rx.rx_bytes_ok;
r_stats->obytes = stats->tx.tx_bytes_ok;

-   r_stats->ierrors = stats->rx.rx_errors;
+   r_stats->ierrors = stats->rx.rx_errors + stats->rx.rx_drop;
r_stats->oerrors = stats->tx.tx_errors;

-   r_stats->imissed = stats->rx.rx_drop;
+   r_stats->imissed = stats->rx.rx_no_bufs;

r_stats->rx_nombuf = stats->rx.rx_no_bufs;
 }
-- 
2.7.0



[dpdk-dev] [PATCH] enic: fix misalignment of Rx mbuf data

2016-04-26 Thread John Daley
Data DMA used m->data_off of uninitialized mbufs instead of
RTE_PKTMBUF_HEADROOM, potentially causing Rx data to be
placed at the wrong alignment in the mbuf.

Fixes: 947d860c821f ("enic: improve Rx performance")
Signed-off-by: John Daley 
---
 drivers/net/enic/enic_main.c | 5 +++--
 drivers/net/enic/enic_rx.c   | 6 --
 2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/net/enic/enic_main.c b/drivers/net/enic/enic_main.c
index be4e9e5..646d87f 100644
--- a/drivers/net/enic/enic_main.c
+++ b/drivers/net/enic/enic_main.c
@@ -354,10 +354,11 @@ enic_alloc_rx_queue_mbufs(struct enic *enic, struct 
vnic_rq *rq)
return -ENOMEM;
}

-   dma_addr = (dma_addr_t)(mb->buf_physaddr + mb->data_off);
+   dma_addr = (dma_addr_t)(mb->buf_physaddr
+  + RTE_PKTMBUF_HEADROOM);

rq_enet_desc_enc(rqd, dma_addr, RQ_ENET_TYPE_ONLY_SOP,
-mb->buf_len);
+mb->buf_len - RTE_PKTMBUF_HEADROOM);
rq->mbuf_ring[i] = mb;
}

diff --git a/drivers/net/enic/enic_rx.c b/drivers/net/enic/enic_rx.c
index 232987a..39bb55c 100644
--- a/drivers/net/enic/enic_rx.c
+++ b/drivers/net/enic/enic_rx.c
@@ -314,9 +314,11 @@ enic_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
 + rx_id);

/* Push descriptor for newly allocated mbuf */
-   dma_addr = (dma_addr_t)(nmb->buf_physaddr + nmb->data_off);
+   dma_addr = (dma_addr_t)(nmb->buf_physaddr
+  + RTE_PKTMBUF_HEADROOM);
rqd_ptr->address = rte_cpu_to_le_64(dma_addr);
-   rqd_ptr->length_type = cpu_to_le16(nmb->buf_len);
+   rqd_ptr->length_type = cpu_to_le16(nmb->buf_len
+  - RTE_PKTMBUF_HEADROOM);

/* Fill in the rest of the mbuf */
rxmb->data_off = RTE_PKTMBUF_HEADROOM;
-- 
2.7.0



[dpdk-dev] [PATCH] enic: Optimization of Tx path to reduce Host CPU overhead, cleanup

2016-04-26 Thread John Daley
Optimizations and cleanup:
- flatten packet send path
- flatten mbuf free path
- disable CQ entry writing and use CQ messages instead
- use rte_mempool_put_bulk() to bulk return freed mbufs
- remove unnecessary fields vnic_bufs struct, use contiguous array of cache
  aligned divisible elements. No next pointers.
- use local variables inside per packet loop instead of fields in structs.
- factor book keeping out of the per packet tx loop where possible
  (removed several conditionals)
- put Tx and Rx code in 1 file (enic_rxtx.c)

Reviewed-by: Nelson Escobar 
Signed-off-by: John Daley 
---
 drivers/net/enic/Makefile|   2 +-
 drivers/net/enic/base/enic_vnic_wq.h |  79 --
 drivers/net/enic/base/vnic_cq.h  |  37 +--
 drivers/net/enic/base/vnic_rq.h  |   2 +-
 drivers/net/enic/base/vnic_wq.c  |  89 +++---
 drivers/net/enic/base/vnic_wq.h  | 113 +---
 drivers/net/enic/enic.h  |  27 +-
 drivers/net/enic/enic_ethdev.c   |  67 +
 drivers/net/enic/enic_main.c | 132 +++--
 drivers/net/enic/enic_res.h  |  81 +-
 drivers/net/enic/enic_rx.c   | 361 -
 drivers/net/enic/enic_rxtx.c | 505 +++
 12 files changed, 635 insertions(+), 860 deletions(-)
 delete mode 100644 drivers/net/enic/base/enic_vnic_wq.h
 delete mode 100644 drivers/net/enic/enic_rx.c
 create mode 100644 drivers/net/enic/enic_rxtx.c

diff --git a/drivers/net/enic/Makefile b/drivers/net/enic/Makefile
index f316274..3926b79 100644
--- a/drivers/net/enic/Makefile
+++ b/drivers/net/enic/Makefile
@@ -53,7 +53,7 @@ VPATH += $(SRCDIR)/src
 #
 SRCS-$(CONFIG_RTE_LIBRTE_ENIC_PMD) += enic_ethdev.c
 SRCS-$(CONFIG_RTE_LIBRTE_ENIC_PMD) += enic_main.c
-SRCS-$(CONFIG_RTE_LIBRTE_ENIC_PMD) += enic_rx.c
+SRCS-$(CONFIG_RTE_LIBRTE_ENIC_PMD) += enic_rxtx.c
 SRCS-$(CONFIG_RTE_LIBRTE_ENIC_PMD) += enic_clsf.c
 SRCS-$(CONFIG_RTE_LIBRTE_ENIC_PMD) += enic_res.c
 SRCS-$(CONFIG_RTE_LIBRTE_ENIC_PMD) += base/vnic_cq.c
diff --git a/drivers/net/enic/base/enic_vnic_wq.h 
b/drivers/net/enic/base/enic_vnic_wq.h
deleted file mode 100644
index b019109..000
--- a/drivers/net/enic/base/enic_vnic_wq.h
+++ /dev/null
@@ -1,79 +0,0 @@
-/*
- * Copyright 2008-2015 Cisco Systems, Inc.  All rights reserved.
- * Copyright 2007 Nuova Systems, Inc.  All rights reserved.
- *
- * Copyright (c) 2015, Cisco Systems, Inc.
- * All rights reserved.
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions
- * are met:
- *
- * 1. Redistributions of source code must retain the above copyright
- * notice, this list of conditions and the following disclaimer.
- *
- * 2. Redistributions in binary form must reproduce the above copyright
- * notice, this list of conditions and the following disclaimer in
- * the documentation and/or other materials provided with the
- * distribution.
- *
- * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
- * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
- * COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
- * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
- * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
- * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
- * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
- * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
- * ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
- * POSSIBILITY OF SUCH DAMAGE.
- *
- */
-
-#ifndef _ENIC_VNIC_WQ_H_
-#define _ENIC_VNIC_WQ_H_
-
-#include "vnic_dev.h"
-#include "vnic_cq.h"
-
-static inline void enic_vnic_post_wq_index(struct vnic_wq *wq)
-{
-   struct vnic_wq_buf *buf = wq->to_use;
-
-   /* Adding write memory barrier prevents compiler and/or CPU
-* reordering, thus avoiding descriptor posting before
-* descriptor is initialized. Otherwise, hardware can read
-* stale descriptor fields.
-   */
-   wmb();
-   iowrite32(buf->index, &wq->ctrl->posted_index);
-}
-
-static inline void enic_vnic_post_wq(struct vnic_wq *wq,
-void *os_buf, dma_addr_t dma_addr,
-unsigned int len, int sop,
-uint8_t desc_skip_cnt, uint8_t cq_entry,
-uint8_t compressed_send, uint64_t wrid)
-{
-   struct vnic_wq_buf *buf = wq->to_use;
-
-   buf->sop = sop;
-   buf->cq_entry = cq_entry;
-   buf->compressed_send = compressed_send;
-   buf->desc_skip_cnt = desc_skip_cnt;
-   buf->os_buf = os_buf;
-   buf->dma_addr = dma_addr;
-   buf->len = len;
-   buf->wr_id = wrid;
-
-   buf = buf-