[dpdk-dev] [PATCH v7 0/4] Fix vhost enqueue/dequeue issue

2015-06-09 Thread Ouyang Changchun
Fix enqueue/dequeue can't handle chained vring descriptors;
Remove unnecessary vring descriptor length updating;
Add support copying scattered mbuf to vring;

Changchun Ouyang (4):
  lib_vhost: Fix enqueue/dequeue can't handle chained vring descriptors
  lib_vhost: Refine code style
  lib_vhost: Extract function
  lib_vhost: Remove unnecessary vring descriptor length updating

 lib/librte_vhost/vhost_rxtx.c | 201 +++---
 1 file changed, 111 insertions(+), 90 deletions(-)

-- 
1.8.4.2



[dpdk-dev] [PATCH v7 2/4] lib_vhost: Refine code style

2015-06-09 Thread Ouyang Changchun
Remove unnecessary new line.

Signed-off-by: Changchun Ouyang 
---
 lib/librte_vhost/vhost_rxtx.c | 9 +++--
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
index b887e0b..1f145bf 100644
--- a/lib/librte_vhost/vhost_rxtx.c
+++ b/lib/librte_vhost/vhost_rxtx.c
@@ -265,8 +265,7 @@ copy_from_mbuf_to_vring(struct virtio_net *dev, uint16_t 
res_base_idx,
 * (guest physical addr -> vhost virtual addr)
 */
vq = dev->virtqueue[VIRTIO_RXQ];
-   vb_addr =
-   gpa_to_vva(dev, vq->buf_vec[vec_idx].buf_addr);
+   vb_addr = gpa_to_vva(dev, vq->buf_vec[vec_idx].buf_addr);
vb_hdr_addr = vb_addr;

/* Prefetch buffer address. */
@@ -284,8 +283,7 @@ copy_from_mbuf_to_vring(struct virtio_net *dev, uint16_t 
res_base_idx,

seg_avail = rte_pktmbuf_data_len(pkt);
vb_offset = vq->vhost_hlen;
-   vb_avail =
-   vq->buf_vec[vec_idx].buf_len - vq->vhost_hlen;
+   vb_avail = vq->buf_vec[vec_idx].buf_len - vq->vhost_hlen;

entry_len = vq->vhost_hlen;

@@ -308,8 +306,7 @@ copy_from_mbuf_to_vring(struct virtio_net *dev, uint16_t 
res_base_idx,
}

vec_idx++;
-   vb_addr =
-   gpa_to_vva(dev, vq->buf_vec[vec_idx].buf_addr);
+   vb_addr = gpa_to_vva(dev, vq->buf_vec[vec_idx].buf_addr);

/* Prefetch buffer address. */
rte_prefetch0((void *)(uintptr_t)vb_addr);
-- 
1.8.4.2



[dpdk-dev] [PATCH v7 1/4] lib_vhost: Fix enqueue/dequeue can't handle chained vring descriptors

2015-06-09 Thread Ouyang Changchun
Vring enqueue need consider the 2 cases:
 1. use separate descriptors to contain virtio header and actual data, e.g. the 
first descriptor
is for virtio header, and then followed by descriptors for actual data.
 2. virtio header and some data are put together in one descriptor, e.g. the 
first descriptor contain both
virtio header and part of actual data, and then followed by more 
descriptors for rest of packet data,
current DPDK based virtio-net pmd implementation is this case;

So does vring dequeue, it should not assume vring descriptor is chained or not 
chained, it should use
desc->flags to check whether it is chained or not. this patch also fixes TX 
corrupt issue when vhost
co-work with virtio-net driver which uses one single vring descriptor(header 
and data are in one descriptor)
for virtio tx process on default.

Changes in v6
  - move desc->len change to here to increase code readability

Changes in v5
  - support virtio header with partial data in first descriptor and then 
followed by descriptor for rest data

Changes in v4
  - remove unnecessary check for mbuf 'next' pointer
  - refine packet copying completeness check

Changes in v3
  - support scattered mbuf, check the mbuf has 'next' pointer or not and copy 
all segments to vring buffer.

Changes in v2
  - drop the uncompleted packet
  - refine code logic

Signed-off-by: Changchun Ouyang 
---
 lib/librte_vhost/vhost_rxtx.c | 90 ++-
 1 file changed, 71 insertions(+), 19 deletions(-)

diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
index 4809d32..b887e0b 100644
--- a/lib/librte_vhost/vhost_rxtx.c
+++ b/lib/librte_vhost/vhost_rxtx.c
@@ -46,7 +46,8 @@
  * This function adds buffers to the virtio devices RX virtqueue. Buffers can
  * be received from the physical port or from another virtio device. A packet
  * count is returned to indicate the number of packets that are succesfully
- * added to the RX queue. This function works when mergeable is disabled.
+ * added to the RX queue. This function works when the mbuf is scattered, but
+ * it doesn't support the mergeable feature.
  */
 static inline uint32_t __attribute__((always_inline))
 virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
@@ -59,7 +60,7 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
struct virtio_net_hdr_mrg_rxbuf virtio_hdr = {{0, 0, 0, 0, 0, 0}, 0};
uint64_t buff_addr = 0;
uint64_t buff_hdr_addr = 0;
-   uint32_t head[MAX_PKT_BURST], packet_len = 0;
+   uint32_t head[MAX_PKT_BURST];
uint32_t head_idx, packet_success = 0;
uint16_t avail_idx, res_cur_idx;
uint16_t res_base_idx, res_end_idx;
@@ -113,6 +114,10 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
rte_prefetch0(&vq->desc[head[packet_success]]);

while (res_cur_idx != res_end_idx) {
+   uint32_t offset = 0, vb_offset = 0;
+   uint32_t pkt_len, len_to_cpy, data_len, total_copied = 0;
+   uint8_t hdr = 0, uncompleted_pkt = 0;
+
/* Get descriptor from available ring */
desc = &vq->desc[head[packet_success]];

@@ -125,39 +130,81 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,

/* Copy virtio_hdr to packet and increment buffer address */
buff_hdr_addr = buff_addr;
-   packet_len = rte_pktmbuf_data_len(buff) + vq->vhost_hlen;

/*
 * If the descriptors are chained the header and data are
 * placed in separate buffers.
 */
-   if (desc->flags & VRING_DESC_F_NEXT) {
-   desc->len = vq->vhost_hlen;
+   if ((desc->flags & VRING_DESC_F_NEXT) &&
+   (desc->len == vq->vhost_hlen)) {
desc = &vq->desc[desc->next];
/* Buffer address translation. */
buff_addr = gpa_to_vva(dev, desc->addr);
-   desc->len = rte_pktmbuf_data_len(buff);
} else {
-   buff_addr += vq->vhost_hlen;
-   desc->len = packet_len;
+   vb_offset += vq->vhost_hlen;
+   hdr = 1;
}

+   pkt_len = rte_pktmbuf_pkt_len(buff);
+   data_len = rte_pktmbuf_data_len(buff);
+   len_to_cpy = RTE_MIN(data_len,
+   hdr ? desc->len - vq->vhost_hlen : desc->len);
+   while (total_copied < pkt_len) {
+   /* Copy mbuf data to buffer */
+   rte_memcpy((void *)(uintptr_t)(buff_addr + vb_offset),
+   (const void *)(rte_pktmbuf_mtod(buff, const 
char *) + offset),
+   len_to_cpy);
+   PRINT_PACKET(dev, (uintptr_t)(buff_addr + vb_offset),
+   len_to_cpy, 0);

[dpdk-dev] [PATCH v7 3/4] lib_vhost: Extract function

2015-06-09 Thread Ouyang Changchun
Extract codes into 2 common functions:
update_secure_len which is used to accumulate the buffer len in the vring 
descriptors.
and fill_buf_vec which is used to fill struct buf_vec.

Changes in v5
  - merge fill_buf_vec into update_secure_len
  - do both tasks in one-time loop

Signed-off-by: Changchun Ouyang 
---
 lib/librte_vhost/vhost_rxtx.c | 85 ++-
 1 file changed, 36 insertions(+), 49 deletions(-)

diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
index 1f145bf..aaf77ed 100644
--- a/lib/librte_vhost/vhost_rxtx.c
+++ b/lib/librte_vhost/vhost_rxtx.c
@@ -436,6 +436,34 @@ copy_from_mbuf_to_vring(struct virtio_net *dev, uint16_t 
res_base_idx,
return entry_success;
 }

+static inline void __attribute__((always_inline))
+update_secure_len(struct vhost_virtqueue *vq, uint32_t id,
+   uint32_t *secure_len, uint32_t *vec_idx)
+{
+   uint16_t wrapped_idx = id & (vq->size - 1);
+   uint32_t idx = vq->avail->ring[wrapped_idx];
+   uint8_t next_desc;
+   uint32_t len = *secure_len;
+   uint32_t vec_id = *vec_idx;
+
+   do {
+   next_desc = 0;
+   len += vq->desc[idx].len;
+   vq->buf_vec[vec_id].buf_addr = vq->desc[idx].addr;
+   vq->buf_vec[vec_id].buf_len = vq->desc[idx].len;
+   vq->buf_vec[vec_id].desc_idx = idx;
+   vec_id++;
+
+   if (vq->desc[idx].flags & VRING_DESC_F_NEXT) {
+   idx = vq->desc[idx].next;
+   next_desc = 1;
+   }
+   } while (next_desc);
+
+   *secure_len = len;
+   *vec_idx = vec_id;
+}
+
 /*
  * This function works for mergeable RX.
  */
@@ -445,8 +473,8 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t 
queue_id,
 {
struct vhost_virtqueue *vq;
uint32_t pkt_idx = 0, entry_success = 0;
-   uint16_t avail_idx, res_cur_idx;
-   uint16_t res_base_idx, res_end_idx;
+   uint16_t avail_idx;
+   uint16_t res_base_idx, res_cur_idx;
uint8_t success = 0;

LOG_DEBUG(VHOST_DATA, "(%"PRIu64") virtio_dev_merge_rx()\n",
@@ -462,17 +490,16 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t 
queue_id,
return 0;

for (pkt_idx = 0; pkt_idx < count; pkt_idx++) {
-   uint32_t secure_len = 0;
-   uint16_t need_cnt;
-   uint32_t vec_idx = 0;
uint32_t pkt_len = pkts[pkt_idx]->pkt_len + vq->vhost_hlen;
-   uint16_t i, id;

do {
/*
 * As many data cores may want access to available
 * buffers, they need to be reserved.
 */
+   uint32_t secure_len = 0;
+   uint32_t vec_idx = 0;
+
res_base_idx = vq->last_used_idx_res;
res_cur_idx = res_base_idx;

@@ -486,22 +513,7 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t 
queue_id,
dev->device_fh);
return pkt_idx;
} else {
-   uint16_t wrapped_idx =
-   (res_cur_idx) & (vq->size - 1);
-   uint32_t idx =
-   vq->avail->ring[wrapped_idx];
-   uint8_t next_desc;
-
-   do {
-   next_desc = 0;
-   secure_len += vq->desc[idx].len;
-   if (vq->desc[idx].flags &
-   VRING_DESC_F_NEXT) {
-   idx = 
vq->desc[idx].next;
-   next_desc = 1;
-   }
-   } while (next_desc);
-
+   update_secure_len(vq, res_cur_idx, 
&secure_len, &vec_idx);
res_cur_idx++;
}
} while (pkt_len > secure_len);
@@ -512,33 +524,8 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t 
queue_id,
res_cur_idx);
} while (success == 0);

-   id = res_base_idx;
-   need_cnt = res_cur_idx - res_base_idx;
-
-   for (i = 0; i < need_cnt; i++, id++) {
-   uint16_t wrapped_idx = id & (vq->size - 1);
-   uint32_t idx = vq->avail->ring[wrapped_idx];
-   uint8_t next_desc;
-   do {
-   next

[dpdk-dev] [PATCH v7 4/4] lib_vhost: Remove unnecessary vring descriptor length updating

2015-06-09 Thread Ouyang Changchun
Remove these unnecessary vring descriptor length updating, vhost should not 
change them.
virtio in front end should assign value to desc.len for both rx and tx.

Signed-off-by: Changchun Ouyang 
---
 lib/librte_vhost/vhost_rxtx.c | 17 +
 1 file changed, 1 insertion(+), 16 deletions(-)

diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
index aaf77ed..07bc16c 100644
--- a/lib/librte_vhost/vhost_rxtx.c
+++ b/lib/librte_vhost/vhost_rxtx.c
@@ -290,7 +290,6 @@ copy_from_mbuf_to_vring(struct virtio_net *dev, uint16_t 
res_base_idx,
if (vb_avail == 0) {
uint32_t desc_idx =
vq->buf_vec[vec_idx].desc_idx;
-   vq->desc[desc_idx].len = vq->vhost_hlen;

if ((vq->desc[desc_idx].flags
& VRING_DESC_F_NEXT) == 0) {
@@ -374,7 +373,6 @@ copy_from_mbuf_to_vring(struct virtio_net *dev, uint16_t 
res_base_idx,
 */
uint32_t desc_idx =
vq->buf_vec[vec_idx].desc_idx;
-   vq->desc[desc_idx].len = vb_offset;

if ((vq->desc[desc_idx].flags &
VRING_DESC_F_NEXT) == 0) {
@@ -409,26 +407,13 @@ copy_from_mbuf_to_vring(struct virtio_net *dev, uint16_t 
res_base_idx,
/*
 * This whole packet completes.
 */
-   uint32_t desc_idx =
-   vq->buf_vec[vec_idx].desc_idx;
-   vq->desc[desc_idx].len = vb_offset;
-
-   while (vq->desc[desc_idx].flags &
-   VRING_DESC_F_NEXT) {
-   desc_idx = vq->desc[desc_idx].next;
-vq->desc[desc_idx].len = 0;
-   }
-
/* Update used ring with desc information */
vq->used->ring[cur_idx & (vq->size - 1)].id
= vq->buf_vec[vec_idx].desc_idx;
vq->used->ring[cur_idx & (vq->size - 1)].len
= entry_len;
-   entry_len = 0;
-   cur_idx++;
entry_success++;
-   seg_avail = 0;
-   cpy_len = RTE_MIN(vb_avail, seg_avail);
+   break;
}
}
}
-- 
1.8.4.2



[dpdk-dev] [PATCH 1/3] fm10k: update VLAN filter

2015-06-09 Thread Chen, Jing D
Hi,

> -Original Message-
> From: He, Shaopeng
> Sent: Tuesday, June 02, 2015 10:59 AM
> To: dev at dpdk.org
> Cc: Chen, Jing D; Qiu, Michael; He, Shaopeng
> Subject: [PATCH 1/3] fm10k: update VLAN filter
> 
> VLAN filter was updated to add/delete one static entry in MAC table for each
> combination of VLAN and MAC address. More sanity checks were added.
> 
> Signed-off-by: Shaopeng He 
> ---
>  drivers/net/fm10k/fm10k.h| 23 +
>  drivers/net/fm10k/fm10k_ethdev.c | 55
> +---
>  2 files changed, 75 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/fm10k/fm10k.h b/drivers/net/fm10k/fm10k.h
> index ad7a7d1..3b95b72 100644
> --- a/drivers/net/fm10k/fm10k.h
> +++ b/drivers/net/fm10k/fm10k.h
> @@ -109,11 +109,31 @@
> 
>  #define FM10K_VLAN_TAG_SIZE 4
> 
> +/* Maximum number of MAC addresses per PF/VF */
> +#define FM10K_MAX_MACADDR_NUM   1
> +
> +#define FM10K_UINT32_BIT_SIZE  (CHAR_BIT * sizeof(uint32_t))
> +#define FM10K_VFTA_SIZE(4096 / FM10K_UINT32_BIT_SIZE)
> +
> +/* vlan_id is a 12 bit number.
> + * The VFTA array is actually a 4096 bit array, 128 of 32bit elements.
> + * 2^5 = 32. The val of lower 5 bits specifies the bit in the 32bit element.
> + * The higher 7 bit val specifies VFTA array index.
> + */
> +#define FM10K_VFTA_BIT(vlan_id)(1 << ((vlan_id) & 0x1F))
> +#define FM10K_VFTA_IDX(vlan_id)((vlan_id) >> 5)
> +
> +struct fm10k_macvlan_filter_info {
> + uint16_t vlan_num;   /* Total VLAN number */
> + uint32_t vfta[FM10K_VFTA_SIZE];/* VLAN bitmap */
> +};
> +
>  struct fm10k_dev_info {
>   volatile uint32_t enable;
>   volatile uint32_t glort;
>   /* Protect the mailbox to avoid race condition */
>   rte_spinlock_tmbx_lock;
> + struct fm10k_macvlan_filter_infomacvlan;
>  };
> 
>  /*
> @@ -137,6 +157,9 @@ struct fm10k_adapter {
>  #define FM10K_DEV_PRIVATE_TO_MBXLOCK(adapter) \
>   (&(((struct fm10k_adapter *)adapter)->info.mbx_lock))
> 
> +#define FM10K_DEV_PRIVATE_TO_MACVLAN(adapter) \
> + (&(((struct fm10k_adapter *)adapter)->info.macvlan))
> +
>  struct fm10k_rx_queue {
>   struct rte_mempool *mp;
>   struct rte_mbuf **sw_ring;
> diff --git a/drivers/net/fm10k/fm10k_ethdev.c
> b/drivers/net/fm10k/fm10k_ethdev.c
> index 3a26480..d2f3e44 100644
> --- a/drivers/net/fm10k/fm10k_ethdev.c
> +++ b/drivers/net/fm10k/fm10k_ethdev.c
> @@ -819,15 +819,61 @@ fm10k_dev_infos_get(struct rte_eth_dev *dev,
>  static int
>  fm10k_vlan_filter_set(struct rte_eth_dev *dev, uint16_t vlan_id, int on)
>  {
> - struct fm10k_hw *hw = FM10K_DEV_PRIVATE_TO_HW(dev->data-
> >dev_private);
> + s32 result;
> + uint32_t vid_idx, vid_bit, mac_index;
> + struct fm10k_hw *hw;
> + struct fm10k_macvlan_filter_info *macvlan;
> + struct rte_eth_dev_data *data = dev->data;
> 
> - PMD_INIT_FUNC_TRACE();
> + hw = FM10K_DEV_PRIVATE_TO_HW(dev->data->dev_private);
> + macvlan = FM10K_DEV_PRIVATE_TO_MACVLAN(dev->data-
> >dev_private);
> 
>   /* @todo - add support for the VF */
>   if (hw->mac.type != fm10k_mac_pf)
>   return -ENOTSUP;
> 
> - return fm10k_update_vlan(hw, vlan_id, 0, on);
> + if (vlan_id > ETH_VLAN_ID_MAX) {
> + PMD_INIT_LOG(ERR, "Invalid vlan_id: must be < 4096");
> + return (-EINVAL);
> + }
> +
> + vid_idx = FM10K_VFTA_IDX(vlan_id);
> + vid_bit = FM10K_VFTA_BIT(vlan_id);
> + /* this VLAN ID is already in the VLAN filter table, return SUCCESS */
> + if (on && (macvlan->vfta[vid_idx] & vid_bit))
> + return 0;
> + /* this VLAN ID is NOT in the VLAN filter table, cannot remove */
> + if (!on && !(macvlan->vfta[vid_idx] & vid_bit)) {
> + PMD_INIT_LOG(ERR, "Invalid vlan_id: not existing "
> + "in the VLAN filter table");
> + return (-EINVAL);
> + }
> +
> + fm10k_mbx_lock(hw);
> + result = fm10k_update_vlan(hw, vlan_id, 0, on);
> + if (FM10K_SUCCESS == result) {
> + if (on) {
> + macvlan->vlan_num++;
> + macvlan->vfta[vid_idx] |= vid_bit;
> + } else {
> + macvlan->vlan_num--;
> + macvlan->vfta[vid_idx] &= ~vid_bit;
> + }
> +
> + for (mac_index = 0; mac_index <
> FM10K_MAX_MACADDR_NUM;
> + mac_index++) {
> + if (is_zero_ether_addr(&data-
> >mac_addrs[mac_index]))
> + continue;
> + fm10k_update_uc_addr(hw, hw->mac.dglort_map,
> + data->mac_addrs[mac_index].addr_bytes,
> + vlan_id, on, 0);


Result = fm10k_update_uc_addr()? If meeting any error, it should break.
In the meanwhile,  I think above if (on)...else... should be moved after the
loop. 

> + }
> + }
> + fm10k_mbx_unlock(hw);

[dpdk-dev] [PATCH 2/3] fm10k: add MAC filter

2015-06-09 Thread Chen, Jing D
Hi,

> -Original Message-
> From: He, Shaopeng
> Sent: Tuesday, June 02, 2015 10:59 AM
> To: dev at dpdk.org
> Cc: Chen, Jing D; Qiu, Michael; He, Shaopeng
> Subject: [PATCH 2/3] fm10k: add MAC filter
> 
> MAC filter function was newly added, each PF and VF can have up to 64 MAC
> addresses. VF filter needs support from PF host, which is not available now.
> 
> Signed-off-by: Shaopeng He 
> ---
>  drivers/net/fm10k/fm10k.h|  3 +-
>  drivers/net/fm10k/fm10k_ethdev.c | 90
> 
>  2 files changed, 85 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/net/fm10k/fm10k.h b/drivers/net/fm10k/fm10k.h
> index 3b95b72..f5be5f8 100644
> --- a/drivers/net/fm10k/fm10k.h
> +++ b/drivers/net/fm10k/fm10k.h
> @@ -110,7 +110,7 @@
>  #define FM10K_VLAN_TAG_SIZE 4
> 
>  /* Maximum number of MAC addresses per PF/VF */
> -#define FM10K_MAX_MACADDR_NUM   1
> +#define FM10K_MAX_MACADDR_NUM   64
> 
>  #define FM10K_UINT32_BIT_SIZE  (CHAR_BIT * sizeof(uint32_t))
>  #define FM10K_VFTA_SIZE(4096 / FM10K_UINT32_BIT_SIZE)
> @@ -125,6 +125,7 @@
> 
>  struct fm10k_macvlan_filter_info {
>   uint16_t vlan_num;   /* Total VLAN number */
> + uint16_t mac_num;/* Total mac number */
>   uint32_t vfta[FM10K_VFTA_SIZE];/* VLAN bitmap */
>  };
> 
> diff --git a/drivers/net/fm10k/fm10k_ethdev.c
> b/drivers/net/fm10k/fm10k_ethdev.c
> index d2f3e44..4f23bf1 100644
> --- a/drivers/net/fm10k/fm10k_ethdev.c
> +++ b/drivers/net/fm10k/fm10k_ethdev.c
> @@ -54,6 +54,10 @@
>  #define BIT_MASK_PER_UINT32 ((1 << CHARS_PER_UINT32) - 1)
> 
>  static void fm10k_close_mbx_service(struct fm10k_hw *hw);
> +static int
> +fm10k_vlan_filter_set(struct rte_eth_dev *dev, uint16_t vlan_id, int on);
> +static void
> +fm10k_MAC_filter_set(struct rte_eth_dev *dev, const u8 *mac, bool add);
> 
>  static void
>  fm10k_mbx_initlock(struct fm10k_hw *hw)
> @@ -668,14 +672,11 @@ fm10k_dev_start(struct rte_eth_dev *dev)
>   }
> 
>   if (hw->mac.default_vid && hw->mac.default_vid <=
> ETHER_MAX_VLAN_ID) {
> - fm10k_mbx_lock(hw);
>   /* Update default vlan */
> - hw->mac.ops.update_vlan(hw, hw->mac.default_vid, 0,
> true);
> + fm10k_vlan_filter_set(dev, hw->mac.default_vid, true);
> 
>   /* Add default mac/vlan filter to PF/Switch manger */
> - hw->mac.ops.update_uc_addr(hw, hw->mac.dglort_map,
> hw->mac.addr,
> - hw->mac.default_vid, true, 0);
> - fm10k_mbx_unlock(hw);
> + fm10k_MAC_filter_set(dev, hw->mac.addr, true);
>   }
> 
>   return 0;
> @@ -781,7 +782,7 @@ fm10k_dev_infos_get(struct rte_eth_dev *dev,
>   dev_info->max_rx_pktlen  = FM10K_MAX_PKT_SIZE;
>   dev_info->max_rx_queues  = hw->mac.max_queues;
>   dev_info->max_tx_queues  = hw->mac.max_queues;
> - dev_info->max_mac_addrs  = 1;
> + dev_info->max_mac_addrs  = FM10K_MAX_MACADDR_NUM;
>   dev_info->max_hash_mac_addrs = 0;
>   dev_info->max_vfs= FM10K_MAX_VF_NUM;
>   dev_info->max_vmdq_pools = ETH_64_POOLS;
> @@ -820,6 +821,7 @@ static int
>  fm10k_vlan_filter_set(struct rte_eth_dev *dev, uint16_t vlan_id, int on)
>  {
>   s32 result;
> + uint16_t mac_num = 0;
>   uint32_t vid_idx, vid_bit, mac_index;
>   struct fm10k_hw *hw;
>   struct fm10k_macvlan_filter_info *macvlan;
> @@ -864,9 +866,15 @@ fm10k_vlan_filter_set(struct rte_eth_dev *dev,
> uint16_t vlan_id, int on)
>   mac_index++) {
>   if (is_zero_ether_addr(&data-
> >mac_addrs[mac_index]))
>   continue;
> + if (mac_num > macvlan->mac_num - 1) {
> + PMD_INIT_LOG(ERR, "MAC address number
> "
> + "not match");
> + break;
> + }
>   fm10k_update_uc_addr(hw, hw->mac.dglort_map,
>   data->mac_addrs[mac_index].addr_bytes,
>   vlan_id, on, 0);
> + mac_num++;
>   }
>   }
>   fm10k_mbx_unlock(hw);
> @@ -876,6 +884,71 @@ fm10k_vlan_filter_set(struct rte_eth_dev *dev,
> uint16_t vlan_id, int on)
>   return (-EIO);
>  }
> 
> +/* Add/Remove a MAC address, and update filters */
> +static void
> +fm10k_MAC_filter_set(struct rte_eth_dev *dev, const u8 *mac, bool add)
> +{
> + uint32_t i, j, k;
> + struct fm10k_hw *hw;
> + struct fm10k_macvlan_filter_info *macvlan;
> +
> + hw = FM10K_DEV_PRIVATE_TO_HW(dev->data->dev_private);
> + macvlan = FM10K_DEV_PRIVATE_TO_MACVLAN(dev->data-
> >dev_private);
> +
> + /* @todo - add support for the VF */
> + if (hw->mac.type != fm10k_mac_pf)
> + return;
> +

Since it only supports PF, it's better to clarify in the log.

> 

[dpdk-dev] [PATCH 3/3] fm10k: update VLAN offload features

2015-06-09 Thread Chen, Jing D
Hi,


> -Original Message-
> From: He, Shaopeng
> Sent: Tuesday, June 02, 2015 10:59 AM
> To: dev at dpdk.org
> Cc: Chen, Jing D; Qiu, Michael; He, Shaopeng
> Subject: [PATCH 3/3] fm10k: update VLAN offload features
> 
> Fm10k PF/VF does not support QinQ; VLAN strip and filter are always on
> for PF/VF ports.
> 
> Signed-off-by: Shaopeng He 
> ---
>  drivers/net/fm10k/fm10k_ethdev.c | 22 ++
>  1 file changed, 22 insertions(+)
> 
> diff --git a/drivers/net/fm10k/fm10k_ethdev.c
> b/drivers/net/fm10k/fm10k_ethdev.c
> index 4f23bf1..9b198a7 100644
> --- a/drivers/net/fm10k/fm10k_ethdev.c
> +++ b/drivers/net/fm10k/fm10k_ethdev.c
> @@ -884,6 +884,27 @@ fm10k_vlan_filter_set(struct rte_eth_dev *dev,
> uint16_t vlan_id, int on)
>   return (-EIO);
>  }
> 
> +static void
> +fm10k_vlan_offload_set(__rte_unused struct rte_eth_dev *dev, int mask)
> +{
> + if (mask & ETH_VLAN_STRIP_MASK) {
> + if (!dev->data->dev_conf.rxmode.hw_vlan_strip)
> + PMD_INIT_LOG(ERR, "VLAN stripping is "
> + "always on in fm10k");
> + }
> +
> + if (mask & ETH_VLAN_EXTEND_MASK) {
> + if (dev->data->dev_conf.rxmode.hw_vlan_extend)
> + PMD_INIT_LOG(ERR, "VLAN QinQ is not "
> + "supported in fm10k");
> + }
> +
> + if (mask & ETH_VLAN_FILTER_MASK) {
> + if (!dev->data->dev_conf.rxmode.hw_vlan_filter)
> + PMD_INIT_LOG(ERR, "VLAN filter is always on in
> fm10k");
> + }
> +}
> +

Update fm10k_dev_infos_get() to configure above options to expected values?

>  /* Add/Remove a MAC address, and update filters */
>  static void
>  fm10k_MAC_filter_set(struct rte_eth_dev *dev, const u8 *mac, bool add)
> @@ -1801,6 +1822,7 @@ static const struct eth_dev_ops
> fm10k_eth_dev_ops = {
>   .link_update= fm10k_link_update,
>   .dev_infos_get  = fm10k_dev_infos_get,
>   .vlan_filter_set= fm10k_vlan_filter_set,
> + .vlan_offload_set   = fm10k_vlan_offload_set,
>   .mac_addr_add   = fm10k_macaddr_add,
>   .mac_addr_remove= fm10k_macaddr_remove,
>   .rx_queue_start = fm10k_dev_rx_queue_start,
> --
> 1.9.3



[dpdk-dev] [PATCH 0/2] vhost: numa aware allocation of virtio_net device and vhost virt queue

2015-06-09 Thread Long, Thomas
Acked-by: Tommy Long 

-Original Message-
From: Xie, Huawei 
Sent: Friday, June 5, 2015 4:13 AM
To: dev at dpdk.org
Cc: Long, Thomas
Subject: [PATCH 0/2] vhost: numa aware allocation of virtio_net device and 
vhost virt queue

The virtio_net device and vhost virt queue should be allocated on the same numa 
node as vring descriptors.
When we firstly allocate the virtio_net device and vhost virt queue, we don't 
know the numa node of vring descriptors.
When we receive the VHOST_SET_VRING_ADDR message, we get the numa node of vring 
descriptors, so we will try to reallocate virtio_net and vhost virt queue to 
the same numa node.

Huawei Xie (2):
  use rte_malloc/free for virtio_net and virt_queue memory data allocation/free
  When we get the address of vring descriptor table, will try to reallocate 
virtio_net device and virtqueue to the same numa node.

 config/common_linuxapp|   1 +
 lib/librte_vhost/Makefile |   4 ++
 lib/librte_vhost/virtio-net.c | 112 ++
 mk/rte.app.mk |   3 ++
 4 files changed, 111 insertions(+), 9 deletions(-)

-- 
1.8.1.4



[dpdk-dev] [PATCH v4 4/7] move rte_eth_dev_check_mq_mode() logic to driver

2015-06-09 Thread Wu, Jingjing


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Pawel Wodkowski
> Sent: Thursday, February 19, 2015 11:55 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v4 4/7] move rte_eth_dev_check_mq_mode()
> logic to driver
> 
> Function rte_eth_dev_check_mq_mode() is driver specific. It should be
> done in PF configuration phase. This patch move igb/ixgbe driver specific mq
> check and SRIOV configuration code to driver part. Also rewriting log
> messages to be shorter and more descriptive.
> 
> Signed-off-by: Pawel Wodkowski 
> ---
>  lib/librte_ether/rte_ethdev.c   | 197 ---
>  lib/librte_pmd_e1000/igb_ethdev.c   |  43 
>  lib/librte_pmd_ixgbe/ixgbe_ethdev.c | 105 ++-
>  lib/librte_pmd_ixgbe/ixgbe_ethdev.h |   5 +-
>  lib/librte_pmd_ixgbe/ixgbe_pf.c | 202
> +++-
>  5 files changed, 327 insertions(+), 225 deletions(-)
> 
> diff --git a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
> b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
> index 02b9cda..8e9da3b 100644
> --- a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
> +++ b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
> @@ -863,7 +863,8 @@ eth_ixgbe_dev_init(__attribute__((unused)) struct
> eth_driver *eth_drv,
>   "Failed to allocate %u bytes needed to store "
>   "MAC addresses",
>   ETHER_ADDR_LEN * hw->mac.num_rar_entries);
> - return -ENOMEM;
> + diag = -ENOMEM;
> + goto error;
>   }
>   /* Copy the permanent MAC address */
>   ether_addr_copy((struct ether_addr *) hw->mac.perm_addr, @@ -
> 876,7 +877,8 @@ eth_ixgbe_dev_init(__attribute__((unused)) struct
> eth_driver *eth_drv,
>   PMD_INIT_LOG(ERR,
>   "Failed to allocate %d bytes needed to store MAC
> addresses",
>   ETHER_ADDR_LEN * IXGBE_VMDQ_NUM_UC_MAC);
> - return -ENOMEM;
> + diag = -ENOMEM;
> + goto error;
>   }
> 
>   /* initialize the vfta */
> @@ -886,7 +888,13 @@ eth_ixgbe_dev_init(__attribute__((unused)) struct
> eth_driver *eth_drv,
>   memset(hwstrip, 0, sizeof(*hwstrip));
> 
>   /* initialize PF if max_vfs not zero */
> - ixgbe_pf_host_init(eth_dev);
> + diag = ixgbe_pf_host_init(eth_dev);
> + if (diag < 0) {
> + PMD_INIT_LOG(ERR,
> + "Failed to allocate %d bytes needed to store MAC
> addresses",
> + ETHER_ADDR_LEN * IXGBE_VMDQ_NUM_UC_MAC);
> + goto error;
> + }
> 
>   ctrl_ext = IXGBE_READ_REG(hw, IXGBE_CTRL_EXT);
>   /* let hardware know driver is loaded */ @@ -918,6 +926,11 @@
> eth_ixgbe_dev_init(__attribute__((unused)) struct eth_driver *eth_drv,
>   ixgbe_enable_intr(eth_dev);
> 
>   return 0;
> +
> +error:
> + rte_free(eth_dev->data->hash_mac_addrs);
> + rte_free(eth_dev->data->mac_addrs);
> + return diag;
>  }
> 
> 
> @@ -1434,7 +1447,93 @@ ixgbe_dev_configure(struct rte_eth_dev *dev)
>   struct ixgbe_interrupt *intr =
>   IXGBE_DEV_PRIVATE_TO_INTR(dev->data->dev_private);
> 
> + struct rte_eth_conf *dev_conf = &dev->data->dev_conf;
> + struct rte_eth_dev_info dev_info;
> + int retval;
> +
>   PMD_INIT_FUNC_TRACE();
> + retval = ixgbe_pf_configure_mq_sriov(dev);
> + if (retval <= 0)
> + return retval;
> +
> + uint16_t nb_rx_q = dev->data->nb_rx_queues;
> + uint16_t nb_tx_q = dev->data->nb_rx_queues;
> +
> + /* For DCB we need to obtain maximum number of queues
> dinamically,
> +  * as this depends on max VF exported in PF. */
> + if ((dev_conf->rxmode.mq_mode == ETH_MQ_RX_DCB) ||
> + (dev_conf->txmode.mq_mode == ETH_MQ_TX_DCB)) {
> + /* Use dev_infos_get field as this might be pointer to PF or
> VF. */
> + (*dev->dev_ops->dev_infos_get)(dev, &dev_info);
Why not call ixgbe_dev_info_get directly? And it looks only max_rx_queues and 
max_tx_queues
are used below, maybe hw->mac.max_rx_queues and hw->mac.max_tx_queues can be 
used
below instead of calling a function.

> + }
> +
> + /* For vmdq+dcb mode check our configuration before we go further
> */
> + if (dev_conf->rxmode.mq_mode == ETH_MQ_RX_VMDQ_DCB) {
> + const struct rte_eth_vmdq_dcb_conf *conf;
> +
> + if (nb_rx_q != ETH_VMDQ_DCB_NUM_QUEUES) {
> + PMD_INIT_LOG(ERR, " VMDQ+DCB,
> nb_rx_q != %d\n",
> + ETH_VMDQ_DCB_NUM_QUEUES);
> + return (-EINVAL);
> + }
> + conf = &(dev_conf->rx_adv_conf.vmdq_dcb_conf);
> + if (conf->nb_queue_pools != ETH_16_POOLS &&
> + conf->nb_queue_pools != ETH_32_POOLS) {
> + PMD_INIT_LOG(ERR, " VMDQ+DCB selected, "
> + "number of RX queue pools must
> be %d or %d

[dpdk-dev] [PATCH 00/26] update ixgbe base driver

2015-06-09 Thread Zhang, Helin
Acked-by: Helin Zhang 

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Wenzhuo Lu
> Sent: Friday, June 5, 2015 1:22 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH 00/26] update ixgbe base driver
> 
> Short summary:
> *update copyright and readme
> *fix code comment, double from
> *fix typo error in code comment
> *check return value after calling
> *allow tunneled UDP and TCP frames to reach their destination *erase
> ixgbe_get_hi_status *provide unlocked I2C methods *reduce I2C retry count on
> X550 devices *issue firmware command when coming up *add logic to reset
> CS4227 when needed *restore ESDP settings after MAC reset *disable FEC to
> save power *set lan_id for non-PCIe devices *add SFP+ dual-speed support *add
> SW based LPLU support *fix flow control for KR backplane *new simplified
> x550em init flow *move I2C MUX function from ixgbe_x540.c to ixgbe_x550.c
> *change return value for ixgbe_setup_internal_phy_t_x550em
> *ixgbe_setup_internal_phy_x550em function clean-up *add x550em Auto neg
> Flow Control support *add x550em PHY interrupt and forced 1G/10G support
> *add link check support for x550em PHY *set lan_id before first I2C access
> *added x550em PHY reset function *block EEE setup on the interfaces which
> don't support EEE
> 
> Wenzhuo Lu (26):
>   ixgbe/base: update copyright and readme
>   ixgbe/base: fix code comment, double from
>   ixgbe/base: fix typo error in code comment
>   ixgbe/base: check return value after calling
>   ixgbe/base: allow tunneled UDP and TCP frames to reach their
> destination
>   ixgbe/base: erase ixgbe_get_hi_status
>   ixgbe/base: provide unlocked I2C methods
>   ixgbe/base: reduce I2C retry count on X550 devices
>   ixgbe/base: issue firmware command when coming up
>   ixgbe/base: add logic to reset CS4227 when needed
>   ixgbe/base: restore ESDP settings after MAC reset
>   ixgbe/base: disable FEC(Forward Error Correction) to save power
>   ixgbe/base: set lan_id for non-PCIe devices
>   ixgbe/base: add SFP+ dual-speed support
>   ixgbe/base: add SW based LPLU support
>   ixgbe/base: fix flow control for KR backplane
>   ixgbe/base: new simplified x550em init flow
>   ixgbe/base: move I2C MUX function from ixgbe_x540.c to ixgbe_x550.c
>   ixgbe/base: change return value for ixgbe_setup_internal_phy_t_x550em
>   ixgbe/base: ixgbe_setup_internal_phy_x550em function clean-up
>   ixgbe/base: add x550em Auto neg Flow Control support
>   ixgbe/base: add x550em PHY interrupt and forced 1G/10G support
>   ixgbe/base: add link check support for x550em PHY
>   ixgbe/base: set lan_id before first I2C access
>   ixgbe/base: added x550em PHY reset function
>   ixgbe/base: block EEE(Energy Efficient Ethernet) setup on the
> interfaces that don't support EEE
> 
>  drivers/net/ixgbe/base/README|4 +-
>  drivers/net/ixgbe/base/ixgbe_82598.c |7 +-
>  drivers/net/ixgbe/base/ixgbe_82598.h |2 +-
>  drivers/net/ixgbe/base/ixgbe_82599.c |  191 +-
>  drivers/net/ixgbe/base/ixgbe_82599.h |7 +-
>  drivers/net/ixgbe/base/ixgbe_api.c   |  141 +++-
>  drivers/net/ixgbe/base/ixgbe_api.h   |   16 +-
>  drivers/net/ixgbe/base/ixgbe_common.c|  270 +++-
>  drivers/net/ixgbe/base/ixgbe_common.h|9 +-
>  drivers/net/ixgbe/base/ixgbe_dcb.c   |2 +-
>  drivers/net/ixgbe/base/ixgbe_dcb.h   |2 +-
>  drivers/net/ixgbe/base/ixgbe_dcb_82598.c |2 +-
>  drivers/net/ixgbe/base/ixgbe_dcb_82598.h |2 +-
>  drivers/net/ixgbe/base/ixgbe_dcb_82599.c |2 +-
>  drivers/net/ixgbe/base/ixgbe_dcb_82599.h |2 +-
>  drivers/net/ixgbe/base/ixgbe_mbx.c   |2 +-
>  drivers/net/ixgbe/base/ixgbe_mbx.h   |2 +-
>  drivers/net/ixgbe/base/ixgbe_osdep.h |2 +-
>  drivers/net/ixgbe/base/ixgbe_phy.c   |  215 ++-
>  drivers/net/ixgbe/base/ixgbe_phy.h   |   23 +-
>  drivers/net/ixgbe/base/ixgbe_type.h  |   70 +-
>  drivers/net/ixgbe/base/ixgbe_vf.c|3 +-
>  drivers/net/ixgbe/base/ixgbe_vf.h|2 +-
>  drivers/net/ixgbe/base/ixgbe_x540.c  |   32 +-
>  drivers/net/ixgbe/base/ixgbe_x540.h  |2 +-
>  drivers/net/ixgbe/base/ixgbe_x550.c  | 1029
> ++
>  drivers/net/ixgbe/base/ixgbe_x550.h  |   20 +-
>  27 files changed, 1646 insertions(+), 415 deletions(-)
> 
> --
> 1.9.3



[dpdk-dev] [PATCH] vhost: enable live migration

2015-06-09 Thread Long, Thomas


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Thomas Monjalon
> Sent: Thursday, June 4, 2015 2:00 PM
> To: Xie, Huawei
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] vhost: enable live migration

> 2015-06-01 04:47, Ouyang, Changchun:
> >  From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Huawei Xie
> > > When we migrate VM, without this feature, qemu will report error :
> > > "migrate: Migration disabled: vhost lacks VHOST_F_LOG_ALL feature".
> > 
> > Is this enough for vhost to support  migrate VM?
> > I remember Claire has another patch, possibly need refer to that patch.

> Indeed, there were some patches which do not build:
>   http://dpdk.org/ml/archives/dev/2014-August/005050.html
> And there was no answer.

The log name is incorrect in Claire's patch and "CONFIG" needs to replaced with 
"VHOST_CONFIG". It also only supported migration for vhost_cuse. Claire left 
the org around the same time and the patch was not picked up



> [...]
> > > + (1ULL << VHOST_F_LOG_ALL))

> Please check if this line is sufficient.

This should be sufficient to enable migration for vhost-user. 

The previous patch with the CONFIG log fix enables migration for vhost-cuse. 
The behavior of qemu with vhost_cuse when migrating is to turn on the migration 
flag before migration regardless of what the backend advertises as being 
supported and to disable it again once migration has been completed. This is 
why Claire's patch ignores the VHOST_F_LOG_ALL setting although I don't think 
this is the right way to implement this.  If the current patch is combined with 
the vhost_net_ioctl modifications of the original patch then it should enable 
migration for both vhost-cuse and vhost-user. 

> Thanks


[dpdk-dev] Dpdk 2.0 with vmware-workstation ubunut guest: so many error msg " EAL: Error reading from file descriptor "

2015-06-09 Thread Mo Jia
~/Git/dpdk$ uname -a
Linux engine 3.16.0-31-generic #43-Ubuntu SMP Tue Mar 10 17:37:36 UTC
2015 x86_64 x86_64 x86_64 GNU/Linux

git log last commit:
commit c1715402df8f7fdb2392e12703d5b6f81fd5f447
Author: Helin Zhang 
Date:   Thu Jun 4 14:54:32 2015 +0800

i40evf: fix jumbo frame support



After config and compile then test the helloworld.

1 If I don?t bind to igb_uio:

sudo ./examples/helloworld/build/helloworld -c 3 -n 1

EAL: Detected lcore 0 as core 0 on socket 0
EAL: Detected lcore 1 as core 1 on socket 0
EAL: Support maximum 128 logical core(s) by configuration.
EAL: Detected 2 lcore(s)
EAL: internal_config.no_hugetlbfs : 0 EAL:
internal_config.process_type : 0 EAL: internal_config.xen_dom0_support
: 0EAL: VFIO modules not all loaded, skip VFIO support...
EAL: Setting up memory...
EAL: Ask a virtual area of 0x380 bytes
EAL: Virtual area found at 0x7f6808a0 (size = 0x380)
EAL: Ask a virtual area of 0x20 bytes
EAL: Virtual area found at 0x7f680860 (size = 0x20)
EAL: Ask a virtual area of 0x20 bytes
EAL: Virtual area found at 0x7f680820 (size = 0x20)
EAL: Ask a virtual area of 0x340 bytes
EAL: Virtual area found at 0x7f6804c0 (size = 0x340)
EAL: Ask a virtual area of 0x20 bytes
EAL: Virtual area found at 0x7f680480 (size = 0x20)
EAL: Requesting 57 pages of size 2MB from socket 0
EAL: TSC frequency is ~2599680 KHz
EAL: Master lcore 0 is ready (tid=dcf0900;cpuset=[0])
EAL: lcore 1 is ready (tid=47ff700;cpuset=[1])
EAL: PCI device :02:01.0 on NUMA socket -1
EAL:   probe driver: 8086:100f rte_em_pmd
EAL:   Not managed by a supported kernel driver, skipped
EAL: PCI device :02:05.0 on NUMA socket -1
EAL:   probe driver: 8086:100f rte_em_pmd
EAL:   Not managed by a supported kernel driver, skipped
EAL: PCI device :02:06.0 on NUMA socket -1
EAL:   probe driver: 8086:100f rte_em_pmd
EAL:   Not managed by a supported kernel driver, skipped
hello from core 1
hello from core 0

2 after I bind it

Network devices using DPDK-compatible driver

:02:05.0 '82545EM Gigabit Ethernet Controller (Copper)' drv=igb_uio unused=
:02:06.0 '82545EM Gigabit Ethernet Controller (Copper)' drv=igb_uio unused=

Network devices using kernel driver
===
:02:01.0 '82545EM Gigabit Ethernet Controller (Copper)' if=eth0
drv=e1000 unused=igb_uio *Active*

Other network devices
=


engine at engine:~/Git/dpdk$ sudo ./examples/helloworld/build/helloworld -c 3 
-n 1
EAL: Detected lcore 0 as core 0 on socket 0
EAL: Detected lcore 1 as core 1 on socket 0
EAL: Support maximum 128 logical core(s) by configuration.
EAL: Detected 2 lcore(s)
EAL: internal_config.no_hugetlbfs : 0 EAL:
internal_config.process_type : 0 EAL: internal_config.xen_dom0_support
: 0EAL: VFIO modules not all loaded, skip VFIO support...
EAL: Setting up memory...
EAL: Ask a virtual area of 0x380 bytes
EAL: Virtual area found at 0x7ff64720 (size = 0x380)
EAL: Ask a virtual area of 0x20 bytes
EAL: Virtual area found at 0x7ff646e0 (size = 0x20)
EAL: Ask a virtual area of 0x20 bytes
EAL: Virtual area found at 0x7ff646a0 (size = 0x20)
EAL: Ask a virtual area of 0x340 bytes
EAL: Virtual area found at 0x7ff64340 (size = 0x340)
EAL: Ask a virtual area of 0x20 bytes
EAL: Virtual area found at 0x7ff64300 (size = 0x20)
EAL: Requesting 57 pages of size 2MB from socket 0
EAL: TSC frequency is ~2599681 KHz
EAL: Master lcore 0 is ready (tid=4c5f8900;cpuset=[0])
EAL: lcore 1 is ready (tid=42fff700;cpuset=[1])
EAL: PCI device :02:01.0 on NUMA socket -1
EAL:   probe driver: 8086:100f rte_em_pmd
EAL:   Not managed by a supported kernel driver, skipped
EAL: PCI device :02:05.0 on NUMA socket -1
EAL:   probe driver: 8086:100f rte_em_pmd
EAL:   PCI memory mapped at 0x7ff64aa0
EAL:   PCI memory mapped at 0x7ff64aa2
PMD: eth_em_dev_init(): port_id 0 vendorID=0x8086 deviceID=0x100f
EAL: PCI device :02:06.0 on NUMA socket -1
EAL:   probe driver: 8086:100f rte_em_pmd
EAL:   PCI memory mapped at 0x7ff64aa3
EAL:   PCI memory mapped at 0x7ff64aa5
EAL: Error reading from file descriptor 13: Input/output error
EAL: Error reading from file descriptor 13: Input/output error
EAL: Error reading from file descriptor 13: Input/output error
EAL: Error reading from file descriptor 13: Input/output error
EAL: Error reading from file descriptor 13: Input/output error
EAL: Error reading from file descriptor 13: Input/output error
EAL: Error reading from file descriptor 13: Input/output error
EAL: Error reading from file descriptor 13: Input/output error
EAL: Error reading from file descriptor 13: Input/output error
EAL: Error reading from file descriptor 13: Input/output error
EAL: Error reading from file descriptor 13: Input/output error
EAL: Error reading from file descriptor 13: Input/output error
EAL: Error reading from file descriptor 13: Input/out

[dpdk-dev] Build broken with COMBINE_LIBS=y

2015-06-09 Thread Li Wei
Hi list,

After drivers separation, the following building error was encountered,
it seems the build system build lib/ first and link it into libintel_dpdk.a
and then drivers/ got compiled, so the symbols in drivers never got linked
into libintel_dpdk.a.

I guess we need add some dependence on drivers/ on libintel_dpdk.a target,
but I'm not familiar with the build system :(

Error messages as follow:

[...]
== Build app
== Build app/test
  CC commands.o
  CC test.o
  CC test_pci.o
  CC test_prefetch.o
  CC test_byteorder.o
  CC test_per_lcore.o
  CC test_atomic.o
  CC test_malloc.o
  CC test_cycles.o
  CC test_spinlock.o
  CC test_memory.o
  CC test_memzone.o
  CC test_ring.o
  CC test_ring_perf.o
  CC test_pmd_perf.o
  CC test_table.o
  CC test_table_pipeline.o
  CC test_table_tables.o
  CC test_table_ports.o
  CC test_table_combined.o
  CC test_table_acl.o
  CC test_rwlock.o
  CC test_timer.o
  CC test_timer_perf.o
  CC test_mempool.o
  CC test_mempool_perf.o
  CC test_mbuf.o
  CC test_logs.o
  CC test_memcpy.o
  CC test_memcpy_perf.o
  CC test_hash.o
  CC test_hash_perf.o
  CC test_lpm.o
  CC test_lpm6.o
  CC test_debug.o
  CC test_errno.o
  CC test_tailq.o
  CC test_string_fns.o
  CC test_cpuflags.o
  CC test_mp_secondary.o
  CC test_eal_flags.o
  CC test_eal_fs.o
  CC test_alarm.o
  CC test_interrupts.o
  CC test_version.o
  CC test_func_reentrancy.o
  CC test_cmdline.o
  CC test_cmdline_num.o
  CC test_cmdline_etheraddr.o
  CC test_cmdline_portlist.o
  CC test_cmdline_ipaddr.o
  CC test_cmdline_cirbuf.o
  CC test_cmdline_string.o
  CC test_cmdline_lib.o
  CC test_red.o
  CC test_sched.o
  CC test_meter.o
  CC test_kni.o
  CC test_power.o
  CC test_power_acpi_cpufreq.o
  CC test_power_kvm_vm.o
  CC test_common.o
  CC test_distributor.o
  CC test_distributor_perf.o
  CC test_reorder.o
  CC test_devargs.o
  CC virtual_pmd.o
  CC packet_burst_generator.o
  CC test_acl.o
  CC test_link_bonding.o
  CC test_link_bonding_mode4.o
  CC test_pmd_ring.o
  CC test_kvargs.o
  LD test
test_link_bonding.o: In function `test_add_slave_to_bonded_device':
test_link_bonding.c:(.text+0x7ca): undefined reference to 
`rte_eth_bond_slave_add'
test_link_bonding.c:(.text+0x7e2): undefined reference to 
`rte_eth_bond_slaves_get'
test_link_bonding.c:(.text+0x807): undefined reference to 
`rte_eth_bond_active_slaves_get'
test_link_bonding.o: In function `test_remove_slave_from_bonded_device':
test_link_bonding.c:(.text+0x8cf): undefined reference to 
`rte_eth_bond_slave_remove'
test_link_bonding.c:(.text+0x8ed): undefined reference to 
`rte_eth_bond_slaves_get'
test_link_bonding.o: In function `test_get_slaves_from_bonded_device':
test_link_bonding.c:(.text+0xa1f): undefined reference to 
`rte_eth_bond_slaves_get'
test_link_bonding.c:(.text+0xa3e): undefined reference to 
`rte_eth_bond_active_slaves_get'
test_link_bonding.c:(.text+0xa59): undefined reference to 
`rte_eth_bond_slaves_get'
test_link_bonding.c:(.text+0xa79): undefined reference to 
`rte_eth_bond_active_slaves_get'
test_link_bonding.c:(.text+0xa90): undefined reference to 
`rte_eth_bond_slaves_get'
test_link_bonding.c:(.text+0xaac): undefined reference to 
`rte_eth_bond_active_slaves_get'
test_link_bonding.o: In function 
`test_set_bonded_port_initialization_mac_assignment':
test_link_bonding.c:(.text+0xbd0): undefined reference to `rte_eth_bond_create'
test_link_bonding.c:(.text+0xc7e): undefined reference to 
`rte_eth_bond_slave_add'
test_link_bonding.c:(.text+0xca0): undefined reference to 
`rte_eth_bond_slaves_get'
test_link_bonding.c:(.text+0xcbf): undefined reference to 
`rte_eth_bond_mac_address_set'
test_link_bonding.c:(.text+0xddb): undefined reference to 
`rte_eth_bond_primary_set'
test_link_bonding.c:(.text+0xed3): undefined reference to 
`rte_eth_bond_slave_remove'
test_link_bonding.c:(.text+0xef5): undefined reference to 
`rte_eth_bond_slaves_get'
test_link_bonding.o: In function 
`test_add_already_bonded_slave_to_bonded_device':
test_link_bonding.c:(.text+0x11fa): undefined reference to 
`rte_eth_bond_slaves_get'
test_link_bonding.c:(.text+0x1244): undefined reference to `rte_eth_bond_create'
test_link_bonding.c:(.text+0x1269): undefined reference to 
`rte_eth_bond_slave_add'
test_link_bonding.o: In function 
`test_create_bonded_device_with_invalid_params':
test_link_bonding.c:(.text+0x131d): undefined reference to `rte_eth_bond_create'
test_link_bonding.c:(.text+0x1341): undefined reference to `rte_eth_bond_create'
test_link_bonding.c:(.text+0x1362): undefined reference to `rte_eth_bond_create'
test_link_bonding.o: In function `test_create_bonded_device':
test_link_bonding.c:(.text+0x13e9): undefined reference to 
`rte_eth_bond_mode_set'
test_link_bonding.c:(.text+0x1405): undefined reference to 
`rte_eth_bond_slaves_get'
test_link_bonding.c:(.text+0x1421): undefined reference to 
`rte_eth_bond_active_slaves_get'
test_link_bonding.c:(.text+0x144d): undefined reference to `rte_eth_bond_create'
test_link_bonding.o: In function `test_stop_bonded_device':

[dpdk-dev] [PATCH] vhost: flush used->idx update before reading avail->flags

2015-06-09 Thread Linhaifeng


On 2015/4/24 15:27, Luke Gorrie wrote:
> On 24 April 2015 at 03:01, Linhaifeng  wrote:
> 
>> If not add memory fence what would happen? Packets loss or interrupt
>> loss?How to test it ?
>>
> 
> You should be able to test it like this:
> 
> 1. Boot two Linux kernel (e.g. 3.13) guests.
> 2. Connect them via vhost switch.
> 3. Run continuous traffic between them (e.g. iperf).
> 
> I would expect that within a reasonable timeframe (< 1 hour) one of the
> guests' network interfaces will hang indefinitely due to a missed interrupt.
> 
> You won't be able to reproduce this using DPDK guests because they are not
> using the same interrupt suppression method.
> 
> This is a serious real-world problem. I wouldn't deploy the vhost
> implementation without this fix.
> 
> Cheers,
> -Luke
> 

I think this patch can't resole this problem. On the other hand we still would 
miss interrupt.

After add rte_mb() function the we want the case is :
1.write used->idx. ring is full or empty.
2.virtio_net open interrupt.
3.read avail->flags.

but this case(miss interrupt) would happen too:
1.write used->idx. ring is full or empty.
2.read avail->flags.
3.virtio_net open interrupt.




[dpdk-dev] [PATCH 1/3] fm10k: update VLAN filter

2015-06-09 Thread He, Shaopeng
> -Original Message-
> From: Chen, Jing D
> Sent: Tuesday, June 09, 2015 10:54 AM
> To: He, Shaopeng; dev at dpdk.org
> Cc: Qiu, Michael
> Subject: RE: [PATCH 1/3] fm10k: update VLAN filter
> 
> Hi,
> 
> > -Original Message-
> > From: He, Shaopeng
> > Sent: Tuesday, June 02, 2015 10:59 AM
> > To: dev at dpdk.org
> > Cc: Chen, Jing D; Qiu, Michael; He, Shaopeng
> > Subject: [PATCH 1/3] fm10k: update VLAN filter
> >
> > VLAN filter was updated to add/delete one static entry in MAC table
> > for each combination of VLAN and MAC address. More sanity checks were
> added.
> >
> > Signed-off-by: Shaopeng He 
> > ---
> >  drivers/net/fm10k/fm10k.h| 23 +
> >  drivers/net/fm10k/fm10k_ethdev.c | 55
> > +---
> >  2 files changed, 75 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/net/fm10k/fm10k.h b/drivers/net/fm10k/fm10k.h
> > index ad7a7d1..3b95b72 100644
> > --- a/drivers/net/fm10k/fm10k.h
> > +++ b/drivers/net/fm10k/fm10k.h
> > @@ -109,11 +109,31 @@
> >
> >  #define FM10K_VLAN_TAG_SIZE 4
> >
> > +/* Maximum number of MAC addresses per PF/VF */
> > +#define FM10K_MAX_MACADDR_NUM   1
> > +
> > +#define FM10K_UINT32_BIT_SIZE  (CHAR_BIT * sizeof(uint32_t))
> > +#define FM10K_VFTA_SIZE(4096 / FM10K_UINT32_BIT_SIZE)
> > +
> > +/* vlan_id is a 12 bit number.
> > + * The VFTA array is actually a 4096 bit array, 128 of 32bit elements.
> > + * 2^5 = 32. The val of lower 5 bits specifies the bit in the 32bit 
> > element.
> > + * The higher 7 bit val specifies VFTA array index.
> > + */
> > +#define FM10K_VFTA_BIT(vlan_id)(1 << ((vlan_id) & 0x1F))
> > +#define FM10K_VFTA_IDX(vlan_id)((vlan_id) >> 5)
> > +
> > +struct fm10k_macvlan_filter_info {
> > +   uint16_t vlan_num;   /* Total VLAN number */
> > +   uint32_t vfta[FM10K_VFTA_SIZE];/* VLAN bitmap */
> > +};
> > +
> >  struct fm10k_dev_info {
> > volatile uint32_t enable;
> > volatile uint32_t glort;
> > /* Protect the mailbox to avoid race condition */
> > rte_spinlock_tmbx_lock;
> > +   struct fm10k_macvlan_filter_infomacvlan;
> >  };
> >
> >  /*
> > @@ -137,6 +157,9 @@ struct fm10k_adapter {  #define
> > FM10K_DEV_PRIVATE_TO_MBXLOCK(adapter) \
> > (&(((struct fm10k_adapter *)adapter)->info.mbx_lock))
> >
> > +#define FM10K_DEV_PRIVATE_TO_MACVLAN(adapter) \
> > +   (&(((struct fm10k_adapter *)adapter)->info.macvlan))
> > +
> >  struct fm10k_rx_queue {
> > struct rte_mempool *mp;
> > struct rte_mbuf **sw_ring;
> > diff --git a/drivers/net/fm10k/fm10k_ethdev.c
> > b/drivers/net/fm10k/fm10k_ethdev.c
> > index 3a26480..d2f3e44 100644
> > --- a/drivers/net/fm10k/fm10k_ethdev.c
> > +++ b/drivers/net/fm10k/fm10k_ethdev.c
> > @@ -819,15 +819,61 @@ fm10k_dev_infos_get(struct rte_eth_dev *dev,
> > static int  fm10k_vlan_filter_set(struct rte_eth_dev *dev, uint16_t
> > vlan_id, int on)  {
> > -   struct fm10k_hw *hw = FM10K_DEV_PRIVATE_TO_HW(dev->data-
> > >dev_private);
> > +   s32 result;
> > +   uint32_t vid_idx, vid_bit, mac_index;
> > +   struct fm10k_hw *hw;
> > +   struct fm10k_macvlan_filter_info *macvlan;
> > +   struct rte_eth_dev_data *data = dev->data;
> >
> > -   PMD_INIT_FUNC_TRACE();
> > +   hw = FM10K_DEV_PRIVATE_TO_HW(dev->data->dev_private);
> > +   macvlan = FM10K_DEV_PRIVATE_TO_MACVLAN(dev->data-
> > >dev_private);
> >
> > /* @todo - add support for the VF */
> > if (hw->mac.type != fm10k_mac_pf)
> > return -ENOTSUP;
> >
> > -   return fm10k_update_vlan(hw, vlan_id, 0, on);
> > +   if (vlan_id > ETH_VLAN_ID_MAX) {
> > +   PMD_INIT_LOG(ERR, "Invalid vlan_id: must be < 4096");
> > +   return (-EINVAL);
> > +   }
> > +
> > +   vid_idx = FM10K_VFTA_IDX(vlan_id);
> > +   vid_bit = FM10K_VFTA_BIT(vlan_id);
> > +   /* this VLAN ID is already in the VLAN filter table, return SUCCESS */
> > +   if (on && (macvlan->vfta[vid_idx] & vid_bit))
> > +   return 0;
> > +   /* this VLAN ID is NOT in the VLAN filter table, cannot remove */
> > +   if (!on && !(macvlan->vfta[vid_idx] & vid_bit)) {
> > +   PMD_INIT_LOG(ERR, "Invalid vlan_id: not existing "
> > +   "in the VLAN filter table");
> > +   return (-EINVAL);
> > +   }
> > +
> > +   fm10k_mbx_lock(hw);
> > +   result = fm10k_update_vlan(hw, vlan_id, 0, on);
> > +   if (FM10K_SUCCESS == result) {
> > +   if (on) {
> > +   macvlan->vlan_num++;
> > +   macvlan->vfta[vid_idx] |= vid_bit;
> > +   } else {
> > +   macvlan->vlan_num--;
> > +   macvlan->vfta[vid_idx] &= ~vid_bit;
> > +   }
> > +
> > +   for (mac_index = 0; mac_index <
> > FM10K_MAX_MACADDR_NUM;
> > +   mac_index++) {
> > +   if (is_zero_ether_addr(&data-
> > >mac_addrs[mac_index]))
> > +   continue;
> > +   fm10k_update_uc_addr(hw, hw->mac.dglort_ma

[dpdk-dev] [PATCH] vhost: flush used->idx update before reading avail->flags

2015-06-09 Thread Luke Gorrie
On 9 June 2015 at 09:04, Linhaifeng  wrote:

> On 2015/4/24 15:27, Luke Gorrie wrote:
> > You should be able to test it like this:
> >
> > 1. Boot two Linux kernel (e.g. 3.13) guests.
> > 2. Connect them via vhost switch.
> > 3. Run continuous traffic between them (e.g. iperf).
> >
> > I would expect that within a reasonable timeframe (< 1 hour) one of the
> > guests' network interfaces will hang indefinitely due to a missed
> interrupt.
> >
> > You won't be able to reproduce this using DPDK guests because they are
> not
> > using the same interrupt suppression method.
>
> I think this patch can't resole this problem. On the other hand we still
> would miss interrupt.
>

For what it is worth, we were able to reproduce the problem as described
above with older Snabb Switch releases and we were also able to verify that
inserting a memory barrier fixes this problem.

This is the relevant commit in the snabbswitch repo for reference:
https://github.com/SnabbCo/snabbswitch/commit/c33cdd8704246887e11d7c353f773f7b488a47f2

In a nutshell, we added an MFENCE instruction after writing used->idx and
before checking VRING_F_NO_INTERRUPT.

I have not tested this case under DPDK myself and so I am not really
certain which memory barrier operations are sufficient/insufficient in that
context. I hope that our experience is relevant/helpful though and I am
happy to explain more about that if I have missed any important details.

Cheers,
-Luke


[dpdk-dev] Build broken with COMBINE_LIBS=y

2015-06-09 Thread Gonzalez Monroy, Sergio
On 09/06/2015 07:31, Li Wei wrote:
> Hi list,
>
> After drivers separation, the following building error was encountered,
> it seems the build system build lib/ first and link it into libintel_dpdk.a
> and then drivers/ got compiled, so the symbols in drivers never got linked
> into libintel_dpdk.a.
>
> I guess we need add some dependence on drivers/ on libintel_dpdk.a target,
> but I'm not familiar with the build system :(
>
>
That is exactly the issue.
Working on a patch.

Sergio


[dpdk-dev] [PATCH 2/3] fm10k: add MAC filter

2015-06-09 Thread He, Shaopeng
> -Original Message-
> From: Chen, Jing D
> Sent: Tuesday, June 09, 2015 11:25 AM
> To: He, Shaopeng; dev at dpdk.org
> Cc: Qiu, Michael
> Subject: RE: [PATCH 2/3] fm10k: add MAC filter
> 
> Hi,
> 
> > -Original Message-
> > From: He, Shaopeng
> > Sent: Tuesday, June 02, 2015 10:59 AM
> > To: dev at dpdk.org
> > Cc: Chen, Jing D; Qiu, Michael; He, Shaopeng
> > Subject: [PATCH 2/3] fm10k: add MAC filter
> >
> > MAC filter function was newly added, each PF and VF can have up to 64
> > MAC addresses. VF filter needs support from PF host, which is not available
> now.
> >
> > Signed-off-by: Shaopeng He 
> > ---
> >  drivers/net/fm10k/fm10k.h|  3 +-
> >  drivers/net/fm10k/fm10k_ethdev.c | 90
> > 
> >  2 files changed, 85 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/net/fm10k/fm10k.h b/drivers/net/fm10k/fm10k.h
> > index 3b95b72..f5be5f8 100644
> > --- a/drivers/net/fm10k/fm10k.h
> > +++ b/drivers/net/fm10k/fm10k.h
> > @@ -110,7 +110,7 @@
> >  #define FM10K_VLAN_TAG_SIZE 4
> >
> >  /* Maximum number of MAC addresses per PF/VF */
> > -#define FM10K_MAX_MACADDR_NUM   1
> > +#define FM10K_MAX_MACADDR_NUM   64
> >
> >  #define FM10K_UINT32_BIT_SIZE  (CHAR_BIT * sizeof(uint32_t))
> >  #define FM10K_VFTA_SIZE(4096 / FM10K_UINT32_BIT_SIZE)
> > @@ -125,6 +125,7 @@
> >
> >  struct fm10k_macvlan_filter_info {
> > uint16_t vlan_num;   /* Total VLAN number */
> > +   uint16_t mac_num;/* Total mac number */
> > uint32_t vfta[FM10K_VFTA_SIZE];/* VLAN bitmap */
> >  };
> >
> > diff --git a/drivers/net/fm10k/fm10k_ethdev.c
> > b/drivers/net/fm10k/fm10k_ethdev.c
> > index d2f3e44..4f23bf1 100644
> > --- a/drivers/net/fm10k/fm10k_ethdev.c
> > +++ b/drivers/net/fm10k/fm10k_ethdev.c
> > @@ -54,6 +54,10 @@
> >  #define BIT_MASK_PER_UINT32 ((1 << CHARS_PER_UINT32) - 1)
> >
> >  static void fm10k_close_mbx_service(struct fm10k_hw *hw);
> > +static int
> > +fm10k_vlan_filter_set(struct rte_eth_dev *dev, uint16_t vlan_id, int
> > +on); static void fm10k_MAC_filter_set(struct rte_eth_dev *dev, const
> > +u8 *mac, bool add);
> >
> >  static void
> >  fm10k_mbx_initlock(struct fm10k_hw *hw) @@ -668,14 +672,11 @@
> > fm10k_dev_start(struct rte_eth_dev *dev)
> > }
> >
> > if (hw->mac.default_vid && hw->mac.default_vid <=
> > ETHER_MAX_VLAN_ID) {
> > -   fm10k_mbx_lock(hw);
> > /* Update default vlan */
> > -   hw->mac.ops.update_vlan(hw, hw->mac.default_vid, 0,
> > true);
> > +   fm10k_vlan_filter_set(dev, hw->mac.default_vid, true);
> >
> > /* Add default mac/vlan filter to PF/Switch manger */
> > -   hw->mac.ops.update_uc_addr(hw, hw->mac.dglort_map,
> > hw->mac.addr,
> > -   hw->mac.default_vid, true, 0);
> > -   fm10k_mbx_unlock(hw);
> > +   fm10k_MAC_filter_set(dev, hw->mac.addr, true);
> > }
> >
> > return 0;
> > @@ -781,7 +782,7 @@ fm10k_dev_infos_get(struct rte_eth_dev *dev,
> > dev_info->max_rx_pktlen  = FM10K_MAX_PKT_SIZE;
> > dev_info->max_rx_queues  = hw->mac.max_queues;
> > dev_info->max_tx_queues  = hw->mac.max_queues;
> > -   dev_info->max_mac_addrs  = 1;
> > +   dev_info->max_mac_addrs  = FM10K_MAX_MACADDR_NUM;
> > dev_info->max_hash_mac_addrs = 0;
> > dev_info->max_vfs= FM10K_MAX_VF_NUM;
> > dev_info->max_vmdq_pools = ETH_64_POOLS;
> > @@ -820,6 +821,7 @@ static int
> >  fm10k_vlan_filter_set(struct rte_eth_dev *dev, uint16_t vlan_id, int
> > on)  {
> > s32 result;
> > +   uint16_t mac_num = 0;
> > uint32_t vid_idx, vid_bit, mac_index;
> > struct fm10k_hw *hw;
> > struct fm10k_macvlan_filter_info *macvlan; @@ -864,9 +866,15 @@
> > fm10k_vlan_filter_set(struct rte_eth_dev *dev, uint16_t vlan_id, int
> > on)
> > mac_index++) {
> > if (is_zero_ether_addr(&data-
> > >mac_addrs[mac_index]))
> > continue;
> > +   if (mac_num > macvlan->mac_num - 1) {
> > +   PMD_INIT_LOG(ERR, "MAC address number
> > "
> > +   "not match");
> > +   break;
> > +   }
> > fm10k_update_uc_addr(hw, hw->mac.dglort_map,
> > data->mac_addrs[mac_index].addr_bytes,
> > vlan_id, on, 0);
> > +   mac_num++;
> > }
> > }
> > fm10k_mbx_unlock(hw);
> > @@ -876,6 +884,71 @@ fm10k_vlan_filter_set(struct rte_eth_dev *dev,
> > uint16_t vlan_id, int on)
> > return (-EIO);
> >  }
> >
> > +/* Add/Remove a MAC address, and update filters */ static void
> > +fm10k_MAC_filter_set(struct rte_eth_dev *dev, const u8 *mac, bool
> > +add) {
> > +   uint32_t i, j, k;
> > +   struct fm10k_hw *hw;
> > +   struct fm10k_macvlan_filter_info *ma

[dpdk-dev] [PATCH] vhost: flush used->idx update before reading avail->flags

2015-06-09 Thread Michael S. Tsirkin
On Tue, Jun 09, 2015 at 03:04:02PM +0800, Linhaifeng wrote:
> 
> 
> On 2015/4/24 15:27, Luke Gorrie wrote:
> > On 24 April 2015 at 03:01, Linhaifeng  wrote:
> > 
> >> If not add memory fence what would happen? Packets loss or interrupt
> >> loss?How to test it ?
> >>
> > 
> > You should be able to test it like this:
> > 
> > 1. Boot two Linux kernel (e.g. 3.13) guests.
> > 2. Connect them via vhost switch.
> > 3. Run continuous traffic between them (e.g. iperf).
> > 
> > I would expect that within a reasonable timeframe (< 1 hour) one of the
> > guests' network interfaces will hang indefinitely due to a missed interrupt.
> > 
> > You won't be able to reproduce this using DPDK guests because they are not
> > using the same interrupt suppression method.
> > 
> > This is a serious real-world problem. I wouldn't deploy the vhost
> > implementation without this fix.
> > 
> > Cheers,
> > -Luke
> > 
> 
> I think this patch can't resole this problem. On the other hand we still 
> would miss interrupt.
> 
> After add rte_mb() function the we want the case is :
> 1.write used->idx. ring is full or empty.
> 2.virtio_net open interrupt.
> 3.read avail->flags.
> 
> but this case(miss interrupt) would happen too:
> 1.write used->idx. ring is full or empty.
> 2.read avail->flags.
> 3.virtio_net open interrupt.
> 

That's why a correct guest, after detecting an empty used ring, must always
re-check used idx at least once after writing avail->flags.

By the way, similarly, host side must re-check avail idx after writing
used flags. I don't see where snabbswitch does it - is that a bug
in snabbswitch?

-- 
MST


[dpdk-dev] [PATCH 3/3] fm10k: update VLAN offload features

2015-06-09 Thread He, Shaopeng
> -Original Message-
> From: Chen, Jing D
> Sent: Tuesday, June 09, 2015 11:27 AM
> To: He, Shaopeng; dev at dpdk.org
> Cc: Qiu, Michael
> Subject: RE: [PATCH 3/3] fm10k: update VLAN offload features
> 
> Hi,
> 
> 
> > -Original Message-
> > From: He, Shaopeng
> > Sent: Tuesday, June 02, 2015 10:59 AM
> > To: dev at dpdk.org
> > Cc: Chen, Jing D; Qiu, Michael; He, Shaopeng
> > Subject: [PATCH 3/3] fm10k: update VLAN offload features
> >
> > Fm10k PF/VF does not support QinQ; VLAN strip and filter are always on
> > for PF/VF ports.
> >
> > Signed-off-by: Shaopeng He 
> > ---
> >  drivers/net/fm10k/fm10k_ethdev.c | 22 ++
> >  1 file changed, 22 insertions(+)
> >
> > diff --git a/drivers/net/fm10k/fm10k_ethdev.c
> > b/drivers/net/fm10k/fm10k_ethdev.c
> > index 4f23bf1..9b198a7 100644
> > --- a/drivers/net/fm10k/fm10k_ethdev.c
> > +++ b/drivers/net/fm10k/fm10k_ethdev.c
> > @@ -884,6 +884,27 @@ fm10k_vlan_filter_set(struct rte_eth_dev *dev,
> > uint16_t vlan_id, int on)
> > return (-EIO);
> >  }
> >
> > +static void
> > +fm10k_vlan_offload_set(__rte_unused struct rte_eth_dev *dev, int
> > +mask) {
> > +   if (mask & ETH_VLAN_STRIP_MASK) {
> > +   if (!dev->data->dev_conf.rxmode.hw_vlan_strip)
> > +   PMD_INIT_LOG(ERR, "VLAN stripping is "
> > +   "always on in fm10k");
> > +   }
> > +
> > +   if (mask & ETH_VLAN_EXTEND_MASK) {
> > +   if (dev->data->dev_conf.rxmode.hw_vlan_extend)
> > +   PMD_INIT_LOG(ERR, "VLAN QinQ is not "
> > +   "supported in fm10k");
> > +   }
> > +
> > +   if (mask & ETH_VLAN_FILTER_MASK) {
> > +   if (!dev->data->dev_conf.rxmode.hw_vlan_filter)
> > +   PMD_INIT_LOG(ERR, "VLAN filter is always on in
> > fm10k");
> > +   }
> > +}
> > +
> 
> Update fm10k_dev_infos_get() to configure above options to expected
> values?
Thank you for the reminder, I will update the value of rx_offload_capa 
and tx_offload_capa in fm10k_dev_infos_get() in the next version


[dpdk-dev] [PATCH v3 00/10] Add a VXLAN sample

2015-06-09 Thread Liu, Yong
Tested-by: Yong Liu 

- Tested Commit: c1715402df8f7fdb2392e12703d5b6f81fd5f447
- OS: Fedora20 3.15.5
- GCC: gcc version 4.8.3 20140911
- CPU: Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz
- NIC: Intel Corporation Device XL710 [8086:1584] Firmware 4.33
- Default x86_64-native-linuxapp-gcc configuration
- Prerequisites: set up dpdk vhost-user running environment
allocate enough hugepages for both vxlan sample and virtual machine
- Total 5 cases, 5 passed, 0 failed

- Prerequisites command / instruction:
  Update qemu-system-x86_64 to version 2.2.0 which support hugepage based memory
  Prepare vhost-use requested modules
modprobe fuse
modprobe cuse
insmod lib/librte_vhost/eventfd_link/eventfd_link.ko
  Allocate 4096*2M hugepages for vm and dpdk
echo 4096 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages

- Case: vxlan_sample_encap
  Description: check vxlan sample encap function work fine
  Command / instruction:
Start vxlan sample with only encapsulation enable
  tep_termination -c 0xf -n 3 --socket-mem 2048,2048 -- -p 0x1 \
--udp-port 4789 --nb-devices 2 --filter-type 3 --tx-checksum 0 \
--encap 1 --decap 0
Wait for vhost-net socket device created and message dumped.
  VHOST_CONFIG: bind to vhost-net
Start virtual machine with hugepage based memory and two vhost-user devices
  qemu-system-x86_64 -name vm0 -enable-kvm -daemonize \
-cpu host -smp 4 -m 4096 \
-object 
memory-backend-file,id=mem,size=4096M,mem-path=/mnt/huge,share=on \
-numa node,memdev=mem -mem-prealloc \
-chardev socket,id=char0,path=./dpdk/vhost-net \
-netdev type=vhost-user,id=netdev0,chardev=char0,vhostforce \
-device virtio-net-pci,netdev=netdev0,mac=00:00:20:00:00:20 \
-chardev socket,id=char1,path=./dpdk/vhost-net \
-netdev type=vhost-user,id=netdev1,chardev=char1,vhostforce \
-device virtio-net-pci,netdev=netdev1,mac=00:00:20:00:00:21 \
-drive file=/storage/vm-image/vm0.img -vnc :1
Login into virtual machine and start testpmd with additional arguments
  testpmd -c f -n 3 -- -i --txqflags=0xf00 --disable-hw-vlan
Start packet forward of testpmd and transit several packets for mac learning
  testpmd> set fwd mac
  testpmd> start tx_first
Make sure virtIO port registered normally.
  VHOST_CONFIG: virtio is now ready for processing.
  VHOST_DATA: (1) Device has been added to data core 56
  VHOST_DATA: (1) MAC_ADDRESS 00:00:20:00:00:21 and VNI 1000 registered
  VHOST_DATA: (0) MAC_ADDRESS 00:00:20:00:00:20 and VNI 1000 registered
Send normal udp packet to PF device and packet dmac match PF device 
Verify packet has been recevied in virtIO port0 and forwarded by port1
  testpmd> show port stats all
Verify encapsulated packet received on PF device

- Case: vxlan_sample_decap
  Description: check vxlan sample decap function work fine
  Command / instruction:
Start vxlan sample with only de-capsulation enable
  tep_termination -c 0xf -n 3 --socket-mem 2048,2048 -- -p 0x1 \
--udp-port 4789 --nb-devices 2 --filter-type 3 --tx-checksum 0 \
--encap 0 --decap 1
Start vhost-user test environment like case vxlan_sample_encap
Send vxlan packet Ether(dst=PF mac)/IP/UDP/vni(1000)/
  Ether(dst=virtIO port0)/IP/UDP to PF device
Verify that packet received by virtIO port0 and forwarded by virtIO 
port1.
  testpmd> show port stats all
Verify that PF received packet just the same as inner packet
Send vxlan packet Ether(dst=PF mac)/IP/UDP/vni(1000)/
  Ether(dst=virtIO port1)/IP/UDP to PF device
Verify that packet received by virtIO port1 and forwarded by virtIO 
port0.
  testpmd> show port stats all  
Make sure PF received packet received inner packet with mac reversed.

- Case: vxlan_sample_encap_and_decap
  Description: check vxlan sample decap&encap work fine in the same time
  Command / instruction:
Start vxlan sample with only de-capsulation enable
  tep_termination -c 0xf -n 3 --socket-mem 2048,2048 -- -p 0x1 \
--udp-port 4789 --nb-devices 2 --filter-type 3 --tx-checksum 0 \
--encap 1 --decap 1
Start vhost-user test environment like case vxlan_sample_encap
Ether(dst=PF mac)/IP/UDP/vni(1000)/ Ether(dst=virtIO port0)/IP/UDP
Send vxlan packet Ether(dst=PF mac)/IP/UDP/vni(1000)/
  Ether(dst=virtIO port0)/IP/UDP to PF device
Verify that packet received by virtIO port0 and forwarded by virtIO 
port1.
  testpmd> show port stats all
Verify encapsulated packet received on PF device.
Verify that inner packet src and dst mac address have been conversed.

- Case: vxlan_sample_chksum 
  Description: check vxlan sample transmit checksum work fine
  Command / instruction:
Start vxlan sample with only decapsulation enable
  tep_termination -c 0xf -n 3 --socket-mem 2048,2048 -- -p 0x1 \
--ud

[dpdk-dev] Dpdk 2.0 with vmware-workstation ubunut guest: so many error msg " EAL: Error reading from file descriptor "

2015-06-09 Thread Mo Jia
Hi anybody met this error. You can modify vmx , change drive "e1000"
to "vmxnet3", This error go away.
Don't understand why in deep , but it can work now.

2015-06-09 12:42 GMT+08:00 Mo Jia :
> ~/Git/dpdk$ uname -a
> Linux engine 3.16.0-31-generic #43-Ubuntu SMP Tue Mar 10 17:37:36 UTC
> 2015 x86_64 x86_64 x86_64 GNU/Linux
>
> git log last commit:
> commit c1715402df8f7fdb2392e12703d5b6f81fd5f447
> Author: Helin Zhang 
> Date:   Thu Jun 4 14:54:32 2015 +0800
>
> i40evf: fix jumbo frame support
>
>
>
> After config and compile then test the helloworld.
>
> 1 If I don?t bind to igb_uio:
>
> sudo ./examples/helloworld/build/helloworld -c 3 -n 1
>
> EAL: Detected lcore 0 as core 0 on socket 0
> EAL: Detected lcore 1 as core 1 on socket 0
> EAL: Support maximum 128 logical core(s) by configuration.
> EAL: Detected 2 lcore(s)
> EAL: internal_config.no_hugetlbfs : 0 EAL:
> internal_config.process_type : 0 EAL: internal_config.xen_dom0_support
> : 0EAL: VFIO modules not all loaded, skip VFIO support...
> EAL: Setting up memory...
> EAL: Ask a virtual area of 0x380 bytes
> EAL: Virtual area found at 0x7f6808a0 (size = 0x380)
> EAL: Ask a virtual area of 0x20 bytes
> EAL: Virtual area found at 0x7f680860 (size = 0x20)
> EAL: Ask a virtual area of 0x20 bytes
> EAL: Virtual area found at 0x7f680820 (size = 0x20)
> EAL: Ask a virtual area of 0x340 bytes
> EAL: Virtual area found at 0x7f6804c0 (size = 0x340)
> EAL: Ask a virtual area of 0x20 bytes
> EAL: Virtual area found at 0x7f680480 (size = 0x20)
> EAL: Requesting 57 pages of size 2MB from socket 0
> EAL: TSC frequency is ~2599680 KHz
> EAL: Master lcore 0 is ready (tid=dcf0900;cpuset=[0])
> EAL: lcore 1 is ready (tid=47ff700;cpuset=[1])
> EAL: PCI device :02:01.0 on NUMA socket -1
> EAL:   probe driver: 8086:100f rte_em_pmd
> EAL:   Not managed by a supported kernel driver, skipped
> EAL: PCI device :02:05.0 on NUMA socket -1
> EAL:   probe driver: 8086:100f rte_em_pmd
> EAL:   Not managed by a supported kernel driver, skipped
> EAL: PCI device :02:06.0 on NUMA socket -1
> EAL:   probe driver: 8086:100f rte_em_pmd
> EAL:   Not managed by a supported kernel driver, skipped
> hello from core 1
> hello from core 0
>
> 2 after I bind it
>
> Network devices using DPDK-compatible driver
> 
> :02:05.0 '82545EM Gigabit Ethernet Controller (Copper)' drv=igb_uio 
> unused=
> :02:06.0 '82545EM Gigabit Ethernet Controller (Copper)' drv=igb_uio 
> unused=
>
> Network devices using kernel driver
> ===
> :02:01.0 '82545EM Gigabit Ethernet Controller (Copper)' if=eth0
> drv=e1000 unused=igb_uio *Active*
>
> Other network devices
> =
> 
>
> engine at engine:~/Git/dpdk$ sudo ./examples/helloworld/build/helloworld -c 3 
> -n 1
> EAL: Detected lcore 0 as core 0 on socket 0
> EAL: Detected lcore 1 as core 1 on socket 0
> EAL: Support maximum 128 logical core(s) by configuration.
> EAL: Detected 2 lcore(s)
> EAL: internal_config.no_hugetlbfs : 0 EAL:
> internal_config.process_type : 0 EAL: internal_config.xen_dom0_support
> : 0EAL: VFIO modules not all loaded, skip VFIO support...
> EAL: Setting up memory...
> EAL: Ask a virtual area of 0x380 bytes
> EAL: Virtual area found at 0x7ff64720 (size = 0x380)
> EAL: Ask a virtual area of 0x20 bytes
> EAL: Virtual area found at 0x7ff646e0 (size = 0x20)
> EAL: Ask a virtual area of 0x20 bytes
> EAL: Virtual area found at 0x7ff646a0 (size = 0x20)
> EAL: Ask a virtual area of 0x340 bytes
> EAL: Virtual area found at 0x7ff64340 (size = 0x340)
> EAL: Ask a virtual area of 0x20 bytes
> EAL: Virtual area found at 0x7ff64300 (size = 0x20)
> EAL: Requesting 57 pages of size 2MB from socket 0
> EAL: TSC frequency is ~2599681 KHz
> EAL: Master lcore 0 is ready (tid=4c5f8900;cpuset=[0])
> EAL: lcore 1 is ready (tid=42fff700;cpuset=[1])
> EAL: PCI device :02:01.0 on NUMA socket -1
> EAL:   probe driver: 8086:100f rte_em_pmd
> EAL:   Not managed by a supported kernel driver, skipped
> EAL: PCI device :02:05.0 on NUMA socket -1
> EAL:   probe driver: 8086:100f rte_em_pmd
> EAL:   PCI memory mapped at 0x7ff64aa0
> EAL:   PCI memory mapped at 0x7ff64aa2
> PMD: eth_em_dev_init(): port_id 0 vendorID=0x8086 deviceID=0x100f
> EAL: PCI device :02:06.0 on NUMA socket -1
> EAL:   probe driver: 8086:100f rte_em_pmd
> EAL:   PCI memory mapped at 0x7ff64aa3
> EAL:   PCI memory mapped at 0x7ff64aa5
> EAL: Error reading from file descriptor 13: Input/output error
> EAL: Error reading from file descriptor 13: Input/output error
> EAL: Error reading from file descriptor 13: Input/output error
> EAL: Error reading from file descriptor 13: Input/output error
> EAL: Error reading from file descriptor 13: Input/output error
> EAL: Error reading from file descriptor 13: Input/output error
> EAL: Error reading fro

[dpdk-dev] [PATCH] mk: fix combined library building

2015-06-09 Thread Sergio Gonzalez Monroy
The combined lib was being created after building the lib root dir.
With the new directory hierarchy, it should be created after the
drivers root dir instead.

Fixes: 980ed498eb1dd0 ("drivers: create new directory")

Signed-off-by: Sergio Gonzalez Monroy 
---
 mk/rte.sdkbuild.mk | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mk/rte.sdkbuild.mk b/mk/rte.sdkbuild.mk
index 3154457..09bb60a 100644
--- a/mk/rte.sdkbuild.mk
+++ b/mk/rte.sdkbuild.mk
@@ -93,7 +93,7 @@ $(ROOTDIRS-y):
@[ -d $(BUILDDIR)/$@ ] || mkdir -p $(BUILDDIR)/$@
@echo "== Build $@"
$(Q)$(MAKE) S=$@ -f $(RTE_SRCDIR)/$@/Makefile -C $(BUILDDIR)/$@ all
-   @if [ $@ = lib -a $(RTE_BUILD_COMBINE_LIBS) = y ]; then \
+   @if [ $@ = drivers -a $(RTE_BUILD_COMBINE_LIBS) = y ]; then \
$(MAKE) -f $(RTE_SDK)/lib/Makefile sharelib; \
fi

-- 
1.9.3



[dpdk-dev] Shared library build broken

2015-06-09 Thread Gonzalez Monroy, Sergio
On 08/06/2015 22:29, Thomas F Herbert wrote:
> Sorry,
>
> I apologize on behalf of my fingers. I meant combined library build is 
> broken when PMD_BOND is selected.
>
>
> On 6/8/15 4:14 PM, Thomas F Herbert wrote:
>> I just noticed that shared library build is broking. I am building
>> current master. I had to make this change to get it to build:
>>
>> -CONFIG_RTE_LIBRTE_PMD_BOND=y
>> +CONFIG_RTE_LIBRTE_PMD_BOND=n
>>
>>
>> One of the recent bonding commits broke some dependencies I think but I
>> didn't investigate further.
>>
>> test_link_bonding.o: In function `test_add_slave_to_bonded_device':
>> test_link_bonding.c:(.text+0x44a): undefined reference to
>> `rte_eth_bond_slave_add'
>> test_link_bonding.c:(.text+0x462): undefined reference to
>> `rte_eth_bond_slaves_get'
>> test_link_bonding.c:(.text+0x487): undefined reference to
>> `rte_eth_bond_active_slaves_get
>> 
>>
>> --TFH
I just sent a patch to fix the issue.
Drivers (PMDs) were not being archive in the combined library.

Sergio


[dpdk-dev] [PATCH] support jumbo frames for pcap vdev

2015-06-09 Thread Maxim Uvarov
PCAP PMD vdev is used mostly for testing. Increase snapshot len
parameter provided to pcap_open_live() to accept packet more
then 4096 (support jumbo frames for pcap pmd).

Signed-off-by: Maxim Uvarov 
---
 lib/librte_pmd_pcap/rte_eth_pcap.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/lib/librte_pmd_pcap/rte_eth_pcap.c 
b/lib/librte_pmd_pcap/rte_eth_pcap.c
index eebe768..978c137 100644
--- a/lib/librte_pmd_pcap/rte_eth_pcap.c
+++ b/lib/librte_pmd_pcap/rte_eth_pcap.c
@@ -47,7 +47,6 @@
 #include 

 #define RTE_ETH_PCAP_SNAPSHOT_LEN 65535
-#define RTE_ETH_PCAP_SNAPLEN 4096
 #define RTE_ETH_PCAP_PROMISC 1
 #define RTE_ETH_PCAP_TIMEOUT -1
 #define ETH_PCAP_RX_PCAP_ARG  "rx_pcap"
@@ -468,7 +467,7 @@ open_tx_pcap(const char *key __rte_unused, const char 
*value, void *extra_args)
  */
 static inline int
 open_iface_live(const char *iface, pcap_t **pcap) {
-   *pcap = pcap_open_live(iface, RTE_ETH_PCAP_SNAPLEN,
+   *pcap = pcap_open_live(iface, RTE_ETH_PCAP_SNAPSHOT_LEN,
RTE_ETH_PCAP_PROMISC, RTE_ETH_PCAP_TIMEOUT, errbuf);

if (*pcap == NULL) {
-- 
1.9.1



[dpdk-dev] [PATCH] dpdk1.7.1 rte.app.mk add options not not build targerts

2015-06-09 Thread Maxim Uvarov
Inherit build varibles only so that this file can be included
from other projects.

Signed-off-by: Maxim Uvarov 
---
 mk/rte.app.mk | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index 34dff2a..b75925d 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -222,6 +222,7 @@ endif # ifeq ($(NO_AUTOLIBS),)

 LDLIBS += $(CPU_LDLIBS)

+ifneq ($(DPRK_APP_MK_SKIP_BUILD_TARGETS),1)
 .PHONY: all
 all: install

@@ -232,6 +233,7 @@ _postinstall: build

 .PHONY: build
 build: _postbuild
+endif

 exe2cmd = $(strip $(call dotfile,$(patsubst %,%.cmd,$(1

@@ -306,6 +308,7 @@ $(RTE_OUTPUT)/app/$(APP).map: $(APP)
@[ -d $(RTE_OUTPUT)/app ] || mkdir -p $(RTE_OUTPUT)/app
$(Q)cp -f $(APP).map $(RTE_OUTPUT)/app

+ifneq ($(DPRK_APP_MK_SKIP_BUILD_TARGETS), 1)
 #
 # Clean all generated files
 #
@@ -317,7 +320,7 @@ clean: _postclean
 doclean:
$(Q)rm -rf $(APP) $(OBJS-all) $(DEPS-all) $(DEPSTMP-all) \
  $(CMDS-all) $(INSTALL-FILES-all) .$(APP).cmd
-
+endif

 include $(RTE_SDK)/mk/internal/rte.compile-post.mk
 include $(RTE_SDK)/mk/internal/rte.install-post.mk
-- 
1.9.1



[dpdk-dev] [PATCH] mk: fix combined library building

2015-06-09 Thread De Lara Guarch, Pablo


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Sergio Gonzalez
> Monroy
> Sent: Tuesday, June 09, 2015 10:37 AM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH] mk: fix combined library building
> 
> The combined lib was being created after building the lib root dir.
> With the new directory hierarchy, it should be created after the
> drivers root dir instead.
> 
> Fixes: 980ed498eb1dd0 ("drivers: create new directory")
> 
> Signed-off-by: Sergio Gonzalez Monroy

Acked-by: Pablo de Lara 


[dpdk-dev] [PATCH] ixgbe: fix checking for tx_free_thresh

2015-06-09 Thread Ananyev, Konstantin


> -Original Message-
> From: Zoltan Kiss [mailto:zoltan.kiss at linaro.org]
> Sent: Wednesday, June 03, 2015 6:47 PM
> To: Ananyev, Konstantin; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] ixgbe: fix checking for tx_free_thresh
> 
> 
> 
> On 02/06/15 18:35, Ananyev, Konstantin wrote:
> >
> >
> >> -Original Message-
> >> From: Zoltan Kiss [mailto:zoltan.kiss at linaro.org]
> >> Sent: Tuesday, June 02, 2015 4:08 PM
> >> To: Ananyev, Konstantin; dev at dpdk.org
> >> Subject: Re: [dpdk-dev] [PATCH] ixgbe: fix checking for tx_free_thresh
> >>
> >>
> >>
> >> On 02/06/15 14:31, Ananyev, Konstantin wrote:
> >>> Hi Zoltan,
> >>>
>  -Original Message-
>  From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Zoltan Kiss
>  Sent: Monday, June 01, 2015 5:16 PM
>  To: dev at dpdk.org
>  Subject: Re: [dpdk-dev] [PATCH] ixgbe: fix checking for tx_free_thresh
> 
>  Hi,
> 
>  Anyone would like to review this patch? Venky sent a NAK, but I've
>  explained to him why it is a bug.
> >>>
> >>>
> >>> Well, I think Venky is right here.
> >> I think the comments above rte_eth_tx_burst() definition are quite clear
> >> about what tx_free_thresh means, e1000 and i40e use it that way, but not
> >> ixgbe.
> >>
> >>> Indeed that fix, will cause more often unsuccessful checks for DD bits 
> >>> and might cause a
> >>> slowdown for TX fast-path.
> >> Not if the applications set tx_free_thresh according to the definition
> >> of this value. But we can change the default value from 32 to something
> >> higher, e.g I'm using nb_desc/2, and it works out well.
> >
> > Sure we can, as I said below, we can unify it one way or another.
> > One way would be to make fast-path TX to free TXDs when number of occupied 
> > TXDs raises above tx_free_thresh
> > (what rte_ethdev.h comments say and what full-featured TX is doing).
> > Though in that case we have to change default value for tx_free_thresh, and 
> > all existing apps that
> > using tx_free_thresh==32 and fast-path TX will probably experience a 
> > slowdown.
> 
> They are in trouble already, because i40e and e1000 uses it as defined.

In fact, i40e has exactly the same problem as ixgbe:
fast-path and full-featured TX  code treat  tx_free_thresh in a different way.
igb just ignores input tx_free_thresh, while em has only full featured path.

What I am saying, existing app that uses TX fast-path and sets tx_free_thresh=32
(as we did in our examples in previous versions) will experience a slowdown,
if we'll make all TX functions to behave like full-featured ones
(txq->nb_tx_desc - txq->nb_tx_free > txq->tx_free_thresh).

>From other side, if app uses TX full-featured TX and sets tx_free_thresh=32,
then it  already has a possible slowdown, because of too often TXDs checking. 
So, if we'll change tx_free_thresh semantics to wht fast-path uses,
It shouldn't see any slowdown, in fact it might see some improvement.

> But I guess most apps are going with 0, which sets the drivers default.
> Others have to change the value to nb_txd - curr_value to have the same
> behaviour
> 
> > Another way would be to make all TX functions to treat 
> > tx_conf->tx_free_thresh as fast-path TX functions do
> > (free TXDs when number of free TXDs drops below  tx_free_thresh) and update 
> >  rte_ethdev.h comments.
> And i40e and e1000e code as well. I don't see what difference it makes
> which way of definition you use, what I care is that it should be used
> consistently.

Yes, both ways are possible, the concern is - how to minimise the impact for 
existing apps.
That's why I am leaning to the fast-path way.

> >
> > Though, I am not sure that it really worth all these changes.
> >  From one side, whatever tx_free_thresh would be,
> > the app should still assume that the worst case might happen,
> > and up to nb_tx_desc mbufs can be consumed by the queue.
> >  From other side, I think the default value should work well for most cases.
> > So I am still for graceful deprecation of that config parameter, see below.
> >
> >>
> >>> Anyway, with current PMD implementation, you can't guarantee that at any 
> >>> moment
> >>> TX queue wouldn't use more than tx_free_thresh mbufs.
> >>
> >>
> >>> There could be situations (low speed, or link is down for some short 
> >>> period, etc), when
> >>> much more than tx_free_thresh TXDs are in use and none of them could be 
> >>> freed by HW right now.
> >>> So your app better be prepared, that up to (nb_tx_desc * 
> >>> num_of_TX_queues) could be in use
> >>> by TX path at any given moment.
> >>>
> >>> Though yes,  there is an inconsistency how different ixgbe TX functions 
> >>> treat tx_conf->tx_free_thresh parameter.
> >>> That probably creates wrong expectations and confusion.
> >> Yes, ixgbe_xmit_pkts() use it the way it's defined, this two function
> >> doesn't.
> >>
> >>> We might try to unify it's usage one way or another, but I personally 
> >>> don't see much point in it.
> >>> After all, tx_fre

[dpdk-dev] [PATCH] dpdk1.7.1 rte.app.mk add options not not build targerts

2015-06-09 Thread Olivier MATZ
Hello Maxim,

On 06/09/2015 12:15 PM, Maxim Uvarov wrote:
> Inherit build varibles only so that this file can be included
> from other projects.
>
> Signed-off-by: Maxim Uvarov 

Can you detail a bit more what you want to do?
Why do you need to include rte.app.mk? This file is
internal to the dpdk framework.

By the way, the title is not understandable:
- why dpdk1.7.1 ?
- targerts -> targets
- not not ?

Regards,
Olivier


> ---
>   mk/rte.app.mk | 5 -
>   1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/mk/rte.app.mk b/mk/rte.app.mk
> index 34dff2a..b75925d 100644
> --- a/mk/rte.app.mk
> +++ b/mk/rte.app.mk
> @@ -222,6 +222,7 @@ endif # ifeq ($(NO_AUTOLIBS),)
>
>   LDLIBS += $(CPU_LDLIBS)
>
> +ifneq ($(DPRK_APP_MK_SKIP_BUILD_TARGETS),1)
>   .PHONY: all
>   all: install
>
> @@ -232,6 +233,7 @@ _postinstall: build
>
>   .PHONY: build
>   build: _postbuild
> +endif
>
>   exe2cmd = $(strip $(call dotfile,$(patsubst %,%.cmd,$(1
>
> @@ -306,6 +308,7 @@ $(RTE_OUTPUT)/app/$(APP).map: $(APP)
>   @[ -d $(RTE_OUTPUT)/app ] || mkdir -p $(RTE_OUTPUT)/app
>   $(Q)cp -f $(APP).map $(RTE_OUTPUT)/app
>
> +ifneq ($(DPRK_APP_MK_SKIP_BUILD_TARGETS), 1)
>   #
>   # Clean all generated files
>   #
> @@ -317,7 +320,7 @@ clean: _postclean
>   doclean:
>   $(Q)rm -rf $(APP) $(OBJS-all) $(DEPS-all) $(DEPSTMP-all) \
> $(CMDS-all) $(INSTALL-FILES-all) .$(APP).cmd
> -
> +endif
>
>   include $(RTE_SDK)/mk/internal/rte.compile-post.mk
>   include $(RTE_SDK)/mk/internal/rte.install-post.mk
>



[dpdk-dev] [PATCH] support jumbo frames for pcap vdev

2015-06-09 Thread Mcnamara, John
> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Maxim Uvarov
> Sent: Tuesday, June 9, 2015 11:15 AM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH] support jumbo frames for pcap vdev
> 
> PCAP PMD vdev is used mostly for testing. Increase snapshot len parameter
> provided to pcap_open_live() to accept packet more then 4096 (support
> jumbo frames for pcap pmd).

Hi,

Thanks for the submission.

There is already an existing patch for jumbo frame support in the PCAP pmd.

http://dpdk.org/dev/patchwork/patch/3792/

Could you review/try that and see if it is suitable for your purposes.

Regards,

John.
-- 








> 
> Signed-off-by: Maxim Uvarov 
> ---
>  lib/librte_pmd_pcap/rte_eth_pcap.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/lib/librte_pmd_pcap/rte_eth_pcap.c
> b/lib/librte_pmd_pcap/rte_eth_pcap.c
> index eebe768..978c137 100644
> --- a/lib/librte_pmd_pcap/rte_eth_pcap.c
> +++ b/lib/librte_pmd_pcap/rte_eth_pcap.c
> @@ -47,7 +47,6 @@
>  #include 
> 
>  #define RTE_ETH_PCAP_SNAPSHOT_LEN 65535 -#define RTE_ETH_PCAP_SNAPLEN
> 4096  #define RTE_ETH_PCAP_PROMISC 1  #define RTE_ETH_PCAP_TIMEOUT -1
> #define ETH_PCAP_RX_PCAP_ARG  "rx_pcap"
> @@ -468,7 +467,7 @@ open_tx_pcap(const char *key __rte_unused, const char
> *value, void *extra_args)
>   */
>  static inline int
>  open_iface_live(const char *iface, pcap_t **pcap) {
> - *pcap = pcap_open_live(iface, RTE_ETH_PCAP_SNAPLEN,
> + *pcap = pcap_open_live(iface, RTE_ETH_PCAP_SNAPSHOT_LEN,
>   RTE_ETH_PCAP_PROMISC, RTE_ETH_PCAP_TIMEOUT, errbuf);
> 
>   if (*pcap == NULL) {
> --
> 1.9.1



[dpdk-dev] [PATCH] support jumbo frames for pcap vdev

2015-06-09 Thread Maxim Uvarov
On 06/09/15 15:15, Mcnamara, John wrote:
>> -Original Message-
>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Maxim Uvarov
>> Sent: Tuesday, June 9, 2015 11:15 AM
>> To: dev at dpdk.org
>> Subject: [dpdk-dev] [PATCH] support jumbo frames for pcap vdev
>>
>> PCAP PMD vdev is used mostly for testing. Increase snapshot len parameter
>> provided to pcap_open_live() to accept packet more then 4096 (support
>> jumbo frames for pcap pmd).
> Hi,
>
> Thanks for the submission.
>
> There is already an existing patch for jumbo frame support in the PCAP pmd.
>
>  http://dpdk.org/dev/patchwork/patch/3792/
>
> Could you review/try that and see if it is suitable for your purposes.
>
> Regards,
>
> John.
Thanks, I did not see that patch. I see that your patch supports 
segmentation also. Will do test in my environment.

Maxim.


[dpdk-dev] [PATCH v2] mk: remove "u" modifier from "ar" command

2015-06-09 Thread Bruce Richardson
On Fedora 22, the "ar" binary operates by default in deterministic mode,
making the "u" parameter irrelevant, and leading to warning messages
getting printed in the build output like below.

  INSTALL-LIB librte_kvargs.a
ar: `u' modifier ignored since `D' is the default (see `U')

There are two options to remove these warnings:
* add in the "U" flag to make "ar" non-deterministic again
* remove the "u" flag to have all objects always updated

This patch takes the second approach. It also explicitly adds in the "D"
flag to make behaviour consistent across different distributions which
may have different defaults.

Signed-off-by: Bruce Richardson 

---
V2 Changes: Add in "D" flag for consistency across distros.
---
 mk/rte.lib.mk | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mk/rte.lib.mk b/mk/rte.lib.mk
index 0d7482d..25aa989 100644
--- a/mk/rte.lib.mk
+++ b/mk/rte.lib.mk
@@ -70,7 +70,7 @@ else
 _CPU_LDFLAGS := $(CPU_LDFLAGS)
 endif

-O_TO_A = $(AR) crus $(LIB) $(OBJS-y)
+O_TO_A = $(AR) crDs $(LIB) $(OBJS-y)
 O_TO_A_STR = $(subst ','\'',$(O_TO_A)) #'# fix syntax highlight
 O_TO_A_DISP = $(if $(V),"$(O_TO_A_STR)","  AR $(@)")
 O_TO_A_CMD = "cmd_$@ = $(O_TO_A_STR)"
-- 
2.4.2



[dpdk-dev] [PATCH v2] mbuf: optimize rte_mbuf_refcnt_update

2015-06-09 Thread Bruce Richardson
On Mon, Jun 08, 2015 at 04:57:22PM +0200, Olivier Matz wrote:
> In __rte_pktmbuf_prefree_seg(), there was an optimization to avoid using
> a costly atomic operation when updating the mbuf reference counter if
> its value is 1. Indeed, it means that we are the only owner of the mbuf,
> and therefore nobody can change it at the same time.
> 
> We can generalize this optimization directly in rte_mbuf_refcnt_update()
> so the other callers of this function, like rte_pktmbuf_attach(), can
> also take advantage of this optimization.
> 
> Signed-off-by: Olivier Matz 

Acked-by: Bruce Richardson 



[dpdk-dev] [PATCH] dpdk1.7.1 rte.app.mk add options not not build targerts

2015-06-09 Thread Maxim Uvarov
On 06/09/15 15:05, Olivier MATZ wrote:
> Hello Maxim,
>
> On 06/09/2015 12:15 PM, Maxim Uvarov wrote:
>> Inherit build varibles only so that this file can be included
>> from other projects.
>>
>> Signed-off-by: Maxim Uvarov 
>
> Can you detail a bit more what you want to do?
> Why do you need to include rte.app.mk? This file is
> internal to the dpdk framework.
>
> By the way, the title is not understandable:
> - why dpdk1.7.1 ?
> - targerts -> targets
> - not not ?
>
> Regards,
> Olivier

Sorry it was quick patch, some typos there. I intended to discuss the 
idea of what I need and might it it's useful for others.
I did ODP implementation with dpdk as back end. And staid on v1.7.1. But 
that patch should be good for the latest git,
if not I can update it.

So my environment is: I build library which calls dpdk functions. That 
library is used to build applications. I need to steal CFLAGS, LDFLAGS,
and build script from dpdk for my library and example apps. So I just 
point where dpdk is and my library build system should inherit the same
env which dpdk used. One reason is optimization and second reason is to 
compile in dpdk PMD drivers the same way as dpdk does that.

So in my Makefile I do: include $dpdk/mk/rte.app.mk

Is that needed for somebody else?

Thanks,
Maxim.

>
>
>> ---
>>   mk/rte.app.mk | 5 -
>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/mk/rte.app.mk b/mk/rte.app.mk
>> index 34dff2a..b75925d 100644
>> --- a/mk/rte.app.mk
>> +++ b/mk/rte.app.mk
>> @@ -222,6 +222,7 @@ endif # ifeq ($(NO_AUTOLIBS),)
>>
>>   LDLIBS += $(CPU_LDLIBS)
>>
>> +ifneq ($(DPRK_APP_MK_SKIP_BUILD_TARGETS),1)
>>   .PHONY: all
>>   all: install
>>
>> @@ -232,6 +233,7 @@ _postinstall: build
>>
>>   .PHONY: build
>>   build: _postbuild
>> +endif
>>
>>   exe2cmd = $(strip $(call dotfile,$(patsubst %,%.cmd,$(1
>>
>> @@ -306,6 +308,7 @@ $(RTE_OUTPUT)/app/$(APP).map: $(APP)
>>   @[ -d $(RTE_OUTPUT)/app ] || mkdir -p $(RTE_OUTPUT)/app
>>   $(Q)cp -f $(APP).map $(RTE_OUTPUT)/app
>>
>> +ifneq ($(DPRK_APP_MK_SKIP_BUILD_TARGETS), 1)
>>   #
>>   # Clean all generated files
>>   #
>> @@ -317,7 +320,7 @@ clean: _postclean
>>   doclean:
>>   $(Q)rm -rf $(APP) $(OBJS-all) $(DEPS-all) $(DEPSTMP-all) \
>> $(CMDS-all) $(INSTALL-FILES-all) .$(APP).cmd
>> -
>> +endif
>>
>>   include $(RTE_SDK)/mk/internal/rte.compile-post.mk
>>   include $(RTE_SDK)/mk/internal/rte.install-post.mk
>>
>



[dpdk-dev] Shared library build broken

2015-06-09 Thread Thomas F Herbert


On 6/9/15 5:40 AM, Gonzalez Monroy, Sergio wrote:
> On 08/06/2015 22:29, Thomas F Herbert wrote:
>> Sorry,
>>
>> I apologize on behalf of my fingers. I meant combined library build is
>> broken when PMD_BOND is selected.
>>
>>
>> On 6/8/15 4:14 PM, Thomas F Herbert wrote:
>>> I just noticed that shared library build is broking. I am building
>>> current master. I had to make this change to get it to build:
>>>
>>> -CONFIG_RTE_LIBRTE_PMD_BOND=y
>>> +CONFIG_RTE_LIBRTE_PMD_BOND=n
>>>
>>>
>>> One of the recent bonding commits broke some dependencies I think but I
>>> didn't investigate further.
>>>
>>> test_link_bonding.o: In function `test_add_slave_to_bonded_device':
>>> test_link_bonding.c:(.text+0x44a): undefined reference to
>>> `rte_eth_bond_slave_add'
>>> test_link_bonding.c:(.text+0x462): undefined reference to
>>> `rte_eth_bond_slaves_get'
>>> test_link_bonding.c:(.text+0x487): undefined reference to
>>> `rte_eth_bond_active_slaves_get
>>> 
>>>
>>> --TFH
> I just sent a patch to fix the issue.
Thanks. It is fixed.
> Drivers (PMDs) were not being archive in the combined library.
>
> Sergio


[dpdk-dev] [PATCH] vhost: flush used->idx update before reading avail->flags

2015-06-09 Thread Xie, Huawei
On 6/9/2015 4:47 PM, Michael S. Tsirkin wrote:
> On Tue, Jun 09, 2015 at 03:04:02PM +0800, Linhaifeng wrote:
>>
>> On 2015/4/24 15:27, Luke Gorrie wrote:
>>> On 24 April 2015 at 03:01, Linhaifeng  wrote:
>>>
 If not add memory fence what would happen? Packets loss or interrupt
 loss?How to test it ?

>>> You should be able to test it like this:
>>>
>>> 1. Boot two Linux kernel (e.g. 3.13) guests.
>>> 2. Connect them via vhost switch.
>>> 3. Run continuous traffic between them (e.g. iperf).
>>>
>>> I would expect that within a reasonable timeframe (< 1 hour) one of the
>>> guests' network interfaces will hang indefinitely due to a missed interrupt.
>>>
>>> You won't be able to reproduce this using DPDK guests because they are not
>>> using the same interrupt suppression method.
>>>
>>> This is a serious real-world problem. I wouldn't deploy the vhost
>>> implementation without this fix.
>>>
>>> Cheers,
>>> -Luke
>>>
>> I think this patch can't resole this problem. On the other hand we still 
>> would miss interrupt.
>>
>> After add rte_mb() function the we want the case is :
>> 1.write used->idx. ring is full or empty.
>> 2.virtio_net open interrupt.
>> 3.read avail->flags.
>>
>> but this case(miss interrupt) would happen too:
>> 1.write used->idx. ring is full or empty.
>> 2.read avail->flags.
>> 3.virtio_net open interrupt.
>>
> That's why a correct guest, after detecting an empty used ring, must always
> re-check used idx at least once after writing avail->flags.
>
> By the way, similarly, host side must re-check avail idx after writing
> used flags. I don't see where snabbswitch does it - is that a bug
> in snabbswitch?
>
yes, both host and guest should recheck if there is more work added
after they toggle the flag.
For DPDK vHost, as it runs in polling mode, we will recheck avail idx
soon, so we don't need recheck.


[dpdk-dev] [PATCH v2] mk: remove "u" modifier from "ar" command

2015-06-09 Thread Olivier MATZ
Hi Bruce,

On 06/09/2015 02:51 PM, Bruce Richardson wrote:
> On Fedora 22, the "ar" binary operates by default in deterministic mode,
> making the "u" parameter irrelevant, and leading to warning messages
> getting printed in the build output like below.
>
>INSTALL-LIB librte_kvargs.a
> ar: `u' modifier ignored since `D' is the default (see `U')
>
> There are two options to remove these warnings:
> * add in the "U" flag to make "ar" non-deterministic again
> * remove the "u" flag to have all objects always updated
>
> This patch takes the second approach. It also explicitly adds in the "D"
> flag to make behaviour consistent across different distributions which
> may have different defaults.
>
> Signed-off-by: Bruce Richardson 

Acked-by: Olivier Matz 


>
> ---
> V2 Changes: Add in "D" flag for consistency across distros.
> ---
>   mk/rte.lib.mk | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mk/rte.lib.mk b/mk/rte.lib.mk
> index 0d7482d..25aa989 100644
> --- a/mk/rte.lib.mk
> +++ b/mk/rte.lib.mk
> @@ -70,7 +70,7 @@ else
>   _CPU_LDFLAGS := $(CPU_LDFLAGS)
>   endif
>
> -O_TO_A = $(AR) crus $(LIB) $(OBJS-y)
> +O_TO_A = $(AR) crDs $(LIB) $(OBJS-y)
>   O_TO_A_STR = $(subst ','\'',$(O_TO_A)) #'# fix syntax highlight
>   O_TO_A_DISP = $(if $(V),"$(O_TO_A_STR)","  AR $(@)")
>   O_TO_A_CMD = "cmd_$@ = $(O_TO_A_STR)"
>



[dpdk-dev] [PATCH] dpdk1.7.1 rte.app.mk add options not not build targerts

2015-06-09 Thread Olivier MATZ
Hi Maxim,

On 06/09/2015 02:59 PM, Maxim Uvarov wrote:
> On 06/09/15 15:05, Olivier MATZ wrote:
>> Hello Maxim,
>>
>> On 06/09/2015 12:15 PM, Maxim Uvarov wrote:
>>> Inherit build varibles only so that this file can be included
>>> from other projects.
>>>
>>> Signed-off-by: Maxim Uvarov 
>>
>> Can you detail a bit more what you want to do?
>> Why do you need to include rte.app.mk? This file is
>> internal to the dpdk framework.
>>
>> By the way, the title is not understandable:
>> - why dpdk1.7.1 ?
>> - targerts -> targets
>> - not not ?
>>
>> Regards,
>> Olivier
>
> Sorry it was quick patch, some typos there. I intended to discuss the
> idea of what I need and might it it's useful for others.
> I did ODP implementation with dpdk as back end. And staid on v1.7.1. But
> that patch should be good for the latest git,
> if not I can update it.
>
> So my environment is: I build library which calls dpdk functions. That
> library is used to build applications. I need to steal CFLAGS, LDFLAGS,
> and build script from dpdk for my library and example apps. So I just
> point where dpdk is and my library build system should inherit the same
> env which dpdk used. One reason is optimization and second reason is to
> compile in dpdk PMD drivers the same way as dpdk does that.
>
> So in my Makefile I do: include $dpdk/mk/rte.app.mk
>
> Is that needed for somebody else?

Maybe you can use rte.extapp.mk and rte.extlib.mk instead?

There is no example for rte.extlib.mk, but it works the same
as rte.extapp.mk. You can start from an example in dpdk/examples
directory (for instance skeleton):
- remove the main()
- change "APP = basicfwd" to "LIB = basicfwd.a"
- change "include $(RTE_SDK)/mk/rte.extapp.mk" to
   "include $(RTE_SDK)/mk/rte.extlib.mk"

Then:
   cd examples/skeleton
   make RTE_SDK=/path/to/dpdk \
 RTE_TARGET=x86_64-native-linuxapp-gcc \
 O=/path/to/dstdir

This should generate a static lib that you can use in another
application example.

If you cannot use this model, another solution would be to generate
a pkg-config file in dpdk framework that could be used by other
build frameworks.

Regards,
Olivier



[dpdk-dev] [PATCH] ixgbe: fix checking for tx_free_thresh

2015-06-09 Thread Zoltan Kiss


On 09/06/15 12:18, Ananyev, Konstantin wrote:
>
>
>> -Original Message-
>> From: Zoltan Kiss [mailto:zoltan.kiss at linaro.org]
>> Sent: Wednesday, June 03, 2015 6:47 PM
>> To: Ananyev, Konstantin; dev at dpdk.org
>> Subject: Re: [dpdk-dev] [PATCH] ixgbe: fix checking for tx_free_thresh
>>
>>
>>
>> On 02/06/15 18:35, Ananyev, Konstantin wrote:
>>>
>>>
 -Original Message-
 From: Zoltan Kiss [mailto:zoltan.kiss at linaro.org]
 Sent: Tuesday, June 02, 2015 4:08 PM
 To: Ananyev, Konstantin; dev at dpdk.org
 Subject: Re: [dpdk-dev] [PATCH] ixgbe: fix checking for tx_free_thresh



 On 02/06/15 14:31, Ananyev, Konstantin wrote:
> Hi Zoltan,
>
>> -Original Message-
>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Zoltan Kiss
>> Sent: Monday, June 01, 2015 5:16 PM
>> To: dev at dpdk.org
>> Subject: Re: [dpdk-dev] [PATCH] ixgbe: fix checking for tx_free_thresh
>>
>> Hi,
>>
>> Anyone would like to review this patch? Venky sent a NAK, but I've
>> explained to him why it is a bug.
>
>
> Well, I think Venky is right here.
 I think the comments above rte_eth_tx_burst() definition are quite clear
 about what tx_free_thresh means, e1000 and i40e use it that way, but not
 ixgbe.

> Indeed that fix, will cause more often unsuccessful checks for DD bits 
> and might cause a
> slowdown for TX fast-path.
 Not if the applications set tx_free_thresh according to the definition
 of this value. But we can change the default value from 32 to something
 higher, e.g I'm using nb_desc/2, and it works out well.
>>>
>>> Sure we can, as I said below, we can unify it one way or another.
>>> One way would be to make fast-path TX to free TXDs when number of occupied 
>>> TXDs raises above tx_free_thresh
>>> (what rte_ethdev.h comments say and what full-featured TX is doing).
>>> Though in that case we have to change default value for tx_free_thresh, and 
>>> all existing apps that
>>> using tx_free_thresh==32 and fast-path TX will probably experience a 
>>> slowdown.
>>
>> They are in trouble already, because i40e and e1000 uses it as defined.
>
> In fact, i40e has exactly the same problem as ixgbe:
> fast-path and full-featured TX  code treat  tx_free_thresh in a different way.
> igb just ignores input tx_free_thresh, while em has only full featured path.
>
> What I am saying, existing app that uses TX fast-path and sets 
> tx_free_thresh=32
> (as we did in our examples in previous versions) will experience a slowdown,
> if we'll make all TX functions to behave like full-featured ones
> (txq->nb_tx_desc - txq->nb_tx_free > txq->tx_free_thresh).
>
>  From other side, if app uses TX full-featured TX and sets tx_free_thresh=32,
> then it  already has a possible slowdown, because of too often TXDs checking.
> So, if we'll change tx_free_thresh semantics to wht fast-path uses,
> It shouldn't see any slowdown, in fact it might see some improvement.
>
>> But I guess most apps are going with 0, which sets the drivers default.
>> Others have to change the value to nb_txd - curr_value to have the same
>> behaviour
>>
>>> Another way would be to make all TX functions to treat 
>>> tx_conf->tx_free_thresh as fast-path TX functions do
>>> (free TXDs when number of free TXDs drops below  tx_free_thresh) and update 
>>>  rte_ethdev.h comments.
>> And i40e and e1000e code as well. I don't see what difference it makes
>> which way of definition you use, what I care is that it should be used
>> consistently.
>
> Yes, both ways are possible, the concern is - how to minimise the impact for 
> existing apps.
> That's why I am leaning to the fast-path way.

Make sense to favour the fast-path way, I'll look into that and try to 
come up with a patch

>
>>>
>>> Though, I am not sure that it really worth all these changes.
>>>   From one side, whatever tx_free_thresh would be,
>>> the app should still assume that the worst case might happen,
>>> and up to nb_tx_desc mbufs can be consumed by the queue.
>>>   From other side, I think the default value should work well for most 
>>> cases.
>>> So I am still for graceful deprecation of that config parameter, see below.
>>>

> Anyway, with current PMD implementation, you can't guarantee that at any 
> moment
> TX queue wouldn't use more than tx_free_thresh mbufs.


> There could be situations (low speed, or link is down for some short 
> period, etc), when
> much more than tx_free_thresh TXDs are in use and none of them could be 
> freed by HW right now.
> So your app better be prepared, that up to (nb_tx_desc * 
> num_of_TX_queues) could be in use
> by TX path at any given moment.
>
> Though yes,  there is an inconsistency how different ixgbe TX functions 
> treat tx_conf->tx_free_thresh parameter.
> That probably creates wrong expectations and confusion.
 Yes, ixgbe_xmit_pkts()

[dpdk-dev] [PATCH v2 0/7] Expose IXGBE extended stats to DPDK apps

2015-06-09 Thread Maryam Tahhan
This patch implements xstats_get() and xstats_reset() in dev_ops for
ixgbe to expose detailed error statistics to DPDK applications. The
dump_cfg application was extended to demonstrate the usage of
retrieving statistics for DPDK interfaces and renamed to proc_info
in order reflect this new functionality. test_pmd was also extended
to display additional statistics.

Maryam Tahhan (7):
  ethdev: add additional error stats
  ixgbe: move stats register reads to a new function
  ixgbe: Expose extended error statistics
  ethdev: expose extended error stats
  testpmd: extend testpmd to show all extended stats
  app: remove dump_cfg
  app: add a new app proc_info

 MAINTAINERS  |   4 +
 app/Makefile |   2 +-
 app/dump_cfg/Makefile|  45 
 app/dump_cfg/main.c  |  92 ---
 app/proc_info/Makefile   |  45 
 app/proc_info/main.c | 514 +++
 app/test-pmd/config.c|   5 +
 drivers/net/ixgbe/ixgbe_ethdev.c | 161 ++--
 lib/librte_ether/rte_ethdev.c|  11 +-
 lib/librte_ether/rte_ethdev.h|   4 +
 mk/rte.sdktest.mk|   4 +-
 11 files changed, 722 insertions(+), 165 deletions(-)
 delete mode 100644 app/dump_cfg/Makefile
 delete mode 100644 app/dump_cfg/main.c
 create mode 100644 app/proc_info/Makefile
 create mode 100644 app/proc_info/main.c

-- 
1.9.3



[dpdk-dev] [PATCH v2 1/7] ethdev: add additional error stats

2015-06-09 Thread Maryam Tahhan
Add MAC error and drop statistics to struct rte_eth_stats and the
extended stats.
Signed-off-by: Maryam Tahhan 
---
 lib/librte_ether/rte_ethdev.c | 4 
 lib/librte_ether/rte_ethdev.h | 4 
 2 files changed, 8 insertions(+)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 5a94654..a439b4a 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -136,6 +136,10 @@ static const struct rte_eth_xstats_name_off 
rte_stats_strings[] = {
{"rx_flow_control_xon", offsetof(struct rte_eth_stats, rx_pause_xon)},
{"tx_flow_control_xoff", offsetof(struct rte_eth_stats, tx_pause_xoff)},
{"rx_flow_control_xoff", offsetof(struct rte_eth_stats, rx_pause_xoff)},
+   {"rx_mac_err", offsetof(struct rte_eth_stats, imacerr)},
+   {"rx_phy_err", offsetof(struct rte_eth_stats, iphyerr)},
+   {"tx_drops", offsetof(struct rte_eth_stats, odrop)},
+   {"rx_drops", offsetof(struct rte_eth_stats, idrop)}
 };
 #define RTE_NB_STATS (sizeof(rte_stats_strings) / sizeof(rte_stats_strings[0]))

diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 16dbe00..5bc3b81 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -224,6 +224,10 @@ struct rte_eth_stats {
/**< Total number of good bytes received from loopback,VF Only */
uint64_t olbbytes;
/**< Total number of good bytes transmitted to loopback,VF Only */
+   uint64_t imacerr;   /**< Total of RX packets with MAC Errors. */
+   uint64_t iphyerr;   /**< Total of RX packets with PHY Errors. */
+   uint64_t idrop;  /**< Total number of dropped received packets. */
+   uint64_t odrop;  /**< Total number of dropped transmitted packets. */
 };

 /**
-- 
1.9.3



[dpdk-dev] [PATCH v2 2/7] ixgbe: move stats register reads to a new function

2015-06-09 Thread Maryam Tahhan
Move stats register reads to ixgbe_read_stats_registers() as it will be
used by the functions to retrieve stats and extended stats.

Signed-off-by: Maryam Tahhan 
---
 drivers/net/ixgbe/ixgbe_ethdev.c | 76 
 1 file changed, 54 insertions(+), 22 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index 0d9f9b2..543e8ab 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -1739,24 +1739,16 @@ ixgbe_dev_close(struct rte_eth_dev *dev)
ixgbe_set_rar(hw, 0, hw->mac.addr, 0, IXGBE_RAH_AV);
 }

-/*
- * This function is based on ixgbe_update_stats_counters() in base/ixgbe.c
- */
 static void
-ixgbe_dev_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats)
+ixgbe_read_stats_registers(struct ixgbe_hw *hw, struct ixgbe_hw_stats
+  *hw_stats, uint64_t 
*total_missed_rx,
+  uint64_t *total_qbrc, 
uint64_t *total_qprc,
+  uint64_t *rxnfgpc, uint64_t 
*txdgpc,
+  uint64_t *total_qprdc)
 {
-   struct ixgbe_hw *hw =
-   IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
-   struct ixgbe_hw_stats *hw_stats =
-   IXGBE_DEV_PRIVATE_TO_STATS(dev->data->dev_private);
uint32_t bprc, lxon, lxoff, total;
-   uint64_t total_missed_rx, total_qbrc, total_qprc;
unsigned i;

-   total_missed_rx = 0;
-   total_qbrc = 0;
-   total_qprc = 0;
-
hw_stats->crcerrs += IXGBE_READ_REG(hw, IXGBE_CRCERRS);
hw_stats->illerrc += IXGBE_READ_REG(hw, IXGBE_ILLERRC);
hw_stats->errbc += IXGBE_READ_REG(hw, IXGBE_ERRBC);
@@ -1768,7 +1760,7 @@ ixgbe_dev_stats_get(struct rte_eth_dev *dev, struct 
rte_eth_stats *stats)
/* global total per queue */
hw_stats->mpc[i] += mp;
/* Running comprehensive total for stats display */
-   total_missed_rx += hw_stats->mpc[i];
+   *total_missed_rx += hw_stats->mpc[i];
if (hw->mac.type == ixgbe_mac_82598EB)
hw_stats->rnbc[i] +=
IXGBE_READ_REG(hw, IXGBE_RNBC(i));
@@ -1792,10 +1784,11 @@ ixgbe_dev_stats_get(struct rte_eth_dev *dev, struct 
rte_eth_stats *stats)
hw_stats->qbtc[i] += IXGBE_READ_REG(hw, IXGBE_QBTC_L(i));
hw_stats->qbtc[i] +=
((uint64_t)IXGBE_READ_REG(hw, IXGBE_QBTC_H(i)) << 32);
-   hw_stats->qprdc[i] += IXGBE_READ_REG(hw, IXGBE_QPRDC(i));
+   *total_qprdc += hw_stats->qprdc[i] +=
+   IXGBE_READ_REG(hw, IXGBE_QPRDC(i));

-   total_qprc += hw_stats->qprc[i];
-   total_qbrc += hw_stats->qbrc[i];
+   *total_qprc += hw_stats->qprc[i];
+   *total_qbrc += hw_stats->qbrc[i];
}
hw_stats->mlfc += IXGBE_READ_REG(hw, IXGBE_MLFC);
hw_stats->mrfc += IXGBE_READ_REG(hw, IXGBE_MRFC);
@@ -1803,6 +1796,8 @@ ixgbe_dev_stats_get(struct rte_eth_dev *dev, struct 
rte_eth_stats *stats)

/* Note that gprc counts missed packets */
hw_stats->gprc += IXGBE_READ_REG(hw, IXGBE_GPRC);
+   *rxnfgpc += IXGBE_READ_REG(hw, IXGBE_RXNFGPC);
+   *txdgpc += IXGBE_READ_REG(hw, IXGBE_TXDGPC);

if (hw->mac.type != ixgbe_mac_82598EB) {
hw_stats->gorc += IXGBE_READ_REG(hw, IXGBE_GORCL);
@@ -1879,6 +1874,31 @@ ixgbe_dev_stats_get(struct rte_eth_dev *dev, struct 
rte_eth_stats *stats)
hw_stats->fcoedwrc += IXGBE_READ_REG(hw, IXGBE_FCOEDWRC);
hw_stats->fcoedwtc += IXGBE_READ_REG(hw, IXGBE_FCOEDWTC);
}
+}
+
+/*
+ * This function is based on ixgbe_update_stats_counters() in ixgbe/ixgbe.c
+ */
+static void
+ixgbe_dev_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats)
+{
+   struct ixgbe_hw *hw =
+   IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+   struct ixgbe_hw_stats *hw_stats =
+   IXGBE_DEV_PRIVATE_TO_STATS(dev->data->dev_private);
+   uint64_t total_missed_rx, total_qbrc, total_qprc, total_qprdc;
+   uint64_t rxnfgpc, txdgpc;
+   unsigned i;
+
+   total_missed_rx = 0;
+   total_qbrc = 0;
+   total_qprc = 0;
+   total_qprdc = 0;
+   rxnfgpc = 0;
+   txdgpc = 0;
+
+   ixgbe_read_stats_registers(hw, hw_stats, &total_missed_rx, &total_qbrc,
+   &total_qprc, &rxnfgpc, &txdgpc, &total_qprdc);

if (stats == NULL)
return;
@@ -1902,13 +1922,25 @@ ixgbe_dev_stats_get(struct rte_eth_dev *dev, struct 
rte_eth_stats *stats)
stats->ibadcrc  = hw_stats->crcerrs;
stats->ibadlen  = hw_stats->rlec + hw_stats->ruc + hw_stats->roc;
stats->imissed  = total_missed_rx;
-   stats->ierrors

[dpdk-dev] [PATCH v2 3/7] ixgbe: Expose extended error statistics

2015-06-09 Thread Maryam Tahhan
Implement xstats_get() and xstats_reset() in dev_ops for ixgbe to
expose detailed error statistics to DPDK applications.

Signed-off-by: Maryam Tahhan 
---
 drivers/net/ixgbe/ixgbe_ethdev.c | 85 
 1 file changed, 85 insertions(+)

diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index 543e8ab..609bd30 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -131,7 +131,10 @@ static int ixgbe_dev_link_update(struct rte_eth_dev *dev,
int wait_to_complete);
 static void ixgbe_dev_stats_get(struct rte_eth_dev *dev,
struct rte_eth_stats *stats);
+static int ixgbe_dev_xstats_get(struct rte_eth_dev *dev,
+   struct rte_eth_xstats *xstats, unsigned n);
 static void ixgbe_dev_stats_reset(struct rte_eth_dev *dev);
+static void ixgbe_dev_xstats_reset(struct rte_eth_dev *dev);
 static int ixgbe_dev_queue_stats_mapping_set(struct rte_eth_dev *eth_dev,
 uint16_t queue_id,
 uint8_t stat_idx,
@@ -330,7 +333,9 @@ static const struct eth_dev_ops ixgbe_eth_dev_ops = {
.allmulticast_disable = ixgbe_dev_allmulticast_disable,
.link_update  = ixgbe_dev_link_update,
.stats_get= ixgbe_dev_stats_get,
+   .xstats_get   = ixgbe_dev_xstats_get,
.stats_reset  = ixgbe_dev_stats_reset,
+   .xstats_reset = ixgbe_dev_xstats_reset,
.queue_stats_mapping_set = ixgbe_dev_queue_stats_mapping_set,
.dev_infos_get= ixgbe_dev_info_get,
.mtu_set  = ixgbe_dev_mtu_set,
@@ -408,6 +413,33 @@ static const struct eth_dev_ops ixgbevf_eth_dev_ops = {
.mac_addr_remove  = ixgbevf_remove_mac_addr,
 };

+/* store statistics names and its offset in stats structure  */
+struct rte_ixgbe_xstats_name_off {
+   char name[RTE_ETH_XSTATS_NAME_SIZE];
+   unsigned offset;
+};
+
+static const struct rte_ixgbe_xstats_name_off rte_ixgbe_stats_strings[] = {
+   {"rx_illegal_byte_err", offsetof(struct ixgbe_hw_stats, errbc)},
+   {"rx_len_err", offsetof(struct ixgbe_hw_stats, rlec)},
+   {"rx_undersize_count", offsetof(struct ixgbe_hw_stats, ruc)},
+   {"rx_oversize_count", offsetof(struct ixgbe_hw_stats, roc)},
+   {"rx_fragment_count", offsetof(struct ixgbe_hw_stats, rfc)},
+   {"rx_jabber_count", offsetof(struct ixgbe_hw_stats, rjc)},
+   {"l3_l4_xsum_error", offsetof(struct ixgbe_hw_stats, xec)},
+   {"mac_local_fault", offsetof(struct ixgbe_hw_stats, mlfc)},
+   {"mac_remote_fault", offsetof(struct ixgbe_hw_stats, mrfc)},
+   {"mac_short_pkt_discard", offsetof(struct ixgbe_hw_stats, mspdc)},
+   {"fccrc_error", offsetof(struct ixgbe_hw_stats, fccrc)},
+   {"fcoe_drop", offsetof(struct ixgbe_hw_stats, fcoerpdc)},
+   {"fc_last_error", offsetof(struct ixgbe_hw_stats, fclast)},
+   {"rx_broadcast_packets", offsetof(struct ixgbe_hw_stats, bprc)},
+   {"mgmt_pkts_dropped", offsetof(struct ixgbe_hw_stats, mngpdc)},
+};
+
+#define RTE_NB_XSTATS (sizeof(rte_ixgbe_stats_strings) /   \
+   sizeof(rte_ixgbe_stats_strings[0]))
+
 /**
  * Atomically reads the link status information from global
  * structure rte_eth_dev.
@@ -1968,6 +2000,59 @@ ixgbe_dev_stats_reset(struct rte_eth_dev *dev)
memset(stats, 0, sizeof(*stats));
 }

+static int
+ixgbe_dev_xstats_get(struct rte_eth_dev *dev, struct rte_eth_xstats *xstats,
+unsigned n)
+{
+   struct ixgbe_hw *hw =
+   IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+   struct ixgbe_hw_stats *hw_stats =
+   IXGBE_DEV_PRIVATE_TO_STATS(dev->data->dev_private);
+   uint64_t total_missed_rx, total_qbrc, total_qprc, total_qprdc;
+   uint64_t rxnfgpc, txdgpc;
+   unsigned i, count = RTE_NB_XSTATS;
+
+   if (n < count)
+   return count;
+
+   total_missed_rx = 0;
+   total_qbrc = 0;
+   total_qprc = 0;
+   total_qprdc = 0;
+   rxnfgpc = 0;
+   txdgpc = 0;
+   count = 0;
+
+   ixgbe_read_stats_registers(hw, hw_stats, &total_missed_rx, &total_qbrc,
+  &total_qprc, 
&rxnfgpc, &txdgpc, &total_qprdc);
+
+   if (!xstats)
+   return 0;
+
+   /* Error stats */
+   for (i = 0; i < RTE_NB_XSTATS; i++) {
+   snprintf(xstats[count].name, sizeof(xstats[count].name),
+   "%s", rte_ixgbe_stats_strings[i].name);
+   xstats[count++].value = *(uint64_t *)(((char *)hw_stats) +
+   
rte_ixgbe_stats_strings[i].offset);
+   }
+
+   return count;
+}
+
+static void
+ixgbe_dev_xstats_reset(struct rte_eth_dev *dev)
+{
+   struct i

[dpdk-dev] [PATCH v2 4/7] ethdev: expose extended error stats

2015-06-09 Thread Maryam Tahhan
Extend rte_eth_xstats_get to retrieve additional stats from the device
driver as well the top level extended stats.

Signed-off-by: Maryam Tahhan 
---
 lib/librte_ether/rte_ethdev.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index a439b4a..ce163a1 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -1745,7 +1745,7 @@ rte_eth_xstats_get(uint8_t port_id, struct rte_eth_xstats 
*xstats,
 {
struct rte_eth_stats eth_stats;
struct rte_eth_dev *dev;
-   unsigned count, i, q;
+   unsigned count = 0, xcount = 0, i, q;
uint64_t val;
char *stats_ptr;

@@ -1758,18 +1758,19 @@ rte_eth_xstats_get(uint8_t port_id, struct 
rte_eth_xstats *xstats,

/* implemented by the driver */
if (dev->dev_ops->xstats_get != NULL)
-   return (*dev->dev_ops->xstats_get)(dev, xstats, n);
+   xcount = (*dev->dev_ops->xstats_get)(dev, xstats, n);

/* else, return generic statistics */
count = RTE_NB_STATS;
count += dev->data->nb_rx_queues * RTE_NB_RXQ_STATS;
count += dev->data->nb_tx_queues * RTE_NB_TXQ_STATS;
+   count += xcount;
if (n < count)
return count;

/* now fill the xstats structure */

-   count = 0;
+   count = xcount;
rte_eth_stats_get(port_id, ð_stats);

/* global stats */
-- 
1.9.3



[dpdk-dev] [PATCH v2 5/7] testpmd: extend testpmd to show all extended stats

2015-06-09 Thread Maryam Tahhan
Extend testpmd to show additional aggregate extended stats.

Signed-off-by: Maryam Tahhan 
---
 app/test-pmd/config.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index f788ed5..b42d83f 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -153,6 +153,11 @@ nic_stats_display(portid_t port_id)
   stats.opackets, stats.oerrors, stats.obytes);
}

+   printf("  RX-MAC-errors: %-10"PRIu64" RX-PHY-errors: %-10"PRIu64"\n",
+  stats.imacerr, stats.iphyerr);
+   printf("  RX-nombuf:  %-10"PRIu64"  RX-dropped: %-10"PRIu64"\n",
+  stats.rx_nombuf, stats.idrop);
+
/* stats fdir */
if (fdir_conf.mode != RTE_FDIR_MODE_NONE)
printf("  Fdirmiss:   %-10"PRIu64" Fdirmatch: %-10"PRIu64"\n",
-- 
1.9.3



[dpdk-dev] [PATCH v2 6/7] app: remove dump_cfg

2015-06-09 Thread Maryam Tahhan
Remove the dump_cfg application, this will be replaced by a new app
called proc_info that will implement the same functionality as dump_cfg
and extend it to retrieve statistics for DPDK ports.

Signed-off-by: Maryam Tahhan 
---
 app/Makefile  |  1 -
 app/dump_cfg/Makefile | 45 -
 app/dump_cfg/main.c   | 92 ---
 3 files changed, 138 deletions(-)
 delete mode 100644 app/dump_cfg/Makefile
 delete mode 100644 app/dump_cfg/main.c

diff --git a/app/Makefile b/app/Makefile
index 50c670b..81bd222 100644
--- a/app/Makefile
+++ b/app/Makefile
@@ -36,6 +36,5 @@ DIRS-$(CONFIG_RTE_LIBRTE_ACL) += test-acl
 DIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += test-pipeline
 DIRS-$(CONFIG_RTE_TEST_PMD) += test-pmd
 DIRS-$(CONFIG_RTE_LIBRTE_CMDLINE) += cmdline_test
-DIRS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += dump_cfg

 include $(RTE_SDK)/mk/rte.subdir.mk
diff --git a/app/dump_cfg/Makefile b/app/dump_cfg/Makefile
deleted file mode 100644
index 3257127..000
--- a/app/dump_cfg/Makefile
+++ /dev/null
@@ -1,45 +0,0 @@
-#   BSD LICENSE
-#
-#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
-#   All rights reserved.
-#
-#   Redistribution and use in source and binary forms, with or without
-#   modification, are permitted provided that the following conditions
-#   are met:
-#
-# * Redistributions of source code must retain the above copyright
-#   notice, this list of conditions and the following disclaimer.
-# * Redistributions in binary form must reproduce the above copyright
-#   notice, this list of conditions and the following disclaimer in
-#   the documentation and/or other materials provided with the
-#   distribution.
-# * Neither the name of Intel Corporation nor the names of its
-#   contributors may be used to endorse or promote products derived
-#   from this software without specific prior written permission.
-#
-#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
-#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
-#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
-#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
-#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
-#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
-#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
-#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
-#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
-#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
-#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-
-include $(RTE_SDK)/mk/rte.vars.mk
-
-APP = dump_cfg
-
-CFLAGS += $(WERROR_FLAGS)
-
-# all source are stored in SRCS-y
-
-SRCS-y := main.c
-
-# this application needs libraries first
-DEPDIRS-y += lib
-
-include $(RTE_SDK)/mk/rte.app.mk
diff --git a/app/dump_cfg/main.c b/app/dump_cfg/main.c
deleted file mode 100644
index 127dbb1..000
--- a/app/dump_cfg/main.c
+++ /dev/null
@@ -1,92 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- * * Redistributions of source code must retain the above copyright
- *   notice, this list of conditions and the following disclaimer.
- * * Redistributions in binary form must reproduce the above copyright
- *   notice, this list of conditions and the following disclaimer in
- *   the documentation and/or other materials provided with the
- *   distribution.
- * * Neither the name of Intel Corporation nor the names of its
- *   contributors may be used to endorse or promote products derived
- *   from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-
-#include 
-#inclu

[dpdk-dev] [PATCH v2 7/7] app: add a new app proc_info

2015-06-09 Thread Maryam Tahhan
proc_info displays statistics information including extened stats for
given DPDK ports and dumps the memory information for DPDK.

Signed-off-by: Maryam Tahhan 
---
 MAINTAINERS|   4 +
 app/Makefile   |   1 +
 app/proc_info/Makefile |  45 +
 app/proc_info/main.c   | 514 +
 mk/rte.sdktest.mk  |   4 +-
 5 files changed, 566 insertions(+), 2 deletions(-)
 create mode 100644 app/proc_info/Makefile
 create mode 100644 app/proc_info/main.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 9362c19..94e0300 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -485,3 +485,7 @@ F: doc/guides/sample_app_ug/skeleton.rst
 F: examples/vmdq/
 F: examples/vmdq_dcb/
 F: doc/guides/sample_app_ug/vmdq_dcb_forwarding.rst
+
+M: Maryam Tahhan 
+F: app/proc_info/
diff --git a/app/Makefile b/app/Makefile
index 81bd222..88c0bad 100644
--- a/app/Makefile
+++ b/app/Makefile
@@ -36,5 +36,6 @@ DIRS-$(CONFIG_RTE_LIBRTE_ACL) += test-acl
 DIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += test-pipeline
 DIRS-$(CONFIG_RTE_TEST_PMD) += test-pmd
 DIRS-$(CONFIG_RTE_LIBRTE_CMDLINE) += cmdline_test
+DIRS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += proc_info

 include $(RTE_SDK)/mk/rte.subdir.mk
diff --git a/app/proc_info/Makefile b/app/proc_info/Makefile
new file mode 100644
index 000..6759547
--- /dev/null
+++ b/app/proc_info/Makefile
@@ -0,0 +1,45 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of Intel Corporation nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+APP = proc_info
+
+CFLAGS += $(WERROR_FLAGS)
+
+# all source are stored in SRCS-y
+
+SRCS-y := main.c
+
+# this application needs libraries first
+DEPDIRS-y += lib
+
+include $(RTE_SDK)/mk/rte.app.mk
diff --git a/app/proc_info/main.c b/app/proc_info/main.c
new file mode 100644
index 000..e948d7f
--- /dev/null
+++ b/app/proc_info/main.c
@@ -0,0 +1,514 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER

[dpdk-dev] [PATCH 0/6] fm10k: A series of bug fixes

2015-06-09 Thread Qiu, Michael
On 2015/5/29 16:11, Chen, Jing D wrote:
> From: "Chen Jing D(Mark)" 
>
> This patch set include a few bug fixes and enhancements on fm10k driver.
>
> Chen Jing D(Mark) (6):
>   fm10k: Fix improper RX buffer size assignment
>   fm10k: Fix jumbo frame issue
>   fm10k: Fix data integrity issue with multi-segment frame
>   fm10k: Fix issue that MAC addr can't be set to silicon
>   fm10k: Do sanity check on mac address
>   fm10k: Add default mac/vlan filter to SM
>
>  drivers/net/fm10k/fm10k.h|5 +-
>  drivers/net/fm10k/fm10k_ethdev.c |  100 ++---
>  drivers/net/fm10k/fm10k_rxtx.c   |   15 +-
>  3 files changed, 86 insertions(+), 34 deletions(-)
>
Acked-by: Michael Qiu 


[dpdk-dev] rte_memcmp: comments from glibc side.

2015-06-09 Thread Ondřej Bílka
Hi,

I as glibc developer that wrote current strcmp code have some comments.

First is that gcc builtins for *cmp are garbage that produce rep cmpsb
which is slower than byte-by-byte loop. So compile your test again with
-fno-builtin-memcmp and your performance gain will probably disappear.

Then there is inlining. Its correct to do that for first 32 bytes and I
plan to add header that does that check to improve performance. However
not for bytes after 32'th. Thats very cold code, Only 5.6% calls reach
17th byte and 1.7% of calls read 33'th byte, so just do libcall to save size.

That also makes avx2 pointless, for most string funtions avx2 doesn't
give you gains as xmm for first 64 bytes has better latency and while
loop is faster its also relatively cold as its almost never reached.

For memcmp I posted on gcc list a sample implementation how it should do
inlining. I found that gcc optimizes that better than expected and
produces probably optimal header (see below and feel free to use it).

When you care about sign then its better to load first 8 bytes, convert
them to big endian where can you compare directly. When you don't gcc
managed to optimize away bswap so you check 8 bytes with three
instructions below. Now I think that in header we shouldn't use sse at
all.

 190:   48 8b 4e 08 mov0x8(%rsi),%rcx
 194:   48 39 4f 08 cmp%rcx,0x8(%rdi)
 198:   75 f3   jne18d 

As I mentioned statistics on my computer memcmp has following:

calls 1430827
average n:7.4n <= 0:   0.1% n <= 4:  36.3% n <= 8:  78.4% n <=
16:  94.4% n <= 24:  97.3% n <= 32:  98.3% n <= 48:  98.6% n <= 64:
99.9% 
s aligned to 4 bytes:  99.8%  8 bytes:  97.5% 16 bytes:  59.5% 
average *s access cache latency3.6l <= 8:  92.0% l <= 16:  96.1%
l <= 32:  98.9% l <= 64:  99.4% l <= 128:  99.5% 
s2 aligned to 4 bytes:  24.1%  8 bytes:  13.1% 16 bytes:   8.2% 
s-s2 aligned to 4 bytes:  24.1%  8 bytes:  15.4% 16 bytes:  10.3% 
average *s2 access cache latency1.5l <= 8:  98.0% l <= 16:
99.6% l <= 32:  99.9% l <= 64: 100.0% l <= 128: 100.0% 
average capacity:8.5c <= 0:   0.0% c <= 4:  36.0% c <= 8:  78.3%
c <= 16:  91.8% c <= 24:  94.8% c <= 32:  95.7% c <= 48:  96.1% c <= 64:
99.9%

#include 
#include 

#undef memcmp
#define memcmp(x, y, n) (__builtin_constant_p (n) && n < 64 ? __memcmp_inline 
(x, y, n) \
 : memcmp (x, y, n))

#define LOAD8(x) (*((uint8_t *) (x)))
#define LOAD32(x) (*((uint32_t *) (x)))
#define LOAD64(x) (*((uint64_t *) (x)))

#define CHECK(tp, n)
#if __BYTE_ORDER == __LITTLE_ENDIAN
# define SWAP32(x) __builtin_bswap32 (LOAD32 (x))
# define SWAP64(x) __builtin_bswap64 (LOAD64 (x))
#else
# define SWAP32(x) LOAD32 (x)
# define SWAP64(x) LOAD64 (x)
#endif

#define __ARCH_64BIT 1

static __always_inline
int
check (uint64_t x, uint64_t y)
{
  if (x == y)
return 0;
  if (x > y)
return 1;

  return -1;
}

static __always_inline
int
check_nonzero (uint64_t x, uint64_t y)
{
  if (x > y)
return 1;

  return -1;
}


static __always_inline
int
__memcmp_inline (void *x, void *y, size_t n)
{
#define CHECK1 if (LOAD8 (x + i) - LOAD8 (y + i)) \
return check_nonzero (LOAD8 (x + i), LOAD8 (y + i)); i = i + 1;
#define CHECK4 if (i == 0 ? SWAP32 (x + i) - SWAP32 (y + i)\
  : LOAD32 (x + i) - LOAD32 (y + i)) \
return check_nonzero (SWAP32 (x + i), SWAP32 (y + i)); i = i + 4;
#define CHECK8 if (i == 0 ? SWAP64 (x + i) - SWAP64 (y + i)\
  : LOAD64 (x + i) - LOAD64 (y + i)) \
return check_nonzero (SWAP64 (x + i), SWAP64 (y + i)); i = i + 8;

#define CHECK1FINAL(o) return check (LOAD8 (x + i + o), LOAD8 (y + i + o));
#define CHECK4FINAL(o) return check (SWAP32 (x + i + o), SWAP32 (y + i + o));
#define CHECK8FINAL(o) return check (SWAP64 (x + i + o), SWAP64 (y + i + o));

#if __ARCH_64BIT == 0
# undef CHECK8
# undef CHECK8FINAL
# define CHECK8 CHECK4 CHECK4
# define CHECK8FINAL(o) CHECK4 CHECK4FINAL (o)
#endif

#define LOOP if (i + 8 < n) { CHECK8 } \
if (i + 8 < n) { CHECK8 } \
if (i + 8 < n) { CHECK8 } \
if (i + 8 < n) { CHECK8 } \
if (i + 8 < n) { CHECK8 } \
if (i + 8 < n) { CHECK8 } \
if (i + 8 < n) { CHECK8 } \
if (i + 8 < n) { CHECK8 } 


  long i = 0;

  switch (n % 8)
{
case 0:
  if (n == 0)
return 0;

  LOOP; CHECK8FINAL (0);
case 1:
  LOOP CHECK1FINAL (0);
case 2:
  if (n == 2)
{
  CHECK1 CHECK1FINAL (0);
}
  LOOP CHECK4FINAL (-2);
case 3:
  if (n == 3)
{
  CHECK1 CHECK1 CHECK1FINAL (0);
}
  LOOP CHECK4FINAL (-1);
case 4:
  LOOP CHECK4FINAL (0);
case 5:
  if (n == 5)
{
  CHECK4 CHECK1FINAL (0);
}
#if __ARCH_64BIT
  LOOP CHECK8FINAL (-3);
#else
  LOOP CHECK4 CHECK1FINAL (0);
#endif
case 6:
  if (n == 6)
{
  CHECK4 CHECK4FINAL (-2);
}
  LOOP CHECK8FINAL (-2);
case 7:
  if (n == 7)
   

[dpdk-dev] [PATCH] dpdk1.7.1 rte.app.mk add options not not build targerts

2015-06-09 Thread Maxim Uvarov
On 06/09/15 17:37, Olivier MATZ wrote:
> Hi Maxim,
>
> On 06/09/2015 02:59 PM, Maxim Uvarov wrote:
>> On 06/09/15 15:05, Olivier MATZ wrote:
>>> Hello Maxim,
>>>
>>> On 06/09/2015 12:15 PM, Maxim Uvarov wrote:
 Inherit build varibles only so that this file can be included
 from other projects.

 Signed-off-by: Maxim Uvarov 
>>>
>>> Can you detail a bit more what you want to do?
>>> Why do you need to include rte.app.mk? This file is
>>> internal to the dpdk framework.
>>>
>>> By the way, the title is not understandable:
>>> - why dpdk1.7.1 ?
>>> - targerts -> targets
>>> - not not ?
>>>
>>> Regards,
>>> Olivier
>>
>> Sorry it was quick patch, some typos there. I intended to discuss the
>> idea of what I need and might it it's useful for others.
>> I did ODP implementation with dpdk as back end. And staid on v1.7.1. But
>> that patch should be good for the latest git,
>> if not I can update it.
>>
>> So my environment is: I build library which calls dpdk functions. That
>> library is used to build applications. I need to steal CFLAGS, LDFLAGS,
>> and build script from dpdk for my library and example apps. So I just
>> point where dpdk is and my library build system should inherit the same
>> env which dpdk used. One reason is optimization and second reason is to
>> compile in dpdk PMD drivers the same way as dpdk does that.
>>
>> So in my Makefile I do: include $dpdk/mk/rte.app.mk
>>
>> Is that needed for somebody else?
>
> Maybe you can use rte.extapp.mk and rte.extlib.mk instead?
>
> There is no example for rte.extlib.mk, but it works the same
> as rte.extapp.mk. You can start from an example in dpdk/examples
> directory (for instance skeleton):
> - remove the main()
> - change "APP = basicfwd" to "LIB = basicfwd.a"
> - change "include $(RTE_SDK)/mk/rte.extapp.mk" to
>   "include $(RTE_SDK)/mk/rte.extlib.mk"
>
> Then:
>   cd examples/skeleton
>   make RTE_SDK=/path/to/dpdk \
> RTE_TARGET=x86_64-native-linuxapp-gcc \
> O=/path/to/dstdir
>
> This should generate a static lib that you can use in another
> application example.
>
> If you cannot use this model, another solution would be to generate
> a pkg-config file in dpdk framework that could be used by other
> build frameworks.
>
> Regards,
> Olivier
>

mk/rte.extlib.mk also references to mk/rte.lib.mk which has all: target.
And as soon I will include that Makefile it will do make all first. But 
I need only
cflags and ldflags.

To link pmds we did that hack:
https://git.linaro.org/lng/odp-dpdk.git/commitdiff/9e41f167a8f44b74af6a1e1ffe00dc6d305ac8a4?hp=ac1789bfe9ceb6bbe04b6455f996680a20441813
Which mostly solved problem. But I would add sse and other cflags 
especially for inline functions to or link.

Will take a look at examples/skeleton. Before I looked only to 1.7.1 and 
there is no such example. Looks like it appeared later.

Thank you,
Maxim.


[dpdk-dev] [PATCH 1/3] fm10k: Add promiscuous mode support

2015-06-09 Thread Qiu, Michael
On 2015/6/5 17:03, Chen, Jing D wrote:
> From: "Chen Jing D(Mark)" 
>
> Add functions to support promiscuous/allmulticast enable and
> disable.
>
> Signed-off-by: Chen Jing D(Mark) 
> ---
>  drivers/net/fm10k/fm10k_ethdev.c |  118 
> +-
>  1 files changed, 117 insertions(+), 1 deletions(-)
>

...

> +
> +static void
> +fm10k_dev_promiscuous_enable(struct rte_eth_dev *dev)
> +{
> + struct fm10k_hw *hw = FM10K_DEV_PRIVATE_TO_HW(dev->data->dev_private);
> + int status;
> +
> + PMD_INIT_FUNC_TRACE();
> +
> + /* Return if it didn't acquire valid glort range */
> + if (!fm10k_glort_valid(hw))
> + return;
> +
> + fm10k_mbx_lock(hw);
> + status = hw->mac.ops.update_xcast_mode(hw, hw->mac.dglort_map,
> + FM10K_XCAST_MODE_PROMISC);
> + fm10k_mbx_unlock(hw);
> +
> + if (status != FM10K_SUCCESS)
> + PMD_INIT_LOG(ERR, "Failed to enable promiscuous mode");
> +}
> +
> +static void
> +fm10k_dev_promiscuous_disable(struct rte_eth_dev *dev)
> +{
> + struct fm10k_hw *hw = FM10K_DEV_PRIVATE_TO_HW(dev->data->dev_private);
> + uint8_t mode;
> + int status;
> +
> + PMD_INIT_FUNC_TRACE();
> +
> + /* Return if it didn't acquire valid glort range */
> + if (!fm10k_glort_valid(hw))
> + return;
> +
> + if (dev->data->all_multicast == 1)
> + mode = FM10K_XCAST_MODE_ALLMULTI;
> + else
> + mode = FM10K_XCAST_MODE_NONE;
> +
> + fm10k_mbx_lock(hw);
> + status = hw->mac.ops.update_xcast_mode(hw, hw->mac.dglort_map,
> + mode);
> + fm10k_mbx_unlock(hw);
> +
> + if (status != FM10K_SUCCESS)
> + PMD_INIT_LOG(ERR, "Failed to disable promiscuous mode");
> +}
> +
> +static void
> +fm10k_dev_allmulticast_enable(struct rte_eth_dev *dev)
> +{
> + struct fm10k_hw *hw = FM10K_DEV_PRIVATE_TO_HW(dev->data->dev_private);
> + int status;
> +
> + PMD_INIT_FUNC_TRACE();
> +
> + /* Return if it didn't acquire valid glort range */
> + if (!fm10k_glort_valid(hw))
> + return;
> +
> + /* If promiscuous mode is enabled, it doesn't make sense to enable
> +  * allmulticast and disable promiscuous since fm10k only can select
> +  * one of the modes.
> +  */
> + if (dev->data->promiscuous)

Would it be better to add a log here to tell user?

> + return;
> +
> + fm10k_mbx_lock(hw);
> + status = hw->mac.ops.update_xcast_mode(hw, hw->mac.dglort_map,
> + FM10K_XCAST_MODE_ALLMULTI);
> + fm10k_mbx_unlock(hw);
> +
> + if (status != FM10K_SUCCESS)
> + PMD_INIT_LOG(ERR, "Failed to enable allmulticast mode");
> +}
> +
> +static void
> +fm10k_dev_allmulticast_disable(struct rte_eth_dev *dev)
> +{
> + struct fm10k_hw *hw = FM10K_DEV_PRIVATE_TO_HW(dev->data->dev_private);
> + int status;
> +
> + PMD_INIT_FUNC_TRACE();
> +
> + /* Return if it didn't acquire valid glort range */
> + if (!fm10k_glort_valid(hw))
> + return;
> +
> + if (dev->data->promiscuous)

Also here?

> + return;
> +
> + fm10k_mbx_lock(hw);
> + /* Change mode to unicast mode */
> + status = hw->mac.ops.update_xcast_mode(hw, hw->mac.dglort_map,
> + FM10K_XCAST_MODE_NONE);
> + fm10k_mbx_unlock(hw);
> +
> + if (status != FM10K_SUCCESS)
> + PMD_INIT_LOG(ERR, "Failed to disable allmulticast mode");
> +}
> +
>  /* fls = find last set bit = 32 minus the number of leading zeros */
>  #ifndef fls
>  #define fls(x) (((x) == 0) ? 0 : (32 - __builtin_clz((x
> @@ -1654,6 +1766,10 @@ static const struct eth_dev_ops fm10k_eth_dev_ops = {
>   .dev_start  = fm10k_dev_start,
>   .dev_stop   = fm10k_dev_stop,
>   .dev_close  = fm10k_dev_close,
> + .promiscuous_enable = fm10k_dev_promiscuous_enable,
> + .promiscuous_disable= fm10k_dev_promiscuous_disable,
> + .allmulticast_enable= fm10k_dev_allmulticast_enable,
> + .allmulticast_disable   = fm10k_dev_allmulticast_disable,
>   .stats_get  = fm10k_stats_get,
>   .stats_reset= fm10k_stats_reset,
>   .link_update= fm10k_link_update,
> @@ -1819,7 +1935,7 @@ eth_fm10k_dev_init(struct rte_eth_dev *dev)
>* API func.
>*/
>   hw->mac.ops.update_xcast_mode(hw, hw->mac.dglort_map,
> - FM10K_XCAST_MODE_MULTI);
> + FM10K_XCAST_MODE_NONE);
>  
>   fm10k_mbx_unlock(hw);
>  



[dpdk-dev] [PATCH 3/3] fm10k: update VLAN offload features

2015-06-09 Thread Qiu, Michael
On 2015/6/9 11:27, Chen, Jing D wrote:
> Hi,
>
>
>> -Original Message-
>> From: He, Shaopeng
>> Sent: Tuesday, June 02, 2015 10:59 AM
>> To: dev at dpdk.org
>> Cc: Chen, Jing D; Qiu, Michael; He, Shaopeng
>> Subject: [PATCH 3/3] fm10k: update VLAN offload features
>>
>> Fm10k PF/VF does not support QinQ; VLAN strip and filter are always on
>> for PF/VF ports.
>>
>> Signed-off-by: Shaopeng He 
>> ---
>>  drivers/net/fm10k/fm10k_ethdev.c | 22 ++
>>  1 file changed, 22 insertions(+)
>>
>> diff --git a/drivers/net/fm10k/fm10k_ethdev.c
>> b/drivers/net/fm10k/fm10k_ethdev.c
>> index 4f23bf1..9b198a7 100644
>> --- a/drivers/net/fm10k/fm10k_ethdev.c
>> +++ b/drivers/net/fm10k/fm10k_ethdev.c
>> @@ -884,6 +884,27 @@ fm10k_vlan_filter_set(struct rte_eth_dev *dev,
>> uint16_t vlan_id, int on)
>>  return (-EIO);
>>  }
>>
>> +static void
>> +fm10k_vlan_offload_set(__rte_unused struct rte_eth_dev *dev, int mask)
>> +{
>> +if (mask & ETH_VLAN_STRIP_MASK) {
>> +if (!dev->data->dev_conf.rxmode.hw_vlan_strip)
>> +PMD_INIT_LOG(ERR, "VLAN stripping is "
>> +"always on in fm10k");
>> +}
>> +
>> +if (mask & ETH_VLAN_EXTEND_MASK) {
>> +if (dev->data->dev_conf.rxmode.hw_vlan_extend)
>> +PMD_INIT_LOG(ERR, "VLAN QinQ is not "
>> +"supported in fm10k");
>> +}
>> +
>> +if (mask & ETH_VLAN_FILTER_MASK) {
>> +if (!dev->data->dev_conf.rxmode.hw_vlan_filter)
>> +PMD_INIT_LOG(ERR, "VLAN filter is always on in
>> fm10k");
>> +}
>> +}
>> +
> Update fm10k_dev_infos_get() to configure above options to expected values?

Could it be better to add CRC strip options to expected values by
convenient?

Thanks,
Michael
>>  /* Add/Remove a MAC address, and update filters */
>>  static void
>>  fm10k_MAC_filter_set(struct rte_eth_dev *dev, const u8 *mac, bool add)
>> @@ -1801,6 +1822,7 @@ static const struct eth_dev_ops
>> fm10k_eth_dev_ops = {
>>  .link_update= fm10k_link_update,
>>  .dev_infos_get  = fm10k_dev_infos_get,
>>  .vlan_filter_set= fm10k_vlan_filter_set,
>> +.vlan_offload_set   = fm10k_vlan_offload_set,
>>  .mac_addr_add   = fm10k_macaddr_add,
>>  .mac_addr_remove= fm10k_macaddr_remove,
>>  .rx_queue_start = fm10k_dev_rx_queue_start,
>> --
>> 1.9.3
>



[dpdk-dev] [PATCH] ixgbe: fix checking for tx_free_thresh

2015-06-09 Thread Ananyev, Konstantin


> -Original Message-
> From: Zoltan Kiss [mailto:zoltan.kiss at linaro.org]
> Sent: Tuesday, June 09, 2015 4:08 PM
> To: Ananyev, Konstantin; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] ixgbe: fix checking for tx_free_thresh
> 
> 
> 
> On 09/06/15 12:18, Ananyev, Konstantin wrote:
> >
> >
> >> -Original Message-
> >> From: Zoltan Kiss [mailto:zoltan.kiss at linaro.org]
> >> Sent: Wednesday, June 03, 2015 6:47 PM
> >> To: Ananyev, Konstantin; dev at dpdk.org
> >> Subject: Re: [dpdk-dev] [PATCH] ixgbe: fix checking for tx_free_thresh
> >>
> >>
> >>
> >> On 02/06/15 18:35, Ananyev, Konstantin wrote:
> >>>
> >>>
>  -Original Message-
>  From: Zoltan Kiss [mailto:zoltan.kiss at linaro.org]
>  Sent: Tuesday, June 02, 2015 4:08 PM
>  To: Ananyev, Konstantin; dev at dpdk.org
>  Subject: Re: [dpdk-dev] [PATCH] ixgbe: fix checking for tx_free_thresh
> 
> 
> 
>  On 02/06/15 14:31, Ananyev, Konstantin wrote:
> > Hi Zoltan,
> >
> >> -Original Message-
> >> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Zoltan Kiss
> >> Sent: Monday, June 01, 2015 5:16 PM
> >> To: dev at dpdk.org
> >> Subject: Re: [dpdk-dev] [PATCH] ixgbe: fix checking for tx_free_thresh
> >>
> >> Hi,
> >>
> >> Anyone would like to review this patch? Venky sent a NAK, but I've
> >> explained to him why it is a bug.
> >
> >
> > Well, I think Venky is right here.
>  I think the comments above rte_eth_tx_burst() definition are quite clear
>  about what tx_free_thresh means, e1000 and i40e use it that way, but not
>  ixgbe.
> 
> > Indeed that fix, will cause more often unsuccessful checks for DD bits 
> > and might cause a
> > slowdown for TX fast-path.
>  Not if the applications set tx_free_thresh according to the definition
>  of this value. But we can change the default value from 32 to something
>  higher, e.g I'm using nb_desc/2, and it works out well.
> >>>
> >>> Sure we can, as I said below, we can unify it one way or another.
> >>> One way would be to make fast-path TX to free TXDs when number of 
> >>> occupied TXDs raises above tx_free_thresh
> >>> (what rte_ethdev.h comments say and what full-featured TX is doing).
> >>> Though in that case we have to change default value for tx_free_thresh, 
> >>> and all existing apps that
> >>> using tx_free_thresh==32 and fast-path TX will probably experience a 
> >>> slowdown.
> >>
> >> They are in trouble already, because i40e and e1000 uses it as defined.
> >
> > In fact, i40e has exactly the same problem as ixgbe:
> > fast-path and full-featured TX  code treat  tx_free_thresh in a different 
> > way.
> > igb just ignores input tx_free_thresh, while em has only full featured path.
> >
> > What I am saying, existing app that uses TX fast-path and sets 
> > tx_free_thresh=32
> > (as we did in our examples in previous versions) will experience a slowdown,
> > if we'll make all TX functions to behave like full-featured ones
> > (txq->nb_tx_desc - txq->nb_tx_free > txq->tx_free_thresh).
> >
> >  From other side, if app uses TX full-featured TX and sets 
> > tx_free_thresh=32,
> > then it  already has a possible slowdown, because of too often TXDs 
> > checking.
> > So, if we'll change tx_free_thresh semantics to wht fast-path uses,
> > It shouldn't see any slowdown, in fact it might see some improvement.
> >
> >> But I guess most apps are going with 0, which sets the drivers default.
> >> Others have to change the value to nb_txd - curr_value to have the same
> >> behaviour
> >>
> >>> Another way would be to make all TX functions to treat 
> >>> tx_conf->tx_free_thresh as fast-path TX functions do
> >>> (free TXDs when number of free TXDs drops below  tx_free_thresh) and 
> >>> update  rte_ethdev.h comments.
> >> And i40e and e1000e code as well. I don't see what difference it makes
> >> which way of definition you use, what I care is that it should be used
> >> consistently.
> >
> > Yes, both ways are possible, the concern is - how to minimise the impact 
> > for existing apps.
> > That's why I am leaning to the fast-path way.
> 
> Make sense to favour the fast-path way, I'll look into that and try to
> come up with a patch
> 
> >
> >>>
> >>> Though, I am not sure that it really worth all these changes.
> >>>   From one side, whatever tx_free_thresh would be,
> >>> the app should still assume that the worst case might happen,
> >>> and up to nb_tx_desc mbufs can be consumed by the queue.
> >>>   From other side, I think the default value should work well for most 
> >>> cases.
> >>> So I am still for graceful deprecation of that config parameter, see 
> >>> below.
> >>>
> 
> > Anyway, with current PMD implementation, you can't guarantee that at 
> > any moment
> > TX queue wouldn't use more than tx_free_thresh mbufs.
> 
> 
> > There could be situations (low speed, or link is down for some short 
> >>>

[dpdk-dev] [PATCH 1/3] fm10k: update VLAN filter

2015-06-09 Thread Qiu, Michael
On 2015/6/2 10:59, He, Shaopeng wrote:
> VLAN filter was updated to add/delete one static entry in MAC table for each
> combination of VLAN and MAC address. More sanity checks were added.
>
> Signed-off-by: Shaopeng He 
> ---
>  drivers/net/fm10k/fm10k.h| 23 +
>  drivers/net/fm10k/fm10k_ethdev.c | 55 
> +---
>  2 files changed, 75 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/fm10k/fm10k.h b/drivers/net/fm10k/fm10k.h
> index ad7a7d1..3b95b72 100644
> --- a/drivers/net/fm10k/fm10k.h
> +++ b/drivers/net/fm10k/fm10k.h
> @@ -109,11 +109,31 @@
>  
>  #define FM10K_VLAN_TAG_SIZE 4
>  
> +/* Maximum number of MAC addresses per PF/VF */
> +#define FM10K_MAX_MACADDR_NUM   1
> +
> +#define FM10K_UINT32_BIT_SIZE  (CHAR_BIT * sizeof(uint32_t))
> +#define FM10K_VFTA_SIZE(4096 / FM10K_UINT32_BIT_SIZE)
> +
> +/* vlan_id is a 12 bit number.
> + * The VFTA array is actually a 4096 bit array, 128 of 32bit elements.
> + * 2^5 = 32. The val of lower 5 bits specifies the bit in the 32bit element.
> + * The higher 7 bit val specifies VFTA array index.
> + */
> +#define FM10K_VFTA_BIT(vlan_id)(1 << ((vlan_id) & 0x1F))
> +#define FM10K_VFTA_IDX(vlan_id)((vlan_id) >> 5)
> +
> +struct fm10k_macvlan_filter_info {
> + uint16_t vlan_num;   /* Total VLAN number */
> + uint32_t vfta[FM10K_VFTA_SIZE];/* VLAN bitmap */
> +};
> +
>  struct fm10k_dev_info {
>   volatile uint32_t enable;
>   volatile uint32_t glort;
>   /* Protect the mailbox to avoid race condition */
>   rte_spinlock_tmbx_lock;
> + struct fm10k_macvlan_filter_infomacvlan;
>  };
>  
>  /*
> @@ -137,6 +157,9 @@ struct fm10k_adapter {
>  #define FM10K_DEV_PRIVATE_TO_MBXLOCK(adapter) \
>   (&(((struct fm10k_adapter *)adapter)->info.mbx_lock))
>  
> +#define FM10K_DEV_PRIVATE_TO_MACVLAN(adapter) \
> + (&(((struct fm10k_adapter *)adapter)->info.macvlan))
> +
>  struct fm10k_rx_queue {
>   struct rte_mempool *mp;
>   struct rte_mbuf **sw_ring;
> diff --git a/drivers/net/fm10k/fm10k_ethdev.c 
> b/drivers/net/fm10k/fm10k_ethdev.c
> index 3a26480..d2f3e44 100644
> --- a/drivers/net/fm10k/fm10k_ethdev.c
> +++ b/drivers/net/fm10k/fm10k_ethdev.c
> @@ -819,15 +819,61 @@ fm10k_dev_infos_get(struct rte_eth_dev *dev,
>  static int
>  fm10k_vlan_filter_set(struct rte_eth_dev *dev, uint16_t vlan_id, int on)
>  {
> - struct fm10k_hw *hw = FM10K_DEV_PRIVATE_TO_HW(dev->data->dev_private);
> + s32 result;
> + uint32_t vid_idx, vid_bit, mac_index;
> + struct fm10k_hw *hw;
> + struct fm10k_macvlan_filter_info *macvlan;
> + struct rte_eth_dev_data *data = dev->data;
>  
> - PMD_INIT_FUNC_TRACE();
> + hw = FM10K_DEV_PRIVATE_TO_HW(dev->data->dev_private);
> + macvlan = FM10K_DEV_PRIVATE_TO_MACVLAN(dev->data->dev_private);
>  
>   /* @todo - add support for the VF */
>   if (hw->mac.type != fm10k_mac_pf)
>   return -ENOTSUP;
>  
> - return fm10k_update_vlan(hw, vlan_id, 0, on);
> + if (vlan_id > ETH_VLAN_ID_MAX) {
> + PMD_INIT_LOG(ERR, "Invalid vlan_id: must be < 4096");
> + return (-EINVAL);
> + }
> +
> + vid_idx = FM10K_VFTA_IDX(vlan_id);
> + vid_bit = FM10K_VFTA_BIT(vlan_id);
> + /* this VLAN ID is already in the VLAN filter table, return SUCCESS */
> + if (on && (macvlan->vfta[vid_idx] & vid_bit))
> + return 0;
> + /* this VLAN ID is NOT in the VLAN filter table, cannot remove */
> + if (!on && !(macvlan->vfta[vid_idx] & vid_bit)) {
> + PMD_INIT_LOG(ERR, "Invalid vlan_id: not existing "
> + "in the VLAN filter table");
> + return (-EINVAL);
> + }
> +
> + fm10k_mbx_lock(hw);
> + result = fm10k_update_vlan(hw, vlan_id, 0, on);

Would it make sense about release the lock here? So that to make sure we
do not hold this lock for a long time.


> + if (FM10K_SUCCESS == result) {
> + if (on) {
> + macvlan->vlan_num++;
> + macvlan->vfta[vid_idx] |= vid_bit;
> + } else {
> + macvlan->vlan_num--;
> + macvlan->vfta[vid_idx] &= ~vid_bit;
> + }
> +
> + for (mac_index = 0; mac_index < FM10K_MAX_MACADDR_NUM;
> + mac_index++) {
> + if (is_zero_ether_addr(&data->mac_addrs[mac_index]))
> + continue;
> + fm10k_update_uc_addr(hw, hw->mac.dglort_map,
> + data->mac_addrs[mac_index].addr_bytes,
> + vlan_id, on, 0);
> + }
> + }
> + fm10k_mbx_unlock(hw);
> + if (FM10K_SUCCESS == result)
> + return 0;
> + else
> + return (-EIO);
>  }
>  
>  static inline int
> @@ -1701,6 +1747,7 @@ eth_fm10k_dev_init(struct rte_eth_dev *dev)
>  {
>   struct

[dpdk-dev] [PATCH 2/3] fm10k: add MAC filter

2015-06-09 Thread Qiu, Michael
On 2015/6/2 10:59, He, Shaopeng wrote:
> MAC filter function was newly added, each PF and VF can have up to 64 MAC
> addresses. VF filter needs support from PF host, which is not available now.
>
> Signed-off-by: Shaopeng He 
> ---
>  drivers/net/fm10k/fm10k.h|  3 +-
>  drivers/net/fm10k/fm10k_ethdev.c | 90 
> 
>  2 files changed, 85 insertions(+), 8 deletions(-)

...

> ;
> +
> + fm10k_mbx_lock(hw);
> + i = 0;
> + for (j = 0; j < FM10K_VFTA_SIZE; j++) {
> + if (macvlan->vfta[j]) {
> + for (k = 0; k < FM10K_UINT32_BIT_SIZE; k++) {
> + if (macvlan->vfta[j] & (1 << k)) {
> + if (i + 1 > macvlan->vlan_num) {
> + PMD_INIT_LOG(ERR, "vlan number "
> + "not match");
> + fm10k_mbx_unlock(hw);
> + return;
> + }
> + fm10k_update_uc_addr(hw,
> + hw->mac.dglort_map, mac,

Here before 'mac', does it has a incident? if no, please ignore, maybe
my mail client's issue.

Thanks,
Michael
> + j * FM10K_UINT32_BIT_SIZE + k,
> + add, 0);
> + i++;
> + }
> + }
> + }
> + }
> + fm10k_mbx_unlock(hw);
> +
> + if (add)
> + macvlan->mac_num++;
> + else
> + macvlan->mac_num--;
> +}
> +
> +/* Add a MAC address, and update filters */
> +static void
> +fm10k_macaddr_add(struct rte_eth_dev *dev,
> +  struct ether_addr *mac_addr,
> +  __rte_unused uint32_t index,
> +  __rte_unused uint32_t pool)
> +{
> + fm10k_MAC_filter_set(dev, mac_addr->addr_bytes, TRUE);
> +}
> +
> +/* Remove a MAC address, and update filters */
> +static void
> +fm10k_macaddr_remove(struct rte_eth_dev *dev, uint32_t index)
> +{
> + struct rte_eth_dev_data *data = dev->data;
> +
> + if (index < FM10K_MAX_MACADDR_NUM)
> + fm10k_MAC_filter_set(dev, data->mac_addrs[index].addr_bytes,
> + FALSE);
> +}
> +
>  static inline int
>  check_nb_desc(uint16_t min, uint16_t max, uint16_t mult, uint16_t request)
>  {
> @@ -1728,6 +1801,8 @@ static const struct eth_dev_ops fm10k_eth_dev_ops = {
>   .link_update= fm10k_link_update,
>   .dev_infos_get  = fm10k_dev_infos_get,
>   .vlan_filter_set= fm10k_vlan_filter_set,
> + .mac_addr_add   = fm10k_macaddr_add,
> + .mac_addr_remove= fm10k_macaddr_remove,
>   .rx_queue_start = fm10k_dev_rx_queue_start,
>   .rx_queue_stop  = fm10k_dev_rx_queue_stop,
>   .tx_queue_start = fm10k_dev_tx_queue_start,
> @@ -1809,7 +1884,8 @@ eth_fm10k_dev_init(struct rte_eth_dev *dev)
>   }
>  
>   /* Initialize MAC address(es) */
> - dev->data->mac_addrs = rte_zmalloc("fm10k", ETHER_ADDR_LEN, 0);
> + dev->data->mac_addrs = rte_zmalloc("fm10k",
> + ETHER_ADDR_LEN * FM10K_MAX_MACADDR_NUM, 0);
>   if (dev->data->mac_addrs == NULL) {
>   PMD_INIT_LOG(ERR, "Cannot allocate memory for MAC addresses");
>   return -ENOMEM;



[dpdk-dev] [PATCH 4/6] fm10k: Fix issue that MAC addr can't be set to silicon

2015-06-09 Thread Qiu, Michael
On 2015/5/29 16:11, Chen, Jing D wrote:
> From: "Chen Jing D(Mark)" 
>
> In fm10k, PF driver needs to communicate with switch through
> mailbox if it needs to add/delete MAC address.
> This fix will validate if switch is ready before going forward.
> Then, it is necessary to acquire LPORT_MAP info after issuing
> MAC addr request to switch.
>
> Signed-off-by: Chen Jing D(Mark) 
> ---
>  drivers/net/fm10k/fm10k_ethdev.c |   34 +++---
>  1 files changed, 31 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/fm10k/fm10k_ethdev.c 
> b/drivers/net/fm10k/fm10k_ethdev.c
> index 19e718b..dedfbb4 100644
> --- a/drivers/net/fm10k/fm10k_ethdev.c
> +++ b/drivers/net/fm10k/fm10k_ethdev.c
> @@ -45,6 +45,10 @@
>  #define FM10K_MBXLOCK_DELAY_US 20
>  #define UINT64_LOWER_32BITS_MASK 0xULL
>  
> +/* Max try times to aquire switch status */
> +#define MAX_QUERY_SWITCH_STATE_TIMES 10
> +/* Wait interval to get switch status */
> +#define WAIT_SWITCH_MSG_US10
>  /* Number of chars per uint32 type */
>  #define CHARS_PER_UINT32 (sizeof(uint32_t))
>  #define BIT_MASK_PER_UINT32 ((1 << CHARS_PER_UINT32) - 1)
> @@ -1802,6 +1806,32 @@ eth_fm10k_dev_init(struct rte_eth_dev *dev)
>   fm10k_dev_enable_intr_vf(dev);
>   }
>  
> + /* Enable uio intr after callback registered */
> + rte_intr_enable(&(dev->pci_dev->intr_handle));
> +
> + hw->mac.ops.update_int_moderator(hw);
> +
> + /* Make sure Switch Manager is ready before going forward. */
> + if (hw->mac.type == fm10k_mac_pf) {
> + int switch_ready = 0;
> + int i;
> +
> + for (i = 0; i < MAX_QUERY_SWITCH_STATE_TIMES; i++) {
> + fm10k_mbx_lock(hw);
> + hw->mac.ops.get_host_state(hw, &switch_ready);
> + fm10k_mbx_unlock(hw);
> + if (switch_ready)
> + break;
> + /* Delay some time to acquire async LPORT_MAP info. */
> + rte_delay_us(WAIT_SWITCH_MSG_US);
> + }
> +
> + if (switch_ready == 0) {
> + PMD_INIT_LOG(ERR, "switch is not ready");
> + return -1;

Here better to return  -EIO or other error code? "-1" seems not keep the
same style with other return routine in this function.

Thanks,
Michael


> + }
> + }
> +
>   /*
>* Below function will trigger operations on mailbox, acquire lock to
>* avoid race condition from interrupt handler. Operations on mailbox
> @@ -1811,7 +1841,7 @@ eth_fm10k_dev_init(struct rte_eth_dev *dev)
>*/
>   fm10k_mbx_lock(hw);
>   /* Enable port first */
> - hw->mac.ops.update_lport_state(hw, 0, 0, 1);
> + hw->mac.ops.update_lport_state(hw, hw->mac.dglort_map, 1, 1);
>  
>   /* Update default vlan */
>   hw->mac.ops.update_vlan(hw, hw->mac.default_vid, 0, true);
> @@ -1831,8 +1861,6 @@ eth_fm10k_dev_init(struct rte_eth_dev *dev)
>  
>   fm10k_mbx_unlock(hw);
>  
> - /* enable uio intr after callback registered */
> - rte_intr_enable(&(dev->pci_dev->intr_handle));
>  
>   return 0;
>  }



[dpdk-dev] [PATCH] ixgbe: fix checking for tx_free_thresh

2015-06-09 Thread Zoltan Kiss


On 09/06/15 16:44, Ananyev, Konstantin wrote:
>
>
>> -Original Message-
>> From: Zoltan Kiss [mailto:zoltan.kiss at linaro.org]
>> Sent: Tuesday, June 09, 2015 4:08 PM
>> To: Ananyev, Konstantin; dev at dpdk.org
>> Subject: Re: [dpdk-dev] [PATCH] ixgbe: fix checking for tx_free_thresh
>>
>>
>>
>> On 09/06/15 12:18, Ananyev, Konstantin wrote:
>>>
>>>
 -Original Message-
 From: Zoltan Kiss [mailto:zoltan.kiss at linaro.org]
 Sent: Wednesday, June 03, 2015 6:47 PM
 To: Ananyev, Konstantin; dev at dpdk.org
 Subject: Re: [dpdk-dev] [PATCH] ixgbe: fix checking for tx_free_thresh



 On 02/06/15 18:35, Ananyev, Konstantin wrote:
>
>
>> -Original Message-
>> From: Zoltan Kiss [mailto:zoltan.kiss at linaro.org]
>> Sent: Tuesday, June 02, 2015 4:08 PM
>> To: Ananyev, Konstantin; dev at dpdk.org
>> Subject: Re: [dpdk-dev] [PATCH] ixgbe: fix checking for tx_free_thresh
>>
>>
>>
>> On 02/06/15 14:31, Ananyev, Konstantin wrote:
>>> Hi Zoltan,
>>>
 -Original Message-
 From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Zoltan Kiss
 Sent: Monday, June 01, 2015 5:16 PM
 To: dev at dpdk.org
 Subject: Re: [dpdk-dev] [PATCH] ixgbe: fix checking for tx_free_thresh

 Hi,

 Anyone would like to review this patch? Venky sent a NAK, but I've
 explained to him why it is a bug.
>>>
>>>
>>> Well, I think Venky is right here.
>> I think the comments above rte_eth_tx_burst() definition are quite clear
>> about what tx_free_thresh means, e1000 and i40e use it that way, but not
>> ixgbe.
>>
>>> Indeed that fix, will cause more often unsuccessful checks for DD bits 
>>> and might cause a
>>> slowdown for TX fast-path.
>> Not if the applications set tx_free_thresh according to the definition
>> of this value. But we can change the default value from 32 to something
>> higher, e.g I'm using nb_desc/2, and it works out well.
>
> Sure we can, as I said below, we can unify it one way or another.
> One way would be to make fast-path TX to free TXDs when number of 
> occupied TXDs raises above tx_free_thresh
> (what rte_ethdev.h comments say and what full-featured TX is doing).
> Though in that case we have to change default value for tx_free_thresh, 
> and all existing apps that
> using tx_free_thresh==32 and fast-path TX will probably experience a 
> slowdown.

 They are in trouble already, because i40e and e1000 uses it as defined.
>>>
>>> In fact, i40e has exactly the same problem as ixgbe:
>>> fast-path and full-featured TX  code treat  tx_free_thresh in a different 
>>> way.
>>> igb just ignores input tx_free_thresh, while em has only full featured path.
>>>
>>> What I am saying, existing app that uses TX fast-path and sets 
>>> tx_free_thresh=32
>>> (as we did in our examples in previous versions) will experience a slowdown,
>>> if we'll make all TX functions to behave like full-featured ones
>>> (txq->nb_tx_desc - txq->nb_tx_free > txq->tx_free_thresh).
>>>
>>>   From other side, if app uses TX full-featured TX and sets 
>>> tx_free_thresh=32,
>>> then it  already has a possible slowdown, because of too often TXDs 
>>> checking.
>>> So, if we'll change tx_free_thresh semantics to wht fast-path uses,
>>> It shouldn't see any slowdown, in fact it might see some improvement.
>>>
 But I guess most apps are going with 0, which sets the drivers default.
 Others have to change the value to nb_txd - curr_value to have the same
 behaviour

> Another way would be to make all TX functions to treat 
> tx_conf->tx_free_thresh as fast-path TX functions do
> (free TXDs when number of free TXDs drops below  tx_free_thresh) and 
> update  rte_ethdev.h comments.
 And i40e and e1000e code as well. I don't see what difference it makes
 which way of definition you use, what I care is that it should be used
 consistently.
>>>
>>> Yes, both ways are possible, the concern is - how to minimise the impact 
>>> for existing apps.
>>> That's why I am leaning to the fast-path way.
>>
>> Make sense to favour the fast-path way, I'll look into that and try to
>> come up with a patch
>>
>>>
>
> Though, I am not sure that it really worth all these changes.
>From one side, whatever tx_free_thresh would be,
> the app should still assume that the worst case might happen,
> and up to nb_tx_desc mbufs can be consumed by the queue.
>From other side, I think the default value should work well for most 
> cases.
> So I am still for graceful deprecation of that config parameter, see 
> below.
>
>>
>>> Anyway, with current PMD implementation, you can't guarantee that at 
>>> any moment
>>> TX queue wouldn't use more than tx_free_thresh mbufs.
>>
>>
>>> There could be

[dpdk-dev] Headers files with BSD license in kernel

2015-06-09 Thread Miguel Bernal Marin
Hi,

I'm working on Clear Linux project, and when I was integrating DPDK
kernel modules to our kernel I found there are two headers with 
BSD License

rte_pci_dev_feature_defs.h
rte_pci_dev_features.h

those are included in igb_uio module.

Are those licenses correct?

Thanks,
Miguel



[dpdk-dev] Headers files with BSD license in kernel

2015-06-09 Thread Miguel Bernal Marin
Including maintainers in CC

On Tue, Jun 09, 2015 at 12:40:57PM -0500, Miguel Bernal Marin wrote:
> Hi,
> 
> I'm working on Clear Linux project, and when I was integrating DPDK
> kernel modules to our kernel I found there are two headers with 
> BSD License
> 
> rte_pci_dev_feature_defs.h
> rte_pci_dev_features.h
> 
> those are included in igb_uio module.
> 
> Are those licenses correct?
> 
> Thanks,
> Miguel
> 


[dpdk-dev] Trouble with Tx on some i40 ports in KVM VM

2015-06-09 Thread Andrew Theurer
Hello.  I am having some trouble with getting various dpdk applications to work 
when using in a KVM VM.  This happens with the 2nd port on a dual port adapter. 
 What I have configured:  

Haswell-ep host with KVM, 2 x i40e phys functions bound to vfio-pci, 2 
functions assigned to KVM guest.  In the guest, the i40 functions bound to 
uio_pci_generic, running testpmd with these options: --nb-cores=2 --nb-ports=2 
--portmask=3 --interactive --auto-start.  

On another host directly connected to these adapters, I run pktgen on the 2 
i40e ports.  If I Tx on port1, I get packets back on port0.  If I Tx on port0, 
no packets come back on port0.  For some reason Tx simply does not happen on 
that second port.

I have tried this test where there is no KVM, just using host, and it works as 
expected.  I have also tried this test with ixgbe adapter in a KVM VM (I have 
both in same system), and it works as expected.  I have also tried just using a 
bridge in the VM with i40e ports, with i40e module in use, and that also works 
as expected.  So, I don't think this has anything to do with the adapters 
themselves, or cables, etc.  Something is not working when using the KVM VM, I 
just don't know what it is.

Here is xstats from testpmd:
## NIC extended statistics for port 1
rx_packets: 134615456
tx_packets: 0
rx_bytes: 8615389184
tx_bytes: 0
tx_errors: 0
rx_missed_errors: 0
rx_crc_errors: 0
rx_bad_length_errors: 0
rx_errors: 0
alloc_rx_buff_failed: 0
fdir_match: 0
fdir_miss: 0
tx_flow_control_xon: 0
rx_flow_control_xon: 0
tx_flow_control_xoff: 0
rx_flow_control_xoff: 0
rx_queue_0_rx_packets: 0
rx_queue_0_rx_bytes: 0
tx_queue_0_tx_packets: 0
tx_queue_0_tx_bytes: 0
tx_queue_0_tx_errors: 0

I can't find any evidence of errors anywhere.  It's just not doing Tx at all on 
that port.  I have also tried using ptkgen in the VM, to manually send packets 
on that port, and again, no packets sent, but no errors either.

Has anyone else tried 2 i40e ports in a KVM VM?  Any ideas what could be going 
on here?

Thanks,

-Andrew


[dpdk-dev] Trouble with Tx on some i40 ports in KVM VM

2015-06-09 Thread Zhou, Danny
On your system, what is your iommu and intel_iommu kernel parameters? 

Could you try to use uio_pci_generic for both PF on host and VF in guest after 
configuring "iommu=pt" on your system?

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Andrew Theurer
> Sent: Wednesday, June 10, 2015 5:48 AM
> To: dev at dpdk.org
> Subject: [dpdk-dev] Trouble with Tx on some i40 ports in KVM VM
> 
> Hello.  I am having some trouble with getting various dpdk applications to 
> work when using in a KVM VM.  This happens
> with the 2nd port on a dual port adapter.  What I have configured:
> 
> Haswell-ep host with KVM, 2 x i40e phys functions bound to vfio-pci, 2 
> functions assigned to KVM guest.  In the guest, the
> i40 functions bound to uio_pci_generic, running testpmd with these options: 
> --nb-cores=2 --nb-ports=2 --portmask=3
> --interactive --auto-start.
> 
> On another host directly connected to these adapters, I run pktgen on the 2 
> i40e ports.  If I Tx on port1, I get packets back
> on port0.  If I Tx on port0, no packets come back on port0.  For some reason 
> Tx simply does not happen on that second
> port.
> 
> I have tried this test where there is no KVM, just using host, and it works 
> as expected.  I have also tried this test with ixgbe
> adapter in a KVM VM (I have both in same system), and it works as expected.  
> I have also tried just using a bridge in the VM
> with i40e ports, with i40e module in use, and that also works as expected.  
> So, I don't think this has anything to do with the
> adapters themselves, or cables, etc.  Something is not working when using the 
> KVM VM, I just don't know what it is.
> 
> Here is xstats from testpmd:
> ## NIC extended statistics for port 1
> rx_packets: 134615456
> tx_packets: 0
> rx_bytes: 8615389184
> tx_bytes: 0
> tx_errors: 0
> rx_missed_errors: 0
> rx_crc_errors: 0
> rx_bad_length_errors: 0
> rx_errors: 0
> alloc_rx_buff_failed: 0
> fdir_match: 0
> fdir_miss: 0
> tx_flow_control_xon: 0
> rx_flow_control_xon: 0
> tx_flow_control_xoff: 0
> rx_flow_control_xoff: 0
> rx_queue_0_rx_packets: 0
> rx_queue_0_rx_bytes: 0
> tx_queue_0_tx_packets: 0
> tx_queue_0_tx_bytes: 0
> tx_queue_0_tx_errors: 0
> 
> I can't find any evidence of errors anywhere.  It's just not doing Tx at all 
> on that port.  I have also tried using ptkgen in the
> VM, to manually send packets on that port, and again, no packets sent, but no 
> errors either.
> 
> Has anyone else tried 2 i40e ports in a KVM VM?  Any ideas what could be 
> going on here?
> 
> Thanks,
> 
> -Andrew


[dpdk-dev] [PATCH v12 00/14] Interrupt mode PMD

2015-06-09 Thread Stephen Hemminger
On Mon,  8 Jun 2015 13:28:57 +0800
Cunming Liang  wrote:

> v12 changes
>  - bsd cleanup for unused variable warning
>  - fix awkward line split in debug message
> 
> v11 changes
>  - typo cleanup and check kernel style
> 
> v10 changes
>  - code rework to return actual error code
>  - bug fix for lsc when using uio_pci_generic
> 
> v9 changes
>  - code rework to fix open comment
>  - bug fix for igb lsc when both lsc and rxq are enabled in vfio-msix
>  - new patch to turn off the feature by default so as to avoid v2.1 abi broken
> 
> v8 changes
>  - remove condition check for only vfio-msix
>  - add multiplex intr support when only one intr vector allowed
>  - lsc and rxq interrupt runtime enable decision
>  - add safe event delete while the event wakeup execution happens
> 
> v7 changes
>  - decouple epoll event and intr operation
>  - add condition check in the case intr vector is disabled
>  - renaming some APIs
> 
> v6 changes
>  - split rte_intr_wait_rx_pkt into two APIs 'wait' and 'set'.
>  - rewrite rte_intr_rx_wait/rte_intr_rx_set.
>  - using vector number instead of queue_id as interrupt API params.
>  - patch reorder and split.
> 
> v5 changes
>  - Rebase the patchset onto the HEAD
>  - Isolate ethdev from EAL for new-added wait-for-rx interrupt function
>  - Export wait-for-rx interrupt function for shared libraries
>  - Split-off a new patch file for changed struct rte_intr_handle that
>other patches depend on, to avoid breaking git bisect
>  - Change sample applicaiton to accomodate EAL function spec change
>accordingly
> 
> v4 changes
>  - Export interrupt enable/disable functions for shared libraries
>  - Adjust position of new-added structure fields and functions to
>avoid breaking ABI
>  
> v3 changes
>  - Add return value for interrupt enable/disable functions
>  - Move spinlok from PMD to L3fwd-power
>  - Remove unnecessary variables in e1000_mac_info
>  - Fix miscelleous review comments
>  
> v2 changes
>  - Fix compilation issue in Makefile for missed header file.
>  - Consolidate internal and community review comments of v1 patch set.
>  
> The patch series introduce low-latency one-shot rx interrupt into DPDK with
> polling and interrupt mode switch control example.
>  
> DPDK userspace interrupt notification and handling mechanism is based on UIO
> with below limitation:
> 1) It is designed to handle LSC interrupt only with inefficient suspended
>pthread wakeup procedure (e.g. UIO wakes up LSC interrupt handling thread
>which then wakes up DPDK polling thread). In this way, it introduces
>non-deterministic wakeup latency for DPDK polling thread as well as packet
>latency if it is used to handle Rx interrupt.
> 2) UIO only supports a single interrupt vector which has to been shared by
>LSC interrupt and interrupts assigned to dedicated rx queues.
>  
> This patchset includes below features:
> 1) Enable one-shot rx queue interrupt in ixgbe PMD(PF & VF) and igb PMD(PF 
> only).
> 2) Build on top of the VFIO mechanism instead of UIO, so it could support
>up to 64 interrupt vectors for rx queue interrupts.
> 3) Have 1 DPDK polling thread handle per Rx queue interrupt with a dedicated
>VFIO eventfd, which eliminates non-deterministic pthread wakeup latency in
>user space.
> 4) Demonstrate interrupts control APIs and userspace NAIP-like 
> polling/interrupt
>switch algorithms in L3fwd-power example.
> 
> Known limitations:
> 1) It does not work for UIO due to a single interrupt eventfd shared by LSC
>and rx queue interrupt handlers causes a mess. [FIXED]
> 2) LSC interrupt is not supported by VF driver, so it is by default disabled
>in L3fwd-power now. Feel free to turn in on if you want to support both LSC
>and rx queue interrupts on a PF.
> 
> Cunming Liang (14):
>   eal/linux: add interrupt vectors support in intr_handle
>   eal/linux: add rte_epoll_wait/ctl support
>   eal/linux: add API to set rx interrupt event monitor
>   eal/linux: fix comments typo on vfio msi
>   eal/linux: add interrupt vectors handling on VFIO
>   eal/linux: standalone intr event fd create support
>   eal/linux: fix lsc read error in uio_pci_generic
>   eal/bsd: dummy for new intr definition
>   eal/bsd: fix inappropriate linuxapp referred in bsd
>   ethdev: add rx intr enable, disable and ctl functions
>   ixgbe: enable rx queue interrupts for both PF and VF
>   igb: enable rx queue interrupts for PF
>   l3fwd-power: enable one-shot rx interrupt and polling/interrupt mode
> switch
>   abi: fix v2.1 abi broken issue
> 
>  drivers/net/e1000/igb_ethdev.c | 311 ++--
>  drivers/net/ixgbe/ixgbe_ethdev.c   | 519 
> -
>  drivers/net/ixgbe/ixgbe_ethdev.h   |   4 +
>  examples/l3fwd-power/main.c| 206 ++--
>  lib/librte_eal/bsdapp/eal/eal_interrupts.c |  30 ++
>  .../bsdapp/eal/include/exec-env/rte_interrupts.h   |  91 +++-
>  lib/librte_eal/b

[dpdk-dev] [PATCH v2 1/7] ethdev: add additional error stats

2015-06-09 Thread Stephen Hemminger
On Tue,  9 Jun 2015 16:10:40 +0100
Maryam Tahhan  wrote:

> Add MAC error and drop statistics to struct rte_eth_stats and the
> extended stats.
> Signed-off-by: Maryam Tahhan 
> ---
>  lib/librte_ether/rte_ethdev.c | 4 
>  lib/librte_ether/rte_ethdev.h | 4 
>  2 files changed, 8 insertions(+)
> 
> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> index 5a94654..a439b4a 100644
> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> @@ -136,6 +136,10 @@ static const struct rte_eth_xstats_name_off 
> rte_stats_strings[] = {
>   {"rx_flow_control_xon", offsetof(struct rte_eth_stats, rx_pause_xon)},
>   {"tx_flow_control_xoff", offsetof(struct rte_eth_stats, tx_pause_xoff)},
>   {"rx_flow_control_xoff", offsetof(struct rte_eth_stats, rx_pause_xoff)},
> + {"rx_mac_err", offsetof(struct rte_eth_stats, imacerr)},
> + {"rx_phy_err", offsetof(struct rte_eth_stats, iphyerr)},
> + {"tx_drops", offsetof(struct rte_eth_stats, odrop)},
> + {"rx_drops", offsetof(struct rte_eth_stats, idrop)}

Are these really generic enough to put them in ethdev?


[dpdk-dev] [PATCH] examples/distributor: fix missing "; " in debug macro

2015-06-09 Thread Stephen Hemminger
On Mon, 8 Jun 2015 11:58:10 +0100
Bruce Richardson  wrote:

> On Fri, Jun 05, 2015 at 10:45:04PM +0200, Thomas Monjalon wrote:
> > 2015-06-05 17:01, Bruce Richardson:
> > > The macro to turn on additional debug output when the app was compiled
> > > with "-DDEBUG" was missing a ";".
> > 
> > It shows that such dead code is almost never tested.
> > It would be saner if this command would return no result:
> > git grep 'ifdef.*DEBUG' examples
> > examples/distributor/main.c:#ifdef DEBUG
> > examples/l3fwd-acl/main.c:#ifdef L3FWDACL_DEBUG
> > examples/l3fwd-acl/main.c:#ifdef L3FWDACL_DEBUG
> > examples/l3fwd-acl/main.c:#ifdef L3FWDACL_DEBUG
> > examples/l3fwd-acl/main.c:#ifdef L3FWDACL_DEBUG
> > examples/packet_ordering/main.c:#ifdef DEBUG
> > examples/vhost/main.c:#ifdef DEBUG
> > examples/vhost/main.h:#ifdef DEBUG
> > examples/vhost_xen/main.c:#ifdef DEBUG
> > examples/vhost_xen/main.h:#ifdef DEBUG
> > 
> > There is no good reason to not use CONFIG_RTE_LOG_LEVEL to trigger debug 
> > build.
> > 
> I agree and disagree. 
> 
> I agree it would be good if we had a standard way of setting up
> a DEBUG build that would make it easier to test and pick up on this sort of 
> things.
> 
> I disagree that the compile time log level is the way to do this. The log 
> level
> at compile time specifies the default log level only, the actual log level is
> controllable at runtime. Having the default log level also affect what kind of
> build is done, e.g. with -O0 rather than -O3, introduces an unnecessary 
> dependency.
> Setting the default log level to 5 and changing it to 9 at runtime should be
> the same as setting the default to 9.
> 
> /Bruce

One good way is to use something like:

#ifdef DEBUG
 #define LOG_DEBUG(log_type, fmt, args...) do { \
-   RTE_LOG(DEBUG, log_type, fmt, ##args)   \
+   RTE_LOG(DEBUG, log_type, fmt, ##args);  \
 } while (0)
#else
#define LOG_LEVEL RTE_LOG_INFO
#define LOG_DEBUG(log_type, fmt, args...) if (0) {  \
RTE_LOG(DEBUG, log_type, fmt, ##args);  \
 } else
#endif


[dpdk-dev] Trouble with Tx on some i40 ports in KVM VM

2015-06-09 Thread Andrew Theurer


- Original Message -
> From: "Danny Zhou" 
> To: "Andrew Theurer" 
> Cc: dev at dpdk.org
> Sent: Tuesday, June 9, 2015 6:38:49 PM
> Subject: RE: Trouble with Tx on some i40 ports in KVM VM
> 
> On your system, what is your iommu and intel_iommu kernel parameters?
> 
> Could you try to use uio_pci_generic for both PF on host and VF in guest
> after configuring "iommu=pt" on your system?

I use iommu=pt intel_iommu=on

To clarify, I am not using a VF in the VM.  I am only using PF's, assigning 
them to the VM.  The host needs vfio module for kvm to assign these to the VM.
I don't see how uio_pci_generic can be used in the host if the PF is being 
passed to the VM.

Thanks,

-Andrew
> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Andrew Theurer
> > Sent: Wednesday, June 10, 2015 5:48 AM
> > To: dev at dpdk.org
> > Subject: [dpdk-dev] Trouble with Tx on some i40 ports in KVM VM
> > 
> > Hello.  I am having some trouble with getting various dpdk applications to
> > work when using in a KVM VM.  This happens
> > with the 2nd port on a dual port adapter.  What I have configured:
> > 
> > Haswell-ep host with KVM, 2 x i40e phys functions bound to vfio-pci, 2
> > functions assigned to KVM guest.  In the guest, the
> > i40 functions bound to uio_pci_generic, running testpmd with these options:
> > --nb-cores=2 --nb-ports=2 --portmask=3
> > --interactive --auto-start.
> > 
> > On another host directly connected to these adapters, I run pktgen on the 2
> > i40e ports.  If I Tx on port1, I get packets back
> > on port0.  If I Tx on port0, no packets come back on port0.  For some
> > reason Tx simply does not happen on that second
> > port.
> > 
> > I have tried this test where there is no KVM, just using host, and it works
> > as expected.  I have also tried this test with ixgbe
> > adapter in a KVM VM (I have both in same system), and it works as expected.
> > I have also tried just using a bridge in the VM
> > with i40e ports, with i40e module in use, and that also works as expected.
> > So, I don't think this has anything to do with the
> > adapters themselves, or cables, etc.  Something is not working when using
> > the KVM VM, I just don't know what it is.
> > 
> > Here is xstats from testpmd:
> > ## NIC extended statistics for port 1
> > rx_packets: 134615456
> > tx_packets: 0
> > rx_bytes: 8615389184
> > tx_bytes: 0
> > tx_errors: 0
> > rx_missed_errors: 0
> > rx_crc_errors: 0
> > rx_bad_length_errors: 0
> > rx_errors: 0
> > alloc_rx_buff_failed: 0
> > fdir_match: 0
> > fdir_miss: 0
> > tx_flow_control_xon: 0
> > rx_flow_control_xon: 0
> > tx_flow_control_xoff: 0
> > rx_flow_control_xoff: 0
> > rx_queue_0_rx_packets: 0
> > rx_queue_0_rx_bytes: 0
> > tx_queue_0_tx_packets: 0
> > tx_queue_0_tx_bytes: 0
> > tx_queue_0_tx_errors: 0
> > 
> > I can't find any evidence of errors anywhere.  It's just not doing Tx at
> > all on that port.  I have also tried using ptkgen in the
> > VM, to manually send packets on that port, and again, no packets sent, but
> > no errors either.
> > 
> > Has anyone else tried 2 i40e ports in a KVM VM?  Any ideas what could be
> > going on here?
> > 
> > Thanks,
> > 
> > -Andrew
> 


[dpdk-dev] [PATCH 5/5] virtio: clarify feature bit handling

2015-06-09 Thread Stephen Hemminger
On Wed, 15 Apr 2015 08:20:19 -0700
Stephen Hemminger  wrote:

> Change the features from bit mask to bit number. This allows the
> DPDK driver to use the definitions from Linux[1]. This makes doing
> enhancements to DPDK driver easier when new feature bits are added.
> 
> More importantly, get rid of the confusing feature bit initialization
> code. Remove the double negative code in the feature masking.
> Instead just have a new define with the list of feature bits implemented.
> 
> Signed-off-by: Stephen Hemminger 
> ---
>  lib/librte_pmd_virtio/virtio_ethdev.c | 17 +--
>  lib/librte_pmd_virtio/virtio_ethdev.h | 27 --
>  lib/librte_pmd_virtio/virtio_pci.h| 96 
> ++-
>  lib/librte_pmd_virtio/virtqueue.h |  8 +--
>  4 files changed, 61 insertions(+), 87 deletions(-)
> 
> diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c 
> b/lib/librte_pmd_virtio/virtio_ethdev.c
> index db0232e..349b73b 100644
> --- a/lib/librte_pmd_virtio/virtio_ethdev.c
> +++ b/lib/librte_pmd_virtio/virtio_ethdev.c
> @@ -807,23 +807,10 @@ virtio_vlan_filter_set(struct rte_eth_dev *dev, 
> uint16_t vlan_id, int on)
>  static void
>  virtio_negotiate_features(struct virtio_hw *hw)
>  {
> - uint32_t host_features, mask;
> -
> - /* checksum offload not implemented */
> - mask = VIRTIO_NET_F_CSUM | VIRTIO_NET_F_GUEST_CSUM;
> -
> - /* TSO and LRO are only available when their corresponding
> -  * checksum offload feature is also negotiated.
> -  */
> - mask |= VIRTIO_NET_F_HOST_TSO4 | VIRTIO_NET_F_HOST_TSO6 | 
> VIRTIO_NET_F_HOST_ECN;
> - mask |= VIRTIO_NET_F_GUEST_TSO4 | VIRTIO_NET_F_GUEST_TSO6 | 
> VIRTIO_NET_F_GUEST_ECN;
> - mask |= VTNET_LRO_FEATURES;
> -
> - /* not negotiating INDIRECT descriptor table support */
> - mask |= VIRTIO_RING_F_INDIRECT_DESC;
> + uint32_t host_features;
>  
>   /* Prepare guest_features: feature that driver wants to support */
> - hw->guest_features = VTNET_FEATURES & ~mask;
> + hw->guest_features = VIRTIO_PMD_GUEST_FEATURES;
>   PMD_INIT_LOG(DEBUG, "guest_features before negotiate = %x",
>   hw->guest_features);
>  
> diff --git a/lib/librte_pmd_virtio/virtio_ethdev.h 
> b/lib/librte_pmd_virtio/virtio_ethdev.h
> index e6d4533..df2cb7d 100644
> --- a/lib/librte_pmd_virtio/virtio_ethdev.h
> +++ b/lib/librte_pmd_virtio/virtio_ethdev.h
> @@ -56,24 +56,15 @@
>  #define VIRTIO_MAX_RX_PKTLEN  9728
>  
>  /* Features desired/implemented by this driver. */
> -#define VTNET_FEATURES \
> - (VIRTIO_NET_F_MAC   | \
> - VIRTIO_NET_F_STATUS | \
> - VIRTIO_NET_F_MQ | \
> - VIRTIO_NET_F_CTRL_MAC_ADDR | \
> - VIRTIO_NET_F_CTRL_VQ| \
> - VIRTIO_NET_F_CTRL_RX| \
> - VIRTIO_NET_F_CTRL_VLAN  | \
> - VIRTIO_NET_F_CSUM   | \
> - VIRTIO_NET_F_HOST_TSO4  | \
> - VIRTIO_NET_F_HOST_TSO6  | \
> - VIRTIO_NET_F_HOST_ECN   | \
> - VIRTIO_NET_F_GUEST_CSUM | \
> - VIRTIO_NET_F_GUEST_TSO4 | \
> - VIRTIO_NET_F_GUEST_TSO6 | \
> - VIRTIO_NET_F_GUEST_ECN  | \
> - VIRTIO_NET_F_MRG_RXBUF  | \
> - VIRTIO_RING_F_INDIRECT_DESC)
> +#define VIRTIO_PMD_GUEST_FEATURES\
> + (1u << VIRTIO_NET_F_MAC   | \
> +  1u << VIRTIO_NET_F_STATUS| \
> +  1u << VIRTIO_NET_F_MQ| \
> +  1u << VIRTIO_NET_F_CTRL_MAC_ADDR | \
> +  1u << VIRTIO_NET_F_CTRL_VQ   | \
> +  1u << VIRTIO_NET_F_CTRL_RX   | \
> +  1u << VIRTIO_NET_F_CTRL_VLAN | \
> +  1u << VIRTIO_NET_F_MRG_RXBUF)
>  
>  /*
>   * CQ function prototype
> diff --git a/lib/librte_pmd_virtio/virtio_pci.h 
> b/lib/librte_pmd_virtio/virtio_pci.h
> index 64d9c34..47f722a 100644
> --- a/lib/librte_pmd_virtio/virtio_pci.h
> +++ b/lib/librte_pmd_virtio/virtio_pci.h
> @@ -96,26 +96,6 @@ struct virtqueue;
>  #define VIRTIO_CONFIG_STATUS_FAILED0x80
>  
>  /*
> - * Generate interrupt when the virtqueue ring is
> - * completely used, even if we've suppressed them.
> - */
> -#define VIRTIO_F_NOTIFY_ON_EMPTY (1 << 24)
> -
> -/*
> - * The guest should never negotiate this feature; it
> - * is used to detect faulty drivers.
> - */
> -#define VIRTIO_F_BAD_FEATURE (1 << 30)
> -
> -/*
> - * Some VirtIO feature bits (currently bits 28 through 31) are
> - * reserved for the transport being used (eg. virtio_ring), the
> - * rest are per-device feature bits.
> - */
> -#define VIRTIO_TRANSPORT_F_START 28
> -#define VIRTIO_TRANSPORT_F_END   32
> -
> -/*
>   * Each virtqueue indirect descriptor list must be physically contiguous.
>   * To allow us to malloc(9) each list individually, limit the number
>   * supported to what will fit in one page. With 4KB pages, this is a limit
> @@ -128,33 +108,55 @@ struct virtqueue;
>  #define VIRTIO_MAX_INDIRECT ((int) (PAGE_SIZE / 16))
>  
>  /* The feature bitmap for virtio net */
> -#define VIRTIO_NET_F_CSUM   0x1 /* Host handles pkts w/ partial csum 
> */
> -#define

[dpdk-dev] Is vhost vring_avail size tunable?

2015-06-09 Thread Stephen Hemminger
On Sun, 7 Jun 2015 16:02:04 +0800 (CST)
"Tim Deng"  wrote:

> 
> Hi,
> 
> Under heavy work load, I found there were some packet lost caused by
> "Failed?to get enough desc from vring...", is there any way to get the
> vring size larger?
> 
> Thanks,
> Tim

One thing that could help the DPDK virtio driver would be to support
INDIRECT descriptors. By using that, it is possible to get 256 packets
into the 256 slots in QEMU.  Currently half the slots are taken by
headers.

The implementation of vhost driver likewise needs to get support
for INDIRECT descriptors and some other features in order to use slots
more efficiently.



[dpdk-dev] [PATCH 1/4] ixgbe: expose extended error statistics

2015-06-09 Thread Stephen Hemminger
On Fri,  5 Jun 2015 18:35:02 +0100
Maryam Tahhan  wrote:

> Implement xstats_get() and xstats_reset() in dev_ops for ixgbe to expose
> detailed error statistics to DPDK applications.
> 
> Signed-off-by: Maryam Tahhan 

Also, the bug where CRC is included in Tx byte count but
not in the Rx byte count has not been addressed.

You seem to have ignored my earlier patch.


[dpdk-dev] Headers files with BSD license in kernel

2015-06-09 Thread Stephen Hemminger
On Wed, 10 Jun 2015 00:42:59 +
"Zhang, Helin"  wrote:

> Hi Miguel
> 
> My thought is there might be something wrong. Let's see what comments from 
> other experts!
> Thank you very much for the good catch!
> 
> Regards,
> Helin
> 
> > -Original Message-
> > From: Miguel Bernal Marin [mailto:miguel.bernal.marin at linux.intel.com]
> > Sent: Wednesday, June 10, 2015 4:10 AM
> > To: dev at dpdk.org
> > Cc: david.marchand at 6wind.com; Burakov, Anatoly; Zhang, Helin; Bernal 
> > Marin,
> > Miguel
> > Subject: Re: [dpdk-dev] Headers files with BSD license in kernel
> > 
> > Including maintainers in CC
> > 
> > On Tue, Jun 09, 2015 at 12:40:57PM -0500, Miguel Bernal Marin wrote:
> > > Hi,
> > >
> > > I'm working on Clear Linux project, and when I was integrating DPDK
> > > kernel modules to our kernel I found there are two headers with BSD
> > > License
> > >
> > > rte_pci_dev_feature_defs.h
> > > rte_pci_dev_features.h
> > >
> > > those are included in igb_uio module.
> > >
> > > Are those licenses correct?
> > >
> > > Thanks,
> > > Miguel
> > >

You can always escalate a BSD license to GPL, but the other way is not allowed.
Ideally, the language on the file should make it clear that it is dual licensed.
In an ideal world, igb_uio would go away, I am working on that.