[dpdk-dev] [PATCH v4 2/2] vhost: Add VHOST PMD

2015-11-16 Thread Wang, Zhihong
A quick glimpse and the bug is gone now :)
Will have more test later on.

> -Original Message-
> From: Tetsuya Mukawa [mailto:mukawa at igel.co.jp]
> Sent: Friday, November 13, 2015 1:21 PM
> To: dev at dpdk.org; Wang, Zhihong ; Liu, Yuanhan
> 
> Cc: Loftus, Ciara ; pmatilai at redhat.com;
> ann.zhuangyanying at huawei.com; Richardson, Bruce
> ; Xie, Huawei ;
> thomas.monjalon at 6wind.com; stephen at networkplumber.org;
> rich.lane at bigswitch.com; Tetsuya Mukawa 
> Subject: [PATCH v4 2/2] vhost: Add VHOST PMD
> 
> The patch introduces a new PMD. This PMD is implemented as thin wrapper
> of librte_vhost. It means librte_vhost is also needed to compile the PMD.
> The vhost messages will be handled only when a port is started. So start
> a port first, then invoke QEMU.
> 
> The PMD has 2 parameters.
>  - iface:  The parameter is used to specify a path to connect to a
>virtio-net device.
>  - queues: The parameter is used to specify the number of the queues
>virtio-net device has.
>(Default: 1)
> 
> Here is an example.
> $ ./testpmd -c f -n 4 --vdev 'eth_vhost0,iface=/tmp/sock0,queues=1' -- -i
> 
> To connect above testpmd, here is qemu command example.
> 
> $ qemu-system-x86_64 \
> 
> -chardev socket,id=chr0,path=/tmp/sock0 \
> -netdev vhost-user,id=net0,chardev=chr0,vhostforce,queues=1 \
> -device virtio-net-pci,netdev=net0
> 
> Signed-off-by: Tetsuya Mukawa 
> ---



[dpdk-dev] [PATCH] i40e: fix the write flush in vf driver

2015-11-16 Thread Jingjing Wu
For i40e vf driver, should use I40EVF_WRITE_FLUSH to flush
configuration but not I40E_WRITE_FLUSH. This patch fixed this issue.

Fixes: be6c228d4da3 (i40evf: support Rx interrupt)

Reported-by: Qian Xu 
Signed-off-by: Jingjing Wu 
---
 drivers/net/i40e/i40e_ethdev_vf.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/net/i40e/i40e_ethdev_vf.c 
b/drivers/net/i40e/i40e_ethdev_vf.c
index 7ce8687..ea96f85 100644
--- a/drivers/net/i40e/i40e_ethdev_vf.c
+++ b/drivers/net/i40e/i40e_ethdev_vf.c
@@ -1700,7 +1700,7 @@ i40evf_enable_queues_intr(struct rte_eth_dev *dev)
   I40E_VFINT_DYN_CTL01,
   I40E_VFINT_DYN_CTL01_INTENA_MASK |
   I40E_VFINT_DYN_CTL01_CLEARPBA_MASK);
-   I40E_WRITE_FLUSH(hw);
+   I40EVF_WRITE_FLUSH(hw);
return;
}

@@ -1716,7 +1716,7 @@ i40evf_enable_queues_intr(struct rte_eth_dev *dev)
I40E_VFINT_DYN_CTL01_INTENA_MASK |
I40E_VFINT_DYN_CTL01_CLEARPBA_MASK);

-   I40E_WRITE_FLUSH(hw);
+   I40EVF_WRITE_FLUSH(hw);
 }

 static inline void
@@ -1728,7 +1728,7 @@ i40evf_disable_queues_intr(struct rte_eth_dev *dev)

if (!rte_intr_allow_others(intr_handle)) {
I40E_WRITE_REG(hw, I40E_VFINT_DYN_CTL01, 0);
-   I40E_WRITE_FLUSH(hw);
+   I40EVF_WRITE_FLUSH(hw);
return;
}

@@ -1740,7 +1740,7 @@ i40evf_disable_queues_intr(struct rte_eth_dev *dev)
else
I40E_WRITE_REG(hw, I40E_VFINT_DYN_CTL01, 0);

-   I40E_WRITE_FLUSH(hw);
+   I40EVF_WRITE_FLUSH(hw);
 }

 static int
@@ -1770,7 +1770,7 @@ i40evf_dev_rx_queue_intr_enable(struct rte_eth_dev *dev, 
uint16_t queue_id)
   (interval <<
I40E_VFINT_DYN_CTLN1_INTERVAL_SHIFT));

-   I40E_WRITE_FLUSH(hw);
+   I40EVF_WRITE_FLUSH(hw);

rte_intr_enable(&dev->pci_dev->intr_handle);

@@ -1793,7 +1793,7 @@ i40evf_dev_rx_queue_intr_disable(struct rte_eth_dev *dev, 
uint16_t queue_id)
I40E_RX_VEC_START),
   0);

-   I40E_WRITE_FLUSH(hw);
+   I40EVF_WRITE_FLUSH(hw);

return 0;
 }
-- 
2.4.0



[dpdk-dev] [PATCH] i40e: fix the write flush in vf driver

2015-11-16 Thread Zhang, Helin


-Original Message-
From: Wu, Jingjing 
Sent: Monday, November 16, 2015 3:09 PM
To: dev at dpdk.org
Cc: Wu, Jingjing; Zhang, Helin; Liang, Cunming; Xu, Qian Q
Subject: [PATCH] i40e: fix the write flush in vf driver

For i40e vf driver, should use I40EVF_WRITE_FLUSH to flush configuration but 
not I40E_WRITE_FLUSH. This patch fixed this issue.

Fixes: be6c228d4da3 (i40evf: support Rx interrupt)

Reported-by: Qian Xu 
Signed-off-by: Jingjing Wu 
Acked-by: Helin Zhang 


[dpdk-dev] [PATCH v5 1/4] vhost/lib: add vhost TX offload capabilities in vhost lib

2015-11-16 Thread Liu, Jijiang
Hi Yunhan,

> -Original Message-
> From: Yuanhan Liu [mailto:yuanhan.liu at linux.intel.com]
> Sent: Friday, November 13, 2015 3:02 PM
> To: Liu, Jijiang
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v5 1/4] vhost/lib: add vhost TX offload
> capabilities in vhost lib
> 
> On Thu, Nov 12, 2015 at 08:07:03PM +0800, Jijiang Liu wrote:
> > Add vhost TX offload(CSUM and TSO) support capabilities in vhost lib.
> >
> > Refer to feature bits in Virtual I/O Device (VIRTIO) Version 1.0
> > below,
> >
> > VIRTIO_NET_F_CSUM (0) Device handles packets with partial checksum.
> This "checksum offload" is a common feature on modern network cards.
> > VIRTIO_NET_F_HOST_TSO4 (11) Device can receive TSOv4.
> > VIRTIO_NET_F_HOST_TSO6 (12) Device can receive TSOv6.
> >
> > In order to support these features, and the following changes are
> > added,
> >
> > 1. Extend 'VHOST_SUPPORTED_FEATURES' macro to add the offload
> features negotiation.
> >
> > 2. Dequeue TX offload: convert the fileds in virtio_net_hdr to the related
> fileds in mbuf.
> >
> >
> > Signed-off-by: Jijiang Liu 
> ...
> > +static void
> > +parse_ethernet(struct rte_mbuf *m, uint16_t *l4_proto, void **l4_hdr)
> > +{
> > +   struct ipv4_hdr *ipv4_hdr;
> > +   struct ipv6_hdr *ipv6_hdr;
> > +   void *l3_hdr = NULL;
> > +   struct ether_hdr *eth_hdr;
> > +   uint16_t ethertype;
> > +
> > +   eth_hdr = rte_pktmbuf_mtod(m, struct ether_hdr *);
> > +
> > +   m->l2_len = sizeof(struct ether_hdr);
> > +   ethertype = rte_be_to_cpu_16(eth_hdr->ether_type);
> > +
> > +   if (ethertype == ETHER_TYPE_VLAN) {
> > +   struct vlan_hdr *vlan_hdr = (struct vlan_hdr *)(eth_hdr + 1);
> > +
> > +   m->l2_len += sizeof(struct vlan_hdr);
> > +   ethertype = rte_be_to_cpu_16(vlan_hdr->eth_proto);
> > +   }
> > +
> > +   l3_hdr = (char *)eth_hdr + m->l2_len;
> > +
> > +   switch (ethertype) {
> > +   case ETHER_TYPE_IPv4:
> > +   ipv4_hdr = (struct ipv4_hdr *)l3_hdr;
> > +   *l4_proto = ipv4_hdr->next_proto_id;
> > +   m->l3_len = (ipv4_hdr->version_ihl & 0x0f) * 4;
> > +   *l4_hdr = (char *)l3_hdr + m->l3_len;
> > +   m->ol_flags |= PKT_TX_IPV4;
> > +   break;
> > +   case ETHER_TYPE_IPv6:
> > +   ipv6_hdr = (struct ipv6_hdr *)l3_hdr;
> > +   *l4_proto = ipv6_hdr->proto;
> > +   m->l3_len = sizeof(struct ipv6_hdr);
> > +   *l4_hdr = (char *)l3_hdr + m->l3_len;
> > +   m->ol_flags |= PKT_TX_IPV6;
> > +   break;
> 
> Note that I'm still not that satisfied with putting all those kind of 
> calculation
> into vhost library.
> 
> Every application requesting TSO and CSUM offload features need setup
> them, so I'm wondering _if_ we can put them into a libraray, say lib_ether,
> and let the application just set few key fields and left others to that lib.
> 
> That could leaves us from touching those chaos, such as TCP and IP, here and
> there. And, that, IMO, would be a more elegant way to leverage hardware
> TSO and CSUM offload features.
> 
> And I guess that might need some efforts and more discussions, so I'm okay
> to left that in later versions. (Hence, I gave my ack).
> 
> (I know little about lib_ether and DPDK hardware TSO settings, so I could be
> wrong, and sorry for that if so)

You suggestion is good, I also think we should add some L2/L3 protocols parse 
into DPDK libs.
as you said, there need more discussions for this, maybe we can do this in the 
future.

But now, it is necessary to add parse_ethernet() function here to get the 
essential information.

> 
>   --yliu
> 
> > +   default:
> > +   m->l3_len = 0;
> > +   *l4_proto = 0;
> > +   break;
> > +   }
> > +}
> > +
> > +static inline void __attribute__((always_inline))
> > +vhost_dequeue_offload(struct virtio_net_hdr *hdr, struct rte_mbuf *m)
> > +{
> > +   uint16_t l4_proto = 0;
> > +   void *l4_hdr = NULL;
> > +   struct tcp_hdr *tcp_hdr = NULL;
> > +
> > +   parse_ethernet(m, &l4_proto, &l4_hdr);
> > +   if (hdr->flags == VIRTIO_NET_HDR_F_NEEDS_CSUM) {
> > +   if (hdr->csum_start == (m->l2_len + m->l3_len)) {
> > +   switch (hdr->csum_offset) {
> > +   case (offsetof(struct tcp_hdr, cksum)):
> > +   if (l4_proto == IPPROTO_TCP)
> > +   m->ol_flags |= PKT_TX_TCP_CKSUM;
> > +   break;
> > +   case (offsetof(struct udp_hdr, dgram_cksum)):
> > +   if (l4_proto == IPPROTO_UDP)
> > +   m->ol_flags |= PKT_TX_UDP_CKSUM;
> > +   break;
> > +   case (offsetof(struct sctp_hdr, cksum)):
> > +   if (l4_proto == IPPROTO_SCTP)
> > +   m->ol_flags |=
> PKT_TX_SCTP_CKSUM;
> > +   break;
> > +   default:
> > +   break;
> > + 

[dpdk-dev] [PATCH] i40e: fix the issue dcb cannot be configured when FW version is 5.x

2015-11-16 Thread Jingjing Wu
When NVM version is updated to 5.x, DCB can not be configured. This
issue is because of the FW version validation is not correct.
This patch fixed this issue.

Fixes: c8b9a3e3fe1b (i40e: support DCB mode)

Signed-off-by: Jingjing Wu 
---
 drivers/net/i40e/i40e_ethdev.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 2c51a0b..9003488 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -8316,7 +8316,8 @@ i40e_dcb_hw_configure(struct i40e_pf *pf,
uint32_t val;

/* Use the FW API if FW > v4.4*/
-   if (!((hw->aq.fw_maj_ver == 4) && (hw->aq.fw_min_ver >= 4))) {
+   if (!(((hw->aq.fw_maj_ver == 4) && (hw->aq.fw_min_ver >= 4)) ||
+ (hw->aq.fw_maj_ver >= 5))) {
PMD_INIT_LOG(ERR, "FW < v4.4, can not use FW LLDP API"
  " to configure DCB");
return I40E_ERR_FIRMWARE_API_VERSION;
-- 
2.4.0



[dpdk-dev] [PATCH v2] doc: fix repeated typo in sample app docs

2015-11-16 Thread John McNamara
Fix repeated typo in the "Compiling the Application" section of
almost all of the sample app docs.

This generally gets copied into new sample app guides.

Signed-off-by: John McNamara 
---

V2:
* Added similar fix in load_balancer file.


 doc/guides/sample_app_ug/kernel_nic_interface.rst| 4 ++--
 doc/guides/sample_app_ug/l2_forward_job_stats.rst| 3 ++-
 doc/guides/sample_app_ug/l2_forward_real_virtual.rst | 3 ++-
 doc/guides/sample_app_ug/l3_forward.rst  | 3 ++-
 doc/guides/sample_app_ug/l3_forward_power_man.rst| 3 ++-
 doc/guides/sample_app_ug/load_balancer.rst   | 3 ++-
 doc/guides/sample_app_ug/qos_scheduler.rst   | 3 ++-
 doc/guides/sample_app_ug/timer.rst   | 3 ++-
 doc/guides/sample_app_ug/vmdq_dcb_forwarding.rst | 3 ++-
 9 files changed, 18 insertions(+), 10 deletions(-)

diff --git a/doc/guides/sample_app_ug/kernel_nic_interface.rst 
b/doc/guides/sample_app_ug/kernel_nic_interface.rst
index f1deca9..c8fc10a 100644
--- a/doc/guides/sample_app_ug/kernel_nic_interface.rst
+++ b/doc/guides/sample_app_ug/kernel_nic_interface.rst
@@ -87,8 +87,8 @@ Compile the application as follows:

 .. code-block:: console

-export RTE_SDK=/path/to/rte_sdk cd
-${RTE_SDK}/examples/kni
+export RTE_SDK=/path/to/rte_sdk
+cd ${RTE_SDK}/examples/kni

 #.  Set the target (a default target is used if not specified)

diff --git a/doc/guides/sample_app_ug/l2_forward_job_stats.rst 
b/doc/guides/sample_app_ug/l2_forward_job_stats.rst
index 6cbf627..acf6273 100644
--- a/doc/guides/sample_app_ug/l2_forward_job_stats.rst
+++ b/doc/guides/sample_app_ug/l2_forward_job_stats.rst
@@ -101,7 +101,8 @@ Compiling the Application

 .. code-block:: console

-export RTE_SDK=/path/to/rte_sdk cd ${RTE_SDK}/examples/l2fwd-jobstats
+export RTE_SDK=/path/to/rte_sdk
+cd ${RTE_SDK}/examples/l2fwd-jobstats

 #.  Set the target (a default target is used if not specified). For example:

diff --git a/doc/guides/sample_app_ug/l2_forward_real_virtual.rst 
b/doc/guides/sample_app_ug/l2_forward_real_virtual.rst
index 9334e75..65a3cec 100644
--- a/doc/guides/sample_app_ug/l2_forward_real_virtual.rst
+++ b/doc/guides/sample_app_ug/l2_forward_real_virtual.rst
@@ -101,7 +101,8 @@ Compiling the Application

 .. code-block:: console

-export RTE_SDK=/path/to/rte_sdk cd ${RTE_SDK}/examples/l2fwd
+export RTE_SDK=/path/to/rte_sdk
+cd ${RTE_SDK}/examples/l2fwd

 #.  Set the target (a default target is used if not specified). For example:

diff --git a/doc/guides/sample_app_ug/l3_forward.rst 
b/doc/guides/sample_app_ug/l3_forward.rst
index 6ca03f9..34a84f1 100644
--- a/doc/guides/sample_app_ug/l3_forward.rst
+++ b/doc/guides/sample_app_ug/l3_forward.rst
@@ -69,7 +69,8 @@ To compile the application:

 .. code-block:: console

-export RTE_SDK=/path/to/rte_sdk cd ${RTE_SDK}/examples/l3fwd
+export RTE_SDK=/path/to/rte_sdk
+cd ${RTE_SDK}/examples/l3fwd

 #.  Set the target (a default target is used if not specified). For example:

diff --git a/doc/guides/sample_app_ug/l3_forward_power_man.rst 
b/doc/guides/sample_app_ug/l3_forward_power_man.rst
index 39c2ea5..ac688f8 100644
--- a/doc/guides/sample_app_ug/l3_forward_power_man.rst
+++ b/doc/guides/sample_app_ug/l3_forward_power_man.rst
@@ -111,7 +111,8 @@ To compile the application:

 .. code-block:: console

-export RTE_SDK=/path/to/rte_sdk cd ${RTE_SDK}/examples/l3fwd-power
+export RTE_SDK=/path/to/rte_sdk
+cd ${RTE_SDK}/examples/l3fwd-power

 #.  Set the target (a default target is used if not specified). For example:

diff --git a/doc/guides/sample_app_ug/load_balancer.rst 
b/doc/guides/sample_app_ug/load_balancer.rst
index 615826a..fdd8cbd 100644
--- a/doc/guides/sample_app_ug/load_balancer.rst
+++ b/doc/guides/sample_app_ug/load_balancer.rst
@@ -99,7 +99,8 @@ The sequence of steps used to build the application is:

 .. code-block:: console

-cd ${RTE_SDK}/examples/load_balancer make
+cd ${RTE_SDK}/examples/load_balancer
+make

 For more details on how to build the DPDK libraries and sample 
applications,
 please refer to the *DPDK Getting Started Guide.*
diff --git a/doc/guides/sample_app_ug/qos_scheduler.rst 
b/doc/guides/sample_app_ug/qos_scheduler.rst
index ffa8ee2..7f9e925 100644
--- a/doc/guides/sample_app_ug/qos_scheduler.rst
+++ b/doc/guides/sample_app_ug/qos_scheduler.rst
@@ -64,7 +64,8 @@ To compile the application:

 .. code-block:: console

-export RTE_SDK=/path/to/rte_sdk cd ${RTE_SDK}/examples/qos_sched
+export RTE_SDK=/path/to/rte_sdk
+cd ${RTE_SDK}/examples/qos_sched

 #.  Set the target (a default target is used if not specified). For example:

diff --git a/doc/guides/sample_app_ug/timer.rst 
b/doc/guides/sample_app_ug/timer.rst
index ee0a732..e4de359 100644
--- a/doc/guides/sample_app_ug/timer.rst
+++ b/doc/guides/sample_app_ug/timer.r

[dpdk-dev] Making rte_eal_pci_probe() in rte_eal_init() optional?

2015-11-16 Thread David Marchand
Hello Roger,

On Sun, Nov 15, 2015 at 3:45 PM, Roger B. Melton  wrote:

> I like the "-b all" and "-w none" idea, but I think it might be
> complicated to implement it the way we would need it to work.  The existing
> -b and -w options  persist for the duration of the application, and we
> would need the "-b all"/"-w none" to persists only through rte_eal_init()
> time.  Otherwise our attempt to to attach a device at a later time would be
> blocked by the option.
>

I agree, the black/white lists should only apply to initial scan.
I forgot about this problem ...
I had started some cleanup in the pci scan / attach code but this is too
late for 2.2, I will post this in the next merge window.


Wouldn't it be simpler to have an option to disable the rte_eal_init() time
> the probe.  Would that address the issue with VFIO, prevent automatically
> attaching to devices while permitting on demand attach?
>

I suppose we can do this yes (I think Thomas once proposed off-list an
option like --no-pci-scan).
Do you think you can send a patch ?


-- 
David Marchand

>


[dpdk-dev] [PATCH 0/4] Remove CRC from byte counters

2015-11-16 Thread Harry van Haaren
This patchset removes CRC bytes from byte statistics in the following
PMDs: igb, ixgbe, i40e, fm10k.

This brings the reported byte statistics in-line with other NICs,
providing consistency.

Removing the CRC from byte counters in i40e resolves a bug, see
i40e commit message for details.


Harry van Haaren (4):
  e1000: remove crc size from all byte counters
  ixgbe: remove crc size from all byte counters
  i40e: fix rx/tx size mismatch, remove crc bytes
  fm10k: remove crc size from all byte counters

 drivers/net/e1000/igb_ethdev.c   | 19 ++--
 drivers/net/fm10k/fm10k_ethdev.c |  8 ---
 drivers/net/i40e/i40e_ethdev.c   | 33 ++--
 drivers/net/ixgbe/ixgbe_ethdev.c | 47 ++--
 4 files changed, 79 insertions(+), 28 deletions(-)

-- 
1.9.1



[dpdk-dev] [PATCH 1/4] e1000: remove crc size from all byte counters

2015-11-16 Thread Harry van Haaren
This patch removes the crc bytes from byte counter statistics.

Signed-off-by: Harry van Haaren 
---
 drivers/net/e1000/igb_ethdev.c | 19 +--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/drivers/net/e1000/igb_ethdev.c b/drivers/net/e1000/igb_ethdev.c
index 88995b0..0ad7341 100644
--- a/drivers/net/e1000/igb_ethdev.c
+++ b/drivers/net/e1000/igb_ethdev.c
@@ -1480,6 +1480,13 @@ igb_read_stats_registers(struct e1000_hw *hw, struct 
e1000_hw_stats *stats)
 {
int pause_frames;

+   uint64_t old_gprc  = stats->gprc;
+   uint64_t old_gptc  = stats->gptc;
+   uint64_t old_tpr   = stats->tpr;
+   uint64_t old_tpt   = stats->tpt;
+   uint64_t old_rpthc = stats->rpthc;
+   uint64_t old_hgptc = stats->hgptc;
+
if(hw->phy.media_type == e1000_media_type_copper ||
(E1000_READ_REG(hw, E1000_STATUS) & E1000_STATUS_LU)) {
stats->symerrs +=
@@ -1521,10 +1528,13 @@ igb_read_stats_registers(struct e1000_hw *hw, struct 
e1000_hw_stats *stats)
/* For the 64-bit byte counters the low dword must be read first. */
/* Both registers clear on the read of the high dword */

+   /* Workaround CRC bytes included in size, take away 4 bytes/packet */
stats->gorc += E1000_READ_REG(hw, E1000_GORCL);
stats->gorc += ((uint64_t)E1000_READ_REG(hw, E1000_GORCH) << 32);
+   stats->gorc -= (stats->gprc - old_gprc) * 4;
stats->gotc += E1000_READ_REG(hw, E1000_GOTCL);
stats->gotc += ((uint64_t)E1000_READ_REG(hw, E1000_GOTCH) << 32);
+   stats->gotc -= (stats->gptc - old_gptc) * 4;

stats->rnbc += E1000_READ_REG(hw, E1000_RNBC);
stats->ruc += E1000_READ_REG(hw, E1000_RUC);
@@ -1532,13 +1542,16 @@ igb_read_stats_registers(struct e1000_hw *hw, struct 
e1000_hw_stats *stats)
stats->roc += E1000_READ_REG(hw, E1000_ROC);
stats->rjc += E1000_READ_REG(hw, E1000_RJC);

+   stats->tpr += E1000_READ_REG(hw, E1000_TPR);
+   stats->tpt += E1000_READ_REG(hw, E1000_TPT);
+
stats->tor += E1000_READ_REG(hw, E1000_TORL);
stats->tor += ((uint64_t)E1000_READ_REG(hw, E1000_TORH) << 32);
+   stats->tor -= (stats->tpr - old_tpr) * 4;
stats->tot += E1000_READ_REG(hw, E1000_TOTL);
stats->tot += ((uint64_t)E1000_READ_REG(hw, E1000_TOTH) << 32);
+   stats->tot -= (stats->tpt - old_tpt) * 4;

-   stats->tpr += E1000_READ_REG(hw, E1000_TPR);
-   stats->tpt += E1000_READ_REG(hw, E1000_TPT);
stats->ptc64 += E1000_READ_REG(hw, E1000_PTC64);
stats->ptc127 += E1000_READ_REG(hw, E1000_PTC127);
stats->ptc255 += E1000_READ_REG(hw, E1000_PTC255);
@@ -1571,8 +1584,10 @@ igb_read_stats_registers(struct e1000_hw *hw, struct 
e1000_hw_stats *stats)
stats->htcbdpc += E1000_READ_REG(hw, E1000_HTCBDPC);
stats->hgorc += E1000_READ_REG(hw, E1000_HGORCL);
stats->hgorc += ((uint64_t)E1000_READ_REG(hw, E1000_HGORCH) << 32);
+   stats->hgorc -= (stats->rpthc - old_rpthc) * 4;
stats->hgotc += E1000_READ_REG(hw, E1000_HGOTCL);
stats->hgotc += ((uint64_t)E1000_READ_REG(hw, E1000_HGOTCH) << 32);
+   stats->hgotc -= (stats->hgptc - old_hgptc) * 4;
stats->lenerrs += E1000_READ_REG(hw, E1000_LENERRS);
stats->scvpc += E1000_READ_REG(hw, E1000_SCVPC);
stats->hrmpc += E1000_READ_REG(hw, E1000_HRMPC);
-- 
1.9.1



[dpdk-dev] [PATCH 2/4] ixgbe: remove crc size from all byte counters

2015-11-16 Thread Harry van Haaren
This patch removes the crc bytes from byte counter statistics.

Signed-off-by: Harry van Haaren 
---
 drivers/net/ixgbe/ixgbe_ethdev.c | 47 ++--
 1 file changed, 35 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index 80801f0..3ccfd89 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -2331,12 +2331,14 @@ ixgbe_dev_close(struct rte_eth_dev *dev)
 }

 static void
-ixgbe_read_stats_registers(struct ixgbe_hw *hw, struct ixgbe_hw_stats
-  *hw_stats, uint64_t 
*total_missed_rx,
-  uint64_t *total_qbrc, 
uint64_t *total_qprc,
-  uint64_t *total_qprdc)
+ixgbe_read_stats_registers(struct ixgbe_hw *hw,
+  struct ixgbe_hw_stats *hw_stats,
+  uint64_t *total_missed_rx, uint64_t *total_qbrc,
+  uint64_t *total_qprc, uint64_t *total_qprdc)
 {
uint32_t bprc, lxon, lxoff, total;
+   uint32_t delta_gprc = 0;
+   uint32_t delta_gptc = 0;
unsigned i;

hw_stats->crcerrs += IXGBE_READ_REG(hw, IXGBE_CRCERRS);
@@ -2372,26 +2374,41 @@ ixgbe_read_stats_registers(struct ixgbe_hw *hw, struct 
ixgbe_hw_stats
IXGBE_READ_REG(hw, IXGBE_PXOFFTXC(i));
}
for (i = 0; i < IXGBE_QUEUE_STAT_COUNTERS; i++) {
-   hw_stats->qprc[i] += IXGBE_READ_REG(hw, IXGBE_QPRC(i));
-   hw_stats->qptc[i] += IXGBE_READ_REG(hw, IXGBE_QPTC(i));
+   uint32_t delta_qprc = IXGBE_READ_REG(hw, IXGBE_QPRC(i));
+   uint32_t delta_qptc = IXGBE_READ_REG(hw, IXGBE_QPTC(i));
+   uint32_t delta_qprdc = IXGBE_READ_REG(hw, IXGBE_QPRDC(i));
+
+   delta_gprc += delta_qprc;
+   delta_gptc += delta_qptc;
+
+   hw_stats->qprc[i] += delta_qprc;
+   hw_stats->qptc[i] += delta_qptc;
+
hw_stats->qbrc[i] += IXGBE_READ_REG(hw, IXGBE_QBRC_L(i));
hw_stats->qbrc[i] +=
((uint64_t)IXGBE_READ_REG(hw, IXGBE_QBRC_H(i)) << 32);
+   hw_stats->qbrc[i] -= delta_qprc * 4;
+
hw_stats->qbtc[i] += IXGBE_READ_REG(hw, IXGBE_QBTC_L(i));
hw_stats->qbtc[i] +=
((uint64_t)IXGBE_READ_REG(hw, IXGBE_QBTC_H(i)) << 32);
-   *total_qprdc += hw_stats->qprdc[i] +=
-   IXGBE_READ_REG(hw, IXGBE_QPRDC(i));
+
+   hw_stats->qprdc[i] += delta_qprdc;
+   *total_qprdc += hw_stats->qprdc[i];

*total_qprc += hw_stats->qprc[i];
*total_qbrc += hw_stats->qbrc[i];
}
+
hw_stats->mlfc += IXGBE_READ_REG(hw, IXGBE_MLFC);
hw_stats->mrfc += IXGBE_READ_REG(hw, IXGBE_MRFC);
hw_stats->rlec += IXGBE_READ_REG(hw, IXGBE_RLEC);

-   /* Note that gprc counts missed packets */
-   hw_stats->gprc += IXGBE_READ_REG(hw, IXGBE_GPRC);
+   /*
+* An errata states that gprc actually counts good + missed packets:
+* Workaround to set gprc to summated queue packet recieves
+*/
+   hw_stats->gprc = *total_qprc;

if (hw->mac.type != ixgbe_mac_82598EB) {
hw_stats->gorc += IXGBE_READ_REG(hw, IXGBE_GORCL);
@@ -2410,6 +2427,14 @@ ixgbe_read_stats_registers(struct ixgbe_hw *hw, struct 
ixgbe_hw_stats
hw_stats->gotc += IXGBE_READ_REG(hw, IXGBE_GOTCH);
hw_stats->tor += IXGBE_READ_REG(hw, IXGBE_TORH);
}
+   uint64_t old_tpr = hw_stats->tpr;
+
+   hw_stats->tpr += IXGBE_READ_REG(hw, IXGBE_TPR);
+   hw_stats->tpt += IXGBE_READ_REG(hw, IXGBE_TPT);
+
+   hw_stats->gorc -= delta_gprc * 4;
+   hw_stats->gotc -= delta_gptc * 4;
+   hw_stats->tor -= (hw_stats->tpr - old_tpr) * 4;

/*
 * Workaround: mprc hardware is incorrectly counting
@@ -2449,8 +2474,6 @@ ixgbe_read_stats_registers(struct ixgbe_hw *hw, struct 
ixgbe_hw_stats
hw_stats->mngprc += IXGBE_READ_REG(hw, IXGBE_MNGPRC);
hw_stats->mngpdc += IXGBE_READ_REG(hw, IXGBE_MNGPDC);
hw_stats->mngptc += IXGBE_READ_REG(hw, IXGBE_MNGPTC);
-   hw_stats->tpr += IXGBE_READ_REG(hw, IXGBE_TPR);
-   hw_stats->tpt += IXGBE_READ_REG(hw, IXGBE_TPT);
hw_stats->ptc127 += IXGBE_READ_REG(hw, IXGBE_PTC127);
hw_stats->ptc255 += IXGBE_READ_REG(hw, IXGBE_PTC255);
hw_stats->ptc511 += IXGBE_READ_REG(hw, IXGBE_PTC511);
-- 
1.9.1



[dpdk-dev] [PATCH 3/4] i40e: fix rx/tx size mismatch, remove crc bytes

2015-11-16 Thread Harry van Haaren
This patch removes the crc bytes from byte counter statistics.

Doing so fixes a bug that CRC bytes were included on TX but not
on RX, causing mismatch of bytes recieved / sent.

Fixes: 9aace75fc82e ("i40e: fix statistics")

Reported-by: Weichun Chen 
Signed-off-by: Harry van Haaren 
---
 drivers/net/i40e/i40e_ethdev.c | 33 ++---
 1 file changed, 22 insertions(+), 11 deletions(-)

diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 2c51a0b..c0c268d 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -1870,11 +1870,11 @@ i40e_read_stats_registers(struct i40e_pf *pf, struct 
i40e_hw *hw)
unsigned int i;
struct i40e_hw_port_stats *ns = &pf->stats; /* new stats */
struct i40e_hw_port_stats *os = &pf->stats_offset; /* old stats */
-   /* Get statistics of struct i40e_eth_stats */
-   i40e_stat_update_48(hw, I40E_GLPRT_GORCH(hw->port),
-   I40E_GLPRT_GORCL(hw->port),
-   pf->offset_loaded, &os->eth.rx_bytes,
-   &ns->eth.rx_bytes);
+
+   /* Workaround: CRC size should not be included in byte statistics,
+* so we add 4 bytes to the offset, causing the resulting byte
+* counters to be 4 bytes smaller.
+*/
i40e_stat_update_48(hw, I40E_GLPRT_UPRCH(hw->port),
I40E_GLPRT_UPRCL(hw->port),
pf->offset_loaded, &os->eth.rx_unicast,
@@ -1887,6 +1887,15 @@ i40e_read_stats_registers(struct i40e_pf *pf, struct 
i40e_hw *hw)
I40E_GLPRT_BPRCL(hw->port),
pf->offset_loaded, &os->eth.rx_broadcast,
&ns->eth.rx_broadcast);
+
+   /* Get statistics of struct i40e_eth_stats */
+   i40e_stat_update_48(hw, I40E_GLPRT_GORCH(hw->port),
+   I40E_GLPRT_GORCL(hw->port),
+   pf->offset_loaded, &os->eth.rx_bytes,
+   &ns->eth.rx_bytes);
+   ns->eth.rx_bytes -= (ns->eth.rx_unicast + ns->eth.rx_multicast +
+   ns->eth.rx_broadcast) * 4;
+
i40e_stat_update_32(hw, I40E_GLPRT_RDPC(hw->port),
pf->offset_loaded, &os->eth.rx_discards,
&ns->eth.rx_discards);
@@ -1896,10 +1905,6 @@ i40e_read_stats_registers(struct i40e_pf *pf, struct 
i40e_hw *hw)
pf->offset_loaded,
&os->eth.rx_unknown_protocol,
&ns->eth.rx_unknown_protocol);
-   i40e_stat_update_48(hw, I40E_GLPRT_GOTCH(hw->port),
-   I40E_GLPRT_GOTCL(hw->port),
-   pf->offset_loaded, &os->eth.tx_bytes,
-   &ns->eth.tx_bytes);
i40e_stat_update_48(hw, I40E_GLPRT_UPTCH(hw->port),
I40E_GLPRT_UPTCL(hw->port),
pf->offset_loaded, &os->eth.tx_unicast,
@@ -1912,6 +1917,12 @@ i40e_read_stats_registers(struct i40e_pf *pf, struct 
i40e_hw *hw)
I40E_GLPRT_BPTCL(hw->port),
pf->offset_loaded, &os->eth.tx_broadcast,
&ns->eth.tx_broadcast);
+   i40e_stat_update_48(hw, I40E_GLPRT_GOTCH(hw->port),
+   I40E_GLPRT_GOTCL(hw->port),
+   pf->offset_loaded, &os->eth.tx_bytes,
+   &ns->eth.tx_bytes);
+   ns->eth.tx_bytes -= (ns->eth.tx_unicast + ns->eth.tx_multicast +
+   ns->eth.tx_broadcast) * 4;
/* GLPRT_TEPC not supported */

/* additional port specific stats */
@@ -2069,8 +2080,8 @@ i40e_dev_stats_get(struct rte_eth_dev *dev, struct 
rte_eth_stats *stats)
stats->opackets = pf->main_vsi->eth_stats.tx_unicast +
pf->main_vsi->eth_stats.tx_multicast +
pf->main_vsi->eth_stats.tx_broadcast;
-   stats->ibytes   = pf->main_vsi->eth_stats.rx_bytes;
-   stats->obytes   = pf->main_vsi->eth_stats.tx_bytes;
+   stats->ibytes   = ns->eth.rx_bytes;
+   stats->obytes   = ns->eth.tx_bytes;
stats->oerrors  = ns->eth.tx_errors +
pf->main_vsi->eth_stats.tx_errors;
stats->imcasts  = pf->main_vsi->eth_stats.rx_multicast;
-- 
1.9.1



[dpdk-dev] [PATCH 4/4] fm10k: remove crc size from all byte counters

2015-11-16 Thread Harry van Haaren
This patch removes the crc bytes from byte counter statistics.

Signed-off-by: Harry van Haaren 
---
 drivers/net/fm10k/fm10k_ethdev.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/net/fm10k/fm10k_ethdev.c b/drivers/net/fm10k/fm10k_ethdev.c
index 441f713..fdb2e81 100644
--- a/drivers/net/fm10k/fm10k_ethdev.c
+++ b/drivers/net/fm10k/fm10k_ethdev.c
@@ -1183,11 +1183,13 @@ fm10k_stats_get(struct rte_eth_dev *dev, struct 
rte_eth_stats *stats)

ipackets = opackets = ibytes = obytes = 0;
for (i = 0; (i < RTE_ETHDEV_QUEUE_STAT_CNTRS) &&
-   (i < hw->mac.max_queues); ++i) {
+   (i < hw->mac.max_queues); ++i) {
stats->q_ipackets[i] = hw_stats->q[i].rx_packets.count;
stats->q_opackets[i] = hw_stats->q[i].tx_packets.count;
-   stats->q_ibytes[i]   = hw_stats->q[i].rx_bytes.count;
-   stats->q_obytes[i]   = hw_stats->q[i].tx_bytes.count;
+   stats->q_ibytes[i]   = hw_stats->q[i].rx_bytes.count -
+   (stats->q_ipackets[i] * 4);
+   stats->q_obytes[i]   = hw_stats->q[i].tx_bytes.count -
+   (stats->q_opackets[i] * 4);
ipackets += stats->q_ipackets[i];
opackets += stats->q_opackets[i];
ibytes   += stats->q_ibytes[i];
-- 
1.9.1



[dpdk-dev] [PATCH v2] doc: fix repeated typo in sample app docs

2015-11-16 Thread De Lara Guarch, Pablo


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of John McNamara
> Sent: Monday, November 16, 2015 9:27 AM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v2] doc: fix repeated typo in sample app docs
> 
> Fix repeated typo in the "Compiling the Application" section of
> almost all of the sample app docs.
> 
> This generally gets copied into new sample app guides.
> 
> Signed-off-by: John McNamara 

Acked-by: Pablo de Lara 


[dpdk-dev] [PATCH v6 2/2] doc: add user-space ethtool sample app guide

2015-11-16 Thread Remy Horton
On 13/11/2015 16:52, Mcnamara, John wrote:

>> +Using the application
>> +-
..
>> +drvinfo
>> +Print driver info
>> +eeprom
>> +Dump EEPROM to file
>
> This definition list doesn't render very well. Maybe better as:

Ideally would present it as a table, but stylistically that felt too 
different from the rest of the documentation. Gone with bullet list.

Will roll the changes into the next patchset.

..Remy


[dpdk-dev] [PATCH] doc: fix examples in netmap compatibility docs

2015-11-16 Thread De Lara Guarch, Pablo


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of John McNamara
> Sent: Friday, November 13, 2015 11:45 AM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH] doc: fix examples in netmap compatibility docs
> 
> Fix the examples in the netmap compatibility sample application
> docs which referred to the packet_reordering application.
> 
> Also fix some minor rst formatting issues.
> 
> Reported-by: Qian Xu 
> Signed-off-by: John McNamara 

Acked-by: Pablo de Lara 


[dpdk-dev] [PATCH] kni: fix compile issue on different kernel versions

2015-11-16 Thread De Lara Guarch, Pablo


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Helin Zhang
> Sent: Monday, November 09, 2015 6:26 AM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH] kni: fix compile issue on different kernel
> versions
> 
> It fixes the compile issue on kernel version 2.6.32 or old ones.
> 
> Error logs:
> lib/librte_eal/linuxapp/kni/kni_misc.c:121: error: unknown field id
> specified in initializer
> lib/librte_eal/linuxapp/kni/kni_misc.c:121: error: excess elements in struct
> initializer
> lib/librte_eal/linuxapp/kni/kni_misc.c:121: error: (near initialization for
> kni_net_ops)
> lib/librte_eal/linuxapp/kni/kni_misc.c:122: error: unknown field size
> specified in initializer
> lib/librte_eal/linuxapp/kni/kni_misc.c:122: error: excess elements in struct
> initializer
> lib/librte_eal/linuxapp/kni/kni_misc.c:122: error: (near initialization for
> kni_net_ops)
> 
> Fixes: 72a7a2b2469e ("kni: allow per-net instances")
> 
> Signed-off-by: Helin Zhang 

Acked-by: Pablo de Lara 


[dpdk-dev] Recent changes related to interrupt thread

2015-11-16 Thread Rahul Lakkireddy
Hi,

I notice that the following changeset:

Fixes: fd6949c55c9a ("eal: fix io permission for virtio interrupt
handler")

has moved the initialization of the interrupt thread to after the master
lcore has been initialized.  However, this causes the interrupt thread
to _inherit_ the affinity of the master lcore. Hence, this seems to
make all interrupts to be handled by _only_ the master lcore. Because
of this change, it seems that now alarm interrupts would also be handled
by master lcore only, IIUC.

We are seeing a performance regression for cxgbe PMD after this commit
since, cxgbe PMD relies on alarm to periodically transmit pending
coalesced packets.

Also, this perf degradation is only seen if there's a queue allocated
on the master lcore, such as in l3fwd app.  If the master lcore has
been skipped, then no degradation in perf is seen since only the alarm
will run on the master lcore.

So, is the change done to make all interrupts, including alarm
interrupts, be handled by _only_ the master lcore intended?

BTW, I have tried setting the affinity to all cpus instead in
eal_intr_init() and this seems to restore the perf back. Perhaps it's
better to move the master lcore initialization to after the interrupt
thread has been initialized as well? Thoughts?

Thanks,
Rahul


[dpdk-dev] [PATCH v7 0/2] User-space ethtool sample application

2015-11-16 Thread Remy Horton
Further enhancements to the userspace ethtool implementation that was
submitted in 2.1 and packaged as a self-contained sample application.
Implements an rte_ethtool shim layer based on rte_ethdev API, along
with a command prompt driven demonstration application.

This patchset depends on:
* http://dpdk.org/dev/patchwork/patch/6563/
* http://dpdk.org/dev/patchwork/patch/8070/

v7:
* Ringparam printouts wrong way round
* Ringparam help message corrections
* Use __rte_unused instead of __attribute__((unused))
* Allow Jumbo-sized MTUs
* Documentation style and spelling changes

v6:
* Fixed hang when run with zero available ports
* Fixed incorrect sanity check preventing EEPROM dumps
* Documentation additions
* Fixed RxMode accepting untagged packets
* Fixed ringparam allocation being too small

v5:
* Documentation changes

v4:
* Fixed assumption that master core always has id zero
* Changed 1:1 core-to-port to 2 core (ethtool & ports) design 
* Included the correct documentation..

v3:
* Made use of enums for core state.
* Fixed Makefile issue.
* Fixed incorrect assumption with core ids.
* Changed handling of more ports than cores.

v2:
* Replaced l2fwd base with simpler application.
* Added ringparam functions.
* Added documentation.

Remy Horton (2):
  example: add user-space ethtool sample application
  doc: add user-space ethtool sample app guide & release notes

 MAINTAINERS   |   4 +
 doc/guides/rel_notes/release_2_2.rst  |   1 +
 doc/guides/sample_app_ug/ethtool.rst  | 160 +++
 doc/guides/sample_app_ug/index.rst|   1 +
 examples/ethtool/Makefile |  48 ++
 examples/ethtool/ethtool-app/Makefile |  54 +++
 examples/ethtool/ethtool-app/ethapp.c | 873 ++
 examples/ethtool/ethtool-app/ethapp.h |  41 ++
 examples/ethtool/ethtool-app/main.c   | 305 
 examples/ethtool/lib/Makefile |  57 +++
 examples/ethtool/lib/rte_ethtool.c| 421 
 examples/ethtool/lib/rte_ethtool.h| 410 
 12 files changed, 2375 insertions(+)
 create mode 100644 doc/guides/sample_app_ug/ethtool.rst
 create mode 100644 examples/ethtool/Makefile
 create mode 100644 examples/ethtool/ethtool-app/Makefile
 create mode 100644 examples/ethtool/ethtool-app/ethapp.c
 create mode 100644 examples/ethtool/ethtool-app/ethapp.h
 create mode 100644 examples/ethtool/ethtool-app/main.c
 create mode 100644 examples/ethtool/lib/Makefile
 create mode 100644 examples/ethtool/lib/rte_ethtool.c
 create mode 100644 examples/ethtool/lib/rte_ethtool.h

-- 
1.9.3



[dpdk-dev] [PATCH v7 2/2] doc: add user-space ethtool sample app guide

2015-11-16 Thread Remy Horton
Signed-off-by: Remy Horton 
---
 doc/guides/rel_notes/release_2_2.rst |   1 +
 doc/guides/sample_app_ug/ethtool.rst | 160 +++
 doc/guides/sample_app_ug/index.rst   |   1 +
 3 files changed, 162 insertions(+)
 create mode 100644 doc/guides/sample_app_ug/ethtool.rst

diff --git a/doc/guides/rel_notes/release_2_2.rst 
b/doc/guides/rel_notes/release_2_2.rst
index 59dda59..bafebb3 100644
--- a/doc/guides/rel_notes/release_2_2.rst
+++ b/doc/guides/rel_notes/release_2_2.rst
@@ -187,6 +187,7 @@ Libraries
 Examples
 

+* **ethtool: Added ethtool shim and sample application.**

 Other
 ~
diff --git a/doc/guides/sample_app_ug/ethtool.rst 
b/doc/guides/sample_app_ug/ethtool.rst
new file mode 100644
index 000..4d1697e
--- /dev/null
+++ b/doc/guides/sample_app_ug/ethtool.rst
@@ -0,0 +1,160 @@
+
+..  BSD LICENSE
+Copyright(c) 2015 Intel Corporation. All rights reserved.
+All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions
+are met:
+
+* Redistributions of source code must retain the above copyright
+notice, this list of conditions and the following disclaimer.
+* Redistributions in binary form must reproduce the above copyright
+notice, this list of conditions and the following disclaimer in
+the documentation and/or other materials provided with the
+distribution.
+* Neither the name of Intel Corporation nor the names of its
+contributors may be used to endorse or promote products derived
+from this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+Ethtool Sample Application
+==
+
+The Ethtool sample application shows an implementation of an
+ethtool-like API and provides a console environment that allows
+its use to query and change Ethernet card parameters. The sample
+is based upon a simple L2 frame reflector.
+
+Compiling the Application
+-
+
+To compile the application:
+
+#.  Go to the sample application directory:
+
+.. code-block:: console
+
+export RTE_SDK=/path/to/rte_sdk
+cd ${RTE_SD}/examples/ethtool
+
+#.  Set the target (a default target is used if not specified). For example:
+
+.. code-block:: console
+
+export RTE_TARGET=x86_64-native-linuxapp-gcc
+
+See the *DPDK Getting Started Guide* for possible RTE_TARGET values.
+
+#.  Build the application:
+
+.. code-block:: console
+
+make
+
+Running the Application
+---
+
+The application requires an available core for each port, plus one.
+The only available options are the standard ones for the EAL:
+
+.. code-block:: console
+
+./ethtool-app/ethtool-app/${RTE_TARGET}/ethtool [EAL options]
+
+Refer to the *DPDK Getting Started Guide* for general information on
+running applications and the Environment Abstraction Layer (EAL)
+options.
+
+Using the application
+-
+
+The application is console-driven using the cmdline DPDK interface:
+
+.. code-block:: console
+
+EthApp>
+
+From this interface the available commands and descriptions of what
+they do as as follows:
+
+* ``drvinfo``: Print driver info
+* ``eeprom``: Dump EEPROM to file
+* ``link``: Print port link states
+* ``macaddr``: Gets/sets MAC address
+* ``mtu``: Set NIC MTU
+* ``open``: Open port
+* ``pause``: Get/set port pause state
+* ``portstats``: Print port statistics
+* ``regs``: Dump port register(s) to file
+* ``ringparam``: Get/set ring parameters
+* ``rxmode``: Toggle port Rx mode
+* ``stop``: Stop port
+* ``validate``: Check that given MAC address is valid unicast address
+* ``vlan``: Add/remove VLAN id
+* ``quit``: Exit program
+
+
+Explanation
+---
+
+The sample program has two parts: A background `packet reflector`_
+that runs on a slave core, and a foreground `Ethtool Shell`_ that
+runs on the master core. These are described below.
+
+Packet Reflector
+
+
+The background packet reflector is intended to demonstrate basic
+packet processing on NIC ports controlled by the Ethtool shim.
+Each incoming MAC frame is 

[dpdk-dev] [PATCH v7 1/2] example: add user-space ethtool sample application

2015-11-16 Thread Remy Horton
Further enhancements to the userspace ethtool implementation that was
submitted in 2.1 and packaged as a self-contained sample application.
Implements an rte_ethtool shim layer based on rte_ethdev API, along
with a command prompt driven demonstration application.

Signed-off-by: Remy Horton 
---
 MAINTAINERS   |   4 +
 examples/ethtool/Makefile |  48 ++
 examples/ethtool/ethtool-app/Makefile |  54 +++
 examples/ethtool/ethtool-app/ethapp.c | 873 ++
 examples/ethtool/ethtool-app/ethapp.h |  41 ++
 examples/ethtool/ethtool-app/main.c   | 305 
 examples/ethtool/lib/Makefile |  57 +++
 examples/ethtool/lib/rte_ethtool.c| 421 
 examples/ethtool/lib/rte_ethtool.h| 410 
 9 files changed, 2213 insertions(+)
 create mode 100644 examples/ethtool/Makefile
 create mode 100644 examples/ethtool/ethtool-app/Makefile
 create mode 100644 examples/ethtool/ethtool-app/ethapp.c
 create mode 100644 examples/ethtool/ethtool-app/ethapp.h
 create mode 100644 examples/ethtool/ethtool-app/main.c
 create mode 100644 examples/ethtool/lib/Makefile
 create mode 100644 examples/ethtool/lib/rte_ethtool.c
 create mode 100644 examples/ethtool/lib/rte_ethtool.h

diff --git a/MAINTAINERS b/MAINTAINERS
index c8be5d2..ee58d7a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -520,3 +520,7 @@ F: examples/tep_termination/
 F: examples/vmdq/
 F: examples/vmdq_dcb/
 F: doc/guides/sample_app_ug/vmdq_dcb_forwarding.rst
+
+M: Remy Horton 
+F: examples/ethtool/
+F: doc/guides/sample_app_ug/ethtool.rst
diff --git a/examples/ethtool/Makefile b/examples/ethtool/Makefile
new file mode 100644
index 000..94f8ee3
--- /dev/null
+++ b/examples/ethtool/Makefile
@@ -0,0 +1,48 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2015 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of Intel Corporation nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+ifeq ($(RTE_SDK),)
+$(error "Please define RTE_SDK environment variable")
+endif
+
+# Default target, can be overwritten by command line or environment
+RTE_TARGET ?= x86_64-native-linuxapp-gcc
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+ifneq ($(CONFIG_RTE_EXEC_ENV),"linuxapp")
+$(error This application can only operate in a linuxapp environment, \
+please change the definition of the RTE_TARGET environment variable)
+endif
+
+DIRS-y += lib ethtool-app
+
+include $(RTE_SDK)/mk/rte.extsubdir.mk
diff --git a/examples/ethtool/ethtool-app/Makefile 
b/examples/ethtool/ethtool-app/Makefile
new file mode 100644
index 000..09c66ad
--- /dev/null
+++ b/examples/ethtool/ethtool-app/Makefile
@@ -0,0 +1,54 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of Intel Corporation nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software w

[dpdk-dev] [PATCH v8 0/2] User-space ethtool sample application

2015-11-16 Thread Remy Horton
Further enhancements to the userspace ethtool implementation that was
submitted in 2.1 and packaged as a self-contained sample application.
Implements an rte_ethtool shim layer based on rte_ethdev API, along
with a command prompt driven demonstration application.

This patchset depends on:
* http://dpdk.org/dev/patchwork/patch/6563/
* http://dpdk.org/dev/patchwork/patch/8070/

v8:
* Rebased to latest origin/master

v7:
* Ringparam printouts wrong way round
* Ringparam help message corrections
* Use __rte_unused instead of __attribute__((unused))
* Allow Jumbo-sized MTUs
* Documentation style and spelling changes

v6:
* Fixed hang when run with zero available ports
* Fixed incorrect sanity check preventing EEPROM dumps
* Documentation additions
* Fixed RxMode accepting untagged packets
* Fixed ringparam allocation being too small

v5:
* Documentation changes

v4:
* Fixed assumption that master core always has id zero
* Changed 1:1 core-to-port to 2 core (ethtool & ports) design 
* Included the correct documentation..

v3:
* Made use of enums for core state.
* Fixed Makefile issue.
* Fixed incorrect assumption with core ids.
* Changed handling of more ports than cores.

v2:
* Replaced l2fwd base with simpler application.
* Added ringparam functions.
* Added documentation.

Remy Horton (2):
  example: add user-space ethtool sample application
  doc: add user-space ethtool sample app guide & release notes

 MAINTAINERS   |   4 +
 doc/guides/rel_notes/release_2_2.rst  |   1 +
 doc/guides/sample_app_ug/ethtool.rst  | 160 +++
 doc/guides/sample_app_ug/index.rst|   1 +
 examples/ethtool/Makefile |  48 ++
 examples/ethtool/ethtool-app/Makefile |  54 +++
 examples/ethtool/ethtool-app/ethapp.c | 873 ++
 examples/ethtool/ethtool-app/ethapp.h |  41 ++
 examples/ethtool/ethtool-app/main.c   | 305 
 examples/ethtool/lib/Makefile |  57 +++
 examples/ethtool/lib/rte_ethtool.c| 421 
 examples/ethtool/lib/rte_ethtool.h| 410 
 12 files changed, 2375 insertions(+)
 create mode 100644 doc/guides/sample_app_ug/ethtool.rst
 create mode 100644 examples/ethtool/Makefile
 create mode 100644 examples/ethtool/ethtool-app/Makefile
 create mode 100644 examples/ethtool/ethtool-app/ethapp.c
 create mode 100644 examples/ethtool/ethtool-app/ethapp.h
 create mode 100644 examples/ethtool/ethtool-app/main.c
 create mode 100644 examples/ethtool/lib/Makefile
 create mode 100644 examples/ethtool/lib/rte_ethtool.c
 create mode 100644 examples/ethtool/lib/rte_ethtool.h

-- 
1.9.3



[dpdk-dev] [PATCH v8 1/2] example: add user-space ethtool sample application

2015-11-16 Thread Remy Horton
Further enhancements to the userspace ethtool implementation that was
submitted in 2.1 and packaged as a self-contained sample application.
Implements an rte_ethtool shim layer based on rte_ethdev API, along
with a command prompt driven demonstration application.

Signed-off-by: Remy Horton 
---
 MAINTAINERS   |   4 +
 examples/ethtool/Makefile |  48 ++
 examples/ethtool/ethtool-app/Makefile |  54 +++
 examples/ethtool/ethtool-app/ethapp.c | 873 ++
 examples/ethtool/ethtool-app/ethapp.h |  41 ++
 examples/ethtool/ethtool-app/main.c   | 305 
 examples/ethtool/lib/Makefile |  57 +++
 examples/ethtool/lib/rte_ethtool.c| 421 
 examples/ethtool/lib/rte_ethtool.h| 410 
 9 files changed, 2213 insertions(+)
 create mode 100644 examples/ethtool/Makefile
 create mode 100644 examples/ethtool/ethtool-app/Makefile
 create mode 100644 examples/ethtool/ethtool-app/ethapp.c
 create mode 100644 examples/ethtool/ethtool-app/ethapp.h
 create mode 100644 examples/ethtool/ethtool-app/main.c
 create mode 100644 examples/ethtool/lib/Makefile
 create mode 100644 examples/ethtool/lib/rte_ethtool.c
 create mode 100644 examples/ethtool/lib/rte_ethtool.h

diff --git a/MAINTAINERS b/MAINTAINERS
index d6feada..08e9047 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -526,3 +526,7 @@ F: doc/guides/sample_app_ug/vmdq_dcb_forwarding.rst
 M: Pablo de Lara 
 M: Daniel Mrzyglod 
 F: examples/ptpclient/
+
+M: Remy Horton 
+F: examples/ethtool/
+F: doc/guides/sample_app_ug/ethtool.rst
diff --git a/examples/ethtool/Makefile b/examples/ethtool/Makefile
new file mode 100644
index 000..94f8ee3
--- /dev/null
+++ b/examples/ethtool/Makefile
@@ -0,0 +1,48 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2015 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of Intel Corporation nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+ifeq ($(RTE_SDK),)
+$(error "Please define RTE_SDK environment variable")
+endif
+
+# Default target, can be overwritten by command line or environment
+RTE_TARGET ?= x86_64-native-linuxapp-gcc
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+ifneq ($(CONFIG_RTE_EXEC_ENV),"linuxapp")
+$(error This application can only operate in a linuxapp environment, \
+please change the definition of the RTE_TARGET environment variable)
+endif
+
+DIRS-y += lib ethtool-app
+
+include $(RTE_SDK)/mk/rte.extsubdir.mk
diff --git a/examples/ethtool/ethtool-app/Makefile 
b/examples/ethtool/ethtool-app/Makefile
new file mode 100644
index 000..09c66ad
--- /dev/null
+++ b/examples/ethtool/ethtool-app/Makefile
@@ -0,0 +1,54 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of Intel Corporation nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without s

[dpdk-dev] [PATCH v8 2/2] doc: add user-space ethtool sample app guide

2015-11-16 Thread Remy Horton
Signed-off-by: Remy Horton 
---
 doc/guides/rel_notes/release_2_2.rst |   1 +
 doc/guides/sample_app_ug/ethtool.rst | 160 +++
 doc/guides/sample_app_ug/index.rst   |   1 +
 3 files changed, 162 insertions(+)
 create mode 100644 doc/guides/sample_app_ug/ethtool.rst

diff --git a/doc/guides/rel_notes/release_2_2.rst 
b/doc/guides/rel_notes/release_2_2.rst
index 0781ae6..43901dc 100644
--- a/doc/guides/rel_notes/release_2_2.rst
+++ b/doc/guides/rel_notes/release_2_2.rst
@@ -207,6 +207,7 @@ Libraries
 Examples
 

+* **ethtool: Added ethtool shim and sample application.**

 Other
 ~
diff --git a/doc/guides/sample_app_ug/ethtool.rst 
b/doc/guides/sample_app_ug/ethtool.rst
new file mode 100644
index 000..4d1697e
--- /dev/null
+++ b/doc/guides/sample_app_ug/ethtool.rst
@@ -0,0 +1,160 @@
+
+..  BSD LICENSE
+Copyright(c) 2015 Intel Corporation. All rights reserved.
+All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions
+are met:
+
+* Redistributions of source code must retain the above copyright
+notice, this list of conditions and the following disclaimer.
+* Redistributions in binary form must reproduce the above copyright
+notice, this list of conditions and the following disclaimer in
+the documentation and/or other materials provided with the
+distribution.
+* Neither the name of Intel Corporation nor the names of its
+contributors may be used to endorse or promote products derived
+from this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+Ethtool Sample Application
+==
+
+The Ethtool sample application shows an implementation of an
+ethtool-like API and provides a console environment that allows
+its use to query and change Ethernet card parameters. The sample
+is based upon a simple L2 frame reflector.
+
+Compiling the Application
+-
+
+To compile the application:
+
+#.  Go to the sample application directory:
+
+.. code-block:: console
+
+export RTE_SDK=/path/to/rte_sdk
+cd ${RTE_SD}/examples/ethtool
+
+#.  Set the target (a default target is used if not specified). For example:
+
+.. code-block:: console
+
+export RTE_TARGET=x86_64-native-linuxapp-gcc
+
+See the *DPDK Getting Started Guide* for possible RTE_TARGET values.
+
+#.  Build the application:
+
+.. code-block:: console
+
+make
+
+Running the Application
+---
+
+The application requires an available core for each port, plus one.
+The only available options are the standard ones for the EAL:
+
+.. code-block:: console
+
+./ethtool-app/ethtool-app/${RTE_TARGET}/ethtool [EAL options]
+
+Refer to the *DPDK Getting Started Guide* for general information on
+running applications and the Environment Abstraction Layer (EAL)
+options.
+
+Using the application
+-
+
+The application is console-driven using the cmdline DPDK interface:
+
+.. code-block:: console
+
+EthApp>
+
+From this interface the available commands and descriptions of what
+they do as as follows:
+
+* ``drvinfo``: Print driver info
+* ``eeprom``: Dump EEPROM to file
+* ``link``: Print port link states
+* ``macaddr``: Gets/sets MAC address
+* ``mtu``: Set NIC MTU
+* ``open``: Open port
+* ``pause``: Get/set port pause state
+* ``portstats``: Print port statistics
+* ``regs``: Dump port register(s) to file
+* ``ringparam``: Get/set ring parameters
+* ``rxmode``: Toggle port Rx mode
+* ``stop``: Stop port
+* ``validate``: Check that given MAC address is valid unicast address
+* ``vlan``: Add/remove VLAN id
+* ``quit``: Exit program
+
+
+Explanation
+---
+
+The sample program has two parts: A background `packet reflector`_
+that runs on a slave core, and a foreground `Ethtool Shell`_ that
+runs on the master core. These are described below.
+
+Packet Reflector
+
+
+The background packet reflector is intended to demonstrate basic
+packet processing on NIC ports controlled by the Ethtool shim.
+Each incoming MAC frame is 

[dpdk-dev] Removing the figure and table lists from the documentation

2015-11-16 Thread Mcnamara, John
Hi,

The figure and table lists in the documentation are out of date again because 
some images/tables have been added to the docs without updating the lists.

I could submit a patch to fix this but I would prefer to remove them since they 
are hard to maintain and not very useful.

If you are wondering about what I am referring to then that is probably a half 
vote in favour or removing them. See the end of the index sections of:

http://dpdk.org/doc/guides/prog_guide/
http://dpdk.org/doc/guides/sample_app_ug/index.html
http://dpdk.org/doc/guides/nics/index.html

John


[dpdk-dev] Recent changes related to interrupt thread

2015-11-16 Thread Thomas Monjalon
Hi,

2015-11-16 18:02, Rahul Lakkireddy:
> Hi,
> 
> I notice that the following changeset:
> 
> Fixes: fd6949c55c9a ("eal: fix io permission for virtio interrupt
> handler")
> 
> has moved the initialization of the interrupt thread to after the master
> lcore has been initialized.  However, this causes the interrupt thread
> to _inherit_ the affinity of the master lcore. Hence, this seems to
> make all interrupts to be handled by _only_ the master lcore. Because
> of this change, it seems that now alarm interrupts would also be handled
> by master lcore only, IIUC.
> 
> We are seeing a performance regression for cxgbe PMD after this commit
> since, cxgbe PMD relies on alarm to periodically transmit pending
> coalesced packets.
> 
> Also, this perf degradation is only seen if there's a queue allocated
> on the master lcore, such as in l3fwd app.  If the master lcore has
> been skipped, then no degradation in perf is seen since only the alarm
> will run on the master lcore.
> 
> So, is the change done to make all interrupts, including alarm
> interrupts, be handled by _only_ the master lcore intended?

No it was not intended. The idea was to inherit settings (iopl) from
the device initialization into the interrupt thread.
Though a DPDK driver is not really supposed to rely on interrupt performance.
So having interrupts managed on any core was more or less a side effect.

> BTW, I have tried setting the affinity to all cpus instead in
> eal_intr_init() and this seems to restore the perf back. Perhaps it's
> better to move the master lcore initialization to after the interrupt
> thread has been initialized as well? Thoughts?

Yes, i think it's possible.
We can also imagine a command line option to set the interrupt affinity
with a default which mimics the old behaviour.

In order to make this conversation clearer, and for later references,
below is the DPDK init call tree:

start
driver constructor (if .a)
rte_eal_driver_register
main
rte_eal_init
eal_parse_args
rte_eal_pci_init
rte_eal_memory_init
eal_plugins_init
dlopen
driver constructor (if .so)
rte_eal_driver_register
eal_thread_init_master
eal_thread_set_affinity
rte_eal_dev_init
driver->init
PMD init
rte_eth_driver_register
rte_eal_intr_init
pthread_create
eal_intr_thread_main
eal_intr_handle_interrupts
pthread_create
rte_eal_pci_probe
driver->devinit
rte_eth_dev_init
rte_eth_dev_allocate
eth_drv->eth_dev_init



[dpdk-dev] [PATCH v8 0/2] User-space ethtool sample application

2015-11-16 Thread Thomas Monjalon
2015-11-16 13:42, Remy Horton:
> This patchset depends on:
> * http://dpdk.org/dev/patchwork/patch/6563/

This one has changes requested.

> * http://dpdk.org/dev/patchwork/patch/8070/

This one is accepted.



[dpdk-dev] [PATCH] kni: fix compilation issue with KNI_VHOST enabled

2015-11-16 Thread Pablo de Lara
Fix for the following error, on kernels 4.2.0 or higher,
when KNI_VHOST is enabled:

  CC [M]  lib/librte_eal/linuxapp/kni/kni_vhost.o
  lib/librte_eal/linuxapp/kni/kni_vhost.c: In function 
?kni_vhost_backend_init?:
  lib/librte_eal/linuxapp/kni/kni_vhost.c:669:38: error: too few 
arguments to function ?sk_alloc?
  if (!(q = (struct kni_vhost_queue *)sk_alloc(
  ^
In file included from lib/librte_eal/linuxapp/kni/kni_vhost.c:27:0:
/usr/src/kernels/4.2.3-300.fc23.x86_64/include/net/sock.h:1515:14: note: 
declared here
 struct sock *sk_alloc(struct net *net, int family, gfp_t priority,
  ^
This change in the kernel was added in the following commit:

Linux: 11aa9c28 ("net: Pass kern from net_proto_family.create to sk_alloc")

Signed-off-by: Pablo de Lara 
---
 lib/librte_eal/linuxapp/kni/kni_vhost.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/lib/librte_eal/linuxapp/kni/kni_vhost.c 
b/lib/librte_eal/linuxapp/kni/kni_vhost.c
index d0c12a6..2346ff3 100644
--- a/lib/librte_eal/linuxapp/kni/kni_vhost.c
+++ b/lib/librte_eal/linuxapp/kni/kni_vhost.c
@@ -666,9 +666,15 @@ kni_vhost_backend_init(struct kni_dev *kni)
if (kni->vhost_queue != NULL)
return -1;

+#if LINUX_VERSION_CODE >= KERNEL_VERSION(4,2,0)
+   if (!(q = (struct kni_vhost_queue *)sk_alloc(
+ net, AF_UNSPEC, GFP_KERNEL, &kni_raw_proto, 0)))
+   return -ENOMEM;
+#else
if (!(q = (struct kni_vhost_queue *)sk_alloc(
  net, AF_UNSPEC, GFP_KERNEL, &kni_raw_proto)))
return -ENOMEM;
+#endif /* LINUX_VERSION_CODE >= KERNEL_VERSION(4,2,0) */

err = sock_create_lite(AF_UNSPEC, SOCK_RAW, IPPROTO_RAW, &q->sock);
if (err)
-- 
2.5.0



[dpdk-dev] [PATCH v8 2/2] doc: add user-space ethtool sample app guide

2015-11-16 Thread Mcnamara, John
> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Remy Horton
> Sent: Monday, November 16, 2015 1:42 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v8 2/2] doc: add user-space ethtool sample app
> guide
> 
> Signed-off-by: Remy Horton 

Acked-by: John McNamara 




[dpdk-dev] Removing the figure and table lists from the documentation

2015-11-16 Thread Iremonger, Bernard
Hi John,

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Mcnamara, John
> Sent: Monday, November 16, 2015 1:47 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] Removing the figure and table lists from the
> documentation
> 
> Hi,
> 
> The figure and table lists in the documentation are out of date again because
> some images/tables have been added to the docs without updating the lists.
> 
> I could submit a patch to fix this but I would prefer to remove them since
> they are hard to maintain and not very useful.
> 
> If you are wondering about what I am referring to then that is probably a half
> vote in favour or removing them. See the end of the index sections of:
> 
> http://dpdk.org/doc/guides/prog_guide/
> http://dpdk.org/doc/guides/sample_app_ug/index.html
> http://dpdk.org/doc/guides/nics/index.html
> 
> John

I think the figure and table lists should be keep as they enable cross 
referencing of the figures and tables in the documents. When images /tables are 
added the lists should be updated.

Regards,

Bernard.



[dpdk-dev] [PATCH v3 6/6] doc: update 2.2 release notes

2015-11-16 Thread Mcnamara, John


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Matej Vido
> Sent: Tuesday, November 10, 2015 2:18 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v3 6/6] doc: update 2.2 release notes
> 
> Add szedata2 PMD to 2.2 release notes.
> 
> Signed-off-by: Matej Vido 

Acked-by: John McNamara 



[dpdk-dev] [PATCH] kni: fix compilation issue with KNI_VHOST enabled

2015-11-16 Thread Zhang, Helin


-Original Message-
From: De Lara Guarch, Pablo 
Sent: Monday, November 16, 2015 10:08 PM
To: dev at dpdk.org
Cc: Zhang, Helin ; De Lara Guarch, Pablo 

Subject: [PATCH] kni: fix compilation issue with KNI_VHOST enabled

Fix for the following error, on kernels 4.2.0 or higher, when KNI_VHOST is 
enabled:

  CC [M]  lib/librte_eal/linuxapp/kni/kni_vhost.o
  lib/librte_eal/linuxapp/kni/kni_vhost.c: In function 
?kni_vhost_backend_init?:
  lib/librte_eal/linuxapp/kni/kni_vhost.c:669:38: error: too few 
arguments to function ?sk_alloc?
  if (!(q = (struct kni_vhost_queue *)sk_alloc(
  ^
In file included from lib/librte_eal/linuxapp/kni/kni_vhost.c:27:0:
/usr/src/kernels/4.2.3-300.fc23.x86_64/include/net/sock.h:1515:14: note: 
declared here  struct sock *sk_alloc(struct net *net, int family, gfp_t 
priority,
  ^
This change in the kernel was added in the following commit:

Linux: 11aa9c28 ("net: Pass kern from net_proto_family.create to sk_alloc")

Signed-off-by: Pablo de Lara 
---
 lib/librte_eal/linuxapp/kni/kni_vhost.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/lib/librte_eal/linuxapp/kni/kni_vhost.c 
b/lib/librte_eal/linuxapp/kni/kni_vhost.c
index d0c12a6..2346ff3 100644
--- a/lib/librte_eal/linuxapp/kni/kni_vhost.c
+++ b/lib/librte_eal/linuxapp/kni/kni_vhost.c
@@ -666,9 +666,15 @@ kni_vhost_backend_init(struct kni_dev *kni)
if (kni->vhost_queue != NULL)
return -1;

+#if LINUX_VERSION_CODE >= KERNEL_VERSION(4,2,0)
+   if (!(q = (struct kni_vhost_queue *)sk_alloc(
+ net, AF_UNSPEC, GFP_KERNEL, &kni_raw_proto, 0)))
Is this a kernel socket or else?

/Helin

+   return -ENOMEM;
+#else
if (!(q = (struct kni_vhost_queue *)sk_alloc(
  net, AF_UNSPEC, GFP_KERNEL, &kni_raw_proto)))
return -ENOMEM;
+#endif /* LINUX_VERSION_CODE >= KERNEL_VERSION(4,2,0) */

err = sock_create_lite(AF_UNSPEC, SOCK_RAW, IPPROTO_RAW, &q->sock);
if (err)
--
2.5.0



[dpdk-dev] [PATCH] kni: fix compilation issue with KNI_VHOST enabled

2015-11-16 Thread De Lara Guarch, Pablo


> -Original Message-
> From: Zhang, Helin
> Sent: Monday, November 16, 2015 3:19 PM
> To: De Lara Guarch, Pablo; dev at dpdk.org
> Subject: RE: [PATCH] kni: fix compilation issue with KNI_VHOST enabled
> 
> 
> 
> -Original Message-
> From: De Lara Guarch, Pablo
> Sent: Monday, November 16, 2015 10:08 PM
> To: dev at dpdk.org
> Cc: Zhang, Helin ; De Lara Guarch, Pablo
> 
> Subject: [PATCH] kni: fix compilation issue with KNI_VHOST enabled
> 
> Fix for the following error, on kernels 4.2.0 or higher, when KNI_VHOST is
> enabled:
> 
>   CC [M]  lib/librte_eal/linuxapp/kni/kni_vhost.o
>   lib/librte_eal/linuxapp/kni/kni_vhost.c: In function
> ?kni_vhost_backend_init?:
>   lib/librte_eal/linuxapp/kni/kni_vhost.c:669:38: error: too few
> arguments to function ?sk_alloc?
>   if (!(q = (struct kni_vhost_queue *)sk_alloc(
>   ^
> In file included from lib/librte_eal/linuxapp/kni/kni_vhost.c:27:0:
> /usr/src/kernels/4.2.3-300.fc23.x86_64/include/net/sock.h:1515:14: note:
> declared here  struct sock *sk_alloc(struct net *net, int family, gfp_t
> priority,
>   ^
> This change in the kernel was added in the following commit:
> 
> Linux: 11aa9c28 ("net: Pass kern from net_proto_family.create to
> sk_alloc")
> 
> Signed-off-by: Pablo de Lara 
> ---
>  lib/librte_eal/linuxapp/kni/kni_vhost.c | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/lib/librte_eal/linuxapp/kni/kni_vhost.c
> b/lib/librte_eal/linuxapp/kni/kni_vhost.c
> index d0c12a6..2346ff3 100644
> --- a/lib/librte_eal/linuxapp/kni/kni_vhost.c
> +++ b/lib/librte_eal/linuxapp/kni/kni_vhost.c
> @@ -666,9 +666,15 @@ kni_vhost_backend_init(struct kni_dev *kni)
>   if (kni->vhost_queue != NULL)
>   return -1;
> 
> +#if LINUX_VERSION_CODE >= KERNEL_VERSION(4,2,0)
> + if (!(q = (struct kni_vhost_queue *)sk_alloc(
> +   net, AF_UNSPEC, GFP_KERNEL, &kni_raw_proto, 0)))
> Is this a kernel socket or else?

Yes, that's for kernel socket. Should we put a 1? In most parts of the kernel,
"kern" is passed, but we don't have such variable. In others, they pass 0.

Thanks,
Pablo
> 
> /Helin



[dpdk-dev] [PATCH] doc: update release note for i40e base driver update

2015-11-16 Thread Mcnamara, John
> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jingjing Wu
> Sent: Tuesday, October 27, 2015 8:35 AM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH] doc: update release note for i40e base driver
> update
> 
> Signed-off-by: Jingjing Wu 

Acked-by: John McNamara 



[dpdk-dev] [PATCH 1/4] e1000: remove crc size from all byte counters

2015-11-16 Thread Stephen Hemminger
On Mon, 16 Nov 2015 10:35:14 +
Harry van Haaren  wrote:

> This patch removes the crc bytes from byte counter statistics.
> 
> Signed-off-by: Harry van Haaren 
> ---
>  drivers/net/e1000/igb_ethdev.c | 19 +--
>  1 file changed, 17 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/e1000/igb_ethdev.c b/drivers/net/e1000/igb_ethdev.c
> index 88995b0..0ad7341 100644
> --- a/drivers/net/e1000/igb_ethdev.c
> +++ b/drivers/net/e1000/igb_ethdev.c
> @@ -1480,6 +1480,13 @@ igb_read_stats_registers(struct e1000_hw *hw, struct 
> e1000_hw_stats *stats)
>  {
>   int pause_frames;
>  
> + uint64_t old_gprc  = stats->gprc;
> + uint64_t old_gptc  = stats->gptc;
> + uint64_t old_tpr   = stats->tpr;
> + uint64_t old_tpt   = stats->tpt;
> + uint64_t old_rpthc = stats->rpthc;
> + uint64_t old_hgptc = stats->hgptc;
> +
>   if(hw->phy.media_type == e1000_media_type_copper ||
>   (E1000_READ_REG(hw, E1000_STATUS) & E1000_STATUS_LU)) {
>   stats->symerrs +=
> @@ -1521,10 +1528,13 @@ igb_read_stats_registers(struct e1000_hw *hw, struct 
> e1000_hw_stats *stats)
>   /* For the 64-bit byte counters the low dword must be read first. */
>   /* Both registers clear on the read of the high dword */
>  
> + /* Workaround CRC bytes included in size, take away 4 bytes/packet */
>   stats->gorc += E1000_READ_REG(hw, E1000_GORCL);
>   stats->gorc += ((uint64_t)E1000_READ_REG(hw, E1000_GORCH) << 32);
> + stats->gorc -= (stats->gprc - old_gprc) * 4;
>   stats->gotc += E1000_READ_REG(hw, E1000_GOTCL);
>   stats->gotc += ((uint64_t)E1000_READ_REG(hw, E1000_GOTCH) << 32);
> + stats->gotc -= (stats->gptc - old_gptc) * 4;
>  
>   stats->rnbc += E1000_READ_REG(hw, E1000_RNBC);
>   stats->ruc += E1000_READ_REG(hw, E1000_RUC);
> @@ -1532,13 +1542,16 @@ igb_read_stats_registers(struct e1000_hw *hw, struct 
> e1000_hw_stats *stats)
>   stats->roc += E1000_READ_REG(hw, E1000_ROC);
>   stats->rjc += E1000_READ_REG(hw, E1000_RJC);
>  
> + stats->tpr += E1000_READ_REG(hw, E1000_TPR);
> + stats->tpt += E1000_READ_REG(hw, E1000_TPT);
> +
>   stats->tor += E1000_READ_REG(hw, E1000_TORL);
>   stats->tor += ((uint64_t)E1000_READ_REG(hw, E1000_TORH) << 32);
> + stats->tor -= (stats->tpr - old_tpr) * 4;

Why not use ETHER_CRC_LEN rather than magic # 4?



[dpdk-dev] [PATCH 1/4] e1000: remove crc size from all byte counters

2015-11-16 Thread Van Haaren, Harry
> From: Stephen Hemminger [mailto:shemming at brocade.com]
> Harry van Haaren  wrote:
> 
> > +   stats->tor -= (stats->tpr - old_tpr) * 4;
> 
> Why not use ETHER_CRC_LEN rather than magic # 4?

That would work too. Will respin tomorrow to fix and to give time
for other comments.

Thanks, -Harry


[dpdk-dev] Recent changes related to interrupt thread

2015-11-16 Thread Stephen Hemminger
On Mon, 16 Nov 2015 14:48:42 +0100
Thomas Monjalon  wrote:

> Hi,
> 
> 2015-11-16 18:02, Rahul Lakkireddy:
> > Hi,
> > 
> > I notice that the following changeset:
> > 
> > Fixes: fd6949c55c9a ("eal: fix io permission for virtio interrupt
> > handler")
> > 
> > has moved the initialization of the interrupt thread to after the master
> > lcore has been initialized.  However, this causes the interrupt thread
> > to _inherit_ the affinity of the master lcore. Hence, this seems to
> > make all interrupts to be handled by _only_ the master lcore. Because
> > of this change, it seems that now alarm interrupts would also be handled
> > by master lcore only, IIUC.
> > 
> > We are seeing a performance regression for cxgbe PMD after this commit
> > since, cxgbe PMD relies on alarm to periodically transmit pending
> > coalesced packets.
> > 
> > Also, this perf degradation is only seen if there's a queue allocated
> > on the master lcore, such as in l3fwd app.  If the master lcore has
> > been skipped, then no degradation in perf is seen since only the alarm
> > will run on the master lcore.
> > 
> > So, is the change done to make all interrupts, including alarm
> > interrupts, be handled by _only_ the master lcore intended?
> 
> No it was not intended. The idea was to inherit settings (iopl) from
> the device initialization into the interrupt thread.
> Though a DPDK driver is not really supposed to rely on interrupt performance.
> So having interrupts managed on any core was more or less a side effect.
> 
> > BTW, I have tried setting the affinity to all cpus instead in
> > eal_intr_init() and this seems to restore the perf back. Perhaps it's
> > better to move the master lcore initialization to after the interrupt
> > thread has been initialized as well? Thoughts?
> 
> Yes, i think it's possible.
> We can also imagine a command line option to set the interrupt affinity
> with a default which mimics the old behaviour.
> 
> In order to make this conversation clearer, and for later references,
> below is the DPDK init call tree:
> 

With the new interrupt mode, the interrupt thread needs some rework anyway.
Ideally, there would be multiple interrupt threads, one per core;
then use SMP affinity to align the MSI-x interrupt for the device queue
to run on the core that is processing that queue.

This would require new API's to do SMP affinity, wrapper around /proc/irq
and an API to tell DPDK which lcore is being to process a RX (and TX)
queue.





[dpdk-dev] [PATCH] vfio: Include No-IOMMU mode

2015-11-16 Thread Avi Kivity
On 11/16/2015 07:06 PM, Alex Williamson wrote:
> On Wed, 2015-10-28 at 15:21 -0600, Alex Williamson wrote:
>> There is really no way to safely give a user full access to a DMA
>> capable device without an IOMMU to protect the host system.  There is
>> also no way to provide DMA translation, for use cases such as device
>> assignment to virtual machines.  However, there are still those users
>> that want userspace drivers even under those conditions.  The UIO
>> driver exists for this use case, but does not provide the degree of
>> device access and programming that VFIO has.  In an effort to avoid
>> code duplication, this introduces a No-IOMMU mode for VFIO.
>>
>> This mode requires building VFIO with CONFIG_VFIO_NOIOMMU and enabling
>> the "enable_unsafe_noiommu_mode" option on the vfio driver.  This
>> should make it very clear that this mode is not safe.  Additionally,
>> CAP_SYS_RAWIO privileges are necessary to work with groups and
>> containers using this mode.  Groups making use of this support are
>> named /dev/vfio/noiommu-$GROUP and can only make use of the special
>> VFIO_NOIOMMU_IOMMU for the container.  Use of this mode, specifically
>> binding a device without a native IOMMU group to a VFIO bus driver
>> will taint the kernel and should therefore not be considered
>> supported.  This patch includes no-iommu support for the vfio-pci bus
>> driver only.
>>
>> Signed-off-by: Alex Williamson 
>> ---
>>
>> This is pretty well the same as RFCv2, I've changed the pr_warn to a
>> dev_warn and added another, printing the pid and comm of the task when
>> it actually opens the device.  If Stephen can port the driver code
>> over and prove that this actually works sometime next week, and there
>> aren't any objections to this code, I'll include it in a pull request
>> for the next merge window.  MST, I dropped your ack due to the
>> changes, but I'll be happy to add it back if you like.  Thanks,
>>
>> Alex
>>
>>   drivers/vfio/Kconfig|   15 +++
>>   drivers/vfio/pci/vfio_pci.c |8 +-
>>   drivers/vfio/vfio.c |  186 
>> ++-
>>   include/linux/vfio.h|3 +
>>   include/uapi/linux/vfio.h   |7 ++
>>   5 files changed, 209 insertions(+), 10 deletions(-)
> FYI, this is now in v4.4-rc1 (the slightly modified v2 version).  I want
> to give fair warning though that while we seem to agree on this idea, it
> hasn't been proven with a userspace driver port.  I've opted to include
> it in this merge window rather than delaying it until v4.5, but I really
> need to see a user for this before the end of the v4.4 cycle or I think
> we'll need to revert and revisit for v4.5 anyway.  I don't really have
> any interest in adding and maintaining code that has no users.  Please
> keep me informed of progress with a dpdk port.  Thanks,
>
>

Thanks Alex.  Copying the dpdk mailing list, where the users live.

dpdk-ers: vfio-noiommu is a replacement for uio_pci_generic and 
uio_igb.  It supports MSI-X and so can be used on SR/IOV VF devices.  
The intent is that you can use dpdk without an external module, using 
vfio, whether you are on bare metal with an iommu, bare metal without an 
iommu, or virtualized.  However, dpdk needs modification to support this.



[dpdk-dev] Recent changes related to interrupt thread

2015-11-16 Thread Ananyev, Konstantin


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Stephen Hemminger
> Sent: Monday, November 16, 2015 5:07 PM
> To: Thomas Monjalon
> Cc: dev at dpdk.org; Nirranjan Kirubaharan; Felix Marti; Kumar Sanghvi
> Subject: Re: [dpdk-dev] Recent changes related to interrupt thread
> 
> On Mon, 16 Nov 2015 14:48:42 +0100
> Thomas Monjalon  wrote:
> 
> > Hi,
> >
> > 2015-11-16 18:02, Rahul Lakkireddy:
> > > Hi,
> > >
> > > I notice that the following changeset:
> > >
> > > Fixes: fd6949c55c9a ("eal: fix io permission for virtio interrupt
> > > handler")
> > >
> > > has moved the initialization of the interrupt thread to after the master
> > > lcore has been initialized.  However, this causes the interrupt thread
> > > to _inherit_ the affinity of the master lcore. Hence, this seems to
> > > make all interrupts to be handled by _only_ the master lcore. Because
> > > of this change, it seems that now alarm interrupts would also be handled
> > > by master lcore only, IIUC.
> > >
> > > We are seeing a performance regression for cxgbe PMD after this commit
> > > since, cxgbe PMD relies on alarm to periodically transmit pending
> > > coalesced packets.
> > >
> > > Also, this perf degradation is only seen if there's a queue allocated
> > > on the master lcore, such as in l3fwd app.  If the master lcore has
> > > been skipped, then no degradation in perf is seen since only the alarm
> > > will run on the master lcore.
> > >
> > > So, is the change done to make all interrupts, including alarm
> > > interrupts, be handled by _only_ the master lcore intended?
> >
> > No it was not intended. The idea was to inherit settings (iopl) from
> > the device initialization into the interrupt thread.
> > Though a DPDK driver is not really supposed to rely on interrupt 
> > performance.
> > So having interrupts managed on any core was more or less a side effect.
> >
> > > BTW, I have tried setting the affinity to all cpus instead in
> > > eal_intr_init() and this seems to restore the perf back. Perhaps it's
> > > better to move the master lcore initialization to after the interrupt
> > > thread has been initialized as well? Thoughts?
> >
> > Yes, i think it's possible.
> > We can also imagine a command line option to set the interrupt affinity
> > with a default which mimics the old behaviour.
> >
> > In order to make this conversation clearer, and for later references,
> > below is the DPDK init call tree:
> >
> 
> With the new interrupt mode, the interrupt thread needs some rework anyway.
> Ideally, there would be multiple interrupt threads, one per core;
> then use SMP affinity to align the MSI-x interrupt for the device queue
> to run on the core that is processing that queue.
> 
> This would require new API's to do SMP affinity, wrapper around /proc/irq
> and an API to tell DPDK which lcore is being to process a RX (and TX)
> queue.

There is no one to one mapping between lcore and device queue.
Any lcore can do RX/TX on the device queue.
Of course it is preferable to do it from the core on the same socket, but not 
required.
You can even have multiple threads  RX/TX from/to the same queue -
as long as you provide some sync mechanism between them.
Konstantin 

> 
> 



[dpdk-dev] Recent changes related to interrupt thread

2015-11-16 Thread Stephen Hemminger
I was thinking of something like:

rte_intr_affinity(portid, queueid, lcoreid)

And per-lcore interrupt threads.

On Mon, Nov 16, 2015 at 9:19 AM, Ananyev, Konstantin <
konstantin.ananyev at intel.com> wrote:

>
>
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Stephen Hemminger
> > Sent: Monday, November 16, 2015 5:07 PM
> > To: Thomas Monjalon
> > Cc: dev at dpdk.org; Nirranjan Kirubaharan; Felix Marti; Kumar Sanghvi
> > Subject: Re: [dpdk-dev] Recent changes related to interrupt thread
> >
> > On Mon, 16 Nov 2015 14:48:42 +0100
> > Thomas Monjalon  wrote:
> >
> > > Hi,
> > >
> > > 2015-11-16 18:02, Rahul Lakkireddy:
> > > > Hi,
> > > >
> > > > I notice that the following changeset:
> > > >
> > > > Fixes: fd6949c55c9a ("eal: fix io permission for virtio interrupt
> > > > handler")
> > > >
> > > > has moved the initialization of the interrupt thread to after the
> master
> > > > lcore has been initialized.  However, this causes the interrupt
> thread
> > > > to _inherit_ the affinity of the master lcore. Hence, this seems to
> > > > make all interrupts to be handled by _only_ the master lcore. Because
> > > > of this change, it seems that now alarm interrupts would also be
> handled
> > > > by master lcore only, IIUC.
> > > >
> > > > We are seeing a performance regression for cxgbe PMD after this
> commit
> > > > since, cxgbe PMD relies on alarm to periodically transmit pending
> > > > coalesced packets.
> > > >
> > > > Also, this perf degradation is only seen if there's a queue allocated
> > > > on the master lcore, such as in l3fwd app.  If the master lcore has
> > > > been skipped, then no degradation in perf is seen since only the
> alarm
> > > > will run on the master lcore.
> > > >
> > > > So, is the change done to make all interrupts, including alarm
> > > > interrupts, be handled by _only_ the master lcore intended?
> > >
> > > No it was not intended. The idea was to inherit settings (iopl) from
> > > the device initialization into the interrupt thread.
> > > Though a DPDK driver is not really supposed to rely on interrupt
> performance.
> > > So having interrupts managed on any core was more or less a side
> effect.
> > >
> > > > BTW, I have tried setting the affinity to all cpus instead in
> > > > eal_intr_init() and this seems to restore the perf back. Perhaps it's
> > > > better to move the master lcore initialization to after the interrupt
> > > > thread has been initialized as well? Thoughts?
> > >
> > > Yes, i think it's possible.
> > > We can also imagine a command line option to set the interrupt affinity
> > > with a default which mimics the old behaviour.
> > >
> > > In order to make this conversation clearer, and for later references,
> > > below is the DPDK init call tree:
> > >
> >
> > With the new interrupt mode, the interrupt thread needs some rework
> anyway.
> > Ideally, there would be multiple interrupt threads, one per core;
> > then use SMP affinity to align the MSI-x interrupt for the device queue
> > to run on the core that is processing that queue.
> >
> > This would require new API's to do SMP affinity, wrapper around /proc/irq
> > and an API to tell DPDK which lcore is being to process a RX (and TX)
> > queue.
>
> There is no one to one mapping between lcore and device queue.
> Any lcore can do RX/TX on the device queue.
> Of course it is preferable to do it from the core on the same socket, but
> not required.
> You can even have multiple threads  RX/TX from/to the same queue -
> as long as you provide some sync mechanism between them.
> Konstantin
>
> >
> >
>
>


[dpdk-dev] [PATCH] l3fwd: Fix l3fwd crash due to unaligned load/store intrinsics

2015-11-16 Thread Harish Patil
>

>
>
>> -Original Message-
>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of
>>harish.patil at qlogic.com
>> Sent: Sunday, November 08, 2015 7:40 PM
>> To: dev at dpdk.org
>> Subject: [dpdk-dev] [PATCH] l3fwd: Fix l3fwd crash due to unaligned
>>load/store intrinsics
>>
>> From: Harish Patil 
>>
>> l3fwd app expects PMDs to return packets whose L2 header is
>> 16-byte aligned due to usage of _mm_load_si128()/_mm_store_si128()
>> intrinsics in the app. However, most of the protocol stacks expects
>> packets such that its IP/L3 header be aligned on a 16-byte boundary.
>>
>> Based on the recommendations received on dpdk-dev, we are changing
>> the l3fwd app to use _mm_loadu_si128()/_mm_loadu_si128() so that the
>> address need not be 16-byte aligned and thereby preventing crash.
>> We have tested that there is no performance impact due to this
>> change.
>>
>> Signed-off-by: Harish Patil 
>> ---
>
>Acked-by: Konstantin Ananyev 
>
>As a side notice:
>In fact with gcc build I do see a slight regression: ~1%
>for 4 ports over 1 core test-case.
>Though I think the problem is not in the patch itself.
>By some, unknown to me reason, gcc treats aligned and unaligned load/store
>instrincts in a different way (at least for that particular case).
>With aligned load/store in use it generates code that is pretty close to
>the source:
> 4 loads first, then 4 BLENDs, then 4  stores  (with some interfering
>scalar instructions of course).
>But with unaligned ones  gcc starts to mix loads and blends for the same
>register, so now it is:
>load x0; blend x0; load x1; blend x1; ..
>As if the source code was:
>
>te[0] = _mm_loadu_si128(p[0]);
>te[0] =  _mm_blend_epi16(te[0], ve[0], MASK_ETH);
>te[1] = _mm_loadu_si128(p[1]);
>te[1] =  _mm_blend_epi16(te[1], ve[1], MASK_ETH);
>...
>
>So load latency is not hidden any more.
>I tried it with different versions of  - same story for all of them.
>Clang doesn't have such issue and generates similar code for both
>aligned and unaligned instrincts.
>
>The only way to fix it I can think about  -  put rte_compiler_barrier()
>just before the first blend instinct.
>That helped, now there are no noticeable differences in generated code
>and results before and after the patch.
> So I suppose, I'll have to submit a patch after yours one to fix that
>problem.
>Konstantin
>
>

Sure, thanks for verifying and providing fix.

Harish




This message and any attached documents contain information from the sending 
company or its parent company(s), subsidiaries, divisions or branch offices 
that may be confidential. If you are not the intended recipient, you may not 
read, copy, distribute, or use this information. If you have received this 
transmission in error, please notify the sender immediately by reply e-mail and 
then delete this message.


[dpdk-dev] [PATCH v6 00/15] Support ARMv7 architecture

2015-11-16 Thread David Marchand
Hello,

On Tue, Nov 3, 2015 at 12:47 AM, Jan Viktorin 
wrote:

> Hello DPDK community,
>
> ARMv7 again, changes:
>
> * removed unnecessary code in the #ifndef RTE_FORCE_INTRINSICS .. #endif
> (atomic, spinlock, byteorder)
> * more splitting of headers to have 32/64 bit variants (atomic, cpuflags)
> * fixed cpuflags AT_PLATFORM
>
> Other details in the individual commits as usual.
>
> ---
> [snip]
> ---
>
> Jan Viktorin (7):
>   eal/arm: implement rdtsc by PMU or clock_gettime
>   eal/arm: use vector memcpy only when NEON is enabled
>   eal/arm: detect arm architecture in cpu flags
>   eal/arm: rwlock support for ARM
>   eal/arm: add very incomplete rte_vect
>   gcc/arm: avoid alignment errors to break build
>   maintainers: claim responsibility for ARMv7
>
> Vlastimil Kosar (8):
>   eal/arm: atomic operations for ARM
>   eal/arm: byte order operations for ARM
>   eal/arm: cpu cycle operations for ARM
>   eal/arm: prefetch operations for ARM
>   eal/arm: spinlock operations for ARM (without HTM)
>   eal/arm: vector memcpy for ARM
>   eal/arm: cpu flag checks for ARM
>   mk: Introduce ARMv7 architecture
>
>
Looks good to me.
Acked-by: David Marchand 

Thanks Jan.


-- 
David Marchand


[dpdk-dev] [PATCH v7 4/8] vhost: rxtx: use queue id instead of constant ring index

2015-11-16 Thread Flavio Leitner
On Wed, Oct 28, 2015 at 11:12:25PM +0200, Michael S. Tsirkin wrote:
> On Wed, Oct 28, 2015 at 06:30:41PM -0200, Flavio Leitner wrote:
> > On Sat, Oct 24, 2015 at 08:47:10PM +0300, Michael S. Tsirkin wrote:
> > > On Sat, Oct 24, 2015 at 12:34:08AM -0200, Flavio Leitner wrote:
> > > > On Thu, Oct 22, 2015 at 02:32:31PM +0300, Michael S. Tsirkin wrote:
> > > > > On Thu, Oct 22, 2015 at 05:49:55PM +0800, Yuanhan Liu wrote:
> > > > > > On Wed, Oct 21, 2015 at 05:26:18PM +0300, Michael S. Tsirkin wrote:
> > > > > > > On Wed, Oct 21, 2015 at 08:48:15PM +0800, Yuanhan Liu wrote:
> > > > > > > > > Please note that for virtio devices, guest is supposed to
> > > > > > > > > control the placement of incoming packets in RX queues.
> > > > > > > > 
> > > > > > > > I may not follow you.
> > > > > > > > 
> > > > > > > > Enqueuing packets to a RX queue is done at vhost lib, outside 
> > > > > > > > the
> > > > > > > > guest, how could the guest take the control here?
> > > > > > > > 
> > > > > > > > --yliu
> > > > > > > 
> > > > > > > vhost should do what guest told it to.
> > > > > > > 
> > > > > > > See virtio spec:
> > > > > > >   5.1.6.5.5 Automatic receive steering in multiqueue mode
> > > > > > 
> > > > > > Spec says:
> > > > > > 
> > > > > > After the driver transmitted a packet of a flow on transmitqX,
> > > > > > the device SHOULD cause incoming packets for that flow to be
> > > > > > steered to receiveqX.
> > > > > > 
> > > > > > 
> > > > > > Michael, I still have no idea how vhost could know the flow even
> > > > > > after discussion with Huawei. Could you be more specific about
> > > > > > this? Say, how could guest know that? And how could guest tell
> > > > > > vhost which RX is gonna to use?
> > > > > > 
> > > > > > Thanks.
> > > > > > 
> > > > > > --yliu
> > > > > 
> > > > > I don't really understand the question.
> > > > > 
> > > > > When guests transmits a packet, it makes a decision
> > > > > about the flow to use, and maps that to a tx/rx pair of queues.
> > > > > 
> > > > > It sends packets out on the tx queue and expects device to
> > > > > return packets from the same flow on the rx queue.
> > > > 
> > > > Why? I can understand that there should be a mapping between
> > > > flows and queues in a way that there is no re-ordering, but
> > > > I can't see the relation of receiving a flow with a TX queue.
> > > > 
> > > > fbl
> > > 
> > > That's the way virtio chose to program the rx steering logic.
> > > 
> > > It's low overhead (no special commands), and
> > > works well for TCP when user is an endpoint since rx and tx
> > > for tcp are generally tied (because of ack handling).

It is low overhead for the control plane, but not for the data plane.

> > > We can discuss other ways, e.g. special commands for guests to
> > > program steering.
> > > We'd have to first see some data showing the current scheme
> > > is problematic somehow.

The issue is that the spec assumes the packets are coming in
a serialized way and the distribution will be made by vhost-user
but that isn't necessarily true.


> > The issue is that the restriction imposes operations to be done in the
> > data path.  For instance, Open vSwitch has N number of threads to manage
> > X RX queues. We distribute them in round-robin fashion.  So, the thread
> > polling one RX queue will do all the packet processing and push it to the
> > TX queue of the other device (vhost-user or not) using the same 'id'.
> > 
> > Doing so we can avoid locking between threads and TX queues and any other
> > extra computation while still keeping the packet ordering/distribution fine.
> > 
> > However, if vhost-user has to send packets according with guest mapping,
> > it will require locking between queues and additional operations to select
> > the appropriate queue.  Those actions will cause performance issues.
> 
> You only need to send updates if guest moves a flow to another queue.
> This is very rare since guest must avoid reordering.

OK, maybe I missed something.  Could you point me to the spec talking
about the update?


> Oh and you don't have to have locking.  Just update the table and make
> the target pick up the new value at leasure, worst case a packet ends up
> in the wrong queue.

You do because packets are coming on different vswitch queues and they
could get mapped to the same virtio queue enforced by the guest, so some
sort of synchronization is needed.

That is one thing.  Another is that it will need some mapping between the
hash available in the vswitch (not necessary L2~L4) with the hash/queue
mapping provided by the guest.  That doesn't require locking, but it's a
costly operation.  Alternatively, vswitch could calculate full L2-L4 hash
which is also a costly operation.

Packets ending in the wrong queue isn't that bad, but then we need to
enforce processing order because re-ordering is really bad.


> > I see no real benefit from enforcing the guest mapping outside to
> > justify all the computation cost, so my

[dpdk-dev] How to approach packet TX lockups

2015-11-16 Thread Matt Laswell
Hey Folks,

I sent this to the users email list, but I'm not sure how many people are
actively reading that list at this point.  I'm dealing with a situation in
which my application loses the ability to transmit packets out of a port
during times of moderate stress.  I'd love to hear suggestions for how to
approach this problem, as I'm a bit at a loss at the moment.

Specifically, I'm using DPDK 1.6r2 running on Ubuntu 14.04LTS on Haswell
processors.  I'm using the 82599 controller, configured to spread packets
across multiple queues.  Each queue is accessed by a different lcore in my
application; there is therefore concurrent access to the controller, but
not to any of the queues.  We're binding the ports to the igb_uio driver.
The symptoms I see are these:


   - All transmit out of a particular port stops
   - rte_eth_tx_burst() indicates that it is sending all of the packets
   that I give to it
   - rte_eth_stats_get() gives me stats indicating that no packets are
   being sent on the affected port.  Also, no tx errors, and no pause frames
   sent or received (opackets = 0, obytes = 0, oerrors = 0, etc.)
   - All other ports continue to work normally
   - The affected port continues to receive packets without problems; only
   TX is affected
   - Resetting the port via rte_eth_dev_stop() and rte_eth_dev_start()
   restores things and packets can flow again
   - The problem is replicable on multiple devices, and doesn't follow one
   particular port

I've tried calling rte_mbuf_sanity_check() on all packets before sending
them.  I've also instrumented my code to look for packets that have already
been sent or freed, as well as cycles in chained packets being sent.  I
also put a lock around all accesses to rte_eth* calls to synchronize access
to the NIC.  Given some recent discussion here, I also tried changing the
TX RS threshold from 0 to 32, 16, and 1.  None of these strategies proved
effective.

Like I said at the top, I'm a little at a loss at this point.  If you were
dealing with this set of symptoms, how would you proceed?

Thanks in advance.

--
Matt Laswell
infinite io, inc.
laswell at infiniteio.com


[dpdk-dev] How to approach packet TX lockups

2015-11-16 Thread Stephen Hemminger
On Mon, 16 Nov 2015 17:48:35 -0600
Matt Laswell  wrote:

> Hey Folks,
> 
> I sent this to the users email list, but I'm not sure how many people are
> actively reading that list at this point.  I'm dealing with a situation in
> which my application loses the ability to transmit packets out of a port
> during times of moderate stress.  I'd love to hear suggestions for how to
> approach this problem, as I'm a bit at a loss at the moment.
> 
> Specifically, I'm using DPDK 1.6r2 running on Ubuntu 14.04LTS on Haswell
> processors.  I'm using the 82599 controller, configured to spread packets
> across multiple queues.  Each queue is accessed by a different lcore in my
> application; there is therefore concurrent access to the controller, but
> not to any of the queues.  We're binding the ports to the igb_uio driver.
> The symptoms I see are these:
> 
> 
>- All transmit out of a particular port stops
>- rte_eth_tx_burst() indicates that it is sending all of the packets
>that I give to it
>- rte_eth_stats_get() gives me stats indicating that no packets are
>being sent on the affected port.  Also, no tx errors, and no pause frames
>sent or received (opackets = 0, obytes = 0, oerrors = 0, etc.)
>- All other ports continue to work normally
>- The affected port continues to receive packets without problems; only
>TX is affected
>- Resetting the port via rte_eth_dev_stop() and rte_eth_dev_start()
>restores things and packets can flow again
>- The problem is replicable on multiple devices, and doesn't follow one
>particular port
> 
> I've tried calling rte_mbuf_sanity_check() on all packets before sending
> them.  I've also instrumented my code to look for packets that have already
> been sent or freed, as well as cycles in chained packets being sent.  I
> also put a lock around all accesses to rte_eth* calls to synchronize access
> to the NIC.  Given some recent discussion here, I also tried changing the
> TX RS threshold from 0 to 32, 16, and 1.  None of these strategies proved
> effective.
> 
> Like I said at the top, I'm a little at a loss at this point.  If you were
> dealing with this set of symptoms, how would you proceed?
> 

I remember some issues with old DPDK 1.6 with some of the prefetch
thresholds on 82599. You would be better off going to a later DPDK
version.


[dpdk-dev] How to approach packet TX lockups

2015-11-16 Thread Matt Laswell
Hey Stephen,

Thanks a lot; that's really useful information.  Unfortunately, I'm at a
stage in our release cycle where upgrading to a new version of DPDK isn't
feasible.  Any chance you (or others reading this) has a pointer to the
relevant changes?  While I can't afford to upgrade DPDK entirely,
backporting targeted fixes is more doable.

Again, thanks.

- Matt


On Mon, Nov 16, 2015 at 6:12 PM, Stephen Hemminger <
stephen at networkplumber.org> wrote:

> On Mon, 16 Nov 2015 17:48:35 -0600
> Matt Laswell  wrote:
>
> > Hey Folks,
> >
> > I sent this to the users email list, but I'm not sure how many people are
> > actively reading that list at this point.  I'm dealing with a situation
> in
> > which my application loses the ability to transmit packets out of a port
> > during times of moderate stress.  I'd love to hear suggestions for how to
> > approach this problem, as I'm a bit at a loss at the moment.
> >
> > Specifically, I'm using DPDK 1.6r2 running on Ubuntu 14.04LTS on Haswell
> > processors.  I'm using the 82599 controller, configured to spread packets
> > across multiple queues.  Each queue is accessed by a different lcore in
> my
> > application; there is therefore concurrent access to the controller, but
> > not to any of the queues.  We're binding the ports to the igb_uio driver.
> > The symptoms I see are these:
> >
> >
> >- All transmit out of a particular port stops
> >- rte_eth_tx_burst() indicates that it is sending all of the packets
> >that I give to it
> >- rte_eth_stats_get() gives me stats indicating that no packets are
> >being sent on the affected port.  Also, no tx errors, and no pause
> frames
> >sent or received (opackets = 0, obytes = 0, oerrors = 0, etc.)
> >- All other ports continue to work normally
> >- The affected port continues to receive packets without problems;
> only
> >TX is affected
> >- Resetting the port via rte_eth_dev_stop() and rte_eth_dev_start()
> >restores things and packets can flow again
> >- The problem is replicable on multiple devices, and doesn't follow
> one
> >particular port
> >
> > I've tried calling rte_mbuf_sanity_check() on all packets before sending
> > them.  I've also instrumented my code to look for packets that have
> already
> > been sent or freed, as well as cycles in chained packets being sent.  I
> > also put a lock around all accesses to rte_eth* calls to synchronize
> access
> > to the NIC.  Given some recent discussion here, I also tried changing the
> > TX RS threshold from 0 to 32, 16, and 1.  None of these strategies proved
> > effective.
> >
> > Like I said at the top, I'm a little at a loss at this point.  If you
> were
> > dealing with this set of symptoms, how would you proceed?
> >
>
> I remember some issues with old DPDK 1.6 with some of the prefetch
> thresholds on 82599. You would be better off going to a later DPDK
> version.
>


[dpdk-dev] How to approach packet TX lockups

2015-11-16 Thread Stephen Hemminger
On Mon, 16 Nov 2015 18:49:15 -0600
Matt Laswell  wrote:

> Hey Stephen,
> 
> Thanks a lot; that's really useful information.  Unfortunately, I'm at a
> stage in our release cycle where upgrading to a new version of DPDK isn't
> feasible.  Any chance you (or others reading this) has a pointer to the
> relevant changes?  While I can't afford to upgrade DPDK entirely,
> backporting targeted fixes is more doable.
> 
> Again, thanks.
> 
> - Matt
> 
> 
> On Mon, Nov 16, 2015 at 6:12 PM, Stephen Hemminger <
> stephen at networkplumber.org> wrote:
> 
> > On Mon, 16 Nov 2015 17:48:35 -0600
> > Matt Laswell  wrote:
> >
> > > Hey Folks,
> > >
> > > I sent this to the users email list, but I'm not sure how many people are
> > > actively reading that list at this point.  I'm dealing with a situation
> > in
> > > which my application loses the ability to transmit packets out of a port
> > > during times of moderate stress.  I'd love to hear suggestions for how to
> > > approach this problem, as I'm a bit at a loss at the moment.
> > >
> > > Specifically, I'm using DPDK 1.6r2 running on Ubuntu 14.04LTS on Haswell
> > > processors.  I'm using the 82599 controller, configured to spread packets
> > > across multiple queues.  Each queue is accessed by a different lcore in
> > my
> > > application; there is therefore concurrent access to the controller, but
> > > not to any of the queues.  We're binding the ports to the igb_uio driver.
> > > The symptoms I see are these:
> > >
> > >
> > >- All transmit out of a particular port stops
> > >- rte_eth_tx_burst() indicates that it is sending all of the packets
> > >that I give to it
> > >- rte_eth_stats_get() gives me stats indicating that no packets are
> > >being sent on the affected port.  Also, no tx errors, and no pause
> > frames
> > >sent or received (opackets = 0, obytes = 0, oerrors = 0, etc.)
> > >- All other ports continue to work normally
> > >- The affected port continues to receive packets without problems;
> > only
> > >TX is affected
> > >- Resetting the port via rte_eth_dev_stop() and rte_eth_dev_start()
> > >restores things and packets can flow again
> > >- The problem is replicable on multiple devices, and doesn't follow
> > one
> > >particular port
> > >
> > > I've tried calling rte_mbuf_sanity_check() on all packets before sending
> > > them.  I've also instrumented my code to look for packets that have
> > already
> > > been sent or freed, as well as cycles in chained packets being sent.  I
> > > also put a lock around all accesses to rte_eth* calls to synchronize
> > access
> > > to the NIC.  Given some recent discussion here, I also tried changing the
> > > TX RS threshold from 0 to 32, 16, and 1.  None of these strategies proved
> > > effective.
> > >
> > > Like I said at the top, I'm a little at a loss at this point.  If you
> > were
> > > dealing with this set of symptoms, how would you proceed?
> > >
> >
> > I remember some issues with old DPDK 1.6 with some of the prefetch
> > thresholds on 82599. You would be better off going to a later DPDK
> > version.
> >

I hope you are on 1.6.0r2 at least??

With older DPDK there was no way to get driver to tell you what the
preferred settings were for pthresh/hthresh/wthresh. And the values
in Intel sample applications were broken on some hardware.

I remember reverse engineering the safe values from reading the Linux driver.

The Linux driver is much better tested than the DPDK one...
In the Linux driver, the Transmit Descriptor Controller (txdctl)
is fixed at (for transmit)
   wthresh = 1
   hthresh = 1
   pthresh = 32

The DPDK 2.2 driver uses:
wthresh = 0
hthresh = 0
pthresh = 32








[dpdk-dev] How to approach packet TX lockups

2015-11-16 Thread Matthew Hall
On Mon, Nov 16, 2015 at 05:31:29PM -0800, Stephen Hemminger wrote:
> The DPDK 2.2 driver uses:
> wthresh = 0
> hthresh = 0
> pthresh = 32

Stephen,

I thought the zero values lead to doing the auto-config by the driver itself?

Matthew.


[dpdk-dev] Klientskie bazi dannix tel/Viber/WhatsApp +79133913837 Email: lnaumova...@gmail.com Skype: prodawez389 ICQ: 6288862 Yznaite podrobnee!!!

2015-11-16 Thread dev@dpdk.org