Re: [dpdk-dev] [PATCH v2 00/15] next-eventdev: event/sw software eventdev

2017-02-06 Thread Jerin Jacob
On Tue, Jan 31, 2017 at 04:14:18PM +, Harry van Haaren wrote:
> The following patchset adds software eventdev implementation
> to the next-eventdev tree.
> 
> This implementation is based on the previous software eventdev
> v1 patchset, now with comments addressed:
> 1) xstats api return values changed to be consistent
> 2) xstats api [out] added to appropriate values
> 3) xstats api now uses xxx_get() for consistency
> 4) patch names for check-log.sh

Nice to have name the in bracket who suggested it.

> 5) checkpatch issues resolved (where it makes sense to, there are
>certain places where fixing checkpatch makes the code less readable.
>These checkpatch warnings will still show up - I see no alternative)

I agree. Except a few lines in the test code with
rte_event_dev_xstats_by_name_get

> 
> In addition, the following improvements have been made to the patchset:
> 1) Adds test to run automatically with make test
> 2) Rework the sw implementation event credit scheme
> 
> The first two patches make changes to the eventdev API,
> then the software implementation is added, and finally
> tests are added for the sw eventdev implementation.

Two issue found in testing. If it makes sense, fix it in test code or
implementation

1) Running eventdev_common_autotest with event_sw0

sudo ./build/app/test  --vdev='event_sw0'
RTE>>eventdev_common_autotest

TestCase test_eventdev_port_setup() line 437 failed: Expected -EINVAL, 0
 + TestCase [11] : test_eventdev_port_setup failed 
 + TestCase [12] : test_eventdev_dequeue_depth succeeded
 + TestCase [13] : test_eventdev_enqueue_depth succeeded
 + TestCase [14] : test_eventdev_port_count succeeded
TestCase test_eventdev_timeout_ticks() line 522 failed (err -95): Fail
to get timeout_ticks
 + TestCase [15] : test_eventdev_timeout_ticks failed 
sw_start 543: queue 0 not configured
TestCase test_eventdev_start_stop() line 547 failed (err -1): Failed to
start device0
 + TestCase [16] : test_eventdev_start_stop failed 
sw_start 543: queue 0 not configured
TestCase eventdev_setup_device() line 573 failed (err -1): Failed to
start device0
EVENTDEV: rte_event_dev_stop() line 1026: Device with dev_id=0already
stopped
 + TestCase [17] : test_eventdev_link failed 
sw_start 543: queue 0 not configured
TestCase eventdev_setup_device() line 573 failed (err -1): Failed to
start device0
EVENTDEV: rte_event_dev_stop() line 1026: Device with dev_id=0already
stopped
 + TestCase [18] : test_eventdev_unlink failed 
sw_start 543: queue 0 not configured
TestCase eventdev_setup_device() line 573 failed (err -1): Failed to
start device0
EVENTDEV: rte_event_dev_stop() line 1026: Device with dev_id=0already
stopped
 + TestCase [19] : test_eventdev_link_get failed 
sw_start 543: queue 0 not configured
TestCase eventdev_setup_device() line 573 failed (err -1): Failed to
start device0
 + TestCase [20] : test_eventdev_close failed 
PMD: Initializing event_skeleton1 on NUMA node 0


2) back to back eventdev_sw_autotest invocation

RTE>>eventdev_sw_autotest 
1926: Eventdev event_sw0 not found - creating.
PMD: Creating eventdev sw device event_sw0, numa_node=0,
sched_quanta=128, credit_quanta=32
*** Running Single Directed Packet test...
*** Running Single Load Balanced Packet test...
*** Running Unordered Basic test...
*** Running Ordered Basic test...
*** Running Burst Packets test...
*** Running Load Balancing test...
*** Running Prioritized Directed test...
*** Running Prioritized Atomic test...
*** Running Prioritized Ordered test...
*** Running Prioritized Unordered test...
*** Running Invalid QID test...
*** Running Load Balancing History test...
*** Running Inflight Count test...
*** Running Abuse Inflights test...
*** Running QID Priority test...
*** Running Head-of-line-blocking test...
*** Running Worker loopback test...
1791:   Producer function started
1731:   Worker function started
1889:   Sched Rx = 10713728, Tx = 10713248
1889:   Sched Rx = 21422880, Tx = 21422400
1889:   Sched Rx = 32066688, Tx = 32066208
Test OK
RTE>>eventdev_
 eventdev_common_autotest [Mul-choice STRING]: launch autotest
 eventdev_sw_autotest [Mul-choice STRING]: launch autotest
RTE>>eventdev_sw_autotest 
*** Running Single Directed Packet test...
*** Running Single Load Balanced Packet test...
*** Running Unordered Basic test...
*** Running Ordered Basic test...
*** Running Burst Packets test...
*** Running Load Balancing test...
*** Running Prioritized Directed test...
*** Running Prioritized Atomic test...
*** Running Prioritized Ordered test...
*** Running Prioritized Unordered test...
*** Running Invalid QID test...
*** Running Load Balancing History test...
*** Running Inflight Count test...
*** Running Abuse Inflights test...
*** Running QID Priority test...
*** Running Head-of-line-blocking test...
*** Running Worker loopback test...
1791:   Producer function started
1731:   Worker function started
1889:   Sched Rx = 502401, Tx = 501860
1889:   Sched Rx = 502401, Tx = 501860
1889:   Sched Rx = 50240

Re: [dpdk-dev] [PATCH v2 01/15] eventdev: remove unneeded dependencies

2017-02-06 Thread Jerin Jacob
On Tue, Jan 31, 2017 at 04:14:19PM +, Harry van Haaren wrote:
> From: Bruce Richardson 
> 
> Since eventdev uses event structures rather than working directly on
> mbufs, there is no actual dependencies on the mbuf library. The
> inclusion of an mbuf pointer element inside the event itself does not
> require the inclusion of the mbuf header file. Similarly the pci
> header is not needed, but following their removal, rte_memory.h is
> needed for the definition of the __rte_cache_aligned macro.
> 
> Signed-off-by: Bruce Richardson 
> Signed-off-by: Harry van Haaren 

Acked-by: Jerin Jacob 

> ---


Re: [dpdk-dev] [PATCH v2 02/15] eventdev: add APIs for extended stats

2017-02-06 Thread Jerin Jacob
On Tue, Jan 31, 2017 at 04:14:20PM +, Harry van Haaren wrote:
> From: Bruce Richardson 
> 
> Add in APIs for extended stats so that eventdev implementations can report
> out information on their internal state. The APIs are based on, but not
> identical to, the equivalent ethdev functions.

The APIs Looks good. One minor comment though,

Can you add statistics specific to per event queue and event
port?, To improve the cases like below in the application code(taken from
app/test/test_sw_eventdev.c).

IMO, it is useful because,
- ethdev has similar semantics
- majority of the implementations will have port and queue specific statistics 
counters

+   for (i = 0; i < MAX_PORTS; i++) {
+   char name[32];
+   snprintf(name, sizeof(name), "port_%u_rx", i);
+   stats->port_rx_pkts[i] = rte_event_dev_xstats_by_name_get(
+   dev_id, name, &port_rx_pkts_ids[i]);

+   for (i = 0; i < MAX_QIDS; i++) {
+   char name[32];
+   snprintf(name, sizeof(name), "qid_%u_rx", i);
+   stats->qid_rx_pkts[i] = rte_event_dev_xstats_by_name_get(
+   dev_id, name, &qid_rx_pkts_ids[i]);




Re: [dpdk-dev] [PATCH v2 03/15] event/sw: add new software-only eventdev driver

2017-02-06 Thread Jerin Jacob
On Tue, Jan 31, 2017 at 04:14:21PM +, Harry van Haaren wrote:
> From: Bruce Richardson 
> 
> This adds the minimal changes to allow a SW eventdev implementation to
> be compiled, linked and created at run time. The eventdev does nothing,
> but can be created via vdev on commandline, e.g.
> 
>   sudo ./x86_64-native-linuxapp-gcc/app/test --vdev=event_sw0
>   ...
>   PMD: Creating eventdev sw device event_sw0, numa_node=0, sched_quanta=128
>   RTE>>

Like other PMDs, I think, we need to add PMD specific documentation at
doc/guides/eventdevs/sw.rst?

reference:
http://dpdk.org/browse/next/dpdk-next-crypto/tree/doc/guides/cryptodevs/zuc.rst

> 
> Signed-off-by: Bruce Richardson 
> Signed-off-by: Harry van Haaren 
> +# library dependencies
> +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_SW_EVENTDEV) += lib/librte_eal
> +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_SW_EVENTDEV) += lib/librte_eventdev
> +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_SW_EVENTDEV) += lib/librte_kvargs
> +DEPDIRS-$(CONFIG_RTE_LIBRTE_PMD_SW_EVENTDEV) += lib/librte_ring
> +
> +include $(RTE_SDK)/mk/rte.lib.mk
> diff --git a/drivers/event/sw/rte_pmd_evdev_sw_version.map 
> b/drivers/event/sw/rte_pmd_evdev_sw_version.map
> new file mode 100644
> index 000..1f84b68
> --- /dev/null
> +++ b/drivers/event/sw/rte_pmd_evdev_sw_version.map
> @@ -0,0 +1,3 @@
> +DPDK_17.02 {

DPDK_17.05

> + local: *;
> +};
> diff --git a/drivers/event/sw/sw_evdev.c b/drivers/event/sw/sw_evdev.c
> new file mode 100644
> index 000..d60f00f
> --- /dev/null
> +++ b/drivers/event/sw/sw_evdev.c
> @@ -0,0 +1,178 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2016-2017 Intel Corporation. All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + * * Redistributions of source code must retain the above copyright
> + *   notice, this list of conditions and the following disclaimer.
> + * * Redistributions in binary form must reproduce the above copyright
> + *   notice, this list of conditions and the following disclaimer in
> + *   the documentation and/or other materials provided with the
> + *   distribution.
> + * * Neither the name of Intel Corporation nor the names of its
> + *   contributors may be used to endorse or promote products derived
> + *   from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + }
> +
> + RTE_LOG(INFO, PMD,
> + "Creating eventdev sw device %s, numa_node=%d, 
> sched_quanta=%d, credit_quanta=%d\n",
> + name, socket_id, sched_quanta, credit_quanta);
> +
> + dev = rte_event_pmd_vdev_init(name,
> + sizeof(struct sw_evdev), socket_id);
> + if (dev == NULL) {
> + printf("eventdev vdev init() failed");

IMO, We need to avoid using printf for error reporting.I guess there are
multiple instance of the same in the patch series.

> + return -EFAULT;
> + }
> + dev->dev_ops = &evdev_sw_ops;
> +
> + sw = dev->data->dev_private;
> + sw->data = dev->data;
> +
> + /* copy values passed from vdev command line to instance */
> + sw->credit_update_quanta = credit_quanta;
> + sw->sched_quanta = sched_quanta;
> +
> + return 0;
> +}
> +
> +static int
> +sw_remove(const char *name)
> +{
> + if (name == NULL)
> + return -EINVAL;
> +
> + RTE_LOG(INFO, PMD, "Closing eventdev sw device %s\n", name);
> + /* TODO unregister eventdev and release memzone */

I have sent a patch to address this.

> +
> + return 0;
> +}
> +


Re: [dpdk-dev] [PATCH v2 04/15] event/sw: add device capabilities function

2017-02-06 Thread Jerin Jacob
On Tue, Jan 31, 2017 at 04:14:22PM +, Harry van Haaren wrote:
> From: Bruce Richardson 
> 
> Add in the info_get function to return details on the queues, flow,
> prioritization capabilities, etc. that this device has.
> 
> Signed-off-by: Bruce Richardson 
> Signed-off-by: Harry van Haaren 
> ---
>  drivers/event/sw/sw_evdev.c | 23 +++
>  drivers/event/sw/sw_evdev.h | 10 ++
>  2 files changed, 33 insertions(+)
> 
> diff --git a/drivers/event/sw/sw_evdev.c b/drivers/event/sw/sw_evdev.c
> index d60f00f..4dca4cf 100644
> --- a/drivers/event/sw/sw_evdev.c
> +++ b/drivers/event/sw/sw_evdev.c
> @@ -44,6 +44,28 @@
>  #define SCHED_QUANTA_ARG "sched_quanta"
>  #define CREDIT_QUANTA_ARG "credit_quanta"
>  
> +static void
> +sw_info_get(struct rte_eventdev *dev, struct rte_event_dev_info *info)
> +{
> + RTE_SET_USED(dev);
> +
> + static const struct rte_event_dev_info evdev_sw_info = {
> + .driver_name = PMD_NAME,
> + .max_event_queues = RTE_EVENT_MAX_QUEUES_PER_DEV,
> + .max_event_queue_flows = SW_QID_NUM_FIDS,
> + .max_event_queue_priority_levels = SW_Q_PRIORITY_MAX,
> + .max_event_priority_levels = SW_IQS_MAX,
> + .max_event_ports = SW_PORTS_MAX,
> + .max_event_port_dequeue_depth = MAX_SW_CONS_Q_DEPTH,
> + .max_event_port_enqueue_depth = MAX_SW_PROD_Q_DEPTH,
> + .max_num_events = SW_INFLIGHT_EVENTS_TOTAL,
> + .event_dev_cap = (RTE_EVENT_DEV_CAP_QUEUE_QOS |
> + RTE_EVENT_DEV_CAP_EVENT_QOS),
> + };
> +
> + *info = evdev_sw_info;
> +}
> +
>  static int
>  assign_numa_node(const char *key __rte_unused, const char *value, void 
> *opaque)
>  {
> @@ -78,6 +100,7 @@ static int
>  sw_probe(const char *name, const char *params)
>  {
>   static const struct rte_eventdev_ops evdev_sw_ops = {
> + .dev_infos_get = sw_info_get,
>   };
>  
>   static const char *const args[] = {
> diff --git a/drivers/event/sw/sw_evdev.h b/drivers/event/sw/sw_evdev.h
> index 9494659..6e3cb36 100644
> --- a/drivers/event/sw/sw_evdev.h
> +++ b/drivers/event/sw/sw_evdev.h
> @@ -36,8 +36,18 @@
>  #include 
>  #include 
>  
> +#define PMD_NAME "event_sw"

Better to add SW_ name space.

> +
>  #define SW_DEFAULT_CREDIT_QUANTA 32
>  #define SW_DEFAULT_SCHED_QUANTA 128
> +#define SW_QID_NUM_FIDS 16384
> +#define SW_IQS_MAX 4
> +#define SW_Q_PRIORITY_MAX 255
> +#define SW_PORTS_MAX 64
> +#define MAX_SW_CONS_Q_DEPTH 128
> +#define SW_INFLIGHT_EVENTS_TOTAL 4096
> +/* allow for lots of over-provisioning */
> +#define MAX_SW_PROD_Q_DEPTH 4096
>  
>  struct sw_evdev {
>   struct rte_eventdev_data *data;
> -- 
> 2.7.4
> 


[dpdk-dev] [PATCH] net/i40e: fix tunnel filter issue

2017-02-06 Thread Beilei Xing
Creating IPv4 flow and IPv6 flow will cause confilct error.
Root cause is there's no IP info included in tunnel filter
input.

Fixes: 425c3325f0b0 ("net/i40e: store tunnel filter")
Fixes: d416530e6358 ("net/i40e: parse tunnel filter")

Signed-off-by: Beilei Xing 
---
 drivers/net/i40e/i40e_ethdev.c |  6 ++
 drivers/net/i40e/i40e_ethdev.h |  6 ++
 drivers/net/i40e/i40e_flow.c   | 28 ++--
 3 files changed, 38 insertions(+), 2 deletions(-)

diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 4492bcc..b2dd6d6 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -6705,6 +6705,12 @@ i40e_tunnel_filter_convert(struct 
i40e_aqc_add_remove_cloud_filters_element_data
ether_addr_copy((struct ether_addr *)&cld_filter->inner_mac,
(struct ether_addr *)&tunnel_filter->input.inner_mac);
tunnel_filter->input.inner_vlan = cld_filter->inner_vlan;
+   if ((rte_le_to_cpu_16(cld_filter->flags) &
+I40E_AQC_ADD_CLOUD_FLAGS_IPV6) ==
+   I40E_AQC_ADD_CLOUD_FLAGS_IPV6)
+   tunnel_filter->input.ip_type = I40E_TUNNEL_IPTYPE_IPV6;
+   else
+   tunnel_filter->input.ip_type = I40E_TUNNEL_IPTYPE_IPV4;
tunnel_filter->input.flags = cld_filter->flags;
tunnel_filter->input.tenant_id = cld_filter->tenant_id;
tunnel_filter->queue = cld_filter->queue_number;
diff --git a/drivers/net/i40e/i40e_ethdev.h b/drivers/net/i40e/i40e_ethdev.h
index 9e2f7a2..7c344a0 100644
--- a/drivers/net/i40e/i40e_ethdev.h
+++ b/drivers/net/i40e/i40e_ethdev.h
@@ -499,11 +499,17 @@ struct i40e_ethertype_rule {
 /* Tunnel filter number HW supports */
 #define I40E_MAX_TUNNEL_FILTER_NUM 400
 
+enum i40e_tunnel_iptype {
+   I40E_TUNNEL_IPTYPE_IPV4 = 0, /* IPv4. */
+   I40E_TUNNEL_IPTYPE_IPV6, /* IPv6. */
+};
+
 /* Tunnel filter struct */
 struct i40e_tunnel_filter_input {
uint8_t outer_mac[6];/* Outer mac address to match */
uint8_t inner_mac[6];/* Inner mac address to match */
uint16_t inner_vlan; /* Inner vlan address to match */
+   enum i40e_tunnel_iptype ip_type;
uint16_t flags;  /* Filter type flag */
uint32_t tenant_id;  /* Tenant id to match */
 };
diff --git a/drivers/net/i40e/i40e_flow.c b/drivers/net/i40e/i40e_flow.c
index c14eb22..c6e4d87 100644
--- a/drivers/net/i40e/i40e_flow.c
+++ b/drivers/net/i40e/i40e_flow.c
@@ -1307,16 +1307,40 @@ i40e_flow_parse_vxlan_pattern(const struct 
rte_flow_item *pattern,
}
break;
case RTE_FLOW_ITEM_TYPE_IPV4:
+   filter->ip_type = RTE_TUNNEL_IPTYPE_IPV4;
+   /* IPv4 is used to describe protocol,
+* spec amd mask should be NULL.
+*/
+   if (item->spec || item->mask) {
+   rte_flow_error_set(error, EINVAL,
+  RTE_FLOW_ERROR_TYPE_ITEM,
+  item,
+  "Invalid IPv4 item");
+   return -rte_errno;
+   }
+   break;
case RTE_FLOW_ITEM_TYPE_IPV6:
+   filter->ip_type = RTE_TUNNEL_IPTYPE_IPV6;
+   /* IPv6 is used to describe protocol,
+* spec amd mask should be NULL.
+*/
+   if (item->spec || item->mask) {
+   rte_flow_error_set(error, EINVAL,
+  RTE_FLOW_ERROR_TYPE_ITEM,
+  item,
+  "Invalid IPv6 item");
+   return -rte_errno;
+   }
+   break;
case RTE_FLOW_ITEM_TYPE_UDP:
-   /* IPv4/IPv6/UDP are used to describe protocol,
+   /* UDP is used to describe protocol,
 * spec amd mask should be NULL.
 */
if (item->spec || item->mask) {
rte_flow_error_set(error, EINVAL,
   RTE_FLOW_ERROR_TYPE_ITEM,
   item,
-  "Invalid IPv4 item");
+  "Invalid UDP item");
return -rte_errno;
}
break;
-- 
2.5.5



Re: [dpdk-dev] [PATCH v2 3/5] net/ixgbe: fix bitmask of supported Tx flags

2017-02-06 Thread Wu, Jingjing


> -Original Message-
> From: Ananyev, Konstantin
> Sent: Sunday, February 5, 2017 8:11 PM
> To: Ananyev, Konstantin ; Wu, Jingjing
> ; dev@dpdk.org
> Subject: RE: [dpdk-dev] [PATCH v2 3/5] net/ixgbe: fix bitmask of supported
> Tx flags
> 
> 
> 
> > -Original Message-
> > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Ananyev,
> > Konstantin
> > Sent: Sunday, February 5, 2017 11:59 AM
> > To: Wu, Jingjing ; dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v2 3/5] net/ixgbe: fix bitmask of
> > supported Tx flags
> >
> > Hi Jingjing,
> >
> >
> > > -Original Message-
> > > From: Wu, Jingjing
> > > Sent: Saturday, February 4, 2017 3:36 AM
> > > To: dev@dpdk.org
> > > Cc: Wu, Jingjing ; Ananyev, Konstantin
> > > 
> > > Subject: [PATCH v2 3/5] net/ixgbe: fix bitmask of supported Tx flags
> > >
> > > Add missed flags to bitmask of all supported packet Tx flags.
> > >
> > > CC: konstantin.anan...@intel.com
> > > Fixes: 7829b8d52be0 ("net/ixgbe: add Tx preparation")
> > > Signed-off-by: Jingjing Wu 
> > > ---
> > >  drivers/net/ixgbe/ixgbe_rxtx.c | 17 -
> > >  1 file changed, 16 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c
> > > b/drivers/net/ixgbe/ixgbe_rxtx.c index 36f1c02..8454581 100644
> > > --- a/drivers/net/ixgbe/ixgbe_rxtx.c
> > > +++ b/drivers/net/ixgbe/ixgbe_rxtx.c
> > > @@ -81,13 +81,28 @@
> > >  #include "ixgbe_rxtx.h"
> > >
> > >  /* Bit Mask to indicate what bits required for building TX context
> > > */
> > > +#ifdef RTE_LIBRTE_IEEE1588
> > >  #define IXGBE_TX_OFFLOAD_MASK (   \
> > >   PKT_TX_VLAN_PKT |\
> > >   PKT_TX_IP_CKSUM |\
> > > + PKT_TX_IPV4 |\
> > >   PKT_TX_L4_MASK | \
> > > + PKT_TX_IEEE1588_TMST |   \
> > >   PKT_TX_TCP_SEG | \
> > >   PKT_TX_MACSEC |  \
> > > - PKT_TX_OUTER_IP_CKSUM)
> > > + PKT_TX_OUTER_IP_CKSUM |  \
> > > + PKT_TX_OUTER_IPV4)
> > > +#else
> > > +#define IXGBE_TX_OFFLOAD_MASK (   \
> > > + PKT_TX_VLAN_PKT |\
> > > + PKT_TX_IP_CKSUM |\
> > > + PKT_TX_IPV4 |\
> >
> > Wonder why ixgbe doesn't have PKT_TX_IPV6?
> 
> Same question for e1000 and fm10k.
> Also if you decided to go that way, you'll probably need to update
> TX_OFFLOAD_MASK for enic and vmxnet3.
> That's why I still think it would be much less hassle not to include these 
> flags
> (PKT_TX_IPV4 and PKT_TX_IPV6)  into TX_OFFLOAD_MASK at all.
> Konstantin
> 
>
Thanks for pointing that. PKT_TX_IPV6 is missed.
About whether include these flags (PKT_TX_IPV4 and PKT_TX_IPV6)  into 
TX_OFFLOAD_MASK, I think they should be
Included. Think about one NIC may support IPV4 L4 checksum offload, but not 
support IPV6? Even I don't know who it is.

Thanks
Jingjing



Re: [dpdk-dev] [RFC 1/2] doc: introduction to prgdev

2017-02-06 Thread Chen, Jing D

Hi, Jan,


On 2/1/2017 7:41 PM, Jan Blunck wrote:

On Fri, Jan 20, 2017 at 4:21 AM, Chen Jing D(Mark)
 wrote:

This is the documentation to describe what prgdev is, how to use
prgdev API and accomplish an image download.

Signed-off-by: Chen Jing D(Mark) 
---
  doc/guides/prog_guide/prgdev_lib.rst |  457 ++
  1 files changed, 457 insertions(+), 0 deletions(-)
  create mode 100644 doc/guides/prog_guide/prgdev_lib.rst

diff --git a/doc/guides/prog_guide/prgdev_lib.rst 
b/doc/guides/prog_guide/prgdev_lib.rst
new file mode 100644
index 000..3917c18
--- /dev/null
+++ b/doc/guides/prog_guide/prgdev_lib.rst
@@ -0,0 +1,457 @@
+Overview
+
+
+More and more devices start to support on-die programming, which include
+NIC, GPU, FPGA, etc. This capability has a requirement to drivers that
+introduce a API for application to download/upload image to/from the HW.
+So, it's necessary to provide a generic API to perform image download, upload,
+validation, etc.
+
+This RFC patch intends to introduce a DPDK generic programming device layer,
+called prgdev below, to provide an abstract, generic APIs for applications to
+program hardware without knowing the details of programmable devices. From
+driver's perspective, they'll try to adapt their functions to the abstract
+APIs defined in prgdev.
+
+The major purpose of prgdev is to help DPDK users to load/upgrade RTL images
+for FPGA devices, or upgrade firmware for programmble NICs.
+
+But those devices that have the capability to do on-die programming may
+expose versatile talents to apps. Add/delete/update entries in flow table,
+update/overwrite the firmware/microcode, add a new algorithm, update profiles
+, etc, which is hard to be abstracted as generic behaviors. So, prgdev won't
+include the APIs to perform device-unique actions in current stage until it
+became popular. On the contratry, it only provides the simple capability to
+download a segment of buffer(image) from host to on-die device, or vice versa.
+The image could be an algorithm, a flow table in the on-die SRAM, a firmware
+image running in the device, a function within a pipeline, etc. prgdev won't
+try to inteprete the content and it's transparent to prgdev. Apps and devices
+can communicate with a language they can understand each other, which is not
+provided by prgdev to decide what needs to be programmed.
+
+When the set of APIs is introduced, a general question is why we need it in
+DPDK community? Why we can't use offline tool to perform same thing? The answer
+is the prgdev provide a generic, online API to applications, in the meanwile,
+offers a capability to switch driver dynamically when downloaded image changed
+personality and a new driver is required. Comparing offline tool, it have 
online
+programmability (see below examples); Comparing online tool, it provides a
+generic API for many HWs; Comparing generic online tool/API for many products,
+it provides a capability to switch driver dynamically.
+
+There are various means to reach same goal, we'll try to find the best/better
+way to approach. All above advantages will help prgdev to be a 'better choice'.
+

 From my point of view this doesn't belong into the DPDK. On Linux this
is traditionally handled by udev and you already have the freedom to
use userspace applications to program a device requiring firmware in
that case. I don't believe that modeling this in the DPDK explicitly
is the right thing to do.


Can you kindly educate me what udev can do? It's for NIC firmware upgrade?
This prgdev is not only for NIC, the major use case is FPGA, but I found 
programmable

NIC is also applicable to the API model and can be added into the scope.



Especially if the device supports changing personality it is required
to unplug the existing personality before reprogramming. You can do
this already today. Also writing OOB firmware data that changes
configuration should be possible today by handling interrupts.


What if application is using DPDK, can we do in the same manner without 
leaving

DPDK instance?


Maybe we can come up with an example application that demonstrates how
the different infrastructure components could get orchestrated. Do you
have a device in mind that supports this?


As Cunming suggested, the incoming FPGA is the desired device to use 
this API. OVS
that is using DPDK can benefit from the API and download personalities 
in running

time for blank FPGA, or upgrade for prior-programmed case.



Regards,
Jan




Re: [dpdk-dev] [PATCH v2 07/15] event/sw: add support for event queues

2017-02-06 Thread Jerin Jacob
On Tue, Jan 31, 2017 at 04:14:25PM +, Harry van Haaren wrote:
> From: Bruce Richardson 
> 
> Add in the data structures for the event queues, and the eventdev
> functions to create and destroy those queues.
> 
> Signed-off-by: Bruce Richardson 
> Signed-off-by: Harry van Haaren 
> ---
>  drivers/event/sw/iq_ring.h  | 176 
> 
>  drivers/event/sw/sw_evdev.c | 158 +++
>  drivers/event/sw/sw_evdev.h |  75 +++
>  3 files changed, 409 insertions(+)
>  create mode 100644 drivers/event/sw/iq_ring.h
> 
> + */
> +
> +/*
> + * Ring structure definitions used for the internal ring buffers of the
> + * SW eventdev implementation. These are designed for single-core use only.
> + */

If I understand it correctly, IQ and QE rings are single producer and
single consumer rings. By the specification, multiple producers through
multiple ports can enqueue to the event queues at a time.Does SW implementation
support that? or am I missing something here?

> +#ifndef _IQ_RING_
> +#define _IQ_RING_
> +
> +#include 
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#define IQ_RING_NAMESIZE 12
> +#define QID_IQ_DEPTH 512
> +#define QID_IQ_MASK (uint16_t)(QID_IQ_DEPTH - 1)
> +
> +struct iq_ring {
> + char name[IQ_RING_NAMESIZE] __rte_cache_aligned;
> + uint16_t write_idx;
> + uint16_t read_idx;
> +
> + struct rte_event ring[QID_IQ_DEPTH];
> +};
> +
> +#ifndef force_inline
> +#define force_inline inline __attribute__((always_inline))
> +#endif
> +
> +static inline struct iq_ring *
> +iq_ring_create(const char *name, unsigned int socket_id)
> +{
> + struct iq_ring *retval;
> +
> + retval = rte_malloc_socket(NULL, sizeof(*retval), 0, socket_id);
> + if (retval == NULL)
> + goto end;
> +
> + snprintf(retval->name, sizeof(retval->name), "%s", name);
> + retval->write_idx = retval->read_idx = 0;
> +end:
> + return retval;
> +}
> +
> +static inline void
> +iq_ring_destroy(struct iq_ring *r)
> +{
> + rte_free(r);
> +}
> +
> +static force_inline uint16_t
> +iq_ring_count(const struct iq_ring *r)
> +{
> + return r->write_idx - r->read_idx;
> +}
> +
> +static force_inline uint16_t
> +iq_ring_free_count(const struct iq_ring *r)
> +{
> + return QID_IQ_MASK - iq_ring_count(r);
> +}
> +
> +static force_inline uint16_t
> +iq_ring_enqueue_burst(struct iq_ring *r, struct rte_event *qes, uint16_t 
> nb_qes)
> +{
> + const uint16_t read = r->read_idx;
> + uint16_t write = r->write_idx;
> + const uint16_t space = read + QID_IQ_MASK - write;
> + uint16_t i;
> +
> + if (space < nb_qes)
> + nb_qes = space;
> +
> + for (i = 0; i < nb_qes; i++, write++)
> + r->ring[write & QID_IQ_MASK] = qes[i];
> +
> + r->write_idx = write;
> +
> + return nb_qes;
> +}
> +
> diff --git a/drivers/event/sw/sw_evdev.h b/drivers/event/sw/sw_evdev.h
> index 65f00e4..aaa8056 100644
> --- a/drivers/event/sw/sw_evdev.h
> +++ b/drivers/event/sw/sw_evdev.h
> @@ -49,6 +49,78 @@
>  #define SW_INFLIGHT_EVENTS_TOTAL 4096
>  /* allow for lots of over-provisioning */
>  #define MAX_SW_PROD_Q_DEPTH 4096
> +#define SW_FRAGMENTS_MAX 16
> +
> +/* have a new scheduling type for 1:1 queue to port links */
> +#define RTE_SCHED_TYPE_DIRECT (RTE_SCHED_TYPE_PARALLEL + 1)

IMO, better to use SW_ for internal sched types

> +
> +#ifdef RTE_LIBRTE_PMD_EVDEV_SW_DEBUG
> +#define SW_LOG_INFO(fmt, args...) \
> + RTE_LOG(INFO, PMD, "[%s] %s() line %u: " fmt "\n", \
> + PMD_NAME, \
> + __func__, __LINE__, ## args)
> +
> +#define SW_LOG_DBG(fmt, args...) \
> + RTE_LOG(DEBUG, PMD, "[%s] %s() line %u: " fmt "\n", \


Re: [dpdk-dev] buf->hash.rss always empty with i40e

2017-02-06 Thread tom . barbette
That also leave the question of how to HASH only on the IP tuple for TCP and 
UDP packets? The use case is that I want all packets from the same IP src/dst 
pair to be hashed to the same queue. This cannot be enforced with a complete 
hash on TCP/UDP fields.

Tom


Re: [dpdk-dev] [PATCH v2 09/15] event/sw: add support for linking queues to ports

2017-02-06 Thread Jerin Jacob
On Tue, Jan 31, 2017 at 04:14:27PM +, Harry van Haaren wrote:
> From: Bruce Richardson 
> 
> Signed-off-by: Bruce Richardson 
> Signed-off-by: Harry van Haaren 
> ---
>  drivers/event/sw/sw_evdev.c | 68 
> +
>  1 file changed, 68 insertions(+)
> 
> diff --git a/drivers/event/sw/sw_evdev.c b/drivers/event/sw/sw_evdev.c
> index 0b26fcb..693a833 100644
> --- a/drivers/event/sw/sw_evdev.c
> +++ b/drivers/event/sw/sw_evdev.c
> @@ -50,6 +50,72 @@ static void
>  sw_info_get(struct rte_eventdev *dev, struct rte_event_dev_info *info);
>  
>  static int
> +sw_port_link(void *port, const uint8_t queues[], const uint8_t priorities[],
> + uint16_t num)
> +{
> + struct sw_port *p = (void *)port;
> + struct sw_evdev *sw = p->sw;
> + int i;
> +
> + RTE_SET_USED(priorities);
> + for (i = 0; i < num; i++) {
> + struct sw_qid *q = &sw->qids[queues[i]];
> +
> + /* check for qid map overflow */
> + if (q->cq_num_mapped_cqs >= RTE_DIM(q->cq_map))
> + break;
> +
> + if (p->is_directed && p->num_qids_mapped > 0)
> + break;
> +
> + if (q->type == RTE_SCHED_TYPE_DIRECT) {
> + /* check directed qids only map to one port */
> + if (p->num_qids_mapped > 0)
> + break;
> + /* check port only takes a directed flow */
> + if (num > 1)
> + break;
> +
> + p->is_directed = 1;
> + p->num_qids_mapped = 1;

What is the expected behavior if queues configured with
RTE_EVENT_QUEUE_CFG_SINGLE_LINK attempted to link by multiple ports?

Based on the head-file, the expected behavior is following,

(-EDQUOT) Quota exceeded(Application tried to link the queue
configured with RTE_EVENT_QUEUE_CFG_SINGLE_LINK to more than one event ports)

> + } else if (q->type == RTE_SCHED_TYPE_ORDERED) {
> + p->num_ordered_qids++;
> + p->num_qids_mapped++;
> + } else if (q->type == RTE_SCHED_TYPE_ATOMIC) {
> + p->num_qids_mapped++;
> + }
> +
> + q->cq_map[q->cq_num_mapped_cqs] = p->id;
> + rte_smp_wmb();
> + q->cq_num_mapped_cqs++;
> + }
> + return i;
> +}
> +
> +static int
> +sw_port_unlink(void *port, uint8_t queues[], uint16_t nb_unlinks)
> +{
> + struct sw_port *p = (void *)port;
> + struct sw_evdev *sw = p->sw;
> + unsigned int i, j;
> +
> + int unlinked = 0;
> + for (i = 0; i < nb_unlinks; i++) {
> + struct sw_qid *q = &sw->qids[queues[i]];
> + for (j = 0; j < q->cq_num_mapped_cqs; j++)
> + if (q->cq_map[j] == p->id) {
> + q->cq_map[j] =
> + q->cq_map[q->cq_num_mapped_cqs - 1];
> + rte_smp_wmb();
> + q->cq_num_mapped_cqs--;
> + unlinked++;
> + continue;
> + }
> + }
> + return unlinked;
> +}
> +
> +static int
>  sw_port_setup(struct rte_eventdev *dev, uint8_t port_id,
>   const struct rte_event_port_conf *conf)
>  {
> @@ -381,6 +447,8 @@ sw_probe(const char *name, const char *params)
>   .port_def_conf = sw_port_def_conf,
>   .port_setup = sw_port_setup,
>   .port_release = sw_port_release,
> + .port_link = sw_port_link,
> + .port_unlink = sw_port_unlink,
>   };
>  
>   static const char *const args[] = {
> -- 
> 2.7.4
> 


Re: [dpdk-dev] [PATCH v2 00/15] next-eventdev: event/sw software eventdev

2017-02-06 Thread Van Haaren, Harry
> -Original Message-
> From: Jerin Jacob [mailto:jerin.ja...@caviumnetworks.com]
> Sent: Monday, February 6, 2017 8:07 AM
> To: Van Haaren, Harry 
> Cc: dev@dpdk.org
> Subject: Re: [PATCH v2 00/15] next-eventdev: event/sw software eventdev
> 
> On Tue, Jan 31, 2017 at 04:14:18PM +, Harry van Haaren wrote:
> > The following patchset adds software eventdev implementation
> > to the next-eventdev tree.
> >
> > This implementation is based on the previous software eventdev
> > v1 patchset, now with comments addressed:
> > 1) xstats api return values changed to be consistent
> > 2) xstats api [out] added to appropriate values
> > 3) xstats api now uses xxx_get() for consistency
> > 4) patch names for check-log.sh
> 
> Nice to have name the in bracket who suggested it.

Agreed - an oversight on my part. For future readers,
the above suggestions on the v1 patchset came from Jerin - thanks for the 
review of v1 (and now v2 :)

 actual review comments 

Comments will be addressed in v3, hoping to send it later this week. -Harry


Re: [dpdk-dev] vhost user MTU and promisc mode

2017-02-06 Thread Maxime Coquelin

Hi Jianfeng,

On 02/06/2017 07:39 AM, Tan, Jianfeng wrote:

Hi Maxime,



Have seen that you submit a feature in QEMU, commit c5f048d8fb69
(“vhost-user: Add MTU protocol feature and op”).



Appreciate your insights on:

(1) Do you have plan to enable it in DPDK vhost user?

Yes, I plan to propose this for v17.05 release.
For vhost-user backend, the op is to notify the backend with the MTU
value set in QEMU, so that it can ensure packets sent won't be bigger.



(2) How about another similar feature, promisc mode enable/disable?

This is ont something we have considered yet.
Are you willing to work on it?

Thanks,
Maxime




Thanks,

Jianfeng



Re: [dpdk-dev] [PATCH v2 07/15] event/sw: add support for event queues

2017-02-06 Thread Van Haaren, Harry
> -Original Message-
> From: Jerin Jacob [mailto:jerin.ja...@caviumnetworks.com]
> Sent: Monday, February 6, 2017 9:25 AM
> To: Van Haaren, Harry 
> Cc: dev@dpdk.org; Richardson, Bruce 
> Subject: Re: [PATCH v2 07/15] event/sw: add support for event queues
> 
> On Tue, Jan 31, 2017 at 04:14:25PM +, Harry van Haaren wrote:
> > From: Bruce Richardson 
> >
> > Add in the data structures for the event queues, and the eventdev
> > functions to create and destroy those queues.
> >
> > Signed-off-by: Bruce Richardson 
> > Signed-off-by: Harry van Haaren 
> > ---
> >  drivers/event/sw/iq_ring.h  | 176 
> > 
> >  drivers/event/sw/sw_evdev.c | 158 +++
> >  drivers/event/sw/sw_evdev.h |  75 +++
> >  3 files changed, 409 insertions(+)
> >  create mode 100644 drivers/event/sw/iq_ring.h
> >
> > + */
> > +
> > +/*
> > + * Ring structure definitions used for the internal ring buffers of the
> > + * SW eventdev implementation. These are designed for single-core use only.
> > + */
> 
> If I understand it correctly, IQ and QE rings are single producer and
> single consumer rings. By the specification, multiple producers through
> multiple ports can enqueue to the event queues at a time.Does SW 
> implementation
> support that? or am I missing something here?

You're right that the IQ and QE rings are Single Producer, Single Consumer 
rings. More specifically, the QE is a ring for sending rte_event structs 
between cores, while the IQ ring is optimized for internal use in the scheduler 
core - and should not be used to send events between cores. Note that the 
design of the SW scheduler includes a central core for performing scheduling.

In other works, the QE rings transfer events from the worker core to the 
scheduler - and the scheduler pushes the events into what you call the "event 
queues" (aka, the atomic/ordered queue itself). These "event queues" are IQ 
instances. On egress from the scheduler, the event passes through a QE ring to 
the worker.

The result is that despite that only SP/SC rings are used, multiple workers can 
enqueue to any event queue.


Re: [dpdk-dev] [PATCH v2 2/5] net/i40e: fix bitmask of supported Tx flags

2017-02-06 Thread Olivier Matz
Hi Jingjing,

On Sat,  4 Feb 2017 11:36:12 +0800, Jingjing Wu 
wrote:
> Some Tx offload flags are missed in bitmask of all supported packet
> Tx flags by i40e.
> This patch fixes it.

Could you detail which flag was missing?
Is it PKT_TX_TUNNEL_MASK?
If yes, shouldn't we have a "Fixes:" line?

I think most of the patchset should be merged in one patch, because
changing only the mbuf part (PKT_TX_OFFLOAD_MASK) would break the
drivers that checks the offload bits at init, and this is not suitable,
especially if we want to be able to do git bisect.


My suggestion is to have:

1- fix i40 (add missing tunnel mask?)
2- fix missing MACSET in TX_OFFLOAD_MASK
3- change TX_OFFLOAD_MASK to include all flags (this impacts all
   drivers using TX_OFFLOAD_MASK)


Regards,
Olivier


Re: [dpdk-dev] [PATCH v2 2/5] net/i40e: fix bitmask of supported Tx flags

2017-02-06 Thread Olivier Matz
On Mon, 6 Feb 2017 03:02:12 +, "Wu, Jingjing"
 wrote:
> > 
> > Functionally will be same, but what do you think about following,
> > to make easy to see what define adds:
> > 
> > +#define I40E_TX_OFFLOAD_MASK (  \
> > +   PKT_TX_IP_CKSUM |\
> > +   PKT_TX_IPV4 |\
> > +   PKT_TX_IPV6 |\
> > +   PKT_TX_L4_MASK | \
> > +   PKT_TX_OUTER_IP_CKSUM |  \
> > +   PKT_TX_OUTER_IPV4 |  \
> > +   PKT_TX_OUTER_IPV6 |  \
> > 
> > +#ifdef RTE_LIBRTE_IEEE1588
> > +   PKT_TX_IEEE1588_TMST |   \
> > +#endif
> > 
> > +   PKT_TX_TCP_SEG | \
> > +   PKT_TX_QINQ_PKT |\
> > +   PKT_TX_VLAN_PKT |\
> > +   PKT_TX_TUNNEL_MASK)
> >   
> 
> Hi, Ferruh
> 
> As I know, the above change is incorrect in C code. We cannot use
> #ifdef  #endif inside #define
> 
> Thanks
> Jingjing


You can do:

#ifdef RTE_LIBRTE_IEEE1588
#define I40_TX_IEEE1588_TMST PKT_TX_IEEE1588_TMST
#else
#define I40_TX_IEEE1588_TMST 0
#endif

#define I40E_TX_OFFLOAD_MASK (   \
I40_TX_IEEE1588_TMST |   \
PKT_TX_IP_CKSUM |\
...


Regards,
Olivier


Re: [dpdk-dev] [PATCH v2 02/15] eventdev: add APIs for extended stats

2017-02-06 Thread Van Haaren, Harry
> -Original Message-
> From: Jerin Jacob [mailto:jerin.ja...@caviumnetworks.com]
> Sent: Monday, February 6, 2017 8:23 AM
> To: Van Haaren, Harry 
> Cc: dev@dpdk.org; Richardson, Bruce 
> Subject: Re: [PATCH v2 02/15] eventdev: add APIs for extended stats
> 
> On Tue, Jan 31, 2017 at 04:14:20PM +, Harry van Haaren wrote:
> > From: Bruce Richardson 
> >
> > Add in APIs for extended stats so that eventdev implementations can report
> > out information on their internal state. The APIs are based on, but not
> > identical to, the equivalent ethdev functions.
> 
> The APIs Looks good. One minor comment though,
> 
> Can you add statistics specific to per event queue and event
> port?, To improve the cases like below in the application code(taken from
> app/test/test_sw_eventdev.c).
> 
> IMO, it is useful because,
> - ethdev has similar semantics
> - majority of the implementations will have port and queue specific 
> statistics counters

I'm not totally sure what you're asking but if I understand correctly, you're 
suggesting a struct based stats API like this?

struct rte_event_dev_port_stats {
uint64_t rx;
uint64_t tx;
...
};

struct rte_event_dev_queue_stats {
uint64_t rx;
uint64_t tx;
...
};

/** a get function to get a specific port's statistics. The *stats* pointer is 
filled in */ 
int rte_event_dev_port_stats_get(dev, uint8_t port_id, struct 
rte_event_dev_port_stats *stats);

/** a get function to get a specific queue's statistics. The *stats* pointer is 
filled in */ 
int rte_event_dev_queue_stats_get(dev, uint8_t queue_id, struct 
rte_event_dev_queue_stats *stats);


Is this what you meant, or did I misunderstand?


> 
> +   for (i = 0; i < MAX_PORTS; i++) {
> +   char name[32];
> +   snprintf(name, sizeof(name), "port_%u_rx", i);
> +   stats->port_rx_pkts[i] = rte_event_dev_xstats_by_name_get(
> +   dev_id, name, &port_rx_pkts_ids[i]);
> 
> +   for (i = 0; i < MAX_QIDS; i++) {
> +   char name[32];
> +   snprintf(name, sizeof(name), "qid_%u_rx", i);
> +   stats->qid_rx_pkts[i] = rte_event_dev_xstats_by_name_get(
> +   dev_id, name, &qid_rx_pkts_ids[i]);
> 



[dpdk-dev] 答复: RE: [PATCH v2] net/ixgbe/base: clear redundant macro define

2017-02-06 Thread yao . chenghu
> As 1.x is for linux/freebsd, but 2.0 is for solaris, 2.0 doesn't 
> leverage the messages from 1.x. For example, 0x0a has different 
> meaning in 1.x and 2.0. So better letting it be. 
> 

Ok, thanks for your reply.


[dpdk-dev] [PATCH] net/ena: solve problem with host attributes

2017-02-06 Thread Jakub Palider
The hardware may reject adding host_info in case support for
host_info is missing in the list of supported features. On the
other hand the list of supported featues may contain support
for the host_info - typical bootstrap problem.
This patch solves it by removing check against support for
host_info attribute and improves error handling by reacting
only to host attribute write failure to the hardware.
---
 drivers/net/ena/base/ena_com.c | 16 
 drivers/net/ena/ena_ethdev.c   | 21 -
 2 files changed, 12 insertions(+), 25 deletions(-)

diff --git a/drivers/net/ena/base/ena_com.c b/drivers/net/ena/base/ena_com.c
index 88053e3..bd6f3c6 100644
--- a/drivers/net/ena/base/ena_com.c
+++ b/drivers/net/ena/base/ena_com.c
@@ -2590,19 +2590,11 @@ int ena_com_set_host_attributes(struct ena_com_dev 
*ena_dev)
struct ena_com_admin_queue *admin_queue;
struct ena_admin_set_feat_cmd cmd;
struct ena_admin_set_feat_resp resp;
+   int ret;
 
-   int ret = 0;
-
-   if (unlikely(!ena_dev)) {
-   ena_trc_err("%s : ena_dev is NULL\n", __func__);
-   return ENA_COM_NO_DEVICE;
-   }
-
-   if (!ena_com_check_supported_feature_id(ena_dev,
-   ENA_ADMIN_HOST_ATTR_CONFIG)) {
-   ena_trc_warn("Set host attribute isn't supported\n");
-   return ENA_COM_PERMISSION;
-   }
+   /* Host attribute config is called before ena_com_get_dev_attr_feat
+* so ena_com can't check if the feature is supported.
+*/
 
memset(&cmd, 0x0, sizeof(cmd));
admin_queue = &ena_dev->admin_queue;
diff --git a/drivers/net/ena/ena_ethdev.c b/drivers/net/ena/ena_ethdev.c
index e99bf29..f3666e5 100644
--- a/drivers/net/ena/ena_ethdev.c
+++ b/drivers/net/ena/ena_ethdev.c
@@ -368,12 +368,9 @@ static void ena_config_host_info(struct ena_com_dev 
*ena_dev)
 
rc = ena_com_set_host_attributes(ena_dev);
if (rc) {
-   if (rc == -EPERM)
-   RTE_LOG(ERR, PMD, "Cannot set host attributes\n");
-   else
-   RTE_LOG(ERR, PMD, "Cannot set host attributes\n");
-
-   goto err;
+   RTE_LOG(ERR, PMD, "Cannot set host attributes\n");
+   if (rc != -EPERM)
+   goto err;
}
 
return;
@@ -424,11 +421,9 @@ static void ena_config_debug_area(struct ena_adapter 
*adapter)
 
rc = ena_com_set_host_attributes(&adapter->ena_dev);
if (rc) {
-   if (rc == -EPERM)
-   RTE_LOG(WARNING, PMD, "Cannot set host attributes\n");
-   else
-   RTE_LOG(ERR, PMD, "Cannot set host attributes\n");
-   goto err;
+   RTE_LOG(WARNING, PMD, "Cannot set host attributes\n");
+   if (rc != -EPERM)
+   goto err;
}
 
return;
@@ -1239,14 +1234,14 @@ static int ena_device_init(struct ena_com_dev *ena_dev,
goto err_mmio_read_less;
}
 
-   ena_config_host_info(ena_dev);
-
/* To enable the msix interrupts the driver needs to know the number
 * of queues. So the driver uses polling mode to retrieve this
 * information.
 */
ena_com_set_admin_polling_mode(ena_dev, true);
 
+   ena_config_host_info(ena_dev);
+
/* Get Device Attributes and features */
rc = ena_com_get_dev_attr_feat(ena_dev, get_feat_ctx);
if (rc) {
-- 
2.5.0



[dpdk-dev] [PATCH v2] net/ena: solve problem with host attributes

2017-02-06 Thread Jakub Palider
The hardware may reject adding host_info in case support for
host_info is missing in the list of supported features. On the
other hand the list of supported featues may contain support
for the host_info - typical bootstrap problem.
This patch solves it by removing check against support for
host_info attribute and improves error handling by reacting
only to host attribute write failure to the hardware.

v2 changes:
 - added Signed-off

Signed-off-by: Jakub Palider 
---
 drivers/net/ena/base/ena_com.c | 16 
 drivers/net/ena/ena_ethdev.c   | 21 -
 2 files changed, 12 insertions(+), 25 deletions(-)

diff --git a/drivers/net/ena/base/ena_com.c b/drivers/net/ena/base/ena_com.c
index 88053e3..bd6f3c6 100644
--- a/drivers/net/ena/base/ena_com.c
+++ b/drivers/net/ena/base/ena_com.c
@@ -2590,19 +2590,11 @@ int ena_com_set_host_attributes(struct ena_com_dev 
*ena_dev)
struct ena_com_admin_queue *admin_queue;
struct ena_admin_set_feat_cmd cmd;
struct ena_admin_set_feat_resp resp;
+   int ret;
 
-   int ret = 0;
-
-   if (unlikely(!ena_dev)) {
-   ena_trc_err("%s : ena_dev is NULL\n", __func__);
-   return ENA_COM_NO_DEVICE;
-   }
-
-   if (!ena_com_check_supported_feature_id(ena_dev,
-   ENA_ADMIN_HOST_ATTR_CONFIG)) {
-   ena_trc_warn("Set host attribute isn't supported\n");
-   return ENA_COM_PERMISSION;
-   }
+   /* Host attribute config is called before ena_com_get_dev_attr_feat
+* so ena_com can't check if the feature is supported.
+*/
 
memset(&cmd, 0x0, sizeof(cmd));
admin_queue = &ena_dev->admin_queue;
diff --git a/drivers/net/ena/ena_ethdev.c b/drivers/net/ena/ena_ethdev.c
index e99bf29..f3666e5 100644
--- a/drivers/net/ena/ena_ethdev.c
+++ b/drivers/net/ena/ena_ethdev.c
@@ -368,12 +368,9 @@ static void ena_config_host_info(struct ena_com_dev 
*ena_dev)
 
rc = ena_com_set_host_attributes(ena_dev);
if (rc) {
-   if (rc == -EPERM)
-   RTE_LOG(ERR, PMD, "Cannot set host attributes\n");
-   else
-   RTE_LOG(ERR, PMD, "Cannot set host attributes\n");
-
-   goto err;
+   RTE_LOG(ERR, PMD, "Cannot set host attributes\n");
+   if (rc != -EPERM)
+   goto err;
}
 
return;
@@ -424,11 +421,9 @@ static void ena_config_debug_area(struct ena_adapter 
*adapter)
 
rc = ena_com_set_host_attributes(&adapter->ena_dev);
if (rc) {
-   if (rc == -EPERM)
-   RTE_LOG(WARNING, PMD, "Cannot set host attributes\n");
-   else
-   RTE_LOG(ERR, PMD, "Cannot set host attributes\n");
-   goto err;
+   RTE_LOG(WARNING, PMD, "Cannot set host attributes\n");
+   if (rc != -EPERM)
+   goto err;
}
 
return;
@@ -1239,14 +1234,14 @@ static int ena_device_init(struct ena_com_dev *ena_dev,
goto err_mmio_read_less;
}
 
-   ena_config_host_info(ena_dev);
-
/* To enable the msix interrupts the driver needs to know the number
 * of queues. So the driver uses polling mode to retrieve this
 * information.
 */
ena_com_set_admin_polling_mode(ena_dev, true);
 
+   ena_config_host_info(ena_dev);
+
/* Get Device Attributes and features */
rc = ena_com_get_dev_attr_feat(ena_dev, get_feat_ctx);
if (rc) {
-- 
2.5.0



Re: [dpdk-dev] [PATCH v2 3/5] net/ixgbe: fix bitmask of supported Tx flags

2017-02-06 Thread Ananyev, Konstantin


> -Original Message-
> From: Wu, Jingjing
> Sent: Monday, February 6, 2017 8:54 AM
> To: Ananyev, Konstantin ; dev@dpdk.org
> Subject: RE: [dpdk-dev] [PATCH v2 3/5] net/ixgbe: fix bitmask of supported Tx 
> flags
> 
> 
> 
> > -Original Message-
> > From: Ananyev, Konstantin
> > Sent: Sunday, February 5, 2017 8:11 PM
> > To: Ananyev, Konstantin ; Wu, Jingjing
> > ; dev@dpdk.org
> > Subject: RE: [dpdk-dev] [PATCH v2 3/5] net/ixgbe: fix bitmask of supported
> > Tx flags
> >
> >
> >
> > > -Original Message-
> > > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Ananyev,
> > > Konstantin
> > > Sent: Sunday, February 5, 2017 11:59 AM
> > > To: Wu, Jingjing ; dev@dpdk.org
> > > Subject: Re: [dpdk-dev] [PATCH v2 3/5] net/ixgbe: fix bitmask of
> > > supported Tx flags
> > >
> > > Hi Jingjing,
> > >
> > >
> > > > -Original Message-
> > > > From: Wu, Jingjing
> > > > Sent: Saturday, February 4, 2017 3:36 AM
> > > > To: dev@dpdk.org
> > > > Cc: Wu, Jingjing ; Ananyev, Konstantin
> > > > 
> > > > Subject: [PATCH v2 3/5] net/ixgbe: fix bitmask of supported Tx flags
> > > >
> > > > Add missed flags to bitmask of all supported packet Tx flags.
> > > >
> > > > CC: konstantin.anan...@intel.com
> > > > Fixes: 7829b8d52be0 ("net/ixgbe: add Tx preparation")
> > > > Signed-off-by: Jingjing Wu 
> > > > ---
> > > >  drivers/net/ixgbe/ixgbe_rxtx.c | 17 -
> > > >  1 file changed, 16 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c
> > > > b/drivers/net/ixgbe/ixgbe_rxtx.c index 36f1c02..8454581 100644
> > > > --- a/drivers/net/ixgbe/ixgbe_rxtx.c
> > > > +++ b/drivers/net/ixgbe/ixgbe_rxtx.c
> > > > @@ -81,13 +81,28 @@
> > > >  #include "ixgbe_rxtx.h"
> > > >
> > > >  /* Bit Mask to indicate what bits required for building TX context
> > > > */
> > > > +#ifdef RTE_LIBRTE_IEEE1588
> > > >  #define IXGBE_TX_OFFLOAD_MASK ( \
> > > > PKT_TX_VLAN_PKT |\
> > > > PKT_TX_IP_CKSUM |\
> > > > +   PKT_TX_IPV4 |\
> > > > PKT_TX_L4_MASK | \
> > > > +   PKT_TX_IEEE1588_TMST |   \
> > > > PKT_TX_TCP_SEG | \
> > > > PKT_TX_MACSEC |  \
> > > > -   PKT_TX_OUTER_IP_CKSUM)
> > > > +   PKT_TX_OUTER_IP_CKSUM |  \
> > > > +   PKT_TX_OUTER_IPV4)
> > > > +#else
> > > > +#define IXGBE_TX_OFFLOAD_MASK ( \
> > > > +   PKT_TX_VLAN_PKT |\
> > > > +   PKT_TX_IP_CKSUM |\
> > > > +   PKT_TX_IPV4 |\
> > >
> > > Wonder why ixgbe doesn't have PKT_TX_IPV6?
> >
> > Same question for e1000 and fm10k.
> > Also if you decided to go that way, you'll probably need to update
> > TX_OFFLOAD_MASK for enic and vmxnet3.
> > That's why I still think it would be much less hassle not to include these 
> > flags
> > (PKT_TX_IPV4 and PKT_TX_IPV6)  into TX_OFFLOAD_MASK at all.
> > Konstantin
> >
> >
> Thanks for pointing that. PKT_TX_IPV6 is missed.
> About whether include these flags (PKT_TX_IPV4 and PKT_TX_IPV6)  into 
> TX_OFFLOAD_MASK, I think they should be
> Included. Think about one NIC may support IPV4 L4 checksum offload, but not 
> support IPV6? Even I don't know who it is.
> 

I don't think such combination is possible now anyway.
But ok, if your preference is it to do more work and add (PKT_TX_IPV4 | 
PKT_TX_IPV6)
into all required places, I wouldn't argue.

BTW, as a side notice, what will be really good is to have a function that 
would take
tx_offload_capabilities as an input and return tx_offload_mask.
That would remove necessity to update/support TX_OFFLOAD_MASK for each PMD,
and hopefully will allow to avoid confusion for PMD writers. 
Though that's probably subject of another patch.

Konstantin   



Re: [dpdk-dev] buf->hash.rss always empty with i40e

2017-02-06 Thread Ananyev, Konstantin
Hi Tom,

> 
> That also leave the question of how to HASH only on the IP tuple for TCP and 
> UDP packets? The use case is that I want all packets from the
> same IP src/dst pair to be hashed to the same queue. This cannot be enforced 
> with a complete hash on TCP/UDP fields.
> 

That's for IPv4 only?
It was a while ago, when I looked at it, but as I remember you can achieve that 
by filling first 64 bits of RSS hash with some meaningful values,
and keeping remaining RSS bits as zeroes.
You probably can give it a quick try with testpmd.

There is a nice article about similar subject:
http://www.ndsl.kaist.edu/~kyoungsoo/papers/TR-symRSS.pdf

Konstantin



[dpdk-dev] [PATCH] app/crypto-perf: fix dereference null return value

2017-02-06 Thread Slawomir Mrozowicz
Dereferencing a pointer that might be null key_token when calling strstr.
Check if the pointer is null before.

Coverity issue: 141071
Fixes: f8be1786b1b8 ("app/crypto-perf: introduce performance test application")

Signed-off-by: Slawomir Mrozowicz 
---
 app/test-crypto-perf/cperf_test_vector_parsing.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/app/test-crypto-perf/cperf_test_vector_parsing.c 
b/app/test-crypto-perf/cperf_test_vector_parsing.c
index e0bcb20..a7d7b51 100644
--- a/app/test-crypto-perf/cperf_test_vector_parsing.c
+++ b/app/test-crypto-perf/cperf_test_vector_parsing.c
@@ -240,7 +240,7 @@ parse_entry(char *entry, struct cperf_test_vector *vector,
 
/* get values for key */
token = strtok(NULL, CPERF_ENTRY_DELIMITER);
-   if (token == NULL) {
+   if (key_token == NULL || token == NULL) {
printf("Expected 'key = values' but was '%.40s'..\n",
key_token);
return -1;
-- 
2.5.0



[dpdk-dev] [PATCH] eventdev: update event port link and unlink callbacks

2017-02-06 Thread Nipun Gupta
Added a pointer to the rte_eventdev type in the event port
link and unlink callbacks. This device shall be used by some
of the event drivers to fetch queue related information.

Also, update the skeleton eventdev driver with corresponding changes.

Signed-off-by: Nipun Gupta 
---
 drivers/event/skeleton/skeleton_eventdev.c | 8 +---
 lib/librte_eventdev/rte_eventdev.c | 8 
 lib/librte_eventdev/rte_eventdev_pmd.h | 8 ++--
 3 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/drivers/event/skeleton/skeleton_eventdev.c 
b/drivers/event/skeleton/skeleton_eventdev.c
index 085cb86..9330d74 100644
--- a/drivers/event/skeleton/skeleton_eventdev.c
+++ b/drivers/event/skeleton/skeleton_eventdev.c
@@ -287,13 +287,14 @@
 }
 
 static int
-skeleton_eventdev_port_link(void *port,
+skeleton_eventdev_port_link(struct rte_eventdev *dev, void *port,
const uint8_t queues[], const uint8_t priorities[],
uint16_t nb_links)
 {
struct skeleton_port *sp = port;
PMD_DRV_FUNC_TRACE();
 
+   RTE_SET_USED(dev);
RTE_SET_USED(sp);
RTE_SET_USED(queues);
RTE_SET_USED(priorities);
@@ -303,12 +304,13 @@
 }
 
 static int
-skeleton_eventdev_port_unlink(void *port, uint8_t queues[],
-uint16_t nb_unlinks)
+skeleton_eventdev_port_unlink(struct rte_eventdev *dev, void *port,
+uint8_t queues[], uint16_t nb_unlinks)
 {
struct skeleton_port *sp = port;
PMD_DRV_FUNC_TRACE();
 
+   RTE_SET_USED(dev);
RTE_SET_USED(sp);
RTE_SET_USED(queues);
 
diff --git a/lib/librte_eventdev/rte_eventdev.c 
b/lib/librte_eventdev/rte_eventdev.c
index c8f3e94..c7452f0 100644
--- a/lib/librte_eventdev/rte_eventdev.c
+++ b/lib/librte_eventdev/rte_eventdev.c
@@ -801,8 +801,8 @@
if (queues[i] >= RTE_EVENT_MAX_QUEUES_PER_DEV)
return -EINVAL;
 
-   diag = (*dev->dev_ops->port_link)(dev->data->ports[port_id], queues,
-   priorities, nb_links);
+   diag = (*dev->dev_ops->port_link)(dev, dev->data->ports[port_id],
+   queues, priorities, nb_links);
if (diag < 0)
return diag;
 
@@ -846,8 +846,8 @@
if (queues[i] >= RTE_EVENT_MAX_QUEUES_PER_DEV)
return -EINVAL;
 
-   diag = (*dev->dev_ops->port_unlink)(dev->data->ports[port_id], queues,
-   nb_unlinks);
+   diag = (*dev->dev_ops->port_unlink)(dev, dev->data->ports[port_id],
+   queues, nb_unlinks);
 
if (diag < 0)
return diag;
diff --git a/lib/librte_eventdev/rte_eventdev_pmd.h 
b/lib/librte_eventdev/rte_eventdev_pmd.h
index c84c9a2..2e3e5d3 100644
--- a/lib/librte_eventdev/rte_eventdev_pmd.h
+++ b/lib/librte_eventdev/rte_eventdev_pmd.h
@@ -357,6 +357,8 @@ typedef int (*eventdev_port_setup_t)(struct rte_eventdev 
*dev,
 /**
  * Link multiple source event queues to destination event port.
  *
+ * @param dev
+ *   Event device pointer
  * @param port
  *   Event port pointer
  * @param link
@@ -372,13 +374,15 @@ typedef int (*eventdev_port_setup_t)(struct rte_eventdev 
*dev,
  *   Returns 0 on success.
  *
  */
-typedef int (*eventdev_port_link_t)(void *port,
+typedef int (*eventdev_port_link_t)(struct rte_eventdev *dev, void *port,
const uint8_t queues[], const uint8_t priorities[],
uint16_t nb_links);
 
 /**
  * Unlink multiple source event queues from destination event port.
  *
+ * @param dev
+ *   Event device pointer
  * @param port
  *   Event port pointer
  * @param queues
@@ -390,7 +394,7 @@ typedef int (*eventdev_port_link_t)(void *port,
  *   Returns 0 on success.
  *
  */
-typedef int (*eventdev_port_unlink_t)(void *port,
+typedef int (*eventdev_port_unlink_t)(struct rte_eventdev *dev, void *port,
uint8_t queues[], uint16_t nb_unlinks);
 
 /**
-- 
1.9.1



[dpdk-dev] 17.02-rc2: i40e and LSC

2017-02-06 Thread Ivan Nardi
Hi guys
we are upgrading our application from dpdk 16.11 to 17.02-rc2 and we are
facing a strange behavior with i40e driver and LSC
If we call rte_eth_dev_configure() with port_conf.intr_conf.lsc = 1, the
call always failed with error

Configuring Port 0 (socket 1)
rte_eth_dev_configure: driver net_i40e does not support lsc
Fail to configure port 0

Everything is fine with 16.11 or with port_conf.intr_conf.lsc = 0 (default
value)

Is this the intended (new) behavior or am I missing something obvious here?
Thanks in advance

Ivan


[dpdk-dev] [PATCH] doc: add known issue for virtio TSO with clones

2017-02-06 Thread Olivier Matz
Document the issue with Tso on shared packets.

Signed-off-by: Olivier Matz 
---
 doc/guides/rel_notes/known_issues.rst | 31 +++
 1 file changed, 31 insertions(+)

diff --git a/doc/guides/rel_notes/known_issues.rst 
b/doc/guides/rel_notes/known_issues.rst
index 018e999..f149ac7 100644
--- a/doc/guides/rel_notes/known_issues.rst
+++ b/doc/guides/rel_notes/known_issues.rst
@@ -666,3 +666,34 @@ uio pci generic module bind failed in X710/XL710/XXV710
 
 **Driver/Module**:
Poll Mode Driver (PMD).
+
+
+virtio tx_burst() function cannot do TSO on shared packets
+--
+
+**Description**:
+   The standard TX function of virtio driver does not manage shared
+   packets properly when doing TSO. These packets should be read-only
+   but the driver modifies them.
+
+   When doing TSO, the virtio standard expects that the L4 checksum is
+   set to the pseudo header checksum in the packet data, which is
+   different than the DPDK API. The driver patches the L4 checksum to
+   conform to the virtio standard, but this solution is invalid when
+   dealing with shared packets (clones), because the packet data should
+   not be modified.
+
+**Implication**:
+   In this situation, the shared data will be modified by the driver,
+   potentially causing race conditions with the other users of the mbuf
+   data.
+
+**Resolution/Workaround**:
+   The workaround in the application is to ensure that the network
+   headers in the packet data are not shared.
+
+**Affected Environment/Platform**:
+   Virtual machines running a virtio driver.
+
+**Driver/Module**:
+   Poll Mode Driver (PMD).
-- 
2.8.1



Re: [dpdk-dev] [PATCH v2] app/test-crypto-perf: fix gcc compilation under FreeBSD

2017-02-06 Thread De Lara Guarch, Pablo
Hi Daniel,

> -Original Message-
> From: Mrzyglod, DanielX T
> Sent: Friday, February 03, 2017 12:10 PM
> To: De Lara Guarch, Pablo; Thomas Monjalon; Mrozowicz, SlawomirX
> Cc: dev@dpdk.org
> Subject: RE: [dpdk-dev] [PATCH v2] app/test-crypto-perf: fix gcc compilation
> under FreeBSD
> 
> 
> 
> >-Original Message-
> >From: De Lara Guarch, Pablo
> >Sent: Friday, February 03, 2017 12:31 PM
> >To: Mrzyglod, DanielX T ; Thomas
> Monjalon
> >; Mrozowicz, SlawomirX
> >
> >Cc: dev@dpdk.org
> >Subject: RE: [dpdk-dev] [PATCH v2] app/test-crypto-perf: fix gcc
> compilation
> >under FreeBSD
> >
> >Hi Thomas,
> >
> >> -Original Message-
> >> From: Mrzyglod, DanielX T
> >> Sent: Thursday, February 02, 2017 3:18 PM
> >> To: Thomas Monjalon; Mrozowicz, SlawomirX; De Lara Guarch, Pablo
> >> Cc: dev@dpdk.org
> >> Subject: RE: [dpdk-dev] [PATCH v2] app/test-crypto-perf: fix gcc
> compilation
> >> under FreeBSD
> >>
> >>
> >>
> >> >-Original Message-
> >> >From: Thomas Monjalon [mailto:thomas.monja...@6wind.com]
> >> >Sent: Thursday, February 02, 2017 2:57 PM
> >> >To: Mrzyglod, DanielX T ; Mrozowicz,
> >> SlawomirX
> >> >; De Lara Guarch, Pablo
> >> >
> >> >Cc: dev@dpdk.org
> >> >Subject: Re: [dpdk-dev] [PATCH v2] app/test-crypto-perf: fix gcc
> >> compilation
> >> >under FreeBSD
> >> >
> >> >Hi,
> >> >
> >> >The error is not specific to GCC.
> >> >I can reproduce it with x86_64-native-bsdapp-clang.
> >> >
> >> I can reproduce it only at FreeBSD 10.3 gcc48
> >>
> >> I have mindblow about it.
> >> When you see patchwork report:
> >> http://dpdk.org/dev/patchwork/patch/19998/   -  Everything is ok.
> >> But it isn't due to daily raport: http://dpdk.org/ml/archives/test-
> >> report/2017-January/010203.html
> >>
> >> For clang it was working so I prepared patch for gcc.
> >
> >Could you say which clang version you have?
> >In patch report, clang version on FreeBSD is 3.4.1, and compilation is OK.
> >We might need to include a newer one.
> >
> 
> Today I reproduced it  also at clang 3.8 & 3.4.1 on fresh git clone so Thomas
> was right.
> Today I add clang-3.8 5 to vm and 3.4.1 compilation stopped working.
> The patch name could be changed to "fix compilation under FreeBSD"

Could you send a newer version then?

Thanks,
Pablo



[dpdk-dev] [RFC 0/8] eal: dynamic logs

2017-02-06 Thread Olivier Matz
The objective of this RFC patchset is to introduce a framework to
support dynamic log types in EAL. It also provides one example of use
(in i40e).

Features:
- log types are identified by a string
- at registration, a uniq identifier is associated to a log type
- each log type can have its level changed dynamically
- extend command line parameters to set the log level of a specific
  type, or logs matching a regular expression
- keep compat with other legacy types (eal, malloc, ring, user*,
  etc... keep their hardcoded log type value)

At the end, when, we can expect that all non-dataplane logs are moved to
be dynamic, so we can enable/disable them at runtime, without
recompiling. Many debug options can probably be removed from
configuration:
  $ git grep DEBUG config/common_base | wc -l
  89

Comments are welcome!


Olivier Matz (8):
  eal: support dynamic log types
  eal: dump registered log types
  eal: change several log levels matching a regexp
  eal: change specific log levels at startup
  eal: deprecate log functions
  app/test: new command to dump log types
  app/testpmd: new command to dump log types
  net/i40e: use dynamic log type for control logs

 app/test-pmd/cmdline.c  |   5 +-
 app/test/commands.c |   4 +-
 app/test/test_logs.c|  12 +-
 doc/guides/contributing/coding_style.rst|  30 ++--
 drivers/net/i40e/i40e_ethdev.c  |  18 +-
 drivers/net/i40e/i40e_fdir.c|   4 -
 drivers/net/i40e/i40e_logs.h|  17 +-
 examples/quota_watermark/qw/main.c  |   2 +-
 examples/quota_watermark/qwctl/qwctl.c  |   2 +-
 lib/librte_eal/bsdapp/eal/eal.c |   4 +-
 lib/librte_eal/bsdapp/eal/rte_eal_version.map   |   6 +
 lib/librte_eal/common/eal_common_log.c  | 219 +++-
 lib/librte_eal/common/eal_common_options.c  |  48 --
 lib/librte_eal/common/include/rte_log.h |  76 +++-
 lib/librte_eal/linuxapp/eal/eal.c   |   4 +-
 lib/librte_eal/linuxapp/eal/rte_eal_version.map |   6 +
 16 files changed, 394 insertions(+), 63 deletions(-)

-- 
2.8.1



[dpdk-dev] [RFC 1/8] eal: support dynamic log types

2017-02-06 Thread Olivier Matz
Introduce 2 new functions to support dynamic log types:

- rte_log_register(): register a log name, and return a log type id
- rte_log_set_level(): set the log level of a given log type

Signed-off-by: Olivier Matz 
---
 app/test/test_logs.c|   6 +-
 doc/guides/contributing/coding_style.rst|  30 +++--
 lib/librte_eal/bsdapp/eal/rte_eal_version.map   |   2 +
 lib/librte_eal/common/eal_common_log.c  | 140 +++-
 lib/librte_eal/common/include/rte_log.h |  35 +-
 lib/librte_eal/linuxapp/eal/rte_eal_version.map |   2 +
 6 files changed, 197 insertions(+), 18 deletions(-)

diff --git a/app/test/test_logs.c b/app/test/test_logs.c
index 6985ddd..805c568 100644
--- a/app/test/test_logs.c
+++ b/app/test/test_logs.c
@@ -62,8 +62,8 @@ static int
 test_logs(void)
 {
/* enable these logs type */
-   rte_set_log_type(RTE_LOGTYPE_TESTAPP1, 1);
-   rte_set_log_type(RTE_LOGTYPE_TESTAPP2, 1);
+   rte_log_set_level(RTE_LOGTYPE_TESTAPP1, RTE_LOG_EMERG);
+   rte_log_set_level(RTE_LOGTYPE_TESTAPP2, RTE_LOG_EMERG);
 
/* log in error level */
rte_set_log_level(RTE_LOG_ERR);
@@ -76,7 +76,7 @@ test_logs(void)
RTE_LOG(CRIT, TESTAPP2, "critical message\n");
 
/* disable one log type */
-   rte_set_log_type(RTE_LOGTYPE_TESTAPP2, 0);
+   rte_log_set_level(RTE_LOGTYPE_TESTAPP2, RTE_LOG_DEBUG);
 
/* log in error level */
rte_set_log_level(RTE_LOG_ERR);
diff --git a/doc/guides/contributing/coding_style.rst 
b/doc/guides/contributing/coding_style.rst
index 4163960..d8e4a0f 100644
--- a/doc/guides/contributing/coding_style.rst
+++ b/doc/guides/contributing/coding_style.rst
@@ -603,22 +603,30 @@ In the DPDK environment, use the logging interface 
provided:
 
 .. code-block:: c
 
- #define RTE_LOGTYPE_TESTAPP1 RTE_LOGTYPE_USER1
- #define RTE_LOGTYPE_TESTAPP2 RTE_LOGTYPE_USER2
+ /* register log types for this application */
+ int my_logtype1 = rte_log_register("myapp.log1");
+ int my_logtype2 = rte_log_register("myapp.log2");
 
- /* enable these logs type */
- rte_set_log_type(RTE_LOGTYPE_TESTAPP1, 1);
- rte_set_log_type(RTE_LOGTYPE_TESTAPP2, 1);
+ /* set global log level to INFO */
+ rte_log_set_global_level(RTE_LOG_INFO);
+
+ /* only display messages higher than NOTICE for log2 (default
+  * is DEBUG) */
+ rte_log_set_level(my_logtype2, RTE_LOG_NOTICE);
+
+ /* enable all PMD logs (whose identifier string starts with "pmd") */
+ rte_log_set_level_regexp("pmd.*", RTE_LOG_DEBUG);
 
  /* log in debug level */
- rte_set_log_level(RTE_LOG_DEBUG);
- RTE_LOG(DEBUG, TESTAPP1, "this is is a debug level message\n");
- RTE_LOG(INFO, TESTAPP1, "this is is a info level message\n");
- RTE_LOG(WARNING, TESTAPP1, "this is is a warning level message\n");
+ rte_log_set_global_level(RTE_LOG_DEBUG);
+ RTE_LOG(DEBUG, my_logtype1, "this is is a debug level message\n");
+ RTE_LOG(INFO, my_logtype1, "this is is a info level message\n");
+ RTE_LOG(WARNING, my_logtype1, "this is is a warning level message\n");
+ RTE_LOG(WARNING, my_logtype2, "this is is a debug level message (not 
displayed)\n");
 
  /* log in info level */
- rte_set_log_level(RTE_LOG_INFO);
- RTE_LOG(DEBUG, TESTAPP2, "debug level message (not displayed)\n");
+ rte_log_set_global_level(RTE_LOG_INFO);
+ RTE_LOG(DEBUG, my_logtype1, "debug level message (not displayed)\n");
 
 Branch Prediction
 ~
diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map 
b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
index 2cf1ac8..85d051c 100644
--- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
@@ -183,5 +183,7 @@ DPDK_17.02 {
rte_bus_register;
rte_bus_scan;
rte_bus_unregister;
+   rte_log_register;
+   rte_log_set_level;
 
 } DPDK_16.11;
diff --git a/lib/librte_eal/common/eal_common_log.c 
b/lib/librte_eal/common/eal_common_log.c
index 2197558..cc8da26 100644
--- a/lib/librte_eal/common/eal_common_log.c
+++ b/lib/librte_eal/common/eal_common_log.c
@@ -35,7 +35,10 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
+#include 
 #include 
 #include 
 
@@ -46,6 +49,8 @@ struct rte_logs rte_logs = {
.type = ~0,
.level = RTE_LOG_DEBUG,
.file = NULL,
+   .dynamic_types_len = 0,
+   .dynamic_types = NULL,
 };
 
 /* Stream to use for logging if rte_logs.file is NULL */
@@ -60,6 +65,11 @@ struct log_cur_msg {
uint32_t logtype;  /**< log type  - see rte_log.h */
 };
 
+struct rte_log_dynamic_type {
+   const char *name;
+   uint32_t loglevel;
+};
+
  /* per core log */
 static RTE_DEFINE_PER_LCORE(struct log_cur_msg, log_cur_msg);
 
@@ -91,10 +101,17 @@ rte_get_log_level(void)
 void
 rte_set_log_type(uint32_t type, int enable)
 {
+   if (type < RTE_LOGTYPE_FIRST_EXT_ID) {
+   if (enable)
+   rte_logs.type |= type;
+   else
+   rte_logs.type &= (~type);
+

[dpdk-dev] [RFC 2/8] eal: dump registered log types

2017-02-06 Thread Olivier Matz
Introduce a function to dump the global level and the registered log
types.

Signed-off-by: Olivier Matz 
---
 lib/librte_eal/bsdapp/eal/rte_eal_version.map   |  1 +
 lib/librte_eal/common/eal_common_log.c  | 34 +
 lib/librte_eal/common/include/rte_log.h | 10 
 lib/librte_eal/linuxapp/eal/rte_eal_version.map |  1 +
 4 files changed, 46 insertions(+)

diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map 
b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
index 85d051c..f0a35b1 100644
--- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
@@ -183,6 +183,7 @@ DPDK_17.02 {
rte_bus_register;
rte_bus_scan;
rte_bus_unregister;
+   rte_log_dump;
rte_log_register;
rte_log_set_level;
 
diff --git a/lib/librte_eal/common/eal_common_log.c 
b/lib/librte_eal/common/eal_common_log.c
index cc8da26..a311279 100644
--- a/lib/librte_eal/common/eal_common_log.c
+++ b/lib/librte_eal/common/eal_common_log.c
@@ -246,6 +246,40 @@ rte_log_init(void)
rte_logs.dynamic_types_len = RTE_LOGTYPE_FIRST_EXT_ID;
 }
 
+static const char *
+loglevel_to_string(uint32_t level)
+{
+   switch (level) {
+   case RTE_LOG_EMERG: return "emerg";
+   case RTE_LOG_ALERT: return "alert";
+   case RTE_LOG_CRIT: return "critical";
+   case RTE_LOG_ERR: return "error";
+   case RTE_LOG_WARNING: return "warning";
+   case RTE_LOG_NOTICE: return "notice";
+   case RTE_LOG_INFO: return "info";
+   case RTE_LOG_DEBUG: return "debug";
+   default: return "unknown";
+   }
+}
+
+/* dump global level and registered log types */
+void
+rte_log_dump(FILE *f)
+{
+   size_t i;
+
+   fprintf(f, "global log level is %s\n",
+   loglevel_to_string(rte_log_get_global_level()));
+
+   for (i = 0; i < rte_logs.dynamic_types_len; i++) {
+   if (rte_logs.dynamic_types[i].name == NULL)
+   continue;
+   fprintf(f, "id %zu: %s, level is %s\n",
+   i, rte_logs.dynamic_types[i].name,
+   loglevel_to_string(rte_logs.dynamic_types[i].loglevel));
+   }
+}
+
 /*
  * Generates a log message The message will be sent in the stream
  * defined by the previous call to rte_openlog_stream().
diff --git a/lib/librte_eal/common/include/rte_log.h 
b/lib/librte_eal/common/include/rte_log.h
index e71887f..97e0c5e 100644
--- a/lib/librte_eal/common/include/rte_log.h
+++ b/lib/librte_eal/common/include/rte_log.h
@@ -209,6 +209,16 @@ int rte_log_cur_msg_logtype(void);
 int rte_log_register(const char *name);
 
 /**
+ * Dump log information.
+ *
+ * Dump the global level and the registered log types.
+ *
+ * @param f
+ *   The output stream where the dump should be sent.
+ */
+void rte_log_dump(FILE *f);
+
+/**
  * Generates a log message.
  *
  * The message will be sent in the stream defined by the previous call
diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map 
b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
index 1471c41..9991bd1 100644
--- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
@@ -187,6 +187,7 @@ DPDK_17.02 {
rte_bus_register;
rte_bus_scan;
rte_bus_unregister;
+   rte_log_dump;
rte_log_register;
rte_log_set_level;
 
-- 
2.8.1



[dpdk-dev] [RFC 3/8] eal: change several log levels matching a regexp

2017-02-06 Thread Olivier Matz
Introduce a function to set the log level of several log types that
match a regular expression.

Signed-off-by: Olivier Matz 
---
 lib/librte_eal/bsdapp/eal/rte_eal_version.map   |  1 +
 lib/librte_eal/common/eal_common_log.c  | 21 +
 lib/librte_eal/common/include/rte_log.h | 12 
 lib/librte_eal/linuxapp/eal/rte_eal_version.map |  1 +
 4 files changed, 35 insertions(+)

diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map 
b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
index f0a35b1..5668d48 100644
--- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
@@ -186,5 +186,6 @@ DPDK_17.02 {
rte_log_dump;
rte_log_register;
rte_log_set_level;
+   rte_log_set_level_regexp;
 
 } DPDK_16.11;
diff --git a/lib/librte_eal/common/eal_common_log.c 
b/lib/librte_eal/common/eal_common_log.c
index a311279..b35c8b2 100644
--- a/lib/librte_eal/common/eal_common_log.c
+++ b/lib/librte_eal/common/eal_common_log.c
@@ -37,6 +37,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -134,6 +135,26 @@ rte_log_set_level(uint32_t type, uint32_t level)
return 0;
 }
 
+/* set level */
+int
+rte_log_set_level_regexp(const char *pattern, uint32_t level)
+{
+   regex_t r;
+   size_t i;
+
+   if (level > RTE_LOG_DEBUG)
+   return -1;
+
+   for (i = 0; i < rte_logs.dynamic_types_len; i++) {
+   if (rte_logs.dynamic_types[i].name == NULL)
+   continue;
+   if (regexec(&r, pattern, 0, NULL, 0) == 0)
+   rte_logs.dynamic_types[i].loglevel = level;
+   }
+
+   return 0;
+}
+
 /* get the current loglevel for the message beeing processed */
 int rte_log_cur_msg_loglevel(void)
 {
diff --git a/lib/librte_eal/common/include/rte_log.h 
b/lib/librte_eal/common/include/rte_log.h
index 97e0c5e..ce48b07 100644
--- a/lib/librte_eal/common/include/rte_log.h
+++ b/lib/librte_eal/common/include/rte_log.h
@@ -157,6 +157,18 @@ uint32_t rte_get_log_type(void);
 /**
  * Set the log level for a given type.
  *
+ * @param pattern
+ *   The regexp identifying the log type.
+ * @param level
+ *   The level to be set.
+ * @return
+ *   0 on success, a negative value if level is invalid.
+ */
+int rte_log_set_level_regexp(const char *pattern, uint32_t level);
+
+/**
+ * Set the log level for a given type.
+ *
  * @param logtype
  *   The log type identifier.
  * @param level
diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map 
b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
index 9991bd1..f133c4f 100644
--- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map
@@ -190,5 +190,6 @@ DPDK_17.02 {
rte_log_dump;
rte_log_register;
rte_log_set_level;
+   rte_log_set_level_regexp;
 
 } DPDK_16.11;
-- 
2.8.1



[dpdk-dev] [RFC 5/8] eal: deprecate log functions

2017-02-06 Thread Olivier Matz
Deprecate the following functions:
- rte_set_log_level(), replaced by rte_log_set_global_level()
- rte_get_log_level(), replaced by rte_log_get_global_level()
- rte_set_log_type(), rte_log_set_level()
- rte_get_log_type(), rte_log_get_level()

The new functions provide a better control of the per-type log level,
and have a better name prefix (rte_log_).

Signed-off-by: Olivier Matz 
---
 app/test/test_logs.c|  6 +++---
 examples/quota_watermark/qw/main.c  |  2 +-
 examples/quota_watermark/qwctl/qwctl.c  |  2 +-
 lib/librte_eal/bsdapp/eal/rte_eal_version.map   |  2 ++
 lib/librte_eal/common/eal_common_log.c  | 24 
 lib/librte_eal/common/include/rte_log.h | 19 +++
 lib/librte_eal/linuxapp/eal/rte_eal_version.map |  2 ++
 7 files changed, 48 insertions(+), 9 deletions(-)

diff --git a/app/test/test_logs.c b/app/test/test_logs.c
index 805c568..730a86b 100644
--- a/app/test/test_logs.c
+++ b/app/test/test_logs.c
@@ -66,12 +66,12 @@ test_logs(void)
rte_log_set_level(RTE_LOGTYPE_TESTAPP2, RTE_LOG_EMERG);
 
/* log in error level */
-   rte_set_log_level(RTE_LOG_ERR);
+   rte_log_set_global_level(RTE_LOG_ERR);
RTE_LOG(ERR, TESTAPP1, "error message\n");
RTE_LOG(CRIT, TESTAPP1, "critical message\n");
 
/* log in critical level */
-   rte_set_log_level(RTE_LOG_CRIT);
+   rte_log_set_global_level(RTE_LOG_CRIT);
RTE_LOG(ERR, TESTAPP2, "error message (not displayed)\n");
RTE_LOG(CRIT, TESTAPP2, "critical message\n");
 
@@ -79,7 +79,7 @@ test_logs(void)
rte_log_set_level(RTE_LOGTYPE_TESTAPP2, RTE_LOG_DEBUG);
 
/* log in error level */
-   rte_set_log_level(RTE_LOG_ERR);
+   rte_log_set_global_level(RTE_LOG_ERR);
RTE_LOG(ERR, TESTAPP1, "error message\n");
RTE_LOG(ERR, TESTAPP2, "error message (not displayed)\n");
 
diff --git a/examples/quota_watermark/qw/main.c 
b/examples/quota_watermark/qw/main.c
index 9162e28..b13df5b 100644
--- a/examples/quota_watermark/qw/main.c
+++ b/examples/quota_watermark/qw/main.c
@@ -311,7 +311,7 @@ main(int argc, char **argv)
 
 uint8_t port_id;
 
-rte_set_log_level(RTE_LOG_INFO);
+rte_log_set_global_level(RTE_LOG_INFO);
 
 ret = rte_eal_init(argc, argv);
 if (ret < 0)
diff --git a/examples/quota_watermark/qwctl/qwctl.c 
b/examples/quota_watermark/qwctl/qwctl.c
index 29c501c..d2c48b8 100644
--- a/examples/quota_watermark/qwctl/qwctl.c
+++ b/examples/quota_watermark/qwctl/qwctl.c
@@ -75,7 +75,7 @@ int main(int argc, char **argv)
 int ret;
 struct cmdline *cl;
 
-rte_set_log_level(RTE_LOG_INFO);
+rte_log_set_global_level(RTE_LOG_INFO);
 
 ret = rte_eal_init(argc, argv);
 if (ret < 0)
diff --git a/lib/librte_eal/bsdapp/eal/rte_eal_version.map 
b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
index 5668d48..7e190cf 100644
--- a/lib/librte_eal/bsdapp/eal/rte_eal_version.map
+++ b/lib/librte_eal/bsdapp/eal/rte_eal_version.map
@@ -185,6 +185,8 @@ DPDK_17.02 {
rte_bus_unregister;
rte_log_dump;
rte_log_register;
+   rte_log_get_global_level;
+   rte_log_set_global_level;
rte_log_set_level;
rte_log_set_level_regexp;
 
diff --git a/lib/librte_eal/common/eal_common_log.c 
b/lib/librte_eal/common/eal_common_log.c
index b35c8b2..3888044 100644
--- a/lib/librte_eal/common/eal_common_log.c
+++ b/lib/librte_eal/common/eal_common_log.c
@@ -86,20 +86,36 @@ rte_openlog_stream(FILE *f)
 
 /* Set global log level */
 void
-rte_set_log_level(uint32_t level)
+rte_log_set_global_level(uint32_t level)
 {
rte_logs.level = (uint32_t)level;
 }
 
+/* Set global log level */
+/* replaced by rte_log_set_global_level */
+__rte_deprecated void
+rte_set_log_level(uint32_t level)
+{
+   rte_log_set_global_level(level);
+}
+
 /* Get global log level */
 uint32_t
-rte_get_log_level(void)
+rte_log_get_global_level(void)
 {
return rte_logs.level;
 }
 
+/* Get global log level */
+/* replaced by rte_log_get_global_level */
+uint32_t
+rte_get_log_level(void)
+{
+   return rte_log_get_global_level();
+}
+
 /* Set global log type */
-void
+__rte_deprecated void
 rte_set_log_type(uint32_t type, int enable)
 {
if (type < RTE_LOGTYPE_FIRST_EXT_ID) {
@@ -116,7 +132,7 @@ rte_set_log_type(uint32_t type, int enable)
 }
 
 /* Get global log type */
-uint32_t
+__rte_deprecated uint32_t
 rte_get_log_type(void)
 {
return rte_logs.type;
diff --git a/lib/librte_eal/common/include/rte_log.h 
b/lib/librte_eal/common/include/rte_log.h
index ce48b07..27c40b4 100644
--- a/lib/librte_eal/common/include/rte_log.h
+++ b/lib/librte_eal/common/include/rte_log.h
@@ -50,6 +50,8 @@ extern "C" {
 #include 
 #include 
 
+#include 
+
 struct rte_log_dynamic_type;
 
 /** The rte_log structure. */
@@ -132,11 +134,26 @@ int rte_openlog_stream(FILE *f);
  * @param level
  *   Log level. A value between RTE_LOG_EMERG (1) and RTE

[dpdk-dev] [RFC 7/8] app/testpmd: new command to dump log types

2017-02-06 Thread Olivier Matz
Signed-off-by: Olivier Matz 
---
 app/test-pmd/cmdline.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 43fc636..849d807 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -7673,6 +7673,8 @@ static void cmd_dump_parsed(void *parsed_result,
rte_mempool_list_dump(stdout);
else if (!strcmp(res->dump, "dump_devargs"))
rte_eal_devargs_dump(stdout);
+   else if (!strcmp(res->dump, "dump_log_types"))
+   rte_log_dump(stdout);
 }
 
 cmdline_parse_token_string_t cmd_dump_dump =
@@ -7682,7 +7684,8 @@ cmdline_parse_token_string_t cmd_dump_dump =
"dump_struct_sizes#"
"dump_ring#"
"dump_mempool#"
-   "dump_devargs");
+   "dump_devargs#"
+   "dump_log_types");
 
 cmdline_parse_inst_t cmd_dump = {
.f = cmd_dump_parsed,  /* function to call */
-- 
2.8.1



[dpdk-dev] [RFC 6/8] app/test: new command to dump log types

2017-02-06 Thread Olivier Matz
Signed-off-by: Olivier Matz 
---
 app/test/commands.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/app/test/commands.c b/app/test/commands.c
index 2df46b0..488dc41 100644
--- a/app/test/commands.c
+++ b/app/test/commands.c
@@ -158,13 +158,15 @@ static void cmd_dump_parsed(void *parsed_result,
rte_mempool_list_dump(stdout);
else if (!strcmp(res->dump, "dump_devargs"))
rte_eal_devargs_dump(stdout);
+   else if (!strcmp(res->dump, "dump_log_types"))
+   rte_log_dump(stdout);
 }
 
 cmdline_parse_token_string_t cmd_dump_dump =
TOKEN_STRING_INITIALIZER(struct cmd_dump_result, dump,
 "dump_physmem#dump_memzone#"
 "dump_struct_sizes#dump_ring#dump_mempool#"
-"dump_devargs");
+"dump_devargs#dump_log_types");
 
 cmdline_parse_inst_t cmd_dump = {
.f = cmd_dump_parsed,  /* function to call */
-- 
2.8.1



[dpdk-dev] [RFC 4/8] eal: change specific log levels at startup

2017-02-06 Thread Olivier Matz
Example of use:
  ./app/test-pmd --log-level='pmd\.i40e.*,8'

  This enables debug logs for all dynamic logs whose type starts with
  'pmd.i40e'.

Signed-off-by: Olivier Matz 
---
 lib/librte_eal/bsdapp/eal/eal.c|  4 +--
 lib/librte_eal/common/eal_common_options.c | 48 +++---
 lib/librte_eal/linuxapp/eal/eal.c  |  4 +--
 3 files changed, 39 insertions(+), 17 deletions(-)

diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c
index ee7c9de..e945468 100644
--- a/lib/librte_eal/bsdapp/eal/eal.c
+++ b/lib/librte_eal/bsdapp/eal/eal.c
@@ -505,10 +505,8 @@ rte_eal_init(int argc, char **argv)
 
thread_id = pthread_self();
 
-   eal_log_level_parse(argc, argv);
-
/* set log level as early as possible */
-   rte_set_log_level(internal_config.log_level);
+   eal_log_level_parse(argc, argv);
 
if (rte_eal_cpu_init() < 0)
rte_panic("Cannot detect lcores\n");
diff --git a/lib/librte_eal/common/eal_common_options.c 
b/lib/librte_eal/common/eal_common_options.c
index f36bc55..2682881 100644
--- a/lib/librte_eal/common/eal_common_options.c
+++ b/lib/librte_eal/common/eal_common_options.c
@@ -739,25 +739,53 @@ eal_parse_syslog(const char *facility, struct 
internal_config *conf)
 }
 
 static int
-eal_parse_log_level(const char *level, uint32_t *log_level)
+eal_parse_log_level(const char *arg, struct internal_config *conf)
 {
-   char *end;
+   char *end, *str, *type, *level;
unsigned long tmp;
 
+   str = strdup(arg);
+   if (str == NULL)
+   return -1;
+
+   if (strchr(str, ',') == NULL) {
+   type = NULL;
+   level = str;
+   } else {
+   type = strsep(&str, ",");
+   level = strsep(&str, ",");
+   }
+
errno = 0;
tmp = strtoul(level, &end, 0);
 
/* check for errors */
if ((errno != 0) || (level[0] == '\0') ||
-   end == NULL || (*end != '\0'))
-   return -1;
+   end == NULL || (*end != '\0'))
+   goto fail;
 
/* log_level is a uint32_t */
if (tmp >= UINT32_MAX)
-   return -1;
+   goto fail;
+
+   printf("set log level %s,%lu\n",
+   type, tmp);
+
+   if (type == NULL) {
+   conf->log_level = tmp;
+   rte_log_set_global_level(tmp);
+   } else if (rte_log_set_level_regexp(type, tmp) < 0) {
+   printf("cannot set log level %s,%lu\n",
+   type, tmp);
+   goto fail;
+   }
 
-   *log_level = tmp;
+   free(str);
return 0;
+
+fail:
+   free(str);
+   return -1;
 }
 
 static enum rte_proc_type_t
@@ -898,15 +926,12 @@ eal_parse_common_option(int opt, const char *optarg,
break;
 
case OPT_LOG_LEVEL_NUM: {
-   uint32_t log;
-
-   if (eal_parse_log_level(optarg, &log) < 0) {
+   if (eal_parse_log_level(optarg, conf) < 0) {
RTE_LOG(ERR, EAL,
"invalid parameters for --"
OPT_LOG_LEVEL "\n");
return -1;
}
-   conf->log_level = log;
break;
}
case OPT_LCORES_NUM:
@@ -1057,7 +1082,8 @@ eal_common_usage(void)
   "  --"OPT_VMWARE_TSC_MAP"Use VMware TSC map instead of 
native RDTSC\n"
   "  --"OPT_PROC_TYPE" Type of this process 
(primary|secondary|auto)\n"
   "  --"OPT_SYSLOG"Set syslog facility\n"
-  "  --"OPT_LOG_LEVEL" Set default log level\n"
+  "  --"OPT_LOG_LEVEL"=  Set global log level\n"
+  "  --"OPT_LOG_LEVEL"=,   Set specific log level\n"
   "  -v  Display version information on startup\n"
   "  -h, --help  This help\n"
   "\nEAL options for DEBUG use only:\n"
diff --git a/lib/librte_eal/linuxapp/eal/eal.c 
b/lib/librte_eal/linuxapp/eal/eal.c
index bf6b818..0f61447 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -762,10 +762,8 @@ rte_eal_init(int argc, char **argv)
 
thread_id = pthread_self();
 
-   eal_log_level_parse(argc, argv);
-
/* set log level as early as possible */
-   rte_set_log_level(internal_config.log_level);
+   eal_log_level_parse(argc, argv);
 
if (rte_eal_cpu_init() < 0)
rte_panic("Cannot detect lcores\n");
-- 
2.8.1



[dpdk-dev] [RFC 8/8] net/i40e: use dynamic log type for control logs

2017-02-06 Thread Olivier Matz
This is an example of how a dynamic log type can be used in a
PMD.

Signed-off-by: Olivier Matz 
---
 drivers/net/i40e/i40e_ethdev.c | 18 --
 drivers/net/i40e/i40e_fdir.c   |  4 
 drivers/net/i40e/i40e_logs.h   | 17 ++---
 3 files changed, 22 insertions(+), 17 deletions(-)

diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 4492bcc..0beb37f 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -40,6 +40,7 @@
 #include 
 #include 
 
+#include 
 #include 
 #include 
 #include 
@@ -419,6 +420,9 @@ static void i40e_ethertype_filter_restore(struct i40e_pf 
*pf);
 static void i40e_tunnel_filter_restore(struct i40e_pf *pf);
 static void i40e_filter_restore(struct i40e_pf *pf);
 
+int i40e_logtype_init;
+int i40e_logtype_driver;
+
 static const struct rte_pci_id pci_id_i40e_map[] = {
{ RTE_PCI_DEVICE(I40E_INTEL_VENDOR_ID, I40E_DEV_ID_SFP_XL710) },
{ RTE_PCI_DEVICE(I40E_INTEL_VENDOR_ID, I40E_DEV_ID_QEMU) },
@@ -5788,7 +5792,6 @@ i40e_dev_interrupt_handler(struct rte_intr_handle 
*intr_handle,
PMD_DRV_LOG(INFO, "No interrupt event");
goto done;
}
-#ifdef RTE_LIBRTE_I40E_DEBUG_DRIVER
if (icr0 & I40E_PFINT_ICR0_ECC_ERR_MASK)
PMD_DRV_LOG(ERR, "ICR0: unrecoverable ECC error");
if (icr0 & I40E_PFINT_ICR0_MAL_DETECT_MASK)
@@ -5803,7 +5806,6 @@ i40e_dev_interrupt_handler(struct rte_intr_handle 
*intr_handle,
PMD_DRV_LOG(ERR, "ICR0: HMC error");
if (icr0 & I40E_PFINT_ICR0_PE_CRITERR_MASK)
PMD_DRV_LOG(ERR, "ICR0: protocol engine critical error");
-#endif /* RTE_LIBRTE_I40E_DEBUG_DRIVER */
 
if (icr0 & I40E_PFINT_ICR0_VFLR_MASK) {
PMD_DRV_LOG(INFO, "ICR0: VF reset detected");
@@ -11187,3 +11189,15 @@ rte_pmd_i40e_reset_vf_stats(uint8_t port,
 
return 0;
 }
+
+RTE_INIT(i40e_init_log);
+static void
+i40e_init_log(void)
+{
+   i40e_logtype_init = rte_log_register("pmd.i40e.init");
+   if (i40e_logtype_init >= 0)
+   rte_log_set_level(i40e_logtype_init, RTE_LOG_NOTICE);
+   i40e_logtype_driver = rte_log_register("pmd.i40e.driver");
+   if (i40e_logtype_driver >= 0)
+   rte_log_set_level(i40e_logtype_driver, RTE_LOG_NOTICE);
+}
diff --git a/drivers/net/i40e/i40e_fdir.c b/drivers/net/i40e/i40e_fdir.c
index 0700253..32d3b19 100644
--- a/drivers/net/i40e/i40e_fdir.c
+++ b/drivers/net/i40e/i40e_fdir.c
@@ -1592,17 +1592,14 @@ i40e_fdir_filter_restore(struct i40e_pf *pf)
struct rte_eth_dev *dev = I40E_VSI_TO_ETH_DEV(pf->main_vsi);
struct i40e_fdir_filter_list *fdir_list = &pf->fdir.fdir_list;
struct i40e_fdir_filter *f;
-#ifdef RTE_LIBRTE_I40E_DEBUG_DRIVER
struct i40e_hw *hw = I40E_PF_TO_HW(pf);
uint32_t fdstat;
uint32_t guarant_cnt;  /**< Number of filters in guaranteed spaces. */
uint32_t best_cnt; /**< Number of filters in best effort spaces. */
-#endif /* RTE_LIBRTE_I40E_DEBUG_DRIVER */
 
TAILQ_FOREACH(f, fdir_list, rules)
i40e_add_del_fdir_filter(dev, &f->fdir, TRUE);
 
-#ifdef RTE_LIBRTE_I40E_DEBUG_DRIVER
fdstat = I40E_READ_REG(hw, I40E_PFQF_FDSTAT);
guarant_cnt =
(uint32_t)((fdstat & I40E_PFQF_FDSTAT_GUARANT_CNT_MASK) >>
@@ -1610,7 +1607,6 @@ i40e_fdir_filter_restore(struct i40e_pf *pf)
best_cnt =
(uint32_t)((fdstat & I40E_PFQF_FDSTAT_BEST_CNT_MASK) >>
   I40E_PFQF_FDSTAT_BEST_CNT_SHIFT);
-#endif /* RTE_LIBRTE_I40E_DEBUG_DRIVER */
 
PMD_DRV_LOG(INFO, "FDIR: Guarant count: %d,  Best count: %d",
guarant_cnt, best_cnt);
diff --git a/drivers/net/i40e/i40e_logs.h b/drivers/net/i40e/i40e_logs.h
index e042e24..8e99cd5 100644
--- a/drivers/net/i40e/i40e_logs.h
+++ b/drivers/net/i40e/i40e_logs.h
@@ -34,14 +34,11 @@
 #ifndef _I40E_LOGS_H_
 #define _I40E_LOGS_H_
 
+extern int i40e_logtype_init;
 #define PMD_INIT_LOG(level, fmt, args...) \
-   RTE_LOG(level, PMD, "%s(): " fmt "\n", __func__, ##args)
-
-#ifdef RTE_LIBRTE_I40E_DEBUG_INIT
+   rte_log(RTE_LOG_ ## level, i40e_logtype_init, "%s(): " fmt "\n", \
+   __func__, ##args)
 #define PMD_INIT_FUNC_TRACE() PMD_INIT_LOG(DEBUG, " >>")
-#else
-#define PMD_INIT_FUNC_TRACE() do { } while(0)
-#endif
 
 #ifdef RTE_LIBRTE_I40E_DEBUG_RX
 #define PMD_RX_LOG(level, fmt, args...) \
@@ -64,12 +61,10 @@
 #define PMD_TX_FREE_LOG(level, fmt, args...) do { } while(0)
 #endif
 
-#ifdef RTE_LIBRTE_I40E_DEBUG_DRIVER
+extern int i40e_logtype_driver;
 #define PMD_DRV_LOG_RAW(level, fmt, args...) \
-   RTE_LOG(level, PMD, "%s(): " fmt, __func__, ## args)
-#else
-#define PMD_DRV_LOG_RAW(level, fmt, args...) do { } while (0)
-#endif
+   rte_log(RTE_LOG_ ## level, i40e_logtype_driver, "%s(): " fmt, \
+   __func__, ## args)
 
 #define PMD_DRV_LOG(level, fmt, args...) \
PMD_DRV_LOG_RAW(level, f

[dpdk-dev] cryptodev - Session and queue pair relationship

2017-02-06 Thread Akhil Goyal

Hi,

I have some issues w.r.t the mapping sessions and queue pairs.

As per my understanding:
- Number of sessions may be large – they are independent of number of 
queue pairs

- Queue pairs are L-core specific
- Depending on the implementation, one queue pair can be mapped to many 
sessions. Or, Only one queue pair for every session- especially in the 
systems having large number of queues (hw).
- Sessions can be created on the fly – typical rekeying use-cases. 
Generally done by the control threads.


There seems to be no straight way for the underlying driver 
implementation to know, what all sessions are mapped to a particular 
queue pair. The session and queue pair information is first time exposed 
in the enqueue command.


One of the NXP Crypto Hardware drivers uses per session data structures 
(descriptors) which need to be configured for hardware queues.  Though 
this information can be extracted from the first enqueue command for a 
particular session, it will add checks in the data path. Also, it will 
bring down the connection setup rate.


In the API rte_cryptodev_sym_session_create(), we create session on a 
particular device, but there is no information of queue pair being shared.


1. We want to propose to change the session create/config API to also 
take queue pair id as argument.

struct rte_cryptodev_sym_session *
rte_cryptodev_sym_session_create(uint8_t dev_id,
  struct rte_crypto_sym_xform *xform) to 
also take “uint16_t qp;”


This will also return “in-use” error, if the underlying hardware only 
support 1 session/descriptor per qp.


2. Currently the application configures the *nb_descriptors* in the 
*rte_cryptodev_queue_pair_setup*. Should we add the queue pair 
capability API?



Please share your feedback, I will submit the patch accordingly.

Regards,
Akhil





Re: [dpdk-dev] [RFC 0/8] eal: dynamic logs

2017-02-06 Thread Bruce Richardson
On Mon, Feb 06, 2017 at 02:29:08PM +0100, Olivier Matz wrote:
> The objective of this RFC patchset is to introduce a framework to
> support dynamic log types in EAL. It also provides one example of use
> (in i40e).
> 
> Features:
> - log types are identified by a string
> - at registration, a uniq identifier is associated to a log type
> - each log type can have its level changed dynamically
> - extend command line parameters to set the log level of a specific
>   type, or logs matching a regular expression
> - keep compat with other legacy types (eal, malloc, ring, user*,
>   etc... keep their hardcoded log type value)
> 
> At the end, when, we can expect that all non-dataplane logs are moved to
> be dynamic, so we can enable/disable them at runtime, without
> recompiling. Many debug options can probably be removed from
> configuration:
>   $ git grep DEBUG config/common_base | wc -l
>   89
> 
> Comments are welcome!
> 
Initial scan through the patches this looks pretty good. However, rather
than continuing with our own logging system, have you investigated what
other tracing and logging systems might be out there that we could
possibly re-use? Could LTTng be a good fit for DPDK, perhaps?

/Bruce


Re: [dpdk-dev] [PATCH] net/ixgbevf: reset hardware when stopping port

2017-02-06 Thread Olivier Matz
Hi,

On Wed, 11 Jan 2017 17:52:04 +0100, Olivier Matz
 wrote:
> From: Guo Fengtian 
> 
> When PF triggers a reset, VF port must acknowledge it by calling
> ixgbe_reset_hw().
> 
> Before this patch, the port link status, speed and duplex are invalid
> (all set to 0).
> 
> The patch move the call to ixgbe_reset_hw() from ixgbevf_dev_close()
> to ixgbevf_dev_stop(), so that after a port restart on the VM, the
> link status is properly marked down, with correct speed and duplex.
> 
> Fixes: f0160874c041 ("ixgbe: various updates")
> 
> CC: sta...@dpdk.org
> Signed-off-by: Guo Fengtian 
> Signed-off-by: David Marchand 
> Signed-off-by: Olivier Matz 
> ---
> 
> Hi,
> 
> The use case for this problem was described some time ago, see:
> http://dpdk.org/ml/archives/dev/2015-December/030067.html
> 
> It is also part of 2.2 release note:
> http://dpdk.org/browse/dpdk/tree/doc/guides/rel_notes/release_2_2.rst#n471
> 
> Regards,
> Olivier

Ping


Thanks,
Olivier


Re: [dpdk-dev] [PATCH] net/ixgbevf: fix stats update after a PF reset

2017-02-06 Thread Olivier Matz
Hi,

On Wed, 11 Jan 2017 18:04:14 +0100, Olivier Matz
 wrote:
> From: Guo Fengtian 
> 
> When PF is set down, in VF, the value of stats register is zero.
> So only increase stats when it's non zero.
> 
> Fixes: af75078fece3 ("first public release")
> 
> CC: sta...@dpdk.org
> Signed-off-by: Guo Fengtian 
> Signed-off-by: Olivier Matz 

Ping


Thanks,
Olivier


Re: [dpdk-dev] [RFC 0/8] eal: dynamic logs

2017-02-06 Thread Olivier Matz
Hi Bruce,

On Mon, 6 Feb 2017 13:49:03 +, Bruce Richardson
 wrote:
> On Mon, Feb 06, 2017 at 02:29:08PM +0100, Olivier Matz wrote:
> > The objective of this RFC patchset is to introduce a framework to
> > support dynamic log types in EAL. It also provides one example of
> > use (in i40e).
> > 
> > Features:
> > - log types are identified by a string
> > - at registration, a uniq identifier is associated to a log type
> > - each log type can have its level changed dynamically
> > - extend command line parameters to set the log level of a specific
> >   type, or logs matching a regular expression
> > - keep compat with other legacy types (eal, malloc, ring, user*,
> >   etc... keep their hardcoded log type value)
> > 
> > At the end, when, we can expect that all non-dataplane logs are
> > moved to be dynamic, so we can enable/disable them at runtime,
> > without recompiling. Many debug options can probably be removed from
> > configuration:
> >   $ git grep DEBUG config/common_base | wc -l
> >   89
> > 
> > Comments are welcome!
> >   
> Initial scan through the patches this looks pretty good. However,
> rather than continuing with our own logging system, have you
> investigated what other tracing and logging systems might be out
> there that we could possibly re-use? Could LTTng be a good fit for
> DPDK, perhaps?

I did not investigate much about existing logging system. I agree
that could be a good alternative too.

On the other hand, I'm not really confident in adding a dependency to
an external lib for a core component like eal. What if we deicide
tomorrow to port dpdk to windows or any baremetal env?

Also, as far as I remember, the components that depend on external
libraries are disabled today because we don't know how to properly
manage the dependency (ex: pmd-pcap, vhost-numa, pmd-mlx, ...).


Olivier




Re: [dpdk-dev] [RFC 0/8] eal: dynamic logs

2017-02-06 Thread Wiles, Keith

> On Feb 6, 2017, at 8:10 AM, Olivier Matz  wrote:
> 
> Hi Bruce,
> 
> On Mon, 6 Feb 2017 13:49:03 +, Bruce Richardson
>  wrote:
>> On Mon, Feb 06, 2017 at 02:29:08PM +0100, Olivier Matz wrote:
>>> The objective of this RFC patchset is to introduce a framework to
>>> support dynamic log types in EAL. It also provides one example of
>>> use (in i40e).
>>> 
>>> Features:
>>> - log types are identified by a string
>>> - at registration, a uniq identifier is associated to a log type
>>> - each log type can have its level changed dynamically
>>> - extend command line parameters to set the log level of a specific
>>>  type, or logs matching a regular expression
>>> - keep compat with other legacy types (eal, malloc, ring, user*,
>>>  etc... keep their hardcoded log type value)
>>> 
>>> At the end, when, we can expect that all non-dataplane logs are
>>> moved to be dynamic, so we can enable/disable them at runtime,
>>> without recompiling. Many debug options can probably be removed from
>>> configuration:
>>>  $ git grep DEBUG config/common_base | wc -l
>>>  89
>>> 
>>> Comments are welcome!
>>> 
>> Initial scan through the patches this looks pretty good. However,
>> rather than continuing with our own logging system, have you
>> investigated what other tracing and logging systems might be out
>> there that we could possibly re-use? Could LTTng be a good fit for
>> DPDK, perhaps?
> 
> I did not investigate much about existing logging system. I agree
> that could be a good alternative too.
> 
> On the other hand, I'm not really confident in adding a dependency to
> an external lib for a core component like eal. What if we deicide
> tomorrow to port dpdk to windows or any baremetal env?
> 
> Also, as far as I remember, the components that depend on external
> libraries are disabled today because we don't know how to properly
> manage the dependency (ex: pmd-pcap, vhost-numa, pmd-mlx, …).

In a previous project I worked in we did not use log levels per say for 
debugging code. We did have a couple general logging for misc type logging.

When debugging you normally only need a couple files or functions that need to 
produce logging output. In that case we defined logging using the file and 
function as the key to determine if the dynamic log messages are output. We use 
a string in the format of "filename:function” we allowed for a wildcard to 
enable all functions in a file or you can select a single function.

Using this type of logging for debugging a system is the most useful if you 
just want general logging then we define something similar to what we have 
today.

> 
> 
> Olivier

Regards,
Keith



[dpdk-dev] [PATCH v1] doc: add distributor library API change notice

2017-02-06 Thread David Hunt
Given that the packet distributor library improvements (1) will
not be in 17.02, I plan on doing some consolidation of the
API for burst operation for 17.05, merging the two api's into
one, with options for single or burst operation.

(1) http://dpdk.org/dev/patchwork/patch/19911/

Signed-off-by: David Hunt 
---
 doc/guides/rel_notes/deprecation.rst | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index 755dc65..925e156 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -55,6 +55,12 @@ Deprecation Notices
   and will be removed in 17.02.
   It is replaced by ``rte_mempool_generic_get/put`` functions.
 
+* lib: distributor library API will be changed to incorporate a burst-
+  oriented API. This will include a change to ``rte_distributor_create``
+  to specify which type of instance to create (single or burst), and
+  additional calls for ``rte_poll_pkt_burst`` and ``rte_return_pkt_burst``,
+  among others.
+
 * ethdev: the legacy filter API, including
   ``rte_eth_dev_filter_supported()``, ``rte_eth_dev_filter_ctrl()`` as well
   as filter types MACVLAN, ETHERTYPE, FLEXIBLE, SYN, NTUPLE, TUNNEL, FDIR,
-- 
2.7.4



Re: [dpdk-dev] buf->hash.rss always empty with i40e

2017-02-06 Thread tom . barbette
Hi Konstantin,

It seems a little overkill to play with the key... The XL710 seems to be able 
to hash on IP fields only. It seems only a configuration issue, I'm adding i40e 
maintainers in CC so they can confirm? I think the i40e driver should configure 
XL710 to hash on IP fields for TCP and UDP when only ETH_RSS_IP is given 
instead of hashing to 0. That would mimic ixgbe behaviour btw.

Tom


- Mail original -
De: "Konstantin Ananyev" 
À: "tom barbette" , "Bruce Richardson" 

Cc: dev@dpdk.org
Envoyé: Lundi 6 Février 2017 13:25:59
Objet: RE: buf->hash.rss always empty with i40e

Hi Tom,

> 
> That also leave the question of how to HASH only on the IP tuple for TCP and 
> UDP packets? The use case is that I want all packets from the
> same IP src/dst pair to be hashed to the same queue. This cannot be enforced 
> with a complete hash on TCP/UDP fields.
> 

That's for IPv4 only?
It was a while ago, when I looked at it, but as I remember you can achieve that 
by filling first 64 bits of RSS hash with some meaningful values,
and keeping remaining RSS bits as zeroes.
You probably can give it a quick try with testpmd.

There is a nice article about similar subject:
http://www.ndsl.kaist.edu/~kyoungsoo/papers/TR-symRSS.pdf

Konstantin



Re: [dpdk-dev] [RFC 0/8] eal: dynamic logs

2017-02-06 Thread Olivier Matz
On Mon, 6 Feb 2017 15:01:30 +, "Wiles, Keith"
 wrote:
> > On Feb 6, 2017, at 8:10 AM, Olivier Matz 
> > wrote:
> > 
> > Hi Bruce,
> > 
> > On Mon, 6 Feb 2017 13:49:03 +, Bruce Richardson
> >  wrote:  
> >> On Mon, Feb 06, 2017 at 02:29:08PM +0100, Olivier Matz wrote:  
> >>> The objective of this RFC patchset is to introduce a framework to
> >>> support dynamic log types in EAL. It also provides one example of
> >>> use (in i40e).
> >>> 
> >>> Features:
> >>> - log types are identified by a string
> >>> - at registration, a uniq identifier is associated to a log type
> >>> - each log type can have its level changed dynamically
> >>> - extend command line parameters to set the log level of a
> >>> specific type, or logs matching a regular expression
> >>> - keep compat with other legacy types (eal, malloc, ring, user*,
> >>>  etc... keep their hardcoded log type value)
> >>> 
> >>> At the end, when, we can expect that all non-dataplane logs are
> >>> moved to be dynamic, so we can enable/disable them at runtime,
> >>> without recompiling. Many debug options can probably be removed
> >>> from configuration:
> >>>  $ git grep DEBUG config/common_base | wc -l
> >>>  89
> >>> 
> >>> Comments are welcome!
> >>>   
> >> Initial scan through the patches this looks pretty good. However,
> >> rather than continuing with our own logging system, have you
> >> investigated what other tracing and logging systems might be out
> >> there that we could possibly re-use? Could LTTng be a good fit for
> >> DPDK, perhaps?  
> > 
> > I did not investigate much about existing logging system. I agree
> > that could be a good alternative too.
> > 
> > On the other hand, I'm not really confident in adding a dependency
> > to an external lib for a core component like eal. What if we deicide
> > tomorrow to port dpdk to windows or any baremetal env?
> > 
> > Also, as far as I remember, the components that depend on external
> > libraries are disabled today because we don't know how to properly
> > manage the dependency (ex: pmd-pcap, vhost-numa, pmd-mlx, …).  
> 
> In a previous project I worked in we did not use log levels per say
> for debugging code. We did have a couple general logging for misc
> type logging.
> 
> When debugging you normally only need a couple files or functions
> that need to produce logging output. In that case we defined logging
> using the file and function as the key to determine if the dynamic
> log messages are output. We use a string in the format of
> "filename:function” we allowed for a wildcard to enable all functions
> in a file or you can select a single function.
> 
> Using this type of logging for debugging a system is the most useful
> if you just want general logging then we define something similar to
> what we have today.

I think the "filename:function" is not adequate if you are not the
developer of that code. Example: you have a problem with a PMD, you
just enable all logs for this PMD (or even all PMDs), you can check it
and if you don't find the problem, you can send them on the ML for help.
I think you don't care where the code is located.

Regards,
Olivier


Re: [dpdk-dev] [PATCH v3 0/3] xen: netfront poll mode driver

2017-02-06 Thread Konrad Rzeszutek Wilk
On Sun, Feb 05, 2017 at 03:44:52PM +0100, Thomas Monjalon wrote:
> Hi Jan,
> 
> 2016-03-22 10:55, Jan Blunck:
> > v3 changes:
> > - removed fake PCI interface
> > - removed struct virt_eth_driver
> > - check for UIO name and version
> > - added basic documentation
> > 
> > Jan Blunck (3):
> >   xen: Add UIO kernel driver
> >   xen: Add netfront poll mode driver
> >   xen: Add documentation
> 
> Any news about this series?

Perhaps reposted with xen-devel being CC-ed?
> It is a long time since last discussion.
> There is a new maintainer Xen-related things: Jianfeng.


Re: [dpdk-dev] [PATCH v2 4/6] net/tap: fix multi-queue support

2017-02-06 Thread Pascal Mazon
On Sun,  5 Feb 2017 10:05:07 -0600
Keith Wiles  wrote:

> At the same time remove the code which created the first device queue
> at probe time. Now all queues are created during queue setup calls.
> 
> Signed-off-by: Keith Wiles 
> ---
>  drivers/net/tap/rte_eth_tap.c | 104
> ++ 1 file changed, 34 insertions(+),
> 70 deletions(-)
> 
> diff --git a/drivers/net/tap/rte_eth_tap.c b/drivers/net/tap/rte_eth_tap.c
> index 61659bc..7c923a2 100644
> --- a/drivers/net/tap/rte_eth_tap.c
> +++ b/drivers/net/tap/rte_eth_tap.c
> @@ -57,7 +57,11 @@
>  #define ETH_TAP_IFACE_ARG   "iface"
>  #define ETH_TAP_SPEED_ARG   "speed"
>  
> +#ifdef IFF_MULTI_QUEUE
>  #define RTE_PMD_TAP_MAX_QUEUES   16
> +#else
> +#define RTE_PMD_TAP_MAX_QUEUES   1
> +#endif
>  
>  static struct rte_vdev_driver pmd_tap_drv;
>  
> @@ -114,17 +118,20 @@ struct pmd_internals {
>   * supplied name.
>   */
>  static int
> -tun_alloc(char *name)
> +tun_alloc(struct pmd_internals *pmd, uint16_t qid)
>  {
>   struct ifreq ifr;
> +#ifdef IFF_MULTI_QUEUE
>   unsigned int features;
> +#endif
>   int fd;
>  
>   memset(&ifr, 0, sizeof(struct ifreq));
>  
>   ifr.ifr_flags = IFF_TAP | IFF_NO_PI;
> - if (name && name[0])
> - strncpy(ifr.ifr_name, name, IFNAMSIZ);
> + strncpy(ifr.ifr_name, pmd->name, IFNAMSIZ);
> +
> + RTE_LOG(DEBUG, PMD, "ifr_name '%s'\n", ifr.ifr_name);
>  
>   fd = open(TUN_TAP_DEV_PATH, O_RDWR);
>   if (fd < 0) {
> @@ -132,37 +139,26 @@ tun_alloc(char *name)
>   goto error;
>   }
>  
> - /* Grab the TUN features to verify we can work */
> +#ifdef IFF_MULTI_QUEUE
> + /* Grab the TUN features to verify we can work multi-queue */
>   if (ioctl(fd, TUNGETFEATURES, &features) < 0) {
> - RTE_LOG(ERR, PMD, "Unable to get TUN/TAP features\n");
> + RTE_LOG(ERR, PMD, "TAP unable to get TUN/TAP features\n");
>   goto error;
>   }
> - RTE_LOG(DEBUG, PMD, "TUN/TAP Features %08x\n", features);
> + RTE_LOG(DEBUG, PMD, "  TAP Features %08x\n", features);
>  
> -#ifdef IFF_MULTI_QUEUE
> - if (!(features & IFF_MULTI_QUEUE) && (RTE_PMD_TAP_MAX_QUEUES > 1)) {
> - RTE_LOG(DEBUG, PMD, "TUN/TAP device only one queue\n");
> - goto error;
> - } else if ((features & IFF_ONE_QUEUE) &&
> - (RTE_PMD_TAP_MAX_QUEUES == 1)) {
> - ifr.ifr_flags |= IFF_ONE_QUEUE;
> - RTE_LOG(DEBUG, PMD, "Single queue only support\n");
> - } else {
> - ifr.ifr_flags |= IFF_MULTI_QUEUE;
> - RTE_LOG(DEBUG, PMD, "Multi-queue support for %d queues\n",
> + if (features & IFF_MULTI_QUEUE) {
> + RTE_LOG(DEBUG, PMD, "  Multi-queue support for %d queues\n",
>   RTE_PMD_TAP_MAX_QUEUES);
> - }
> -#else
> - if (RTE_PMD_TAP_MAX_QUEUES > 1) {
> - RTE_LOG(DEBUG, PMD, "TUN/TAP device only one queue\n");
> - goto error;
> - } else {
> + ifr.ifr_flags |= IFF_MULTI_QUEUE;
> + } else
> +#endif
> + {
>   ifr.ifr_flags |= IFF_ONE_QUEUE;
>   RTE_LOG(DEBUG, PMD, "Single queue only support\n");
>   }
> -#endif
>  
> - /* Set the TUN/TAP configuration and get the name if needed */
> + /* Set the TUN/TAP configuration and set the name if needed */
>   if (ioctl(fd, TUNSETIFF, (void *)&ifr) < 0) {
>   RTE_LOG(ERR, PMD, "Unable to set TUNSETIFF for %s\n",
>   ifr.ifr_name);
> @@ -177,9 +173,15 @@ tun_alloc(char *name)
>   goto error;
>   }
>  
> - /* If the name is different that new name as default */
> - if (name && strcmp(name, ifr.ifr_name))
> - snprintf(name, RTE_ETH_NAME_MAX_LEN - 1, "%s", ifr.ifr_name);
> + if (qid == 0) {
> + if (ioctl(fd, SIOCGIFHWADDR, &ifr) == -1) {
> + RTE_LOG(ERR, PMD, "ioctl failed (SIOCGIFHWADDR)
> (%s)\n",
> + ifr.ifr_name);
> + goto error;
> + }
> +
> + rte_memcpy(&pmd->eth_addr, ifr.ifr_hwaddr.sa_data, ETH_ALEN);
> + }
>  
>   return fd;
>  
> @@ -507,7 +509,7 @@ tap_setup_queue(struct rte_eth_dev *dev,
>   if (fd < 0) {
>   RTE_LOG(INFO, PMD, "Add queue to TAP %s for qid
> %d\n", pmd->name, qid);
> - fd = tun_alloc(pmd->name);
> + fd = tun_alloc(pmd, qid);
>   if (fd < 0) {
>   RTE_LOG(ERR, PMD, "tun_alloc(%s) failed\n",
>   pmd->name);
> @@ -629,34 +631,13 @@ static const struct eth_dev_ops ops = {
>  };
>  
>  static int
> -pmd_mac_address(int fd, struct ether_addr *addr)
> -{
> - struct ifreq ifr;
> -
> - if ((fd <= 0) || !addr)
> - return -1;
> -
> - memset(&ifr, 0, sizeof(ifr));
> -
> - if (ioctl(fd, SIOCG

Re: [dpdk-dev] [RFC 0/8] eal: dynamic logs

2017-02-06 Thread Wiles, Keith

> On Feb 6, 2017, at 9:27 AM, Olivier Matz  wrote:
> 
> On Mon, 6 Feb 2017 15:01:30 +, "Wiles, Keith"
>  wrote:
>>> On Feb 6, 2017, at 8:10 AM, Olivier Matz 
>>> wrote:
>>> 
>>> Hi Bruce,
>>> 
>>> On Mon, 6 Feb 2017 13:49:03 +, Bruce Richardson
>>>  wrote:  
 On Mon, Feb 06, 2017 at 02:29:08PM +0100, Olivier Matz wrote:  
> The objective of this RFC patchset is to introduce a framework to
> support dynamic log types in EAL. It also provides one example of
> use (in i40e).
> 
> Features:
> - log types are identified by a string
> - at registration, a uniq identifier is associated to a log type
> - each log type can have its level changed dynamically
> - extend command line parameters to set the log level of a
> specific type, or logs matching a regular expression
> - keep compat with other legacy types (eal, malloc, ring, user*,
> etc... keep their hardcoded log type value)
> 
> At the end, when, we can expect that all non-dataplane logs are
> moved to be dynamic, so we can enable/disable them at runtime,
> without recompiling. Many debug options can probably be removed
> from configuration:
> $ git grep DEBUG config/common_base | wc -l
> 89
> 
> Comments are welcome!
> 
 Initial scan through the patches this looks pretty good. However,
 rather than continuing with our own logging system, have you
 investigated what other tracing and logging systems might be out
 there that we could possibly re-use? Could LTTng be a good fit for
 DPDK, perhaps?  
>>> 
>>> I did not investigate much about existing logging system. I agree
>>> that could be a good alternative too.
>>> 
>>> On the other hand, I'm not really confident in adding a dependency
>>> to an external lib for a core component like eal. What if we deicide
>>> tomorrow to port dpdk to windows or any baremetal env?
>>> 
>>> Also, as far as I remember, the components that depend on external
>>> libraries are disabled today because we don't know how to properly
>>> manage the dependency (ex: pmd-pcap, vhost-numa, pmd-mlx, …).  
>> 
>> In a previous project I worked in we did not use log levels per say
>> for debugging code. We did have a couple general logging for misc
>> type logging.
>> 
>> When debugging you normally only need a couple files or functions
>> that need to produce logging output. In that case we defined logging
>> using the file and function as the key to determine if the dynamic
>> log messages are output. We use a string in the format of
>> "filename:function” we allowed for a wildcard to enable all functions
>> in a file or you can select a single function.
>> 
>> Using this type of logging for debugging a system is the most useful
>> if you just want general logging then we define something similar to
>> what we have today.
> 
> I think the "filename:function" is not adequate if you are not the
> developer of that code. Example: you have a problem with a PMD, you
> just enable all logs for this PMD (or even all PMDs), you can check it
> and if you don't find the problem, you can send them on the ML for help.
> I think you don't care where the code is located.

I do not understand your concern the design allows you to enable a single file, 
which means all functions within a file “filename:*". In the case of the all 
PMDs it not the best way to debug as you get a lot of output that may not be 
even related to the problem you are trying to solve. The design does allow you 
to enable one or more PMDs if say you are debugging say two PMDs. The output 
would be more readable and less cluttered with output that is not germane to 
the problem.

If I was debugging the TAP driver I would like to just enable 
“rte_eth_tap_pmd.c:*” or maybe we can define a something registered other then 
file name e.g. rte_log_register(“tap_pmd”); “tap_pmd:*” or 
“tap_pmd:pmd_rx_burst” or “tap_pmd:rte_tap_pmd_probe”. We could for the PMDs 
just use the PMD name we define at registration.

Maybe the register option brings us closer to the same goal, but add the 
function or selecting a specific set functions. The design does require a more 
active lookup at run time for dynamic debugging and we would have to make sure 
if enabled it does not effect performance. We used a hash table to locate the 
enabled debug log output.

The design allowed us to use the command line or CLI to enable/disable logging 
output.

> 
> Regards,
> Olivier

Regards,
Keith



Re: [dpdk-dev] [PATCH v2 6/6] net/tap: link set down must be before close

2017-02-06 Thread Pascal Mazon
On Sun,  5 Feb 2017 10:05:09 -0600
Keith Wiles  wrote:

> Signed-off-by: Keith Wiles 
> ---
>  drivers/net/tap/rte_eth_tap.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/tap/rte_eth_tap.c b/drivers/net/tap/rte_eth_tap.c
> index 65e4bab..966e91a 100644
> --- a/drivers/net/tap/rte_eth_tap.c
> +++ b/drivers/net/tap/rte_eth_tap.c
> @@ -353,10 +353,11 @@ tap_dev_stop(struct rte_eth_dev *dev)
>   int i;
>   struct pmd_internals *internals = dev->data->dev_private;
>  
> + tap_link_set_down(dev);
> +
>   for (i = 0; i < internals->nb_queues; i++)
>   if (internals->rxq[i].fd != -1)
>   close(internals->rxq[i].fd);
> - tap_link_set_down(dev);
>  }
>  
>  static int

There's a word missing in your commit title.
Otherwise the patch is absolutely necessary.

Could you add this line to your patch, please, for traceability?

Fixes: ee418a25b0d3 ("net/tap: implement link up and down callbacks")

Thanks for fixing my bug.

Regards,
Pascal


Re: [dpdk-dev] [PATCH v2 4/6] net/tap: fix multi-queue support

2017-02-06 Thread Wiles, Keith

> On Feb 6, 2017, at 9:45 AM, Pascal Mazon  wrote:
> 
> On Sun,  5 Feb 2017 10:05:07 -0600
> Keith Wiles  wrote:
> 
>> At the same time remove the code which created the first device queue
>> at probe time. Now all queues are created during queue setup calls.
>> 
>> Signed-off-by: Keith Wiles 
>> ---
>> drivers/net/tap/rte_eth_tap.c | 104
>> ++ 1 file changed, 34 insertions(+),
>> 70 deletions(-)
>> 
>> diff --git a/drivers/net/tap/rte_eth_tap.c b/drivers/net/tap/rte_eth_tap.c
>> index 61659bc..7c923a2 100644
>> --- a/drivers/net/tap/rte_eth_tap.c
>> +++ b/drivers/net/tap/rte_eth_tap.c
>> @@ -57,7 +57,11 @@
>> #define ETH_TAP_IFACE_ARG   "iface"
>> #define ETH_TAP_SPEED_ARG   "speed"
>> 
>> +#ifdef IFF_MULTI_QUEUE
>> #define RTE_PMD_TAP_MAX_QUEUES   16
>> +#else
>> +#define RTE_PMD_TAP_MAX_QUEUES  1
>> +#endif
>> 
>> static struct rte_vdev_driver pmd_tap_drv;
>> 
>> @@ -114,17 +118,20 @@ struct pmd_internals {
>>  * supplied name.
>>  */
>> static int
>> -tun_alloc(char *name)
>> +tun_alloc(struct pmd_internals *pmd, uint16_t qid)
>> {
>>  struct ifreq ifr;
>> +#ifdef IFF_MULTI_QUEUE
>>  unsigned int features;
>> +#endif
>>  int fd;
>> 
>>  memset(&ifr, 0, sizeof(struct ifreq));
>> 
>>  ifr.ifr_flags = IFF_TAP | IFF_NO_PI;
>> -if (name && name[0])
>> -strncpy(ifr.ifr_name, name, IFNAMSIZ);
>> +strncpy(ifr.ifr_name, pmd->name, IFNAMSIZ);
>> +
>> +RTE_LOG(DEBUG, PMD, "ifr_name '%s'\n", ifr.ifr_name);
>> 
>>  fd = open(TUN_TAP_DEV_PATH, O_RDWR);
>>  if (fd < 0) {
>> @@ -132,37 +139,26 @@ tun_alloc(char *name)
>>  goto error;
>>  }
>> 
>> -/* Grab the TUN features to verify we can work */
>> +#ifdef IFF_MULTI_QUEUE
>> +/* Grab the TUN features to verify we can work multi-queue */
>>  if (ioctl(fd, TUNGETFEATURES, &features) < 0) {
>> -RTE_LOG(ERR, PMD, "Unable to get TUN/TAP features\n");
>> +RTE_LOG(ERR, PMD, "TAP unable to get TUN/TAP features\n");
>>  goto error;
>>  }
>> -RTE_LOG(DEBUG, PMD, "TUN/TAP Features %08x\n", features);
>> +RTE_LOG(DEBUG, PMD, "  TAP Features %08x\n", features);
>> 
>> -#ifdef IFF_MULTI_QUEUE
>> -if (!(features & IFF_MULTI_QUEUE) && (RTE_PMD_TAP_MAX_QUEUES > 1)) {
>> -RTE_LOG(DEBUG, PMD, "TUN/TAP device only one queue\n");
>> -goto error;
>> -} else if ((features & IFF_ONE_QUEUE) &&
>> -(RTE_PMD_TAP_MAX_QUEUES == 1)) {
>> -ifr.ifr_flags |= IFF_ONE_QUEUE;
>> -RTE_LOG(DEBUG, PMD, "Single queue only support\n");
>> -} else {
>> -ifr.ifr_flags |= IFF_MULTI_QUEUE;
>> -RTE_LOG(DEBUG, PMD, "Multi-queue support for %d queues\n",
>> +if (features & IFF_MULTI_QUEUE) {
>> +RTE_LOG(DEBUG, PMD, "  Multi-queue support for %d queues\n",
>>  RTE_PMD_TAP_MAX_QUEUES);
>> -}
>> -#else
>> -if (RTE_PMD_TAP_MAX_QUEUES > 1) {
>> -RTE_LOG(DEBUG, PMD, "TUN/TAP device only one queue\n");
>> -goto error;
>> -} else {
>> +ifr.ifr_flags |= IFF_MULTI_QUEUE;
>> +} else
>> +#endif
>> +{
>>  ifr.ifr_flags |= IFF_ONE_QUEUE;
>>  RTE_LOG(DEBUG, PMD, "Single queue only support\n");
>>  }
>> -#endif
>> 
>> -/* Set the TUN/TAP configuration and get the name if needed */
>> +/* Set the TUN/TAP configuration and set the name if needed */
>>  if (ioctl(fd, TUNSETIFF, (void *)&ifr) < 0) {
>>  RTE_LOG(ERR, PMD, "Unable to set TUNSETIFF for %s\n",
>>  ifr.ifr_name);
>> @@ -177,9 +173,15 @@ tun_alloc(char *name)
>>  goto error;
>>  }
>> 
>> -/* If the name is different that new name as default */
>> -if (name && strcmp(name, ifr.ifr_name))
>> -snprintf(name, RTE_ETH_NAME_MAX_LEN - 1, "%s", ifr.ifr_name);
>> +if (qid == 0) {
>> +if (ioctl(fd, SIOCGIFHWADDR, &ifr) == -1) {
>> +RTE_LOG(ERR, PMD, "ioctl failed (SIOCGIFHWADDR)
>> (%s)\n",
>> +ifr.ifr_name);
>> +goto error;
>> +}
>> +
>> +rte_memcpy(&pmd->eth_addr, ifr.ifr_hwaddr.sa_data, ETH_ALEN);
>> +}
>> 
>>  return fd;
>> 
>> @@ -507,7 +509,7 @@ tap_setup_queue(struct rte_eth_dev *dev,
>>  if (fd < 0) {
>>  RTE_LOG(INFO, PMD, "Add queue to TAP %s for qid
>> %d\n", pmd->name, qid);
>> -fd = tun_alloc(pmd->name);
>> +fd = tun_alloc(pmd, qid);
>>  if (fd < 0) {
>>  RTE_LOG(ERR, PMD, "tun_alloc(%s) failed\n",
>>  pmd->name);
>> @@ -629,34 +631,13 @@ static const struct eth_dev_ops ops = {
>> };
>> 
>> static int
>> -pmd_mac_address(int fd, struct ether_addr *addr)
>> -{
>> -struct ifreq ifr;
>> -
>> -if ((fd <= 0) || !addr)
>>

Re: [dpdk-dev] [PATCH v2 6/6] net/tap: link set down must be before close

2017-02-06 Thread Wiles, Keith

> On Feb 6, 2017, at 9:57 AM, Pascal Mazon  wrote:
> 
> Fixes: ee418a25b0d3 ("net/tap: implement link up and down callbacks”)

As I need to add these changes I will produce a v3 patch

Regards,
Keith



Re: [dpdk-dev] [RFC 0/8] eal: dynamic logs

2017-02-06 Thread Olivier Matz
On Mon, 6 Feb 2017 15:55:20 +, "Wiles, Keith"
 wrote:
> > On Feb 6, 2017, at 9:27 AM, Olivier Matz 
> > wrote:
> > 
> > On Mon, 6 Feb 2017 15:01:30 +, "Wiles, Keith"
> >  wrote:  
> >>> On Feb 6, 2017, at 8:10 AM, Olivier Matz 
> >>> wrote:
> >>> 
> >>> Hi Bruce,
> >>> 
> >>> On Mon, 6 Feb 2017 13:49:03 +, Bruce Richardson
> >>>  wrote:
>  On Mon, Feb 06, 2017 at 02:29:08PM +0100, Olivier Matz wrote:
> > The objective of this RFC patchset is to introduce a framework
> > to support dynamic log types in EAL. It also provides one
> > example of use (in i40e).
> > 
> > Features:
> > - log types are identified by a string
> > - at registration, a uniq identifier is associated to a log type
> > - each log type can have its level changed dynamically
> > - extend command line parameters to set the log level of a
> > specific type, or logs matching a regular expression
> > - keep compat with other legacy types (eal, malloc, ring, user*,
> > etc... keep their hardcoded log type value)
> > 
> > At the end, when, we can expect that all non-dataplane logs are
> > moved to be dynamic, so we can enable/disable them at runtime,
> > without recompiling. Many debug options can probably be removed
> > from configuration:
> > $ git grep DEBUG config/common_base | wc -l
> > 89
> > 
> > Comments are welcome!
> >   
>  Initial scan through the patches this looks pretty good. However,
>  rather than continuing with our own logging system, have you
>  investigated what other tracing and logging systems might be out
>  there that we could possibly re-use? Could LTTng be a good fit
>  for DPDK, perhaps?
> >>> 
> >>> I did not investigate much about existing logging system. I agree
> >>> that could be a good alternative too.
> >>> 
> >>> On the other hand, I'm not really confident in adding a dependency
> >>> to an external lib for a core component like eal. What if we
> >>> deicide tomorrow to port dpdk to windows or any baremetal env?
> >>> 
> >>> Also, as far as I remember, the components that depend on external
> >>> libraries are disabled today because we don't know how to properly
> >>> manage the dependency (ex: pmd-pcap, vhost-numa, pmd-mlx, …).
> >> 
> >> In a previous project I worked in we did not use log levels per say
> >> for debugging code. We did have a couple general logging for misc
> >> type logging.
> >> 
> >> When debugging you normally only need a couple files or functions
> >> that need to produce logging output. In that case we defined
> >> logging using the file and function as the key to determine if the
> >> dynamic log messages are output. We use a string in the format of
> >> "filename:function” we allowed for a wildcard to enable all
> >> functions in a file or you can select a single function.
> >> 
> >> Using this type of logging for debugging a system is the most
> >> useful if you just want general logging then we define something
> >> similar to what we have today.  
> > 
> > I think the "filename:function" is not adequate if you are not the
> > developer of that code. Example: you have a problem with a PMD, you
> > just enable all logs for this PMD (or even all PMDs), you can check
> > it and if you don't find the problem, you can send them on the ML
> > for help. I think you don't care where the code is located.  
> 
> I do not understand your concern the design allows you to enable a
> single file, which means all functions within a file “filename:*".

I think the user doesn't want to care about the file name of the C
source.


> In
> the case of the all PMDs it not the best way to debug as you get a
> lot of output that may not be even related to the problem you are
> trying to solve. The design does allow you to enable one or more PMDs
> if say you are debugging say two PMDs. The output would be more
> readable and less cluttered with output that is not germane to the
> problem.
> 
> If I was debugging the TAP driver I would like to just enable
> “rte_eth_tap_pmd.c:*” or maybe we can define a something registered
> other then file name e.g. rte_log_register(“tap_pmd”); “tap_pmd:*” or
> “tap_pmd:pmd_rx_burst” or “tap_pmd:rte_tap_pmd_probe”. We could for
> the PMDs just use the PMD name we define at registration.

That's very similar to what is proposed in the patch...

See:

+   i40e_logtype_init = rte_log_register("pmd.i40e.init");

+   i40e_logtype_driver = rte_log_register("pmd.i40e.driver");


> 
> Maybe the register option brings us closer to the same goal, but add
> the function or selecting a specific set functions. The design does
> require a more active lookup at run time for dynamic debugging and we
> would have to make sure if enabled it does not effect performance. We
> used a hash table to locate the enabled debug log output.

Again, I don't really see the added value to be able enable logs on a
per function basis. For users, it brings m

Re: [dpdk-dev] [PATCH v2] mempool: Introduce _populate_mz_range api

2017-02-06 Thread Olivier Matz
On Tue, 31 Jan 2017 20:02:26 +0530, Santosh Shukla
 wrote:
> Hi Olivier,
> 
> Reply inline.
> 
> On Tue, Jan 31, 2017 at 11:31:51AM +0100, Olivier Matz wrote:
> > Hi Santosh,
> > 
> > I guess this patch is targeted for 17.05, right?
> >  
> 
> Yes.
> 
> > Please see some other comments below.
> > 
> > On Fri, 20 Jan 2017 20:43:41 +0530,
> >  wrote:  
> > > From: Santosh Shukla 
> > > 
> > > HW pool manager e.g. Cavium SoC need s/w to program start and
> > > end address of pool. Currently there is no such api in
> > > ext-mempool.  
> > 
> > Today, the mempool objects are not necessarily contiguous in
> > virtual or physical memory. The only assumption that can be made is
> > that each object is contiguous (virtually and physically). If the
> > flag MEMPOOL_F_NO_PHYS_CONTIG is passed, each object is assured to
> > be contiguous virtually.
> >  
> 
> Some clarification: IIUC then pa/va addr for each hugepage (aka mz)
> is contiguous but no gaurantee of contiguity across the hugepages (aka
> muli-mzs), Right?
> 
> If above said is correct then case like pool start addr in one mz and
> end addr is on another mz is a problem scenario. Correct? That said
> then ext-mempool drivers will get affected in such cases.

Yes that's globally correct, except it is possible that several
hugepages are physically contiguous (it will try to, but it's not
guaranteed).


> 
> > > So introducing _populate_mz_range API which will let HW(pool
> > > manager) know about hugepage mapped virtual start and end
> > > address.  
> > 
> > rte_mempool_ops_populate_mz_range() looks a bit long. What about
> > rte_mempool_ops_populate() instead?  
> 
> Ok.
> 
> > > diff --git a/lib/librte_mempool/rte_mempool.c
> > > b/lib/librte_mempool/rte_mempool.c index 1c2aed8..9a39f5c 100644
> > > --- a/lib/librte_mempool/rte_mempool.c
> > > +++ b/lib/librte_mempool/rte_mempool.c
> > > @@ -568,6 +568,10 @@ static unsigned optimize_object_size(unsigned
> > > obj_size) else
> > >   paddr = mz->phys_addr;
> > >  
> > > + /* Populate mz range */
> > > + if ((mp->flags & MEMPOOL_F_POOL_CREATED) == 0)
> > > + rte_mempool_ops_populate_mz_range(mp,
> > > mz); +
> > >   if (rte_eal_has_hugepages()  
> > 
> > Given what I've said above, I think the populate() callback should
> > be in rte_mempool_populate_phys() instead of
> > rte_mempool_populate_default(). It would be called for each
> > contiguous zone.
> >  
> 
> Ok.
> 
> > > --- a/lib/librte_mempool/rte_mempool.h
> > > +++ b/lib/librte_mempool/rte_mempool.h
> > > @@ -387,6 +387,12 @@ typedef int (*rte_mempool_dequeue_t)(struct
> > > rte_mempool *mp, */
> > >  typedef unsigned (*rte_mempool_get_count)(const struct
> > > rte_mempool *mp); +/**
> > > + * Set the memzone va/pa addr range and len in the external pool.
> > > + */
> > > +typedef void (*rte_mempool_populate_mz_range_t)(struct
> > > rte_mempool *mp,
> > > + const struct rte_memzone *mz);
> > > +  
> > 
> > And this API would be:
> > typedef void (*rte_mempool_populate_t)(struct rte_mempool *mp,
> > char *vaddr, phys_addr_t paddr, size_t len)
> >  
> > 
> > If your hw absolutly needs a contiguous memory, a solution could be:
> > 
> > - add a new flag MEMPOOL_F_CONTIG (maybe a better nale could be
> >   found) saying that all the mempool objects must be contiguous.
> > - add the ops_populate() callback in rte_mempool_populate_phys(), as
> >   suggested above
> > 
> > Then:
> > 
> >   /* create an empty mempool */
> >   rte_mempool_create_empty(...);
> > 
> >   /* set the handler:
> >*   in the ext handler, the mempool flags are updated with
> >*   MEMPOOL_F_CONTIG
> >*/
> >   rte_mempool_set_ops_byname(..., "my_hardware");
> >   
> 
> You mean, ext handler will set mempool flag using 'pool_config'
> param; somethng like rte_mempool_ops_by_name(..., "my_hardware" ,
> MEMPOOL_F_CONTIG); ?

I don't mean changing the API of rte_mempool_set_ops_byname().
I was suggesting something like this:

static int your_handler_alloc(struct rte_mempool *mp)
{
/* handler init */
...

mp->flags |= MEMPOOL_F_CONTIG;
return 0;
}


And update rte_mempool_populate_*() functions to manage this flag:
instead of segment, just fail if it cannot fit in one segment. It won't
really solve the issue, but at least it will be properly detected, and
the user could take dispositions to have more contiguous memory (ex:
use larger hugepages, allocate hugepages at boot time).

> 
> >   /* if MEMPOOL_F_CONTIG is set, all populate() functions should
> > ensure
> >* that there is only one contiguous zone
> >*/
> >   rte_mempool_populate_default(...);
> >   
> 
> I am not too sure about scope of MEMPOOL_F_CONTIG. How
> MEMPOOL_F_CONTIG flag {setted by application/ driver by calling
> rte_mempool_create(..., flag)} can make sure only one contiguous
> zone? Like to understand from you.

As described above, there would be no change from application point of
view. The han

[dpdk-dev] [PATCH] sched: fix segmentation fault when freeing port

2017-02-06 Thread Jan Blunck
From: Alan Dewar 

Prevent a segmentation fault in rte_sched_port_free by only accessing
the port structure after the NULL pointer check has been made.

Fixes: 7b3c4f35 ("sched: fix releasing enqueued packets")

Signed-off-by: Alan Dewar 
Signed-off-by: Jan Blunck 
---
 lib/librte_sched/rte_sched.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index e6dace2..614705d 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -735,12 +735,14 @@ void
 rte_sched_port_free(struct rte_sched_port *port)
 {
uint32_t qindex;
-   uint32_t n_queues_per_port = rte_sched_port_queues_per_port(port);
+   uint32_t n_queues_per_port;
 
/* Check user parameters */
if (port == NULL)
return;
 
+   n_queues_per_port = rte_sched_port_queues_per_port(port);
+
/* Free enqueued mbufs */
for (qindex = 0; qindex < n_queues_per_port; qindex++) {
struct rte_mbuf **mbufs = rte_sched_port_qbase(port, qindex);
-- 
2.7.4



Re: [dpdk-dev] [RFC 0/8] eal: dynamic logs

2017-02-06 Thread Wiles, Keith

> On Feb 6, 2017, at 10:18 AM, Olivier Matz  wrote:
> 
> On Mon, 6 Feb 2017 15:55:20 +, "Wiles, Keith"
>  wrote:
>>> On Feb 6, 2017, at 9:27 AM, Olivier Matz 
>>> wrote:
>>> 
>>> On Mon, 6 Feb 2017 15:01:30 +, "Wiles, Keith"
>>>  wrote:  
> On Feb 6, 2017, at 8:10 AM, Olivier Matz 
> wrote:
> 
> Hi Bruce,
> 
> On Mon, 6 Feb 2017 13:49:03 +, Bruce Richardson
>  wrote:
>> On Mon, Feb 06, 2017 at 02:29:08PM +0100, Olivier Matz wrote:
>>> The objective of this RFC patchset is to introduce a framework
>>> to support dynamic log types in EAL. It also provides one
>>> example of use (in i40e).
>>> 
>>> Features:
>>> - log types are identified by a string
>>> - at registration, a uniq identifier is associated to a log type
>>> - each log type can have its level changed dynamically
>>> - extend command line parameters to set the log level of a
>>> specific type, or logs matching a regular expression
>>> - keep compat with other legacy types (eal, malloc, ring, user*,
>>> etc... keep their hardcoded log type value)
>>> 
>>> At the end, when, we can expect that all non-dataplane logs are
>>> moved to be dynamic, so we can enable/disable them at runtime,
>>> without recompiling. Many debug options can probably be removed
>>> from configuration:
>>> $ git grep DEBUG config/common_base | wc -l
>>> 89
>>> 
>>> Comments are welcome!
>>> 
>> Initial scan through the patches this looks pretty good. However,
>> rather than continuing with our own logging system, have you
>> investigated what other tracing and logging systems might be out
>> there that we could possibly re-use? Could LTTng be a good fit
>> for DPDK, perhaps?
> 
> I did not investigate much about existing logging system. I agree
> that could be a good alternative too.
> 
> On the other hand, I'm not really confident in adding a dependency
> to an external lib for a core component like eal. What if we
> deicide tomorrow to port dpdk to windows or any baremetal env?
> 
> Also, as far as I remember, the components that depend on external
> libraries are disabled today because we don't know how to properly
> manage the dependency (ex: pmd-pcap, vhost-numa, pmd-mlx, …).
 
 In a previous project I worked in we did not use log levels per say
 for debugging code. We did have a couple general logging for misc
 type logging.
 
 When debugging you normally only need a couple files or functions
 that need to produce logging output. In that case we defined
 logging using the file and function as the key to determine if the
 dynamic log messages are output. We use a string in the format of
 "filename:function” we allowed for a wildcard to enable all
 functions in a file or you can select a single function.
 
 Using this type of logging for debugging a system is the most
 useful if you just want general logging then we define something
 similar to what we have today.  
>>> 
>>> I think the "filename:function" is not adequate if you are not the
>>> developer of that code. Example: you have a problem with a PMD, you
>>> just enable all logs for this PMD (or even all PMDs), you can check
>>> it and if you don't find the problem, you can send them on the ML
>>> for help. I think you don't care where the code is located.  
>> 
>> I do not understand your concern the design allows you to enable a
>> single file, which means all functions within a file “filename:*".
> 
> I think the user doesn't want to care about the file name of the C
> source.

The filename is just one way to isolate the logging output for debugging it 
needs to be easily defined/found. IMO 80% of debugging is locating and 
isolating the problem quickly and in most cases the output from a single or 
very small set of files even isolated to a small set of functions in those 
files is the best.

> 
> 
>> In
>> the case of the all PMDs it not the best way to debug as you get a
>> lot of output that may not be even related to the problem you are
>> trying to solve. The design does allow you to enable one or more PMDs
>> if say you are debugging say two PMDs. The output would be more
>> readable and less cluttered with output that is not germane to the
>> problem.
>> 
>> If I was debugging the TAP driver I would like to just enable
>> “rte_eth_tap_pmd.c:*” or maybe we can define a something registered
>> other then file name e.g. rte_log_register(“tap_pmd”); “tap_pmd:*” or
>> “tap_pmd:pmd_rx_burst” or “tap_pmd:rte_tap_pmd_probe”. We could for
>> the PMDs just use the PMD name we define at registration.
> 
> That's very similar to what is proposed in the patch...
> 
> See:
> 
> + i40e_logtype_init = rte_log_register("pmd.i40e.init");
> 
> + i40e_logtype_driver = rte_log_register("pmd.i40e.driver”);

These strings are just arbitrary strings un

Re: [dpdk-dev] [RFC 0/8] mbuf: structure reorganization

2017-02-06 Thread Ananyev, Konstantin
Hi Olivier,
Looks good in general, some comments from me below.
Thanks
Konstantin

> 
> The main changes are:
> - reorder structure to increase vector performance on some non-ia
>   platforms.
> - add a 64bits timestamp field in the 1st cache line

Wonder why it deserves to be in first cache line?
How it differs from seqn below (pure SW stuff right now).

> - m->next, m->nb_segs, and m->refcnt are always initialized for mbufs
>   in the pool, avoiding the need of setting m->next (located in the
>   2nd cache line) in the Rx path for mono-segment packets.
> - change port and nb_segs to 16 bits

Not that I am completely against it,
but changing nb_segs to 16 bits seems like an overkill to me.
I think we can keep and extra 8bits for something more useful in future.

> - move seqn in the 2nd cache line
> 
> Things discussed but not done in the patchset:
> - move refcnt and nb_segs to the 2nd cache line: many drivers sets
>   them in the Rx path, so it could introduce a performance regression, or

I wonder can refcnt only be moved into the 2-nd cacheline?
As I understand thanks to other change (from above) m->refcnt 
will already be initialized, so RX code don't need to touch it.
Though yes, it still would require changes in all PMDs.

>   it would require to change all the drivers, which is not an easy task.
> - remove the m->port field: too much impact on many examples and libraries,
>   and some people highlighted they are using it.

Ok, but can it be moved into the second cache-line?

> - moving m->next in the 1st cache line: there is not enough room, and having
>   it set to NULL for unused mbuf should remove the need for it.
> - merge seqn and timestamp together in a union: we could imagine use cases
>   were both are activated. There is no flag indicating the presence of seqn,
>   so it looks preferable to keep them separated for now.
> 
> I made some basic performance tests (ixgbe) and see no regression, but
> the patchset requires more testing.
> 
> [1] http://dpdk.org/ml/archives/dev/2016-October/049338.html
> 


Re: [dpdk-dev] [PATCH] sched: fix segmentation fault when freeing port

2017-02-06 Thread Dumitrescu, Cristian


> -Original Message-
> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Jan Blunck
> Sent: Monday, February 6, 2017 5:33 PM
> To: dev@dpdk.org
> Cc: h.mikit...@gmail.com; Alan Dewar 
> Subject: [dpdk-dev] [PATCH] sched: fix segmentation fault when freeing port
> 
> From: Alan Dewar 
> 
> Prevent a segmentation fault in rte_sched_port_free by only accessing
> the port structure after the NULL pointer check has been made.
> 
> Fixes: 7b3c4f35 ("sched: fix releasing enqueued packets")
> 
> Signed-off-by: Alan Dewar 
> Signed-off-by: Jan Blunck 
> ---
>  lib/librte_sched/rte_sched.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
> index e6dace2..614705d 100644
> --- a/lib/librte_sched/rte_sched.c
> +++ b/lib/librte_sched/rte_sched.c
> @@ -735,12 +735,14 @@ void
>  rte_sched_port_free(struct rte_sched_port *port)
>  {
>   uint32_t qindex;
> - uint32_t n_queues_per_port =
> rte_sched_port_queues_per_port(port);
> + uint32_t n_queues_per_port;
> 
>   /* Check user parameters */
>   if (port == NULL)
>   return;
> 
> + n_queues_per_port = rte_sched_port_queues_per_port(port);
> +
>   /* Free enqueued mbufs */
>   for (qindex = 0; qindex < n_queues_per_port; qindex++) {
>   struct rte_mbuf **mbufs = rte_sched_port_qbase(port,
> qindex);
> --
> 2.7.4

Acked-by: Cristian Dumitrescu 



[dpdk-dev] [PATCH v3 1/7] net/tap: fix tap docs for device name

2017-02-06 Thread Keith Wiles
Signed-off-by: Keith Wiles 
---
 doc/guides/nics/tap.rst | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/doc/guides/nics/tap.rst b/doc/guides/nics/tap.rst
index 622b9e7..c4f207b 100644
--- a/doc/guides/nics/tap.rst
+++ b/doc/guides/nics/tap.rst
@@ -45,18 +45,18 @@ device.
 These TAP interfaces can be used with Wireshark or tcpdump or Pktgen-DPDK
 along with being able to be used as a network connection to the DPDK
 application. The method enable one or more interfaces is to use the
-``--vdev=net_tap`` option on the DPDK application command line. Each
-``--vdev=net_tap`` option give will create an interface named dtap0, dtap1,
+``--vdev=net_tap0`` option on the DPDK application command line. Each
+``--vdev=net_tap1`` option give will create an interface named dtap0, dtap1,
 and so on.
 
-The interfaced name can be changed by adding the ``iface=foo0``, for example::
+The interface name can be changed by adding the ``iface=foo0``, for example::
 
-   --vdev=net_tap,iface=foo0 --vdev=net_tap,iface=foo1, ...
+   --vdev=net_tap0,iface=foo0 --vdev=net_tap1,iface=foo1, ...
 
 Also the speed of the interface can be changed from 10G to whatever number
 needed, but the interface does not enforce that speed, for example::
 
-   --vdev=net_tap,iface=foo0,speed=25000
+   --vdev=net_tap0,iface=foo0,speed=25000
 
 After the DPDK application is started you can send and receive packets on the
 interface using the standard rx_burst/tx_burst APIs in DPDK. From the host
@@ -97,7 +97,7 @@ following::
 
 sudo ./app/app/x86_64-native-linuxapp-gcc/app/pktgen -l 1-5 -n 4\
  --proc-type auto --log-level 8 --socket-mem 512,512 --file-prefix pg   \
- --vdev=net_tap --vdev=net_tap -b 05:00.0 -b 05:00.1\
+ --vdev=net_tap0 --vdev=net_tap1 -b 05:00.0 -b 05:00.1  \
  -b 04:00.0 -b 04:00.1 -b 04:00.2 -b 04:00.3\
  -b 81:00.0 -b 81:00.1 -b 81:00.2 -b 81:00.3\
  -b 82:00.0 -b 83:00.0 -- -T -P -m [2:3].0 -m [4:5].1   \
@@ -131,6 +131,6 @@ time with ``start all``. The command ``str`` is an alias 
for ``start all`` and
 
 While running you should see the 64 byte counters increasing to verify the
 traffic is being looped back. You can use ``set all size XXX`` to change the
-size of the packets after you stop the traffic. Use the pktgen ``help``
+size of the packets after you stop the traffic. Use pktgen ``help``
 command to see a list of all commands. You can also use the ``-f`` option to
-load commands at startup.
+load commands at startup in command line or Lua script in pktgen.
-- 
2.8.0.GIT



[dpdk-dev] [PATCH v3 2/7] net/tap: remove redundant fds array

2017-02-06 Thread Keith Wiles
Signed-off-by: Keith Wiles 
---
 drivers/net/tap/rte_eth_tap.c | 12 
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/drivers/net/tap/rte_eth_tap.c b/drivers/net/tap/rte_eth_tap.c
index c7b04bb..6d93eb7 100644
--- a/drivers/net/tap/rte_eth_tap.c
+++ b/drivers/net/tap/rte_eth_tap.c
@@ -103,7 +103,6 @@ struct pmd_internals {
struct ether_addr eth_addr; /* Mac address of the device port */
 
int if_index;   /* IF_INDEX for the port */
-   int fds[RTE_PMD_TAP_MAX_QUEUES]; /* List of all file descriptors */
 
struct rx_queue rxq[RTE_PMD_TAP_MAX_QUEUES];/* List of RX queues */
struct tx_queue txq[RTE_PMD_TAP_MAX_QUEUES];/* List of TX queues */
@@ -349,8 +348,8 @@ tap_dev_stop(struct rte_eth_dev *dev)
struct pmd_internals *internals = dev->data->dev_private;
 
for (i = 0; i < internals->nb_queues; i++)
-   if (internals->fds[i] != -1)
-   close(internals->fds[i]);
+   if (internals->rxq[i].fd != -1)
+   close(internals->rxq[i].fd);
tap_link_set_down(dev);
 }
 
@@ -583,7 +582,6 @@ tap_rx_queue_setup(struct rte_eth_dev *dev,
if (fd == -1)
return -1;
 
-   internals->fds[rx_queue_id] = fd;
RTE_LOG(INFO, PMD, "RX TAP device name %s, qid %d on fd %d\n",
internals->name, rx_queue_id, internals->rxq[rx_queue_id].fd);
 
@@ -720,7 +718,6 @@ eth_dev_tap_create(const char *name, char *tap_name)
 
/* Presetup the fds to -1 as being not working */
for (i = 1; i < RTE_PMD_TAP_MAX_QUEUES; i++) {
-   pmd->fds[i] = -1;
pmd->rxq[i].fd = -1;
pmd->txq[i].fd = -1;
}
@@ -728,7 +725,6 @@ eth_dev_tap_create(const char *name, char *tap_name)
/* Take the TUN/TAP fd and place in the first location */
pmd->rxq[0].fd = fd;
pmd->txq[0].fd = fd;
-   pmd->fds[0] = fd;
 
if (pmd_mac_address(fd, dev, &pmd->eth_addr) < 0) {
RTE_LOG(ERR, PMD, "Unable to get MAC address\n");
@@ -849,8 +845,8 @@ rte_pmd_tap_remove(const char *name)
 
internals = eth_dev->data->dev_private;
for (i = 0; i < internals->nb_queues; i++)
-   if (internals->fds[i] != -1)
-   close(internals->fds[i]);
+   if (internals->rxq[i].fd != -1)
+   close(internals->rxq[i].fd);
 
rte_free(eth_dev->data->dev_private);
rte_free(eth_dev->data);
-- 
2.8.0.GIT



[dpdk-dev] [PATCH v3 3/7] net/tap: remove unused variable and minor cleanup

2017-02-06 Thread Keith Wiles
Signed-off-by: Keith Wiles 
---
 drivers/net/tap/rte_eth_tap.c | 10 +++---
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/drivers/net/tap/rte_eth_tap.c b/drivers/net/tap/rte_eth_tap.c
index 6d93eb7..61659bc 100644
--- a/drivers/net/tap/rte_eth_tap.c
+++ b/drivers/net/tap/rte_eth_tap.c
@@ -389,9 +389,7 @@ tap_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats 
*tap_stats)
tap_stats->q_ibytes[i] = pmd->rxq[i].stats.ibytes;
rx_total += tap_stats->q_ipackets[i];
rx_bytes_total += tap_stats->q_ibytes[i];
-   }
 
-   for (i = 0; i < imax; i++) {
tap_stats->q_opackets[i] = pmd->txq[i].stats.opackets;
tap_stats->q_errors[i] = pmd->txq[i].stats.errs;
tap_stats->q_obytes[i] = pmd->txq[i].stats.obytes;
@@ -416,9 +414,7 @@ tap_stats_reset(struct rte_eth_dev *dev)
for (i = 0; i < pmd->nb_queues; i++) {
pmd->rxq[i].stats.ipackets = 0;
pmd->rxq[i].stats.ibytes = 0;
-   }
 
-   for (i = 0; i < pmd->nb_queues; i++) {
pmd->txq[i].stats.opackets = 0;
pmd->txq[i].stats.errs = 0;
pmd->txq[i].stats.obytes = 0;
@@ -633,11 +629,11 @@ static const struct eth_dev_ops ops = {
 };
 
 static int
-pmd_mac_address(int fd, struct rte_eth_dev *dev, struct ether_addr *addr)
+pmd_mac_address(int fd, struct ether_addr *addr)
 {
struct ifreq ifr;
 
-   if ((fd <= 0) || !dev || !addr)
+   if ((fd <= 0) || !addr)
return -1;
 
memset(&ifr, 0, sizeof(ifr));
@@ -726,7 +722,7 @@ eth_dev_tap_create(const char *name, char *tap_name)
pmd->rxq[0].fd = fd;
pmd->txq[0].fd = fd;
 
-   if (pmd_mac_address(fd, dev, &pmd->eth_addr) < 0) {
+   if (pmd_mac_address(fd, &pmd->eth_addr) < 0) {
RTE_LOG(ERR, PMD, "Unable to get MAC address\n");
goto error_exit;
}
-- 
2.8.0.GIT



[dpdk-dev] [PATCH v3 4/7] net/tap: fix multi-queue support

2017-02-06 Thread Keith Wiles
At the same time remove the code which created the first device queue
at probe time. Now all queues are created during queue setup calls.

Signed-off-by: Keith Wiles 
---
 drivers/net/tap/rte_eth_tap.c | 104 ++
 1 file changed, 34 insertions(+), 70 deletions(-)

diff --git a/drivers/net/tap/rte_eth_tap.c b/drivers/net/tap/rte_eth_tap.c
index 61659bc..7c923a2 100644
--- a/drivers/net/tap/rte_eth_tap.c
+++ b/drivers/net/tap/rte_eth_tap.c
@@ -57,7 +57,11 @@
 #define ETH_TAP_IFACE_ARG   "iface"
 #define ETH_TAP_SPEED_ARG   "speed"
 
+#ifdef IFF_MULTI_QUEUE
 #define RTE_PMD_TAP_MAX_QUEUES 16
+#else
+#define RTE_PMD_TAP_MAX_QUEUES 1
+#endif
 
 static struct rte_vdev_driver pmd_tap_drv;
 
@@ -114,17 +118,20 @@ struct pmd_internals {
  * supplied name.
  */
 static int
-tun_alloc(char *name)
+tun_alloc(struct pmd_internals *pmd, uint16_t qid)
 {
struct ifreq ifr;
+#ifdef IFF_MULTI_QUEUE
unsigned int features;
+#endif
int fd;
 
memset(&ifr, 0, sizeof(struct ifreq));
 
ifr.ifr_flags = IFF_TAP | IFF_NO_PI;
-   if (name && name[0])
-   strncpy(ifr.ifr_name, name, IFNAMSIZ);
+   strncpy(ifr.ifr_name, pmd->name, IFNAMSIZ);
+
+   RTE_LOG(DEBUG, PMD, "ifr_name '%s'\n", ifr.ifr_name);
 
fd = open(TUN_TAP_DEV_PATH, O_RDWR);
if (fd < 0) {
@@ -132,37 +139,26 @@ tun_alloc(char *name)
goto error;
}
 
-   /* Grab the TUN features to verify we can work */
+#ifdef IFF_MULTI_QUEUE
+   /* Grab the TUN features to verify we can work multi-queue */
if (ioctl(fd, TUNGETFEATURES, &features) < 0) {
-   RTE_LOG(ERR, PMD, "Unable to get TUN/TAP features\n");
+   RTE_LOG(ERR, PMD, "TAP unable to get TUN/TAP features\n");
goto error;
}
-   RTE_LOG(DEBUG, PMD, "TUN/TAP Features %08x\n", features);
+   RTE_LOG(DEBUG, PMD, "  TAP Features %08x\n", features);
 
-#ifdef IFF_MULTI_QUEUE
-   if (!(features & IFF_MULTI_QUEUE) && (RTE_PMD_TAP_MAX_QUEUES > 1)) {
-   RTE_LOG(DEBUG, PMD, "TUN/TAP device only one queue\n");
-   goto error;
-   } else if ((features & IFF_ONE_QUEUE) &&
-   (RTE_PMD_TAP_MAX_QUEUES == 1)) {
-   ifr.ifr_flags |= IFF_ONE_QUEUE;
-   RTE_LOG(DEBUG, PMD, "Single queue only support\n");
-   } else {
-   ifr.ifr_flags |= IFF_MULTI_QUEUE;
-   RTE_LOG(DEBUG, PMD, "Multi-queue support for %d queues\n",
+   if (features & IFF_MULTI_QUEUE) {
+   RTE_LOG(DEBUG, PMD, "  Multi-queue support for %d queues\n",
RTE_PMD_TAP_MAX_QUEUES);
-   }
-#else
-   if (RTE_PMD_TAP_MAX_QUEUES > 1) {
-   RTE_LOG(DEBUG, PMD, "TUN/TAP device only one queue\n");
-   goto error;
-   } else {
+   ifr.ifr_flags |= IFF_MULTI_QUEUE;
+   } else
+#endif
+   {
ifr.ifr_flags |= IFF_ONE_QUEUE;
RTE_LOG(DEBUG, PMD, "Single queue only support\n");
}
-#endif
 
-   /* Set the TUN/TAP configuration and get the name if needed */
+   /* Set the TUN/TAP configuration and set the name if needed */
if (ioctl(fd, TUNSETIFF, (void *)&ifr) < 0) {
RTE_LOG(ERR, PMD, "Unable to set TUNSETIFF for %s\n",
ifr.ifr_name);
@@ -177,9 +173,15 @@ tun_alloc(char *name)
goto error;
}
 
-   /* If the name is different that new name as default */
-   if (name && strcmp(name, ifr.ifr_name))
-   snprintf(name, RTE_ETH_NAME_MAX_LEN - 1, "%s", ifr.ifr_name);
+   if (qid == 0) {
+   if (ioctl(fd, SIOCGIFHWADDR, &ifr) == -1) {
+   RTE_LOG(ERR, PMD, "ioctl failed (SIOCGIFHWADDR) (%s)\n",
+   ifr.ifr_name);
+   goto error;
+   }
+
+   rte_memcpy(&pmd->eth_addr, ifr.ifr_hwaddr.sa_data, ETH_ALEN);
+   }
 
return fd;
 
@@ -507,7 +509,7 @@ tap_setup_queue(struct rte_eth_dev *dev,
if (fd < 0) {
RTE_LOG(INFO, PMD, "Add queue to TAP %s for qid %d\n",
pmd->name, qid);
-   fd = tun_alloc(pmd->name);
+   fd = tun_alloc(pmd, qid);
if (fd < 0) {
RTE_LOG(ERR, PMD, "tun_alloc(%s) failed\n",
pmd->name);
@@ -629,34 +631,13 @@ static const struct eth_dev_ops ops = {
 };
 
 static int
-pmd_mac_address(int fd, struct ether_addr *addr)
-{
-   struct ifreq ifr;
-
-   if ((fd <= 0) || !addr)
-   return -1;
-
-   memset(&ifr, 0, sizeof(ifr));
-
-   if (ioctl(fd, SIOCGIFHWADDR, &ifr) == -1) {
-   RTE_LOG(ERR, PMD, "ioctl failed (SIOCGIFHWADDR) (%s)\n",
-   ifr.ifr_name);
-   return -1;

[dpdk-dev] [PATCH v3 5/7] net/tap: cleanup log messages

2017-02-06 Thread Keith Wiles
Signed-off-by: Keith Wiles 
---
 drivers/net/tap/rte_eth_tap.c | 54 ++-
 1 file changed, 28 insertions(+), 26 deletions(-)

diff --git a/drivers/net/tap/rte_eth_tap.c b/drivers/net/tap/rte_eth_tap.c
index 7c923a2..65e4bab 100644
--- a/drivers/net/tap/rte_eth_tap.c
+++ b/drivers/net/tap/rte_eth_tap.c
@@ -155,12 +155,13 @@ tun_alloc(struct pmd_internals *pmd, uint16_t qid)
 #endif
{
ifr.ifr_flags |= IFF_ONE_QUEUE;
-   RTE_LOG(DEBUG, PMD, "Single queue only support\n");
+   RTE_LOG(DEBUG, PMD, "  Single queue only support\n");
}
 
/* Set the TUN/TAP configuration and set the name if needed */
if (ioctl(fd, TUNSETIFF, (void *)&ifr) < 0) {
-   RTE_LOG(ERR, PMD, "Unable to set TUNSETIFF for %s\n",
+   RTE_LOG(WARNING, PMD,
+   "Unable to set TUNSETIFF for %s\n",
ifr.ifr_name);
perror("TUNSETIFF");
goto error;
@@ -168,7 +169,9 @@ tun_alloc(struct pmd_internals *pmd, uint16_t qid)
 
/* Always set the file descriptor to non-blocking */
if (fcntl(fd, F_SETFL, O_NONBLOCK) < 0) {
-   RTE_LOG(ERR, PMD, "Unable to set to nonblocking\n");
+   RTE_LOG(WARNING, PMD,
+   "Unable to set %s to nonblocking\n",
+   ifr.ifr_name);
perror("F_SETFL, NONBLOCK");
goto error;
}
@@ -207,7 +210,7 @@ pmd_rx_burst(void *queue, struct rte_mbuf **bufs, uint16_t 
nb_pkts)
/* allocate the next mbuf */
mbuf = rte_pktmbuf_alloc(rxq->mp);
if (unlikely(!mbuf)) {
-   RTE_LOG(WARNING, PMD, "Unable to allocate mbuf\n");
+   RTE_LOG(WARNING, PMD, "TAP unable to allocate mbuf\n");
break;
}
 
@@ -276,7 +279,8 @@ pmd_tx_burst(void *queue, struct rte_mbuf **bufs, uint16_t 
nb_pkts)
return num_tx;
 }
 
-static int tap_link_set_flags(struct pmd_internals *pmd, short flags, int add)
+static int
+tap_link_set_flags(struct pmd_internals *pmd, short flags, int add)
 {
struct ifreq ifr;
int err, s;
@@ -296,8 +300,8 @@ static int tap_link_set_flags(struct pmd_internals *pmd, 
short flags, int add)
strncpy(ifr.ifr_name, pmd->name, IFNAMSIZ);
err = ioctl(s, SIOCGIFFLAGS, &ifr);
if (err < 0) {
-   RTE_LOG(ERR, PMD, "Unable to get tap netdevice flags: %s\n",
-   strerror(errno));
+   RTE_LOG(WARNING, PMD, "Unable to get %s device flags: %s\n",
+   pmd->name, strerror(errno));
close(s);
return -1;
}
@@ -307,7 +311,7 @@ static int tap_link_set_flags(struct pmd_internals *pmd, 
short flags, int add)
ifr.ifr_flags &= ~flags;
err = ioctl(s, SIOCSIFFLAGS, &ifr);
if (err < 0) {
-   RTE_LOG(ERR, PMD, "Unable to %s flags 0x%x: %s\n",
+   RTE_LOG(WARNING, PMD, "Unable to %s flags 0x%x: %s\n",
add ? "set" : "unset", flags, strerror(errno));
close(s);
return -1;
@@ -511,8 +515,8 @@ tap_setup_queue(struct rte_eth_dev *dev,
pmd->name, qid);
fd = tun_alloc(pmd, qid);
if (fd < 0) {
-   RTE_LOG(ERR, PMD, "tun_alloc(%s) failed\n",
-   pmd->name);
+   RTE_LOG(ERR, PMD, "tun_alloc(%s, %d) failed\n",
+   pmd->name, qid);
return -1;
}
}
@@ -557,8 +561,8 @@ tap_rx_queue_setup(struct rte_eth_dev *dev,
int fd;
 
if ((rx_queue_id >= internals->nb_queues) || !mp) {
-   RTE_LOG(ERR, PMD, "nb_queues %d mp %p\n",
-   internals->nb_queues, mp);
+   RTE_LOG(WARNING, PMD, "nb_queues %d too small or mempool 
NULL\n",
+   internals->nb_queues);
return -1;
}
 
@@ -570,7 +574,7 @@ tap_rx_queue_setup(struct rte_eth_dev *dev,
RTE_PKTMBUF_HEADROOM);
 
if (buf_size < ETH_FRAME_LEN) {
-   RTE_LOG(ERR, PMD,
+   RTE_LOG(WARNING, PMD,
"%s: %d bytes will not fit in mbuf (%d bytes)\n",
dev->data->name, ETH_FRAME_LEN, buf_size);
return -ENOMEM;
@@ -580,7 +584,7 @@ tap_rx_queue_setup(struct rte_eth_dev *dev,
if (fd == -1)
return -1;
 
-   RTE_LOG(INFO, PMD, "RX TAP device name %s, qid %d on fd %d\n",
+   RTE_LOG(DEBUG, PMD, "  RX TAP device name %s, qid %d on fd %d\n",
internals->name, rx_queue_id, internals->rxq[rx_queue_id].fd);
 
return 0;
@@ 

[dpdk-dev] [PATCH v3 6/7] net/tap: link set down must be done before close

2017-02-06 Thread Keith Wiles
Fixes: ee418a25b0d3 ("net/tap: implement link up and down callbacks")

Signed-off-by: Keith Wiles 
---
 drivers/net/tap/rte_eth_tap.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/tap/rte_eth_tap.c b/drivers/net/tap/rte_eth_tap.c
index 65e4bab..966e91a 100644
--- a/drivers/net/tap/rte_eth_tap.c
+++ b/drivers/net/tap/rte_eth_tap.c
@@ -353,10 +353,11 @@ tap_dev_stop(struct rte_eth_dev *dev)
int i;
struct pmd_internals *internals = dev->data->dev_private;
 
+   tap_link_set_down(dev);
+
for (i = 0; i < internals->nb_queues; i++)
if (internals->rxq[i].fd != -1)
close(internals->rxq[i].fd);
-   tap_link_set_down(dev);
 }
 
 static int
-- 
2.8.0.GIT



[dpdk-dev] [PATCH v3 7/7] net/tap: move closing fds to pmd close from pmd stop

2017-02-06 Thread Keith Wiles
At the same time remove closing fds code from pmd stop routine.

Signed-off-by: Keith Wiles 
---
 drivers/net/tap/rte_eth_tap.c | 18 +++---
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/drivers/net/tap/rte_eth_tap.c b/drivers/net/tap/rte_eth_tap.c
index 966e91a..0a7f4af 100644
--- a/drivers/net/tap/rte_eth_tap.c
+++ b/drivers/net/tap/rte_eth_tap.c
@@ -350,14 +350,7 @@ tap_dev_start(struct rte_eth_dev *dev)
 static void
 tap_dev_stop(struct rte_eth_dev *dev)
 {
-   int i;
-   struct pmd_internals *internals = dev->data->dev_private;
-
tap_link_set_down(dev);
-
-   for (i = 0; i < internals->nb_queues; i++)
-   if (internals->rxq[i].fd != -1)
-   close(internals->rxq[i].fd);
 }
 
 static int
@@ -431,6 +424,17 @@ tap_stats_reset(struct rte_eth_dev *dev)
 static void
 tap_dev_close(struct rte_eth_dev *dev __rte_unused)
 {
+   int i;
+   struct pmd_internals *internals = dev->data->dev_private;
+
+   tap_link_set_down(dev);
+
+   for (i = 0; i < internals->nb_queues; i++) {
+   if (internals->rxq[i].fd != -1)
+   close(internals->rxq[i].fd);
+   internals->rxq[i].fd = -1;
+   internals->txq[i].fd = -1;
+   }
 }
 
 static void
-- 
2.8.0.GIT



[dpdk-dev] i40e and memory allocations restricted on node 1

2017-02-06 Thread Ivan Nardi
Hi
With dpdk 17.02-rc2 i40e driver doesn't load at all when memory
allocation is restricted on numa node 1 (on system with more than 1
node, obviously)


[root@micro ~]# /tmp/testpmd -d /home/micro/lib/librte_pmd_i40e.so -c
0xFFFC000 -w :81:00.0  -w :81:00.1 -n 4 --socket-mem=0,8192 --
-i --socket-num=1
EAL: Detected 56 lcore(s)
EAL: Probing VFIO support...
EAL: PCI device :81:00.0 on NUMA socket 1
EAL:   probe driver: 8086:1572 net_i40e
PMD: eth_i40e_dev_init(): FW 4.40 API 1.4 NVM 04.05.03 eetrack 80001cd8
RING: Cannot reserve memory
HASH: memory allocation failed
PMD: i40e_init_ethtype_filter_list(): Failed to create ethertype hash table!
EAL: Error - exiting with code: 1
  Cause: Requested device :81:00.0 cannot be used


Everything is fine with 16.11, or allowing allocations from both nodes

I was able to locate the code to fix (at least, I hope so), but I
don't really know if it is better to allocate from SOCKET_ID_ANY of
from the exact node associated with the network device, something like
dev->data->numa_node ("pseudo" patch below)


--- dpdk-17.02-rc2.orig/drivers/net/i40e/i40e_ethdev.c2017-01-30
23:47:11.0 +0100
+++ dpdk-17.02-rc2/drivers/net/i40e/i40e_ethdev.c2017-02-06
21:53:37.812313120 +0100
@@ -899,6 +899,7 @@
 .entries = I40E_MAX_ETHERTYPE_FILTER_NUM,
 .key_len = sizeof(struct i40e_ethertype_filter_input),
 .hash_func = rte_hash_crc,
+.socket_id = SOCKET_ID_ANY,
 };

 /* Initialize ethertype filter rule list and hash */
@@ -942,6 +943,7 @@
 .entries = I40E_MAX_TUNNEL_FILTER_NUM,
 .key_len = sizeof(struct i40e_tunnel_filter_input),
 .hash_func = rte_hash_crc,
+.socket_id = SOCKET_ID_ANY,
 };

 /* Initialize tunnel filter rule list and hash */
@@ -985,6 +987,7 @@
 .entries = I40E_MAX_FDIR_FILTER_NUM,
 .key_len = sizeof(struct rte_eth_fdir_input),
 .hash_func = rte_hash_crc,
+.socket_id = SOCKET_ID_ANY,
 };

 /* Initialize flow director filter rule list and hash */



Any thoughts?
Thanks in advance
Ivan


Re: [dpdk-dev] i40e_aq_get_phy_capabilities() fails when using SFP+ with no link

2017-02-06 Thread Ivan Nardi
Thanks for the suggestions!
I'll try them and I will report back the results in the next days.
Regards
Ivan

On 6 February 2017 at 02:04, Zhang, Qi Z  wrote:
> Hi Ivan:
>
> I'm looking at this issue, but I can't repeat it on my environment 
> both with X710x4 and XL710x1
> Not sure if you could try below things to help narrow down this issue.
>
> 1) move i40e_dev_sync_phy_type call after i40e_set_fc call, to see if 
> the problem still exist, since without i40e_dev_sync_phy_type, i40e_set_fc is 
> the first place i40e_aq_get_phy_capabilities get called and we didn't see 
> this issue before 16.11.
>
> 2) if above change works, at least we have a work around, if above 
> still fail, please modify the parameter of i40e_aq_get_phy_capabilities in 
> i40e_dev_sync_phy_type as below and check result.
> - status = i40e_aq_get_phy_capabilities(hw, false, true, &phy_ab,
> - NULL);
> + status = i40e_aq_get_phy_capabilities(hw, false, false, &phy_ab,
> + NULL);
>
> Thank you!
>
> Regards
> Qi
>
>> -Original Message-
>> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Ivan Nardi
>> Sent: Monday, February 6, 2017 4:19 AM
>> To: dev@dpdk.org
>> Cc: Olivier MATZ ; Christos Ricudis
>> ; Rowden, Aaron F
>> ; Zhang, Helin ; Wu,
>> Jingjing 
>> Subject: Re: [dpdk-dev] i40e_aq_get_phy_capabilities() fails when using SFP+
>> with no link
>>
>> HI
>> same issue with 17.02-rc2
>> It seems to me the problem I am facing is similar to the ones reported in
>> these mails; if not, I apologize to have used this thread
>>
>> Ivan
>>
>> On 5 February 2017 at 16:30, Ivan Nardi  wrote:
>>
>> > Hi guys
>> > any updates on this issue?
>
>> > We are facing a very similar problem.
>> > We have a server with 4 nics X710 4*10Gbit and the dpdk randomly
>> > failed to start with the error:
>> >
>> > PMD: eth_i40e_dev_init(): FW 4.40 API 1.4 NVM 04.05.03 eetrack
>> > 80001cd8
>> > PMD: eth_i40e_dev_init(): Failed to sync phy type: -95
>> >
>> > It happens randomly (sometimes it works properly, sometimes not), the
>> > "failed" port index is random too and it happens whether the fibers
>> > have been connected or not.
>> >
>> > We are using dpdk 16.11.
>> >
>> > Any help would be appreciated
>> > Thanks in advance
>> >
>> > Ivan
>> >
>> > On 18 January 2017 at 11:15, Christos Ricudis
>> > 
>> > wrote:
>> >
>> >> Hi,
>> >>
>> >> > On 12 Jan 2017, at 21:55, Olivier MATZ 
>> wrote:
>> >> >
>> >> > Hi,
>> >> >
>> >> > On Wed, 11 Jan 2017 20:51:58 +, "Rowden, Aaron F"
>> >> >  wrote:
>> >> >> Hi Helin,
>> >> >>
>> >> >> I'm checking on this to see why it could be failing but I don’t
>> >> >> think this is one part of formal validation. Intel modules are
>> >> >> always what is recommended.
>> >> >>
>> >> >> Aaron
>> >> >>
>> >> >>> Hi Helin,
>> >> >>>
>> >>  On 11 Jan 2017, at 09:08, Zhang, Helin 
>> >>  wrote:
>> >> 
>> >>  Hi Aaron
>> >> 
>> >>  Is the SFP+ (Finisar FTLX8571D3BCL) supported and validated by
>> >>  Intel? It seems there is some PHY issue in this case.
>> >> >>>
>> >> >>> As the original reporter of this issue, I will test with
>> >> >>> validated
>> >> >>> SFP+s and will report on my testing.
>> >> >>>
>> >> >>> Shouldn’t unsupported SFP+s be blacklisted in the I40E driver?
>> >> >>>
>> >> >
>> >> > Just to let you know that in my case the SFP are Intel ones.
>> >> > Maybe it's a different issue.
>> >> >
>> >> > I see there are some i40e fixes in the net-next repo, I'll give a
>> >> > try with this version.
>> >> >
>> >> > Regards,
>> >> > Olivier
>> >>
>> >> After further testing, I can confirm that this issue persists with
>> >> supported Intel SFPs (Intel FTLX8571D3BCV-IT).
>> >>
>> >> As for the changeset introducing this issue - we had failure reports
>> >> with previous DPDK versions, probably related to LSE handling, but
>> >> these weren’t properly investigated. The change in 16.11 which calls
>> >> get_phy_capability too early in initialization stage might have
>> >> alleviated the issue making it easier for us to detect and confirm.
>> >>
>> >> Best regards,
>> >> Christos Ricudis.
>> >>
>> >>
>> >


Re: [dpdk-dev] [PATCH] doc: add known issue for virtio TSO with clones

2017-02-06 Thread Stephen Hemminger
On Mon,  6 Feb 2017 14:16:53 +0100
Olivier Matz  wrote:

> +**Implication**:
> +   In this situation, the shared data will be modified by the driver,
> +   potentially causing race conditions with the other users of the mbuf
> +   data.

This is a driver bug! The driver must check for refcount and not modify
mbuf's whose refcnt > 1. For those mbuf's it should copy the mbuf and modify
the copy. 

It is not good to force buggy refcnt handling back onto the application.
It is acceptable to document the performance impact.


Re: [dpdk-dev] bnx2x and VFIO

2017-02-06 Thread Harish Patil
>
>This PMD driver doesn't seem to work with VFIO.  I haven't had time
>to look into it very closely, so I was going to ask here first just
>in case someone had any ideas why this might be the case.  Thanks
>for any pointers here!
>
Hi Chas,
Yes its a known issue - we haven’t looked at it yet, but its in our to-do
list.
Thanks.



Re: [dpdk-dev] 17.02-rc2: i40e and LSC

2017-02-06 Thread Zhang, Qi Z
Hi Ivan:
This is a bug and is fixed with below patch
http://dpdk.org/dev/patchwork/patch/20163/
Regards
Qi

> -Original Message-
> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Ivan Nardi
> Sent: Monday, February 6, 2017 9:13 PM
> To: dev@dpdk.org
> Subject: [dpdk-dev] 17.02-rc2: i40e and LSC
> 
> Hi guys
> we are upgrading our application from dpdk 16.11 to 17.02-rc2 and we are
> facing a strange behavior with i40e driver and LSC If we call
> rte_eth_dev_configure() with port_conf.intr_conf.lsc = 1, the call always
> failed with error
> 
> Configuring Port 0 (socket 1)
> rte_eth_dev_configure: driver net_i40e does not support lsc Fail to configure
> port 0
> 
> Everything is fine with 16.11 or with port_conf.intr_conf.lsc = 0 (default
> value)
> 
> Is this the intended (new) behavior or am I missing something obvious here?
> Thanks in advance
> 
> Ivan

Regards
Qi


Re: [dpdk-dev] i40e and memory allocations restricted on node 1

2017-02-06 Thread Zhang, Helin


> -Original Message-
> From: Ivan Nardi [mailto:nardi.i...@gmail.com]
> Sent: Tuesday, February 7, 2017 5:04 AM
> To: dev@dpdk.org
> Cc: Zhang, Helin; Wu, Jingjing
> Subject: i40e and memory allocations restricted on node 1
> 
> Hi
> With dpdk 17.02-rc2 i40e driver doesn't load at all when memory allocation is
> restricted on numa node 1 (on system with more than 1 node, obviously)
> 
> 
> [root@micro ~]# /tmp/testpmd -d /home/micro/lib/librte_pmd_i40e.so -c
> 0xFFFC000 -w :81:00.0  -w :81:00.1 -n 4 --socket-mem=0,8192 -- -i --
> socket-num=1
> EAL: Detected 56 lcore(s)
> EAL: Probing VFIO support...
> EAL: PCI device :81:00.0 on NUMA socket 1
> EAL:   probe driver: 8086:1572 net_i40e
> PMD: eth_i40e_dev_init(): FW 4.40 API 1.4 NVM 04.05.03 eetrack 80001cd8
This is a very old verion of firmware, it may have compatibility issue.
So first, firmware upgrade is needed.

> RING: Cannot reserve memory
It seems not reserve memory for ring, and not an issue specifically to i40e.
Let's double check that if it is common to all.

> HASH: memory allocation failed
> PMD: i40e_init_ethtype_filter_list(): Failed to create ethertype hash table!
> EAL: Error - exiting with code: 1
>   Cause: Requested device :81:00.0 cannot be used
> 
> 
> Everything is fine with 16.11, or allowing allocations from both nodes
> 
> I was able to locate the code to fix (at least, I hope so), but I don't really
> know if it is better to allocate from SOCKET_ID_ANY of from the exact node
> associated with the network device, something like
> dev->data->numa_node ("pseudo" patch below)
> 
> 
> --- dpdk-17.02-rc2.orig/drivers/net/i40e/i40e_ethdev.c2017-01-30
> 23:47:11.0 +0100
> +++ dpdk-17.02-rc2/drivers/net/i40e/i40e_ethdev.c2017-02-06
> 21:53:37.812313120 +0100
> @@ -899,6 +899,7 @@
>  .entries = I40E_MAX_ETHERTYPE_FILTER_NUM,
>  .key_len = sizeof(struct i40e_ethertype_filter_input),
>  .hash_func = rte_hash_crc,
> +.socket_id = SOCKET_ID_ANY,
>  };
> 
>  /* Initialize ethertype filter rule list and hash */ @@ -942,6 +943,7 @@
>  .entries = I40E_MAX_TUNNEL_FILTER_NUM,
>  .key_len = sizeof(struct i40e_tunnel_filter_input),
>  .hash_func = rte_hash_crc,
> +.socket_id = SOCKET_ID_ANY,
>  };
> 
>  /* Initialize tunnel filter rule list and hash */ @@ -985,6 +987,7 @@
>  .entries = I40E_MAX_FDIR_FILTER_NUM,
>  .key_len = sizeof(struct rte_eth_fdir_input),
>  .hash_func = rte_hash_crc,
> +.socket_id = SOCKET_ID_ANY,
>  };
> 
>  /* Initialize flow director filter rule list and hash */
> 
> 
> 
> Any thoughts?
> Thanks in advance
> Ivan


Re: [dpdk-dev] buf->hash.rss always empty with i40e

2017-02-06 Thread Zhang, Helin
Hi Tom

Sorry, by default, i40e HW is a bit different from ixgbe HW. It will not treat 
UDP as an IP packet, when setting hash enable flags.
But, a feature of configuring 'input set' can help to change the default HW 
behavior. Please refer to testpmd input set part to understand how to use that. 
Thanks!

Regards,
Helin

-Original Message-
From: tom.barbe...@ulg.ac.be [mailto:tom.barbe...@ulg.ac.be] 
Sent: Monday, February 6, 2017 11:08 PM
To: Ananyev, Konstantin 
Cc: Richardson, Bruce ; dev@dpdk.org; Zhang, Helin 
; Wu, Jingjing 
Subject: Re: buf->hash.rss always empty with i40e

Hi Konstantin,

It seems a little overkill to play with the key... The XL710 seems to be able 
to hash on IP fields only. It seems only a configuration issue, I'm adding i40e 
maintainers in CC so they can confirm? I think the i40e driver should configure 
XL710 to hash on IP fields for TCP and UDP when only ETH_RSS_IP is given 
instead of hashing to 0. That would mimic ixgbe behaviour btw.

Tom


- Mail original -
De: "Konstantin Ananyev" 
À: "tom barbette" , "Bruce Richardson" 

Cc: dev@dpdk.org
Envoyé: Lundi 6 Février 2017 13:25:59
Objet: RE: buf->hash.rss always empty with i40e

Hi Tom,

> 
> That also leave the question of how to HASH only on the IP tuple for 
> TCP and UDP packets? The use case is that I want all packets from the same IP 
> src/dst pair to be hashed to the same queue. This cannot be enforced with a 
> complete hash on TCP/UDP fields.
> 

That's for IPv4 only?
It was a while ago, when I looked at it, but as I remember you can achieve that 
by filling first 64 bits of RSS hash with some meaningful values, and keeping 
remaining RSS bits as zeroes.
You probably can give it a quick try with testpmd.

There is a nice article about similar subject:
http://www.ndsl.kaist.edu/~kyoungsoo/papers/TR-symRSS.pdf

Konstantin



Re: [dpdk-dev] [PATCH v2 3/5] net/ixgbe: fix bitmask of supported Tx flags

2017-02-06 Thread Wu, Jingjing


> -Original Message-
> From: Ananyev, Konstantin
> Sent: Monday, February 6, 2017 8:11 PM
> To: Wu, Jingjing ; dev@dpdk.org
> Subject: RE: [dpdk-dev] [PATCH v2 3/5] net/ixgbe: fix bitmask of supported
> Tx flags
> 
> 
> 
> > -Original Message-
> > From: Wu, Jingjing
> > Sent: Monday, February 6, 2017 8:54 AM
> > To: Ananyev, Konstantin ; dev@dpdk.org
> > Subject: RE: [dpdk-dev] [PATCH v2 3/5] net/ixgbe: fix bitmask of
> > supported Tx flags
> >
> >
> >
> > > -Original Message-
> > > From: Ananyev, Konstantin
> > > Sent: Sunday, February 5, 2017 8:11 PM
> > > To: Ananyev, Konstantin ; Wu, Jingjing
> > > ; dev@dpdk.org
> > > Subject: RE: [dpdk-dev] [PATCH v2 3/5] net/ixgbe: fix bitmask of
> > > supported Tx flags
> > >
> > >
> > >
> > > > -Original Message-
> > > > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Ananyev,
> > > > Konstantin
> > > > Sent: Sunday, February 5, 2017 11:59 AM
> > > > To: Wu, Jingjing ; dev@dpdk.org
> > > > Subject: Re: [dpdk-dev] [PATCH v2 3/5] net/ixgbe: fix bitmask of
> > > > supported Tx flags
> > > >
> > > > Hi Jingjing,
> > > >
> > > >
> > > > > -Original Message-
> > > > > From: Wu, Jingjing
> > > > > Sent: Saturday, February 4, 2017 3:36 AM
> > > > > To: dev@dpdk.org
> > > > > Cc: Wu, Jingjing ; Ananyev, Konstantin
> > > > > 
> > > > > Subject: [PATCH v2 3/5] net/ixgbe: fix bitmask of supported Tx
> > > > > flags
> > > > >
> > > > > Add missed flags to bitmask of all supported packet Tx flags.
> > > > >
> > > > > CC: konstantin.anan...@intel.com
> > > > > Fixes: 7829b8d52be0 ("net/ixgbe: add Tx preparation")
> > > > > Signed-off-by: Jingjing Wu 
> > > > > ---
> > > > >  drivers/net/ixgbe/ixgbe_rxtx.c | 17 -
> > > > >  1 file changed, 16 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c
> > > > > b/drivers/net/ixgbe/ixgbe_rxtx.c index 36f1c02..8454581 100644
> > > > > --- a/drivers/net/ixgbe/ixgbe_rxtx.c
> > > > > +++ b/drivers/net/ixgbe/ixgbe_rxtx.c
> > > > > @@ -81,13 +81,28 @@
> > > > >  #include "ixgbe_rxtx.h"
> > > > >
> > > > >  /* Bit Mask to indicate what bits required for building TX
> > > > > context */
> > > > > +#ifdef RTE_LIBRTE_IEEE1588
> > > > >  #define IXGBE_TX_OFFLOAD_MASK (   \
> > > > >   PKT_TX_VLAN_PKT |\
> > > > >   PKT_TX_IP_CKSUM |\
> > > > > + PKT_TX_IPV4 |\
> > > > >   PKT_TX_L4_MASK | \
> > > > > + PKT_TX_IEEE1588_TMST |   \
> > > > >   PKT_TX_TCP_SEG | \
> > > > >   PKT_TX_MACSEC |  \
> > > > > - PKT_TX_OUTER_IP_CKSUM)
> > > > > + PKT_TX_OUTER_IP_CKSUM |  \
> > > > > + PKT_TX_OUTER_IPV4)
> > > > > +#else
> > > > > +#define IXGBE_TX_OFFLOAD_MASK (   \
> > > > > + PKT_TX_VLAN_PKT |\
> > > > > + PKT_TX_IP_CKSUM |\
> > > > > + PKT_TX_IPV4 |\
> > > >
> > > > Wonder why ixgbe doesn't have PKT_TX_IPV6?
> > >
> > > Same question for e1000 and fm10k.
> > > Also if you decided to go that way, you'll probably need to update
> > > TX_OFFLOAD_MASK for enic and vmxnet3.
> > > That's why I still think it would be much less hassle not to include
> > > these flags
> > > (PKT_TX_IPV4 and PKT_TX_IPV6)  into TX_OFFLOAD_MASK at all.
> > > Konstantin
> > >
> > >
> > Thanks for pointing that. PKT_TX_IPV6 is missed.
> > About whether include these flags (PKT_TX_IPV4 and PKT_TX_IPV6)  into
> > TX_OFFLOAD_MASK, I think they should be Included. Think about one NIC
> may support IPV4 L4 checksum offload, but not support IPV6? Even I don't
> know who it is.
> >
> 
> I don't think such combination is possible now anyway.
> But ok, if your preference is it to do more work and add (PKT_TX_IPV4 |
> PKT_TX_IPV6) into all required places, I wouldn't argue.
> 
> BTW, as a side notice, what will be really good is to have a function that
> would take tx_offload_capabilities as an input and return tx_offload_mask.
> That would remove necessity to update/support TX_OFFLOAD_MASK for
> each PMD, and hopefully will allow to avoid confusion for PMD writers.
> Though that's probably subject of another patch.
> 

OK. I think what I did is more than necessary. Let me simplify the change. 
Thanks!

And about the querying tx offload capabilities, I think it is already been done
In rte_eth_dev_info_get. But it used another set of flags which is not TX flags
Defined in mbuf. 

Jingjing


Re: [dpdk-dev] [PATCH v2 2/5] net/i40e: fix bitmask of supported Tx flags

2017-02-06 Thread Wu, Jingjing


> -Original Message-
> From: Olivier Matz [mailto:olivier.m...@6wind.com]
> Sent: Monday, February 6, 2017 6:29 PM
> To: Wu, Jingjing 
> Cc: Yigit, Ferruh ; dev@dpdk.org; Zhang, Helin
> 
> Subject: Re: [dpdk-dev] [PATCH v2 2/5] net/i40e: fix bitmask of supported Tx
> flags
> 
> On Mon, 6 Feb 2017 03:02:12 +, "Wu, Jingjing"
>  wrote:
> > >
> > > Functionally will be same, but what do you think about following, to
> > > make easy to see what define adds:
> > >
> > > +#define I40E_TX_OFFLOAD_MASK (\
> > > + PKT_TX_IP_CKSUM |\
> > > + PKT_TX_IPV4 |\
> > > + PKT_TX_IPV6 |\
> > > + PKT_TX_L4_MASK | \
> > > + PKT_TX_OUTER_IP_CKSUM |  \
> > > + PKT_TX_OUTER_IPV4 |  \
> > > + PKT_TX_OUTER_IPV6 |  \
> > >
> > > +#ifdef RTE_LIBRTE_IEEE1588
> > > + PKT_TX_IEEE1588_TMST |   \
> > > +#endif
> > >
> > > + PKT_TX_TCP_SEG | \
> > > + PKT_TX_QINQ_PKT |\
> > > + PKT_TX_VLAN_PKT |\
> > > + PKT_TX_TUNNEL_MASK)
> > >
> >
> > Hi, Ferruh
> >
> > As I know, the above change is incorrect in C code. We cannot use
> > #ifdef  #endif inside #define
> >
> > Thanks
> > Jingjing
> 
> 
> You can do:
> 
> #ifdef RTE_LIBRTE_IEEE1588
> #define I40_TX_IEEE1588_TMST PKT_TX_IEEE1588_TMST #else #define
> I40_TX_IEEE1588_TMST 0 #endif
> 
> #define I40E_TX_OFFLOAD_MASK (   \
>   I40_TX_IEEE1588_TMST |   \
>   PKT_TX_IP_CKSUM |\
>   ...
> 

Thanks for the suggestion.
> 
> Regards,
> Olivier


Re: [dpdk-dev] [PATCH v2 2/5] net/i40e: fix bitmask of supported Tx flags

2017-02-06 Thread Wu, Jingjing


> -Original Message-
> From: Olivier Matz [mailto:olivier.m...@6wind.com]
> Sent: Monday, February 6, 2017 6:27 PM
> To: Wu, Jingjing 
> Cc: dev@dpdk.org; Zhang, Helin ; Ananyev,
> Konstantin ; Yigit, Ferruh
> 
> Subject: Re: [dpdk-dev] [PATCH v2 2/5] net/i40e: fix bitmask of supported Tx
> flags
> 
> Hi Jingjing,
> 
> On Sat,  4 Feb 2017 11:36:12 +0800, Jingjing Wu 
> wrote:
> > Some Tx offload flags are missed in bitmask of all supported packet Tx
> > flags by i40e.
> > This patch fixes it.
> 
> Could you detail which flag was missing?
> Is it PKT_TX_TUNNEL_MASK?
> If yes, shouldn't we have a "Fixes:" line?
> 
> I think most of the patchset should be merged in one patch, because
> changing only the mbuf part (PKT_TX_OFFLOAD_MASK) would break the
> drivers that checks the offload bits at init, and this is not suitable, 
> especially if
> we want to be able to do git bisect.
> 
> 
> My suggestion is to have:
> 
> 1- fix i40 (add missing tunnel mask?)
> 2- fix missing MACSET in TX_OFFLOAD_MASK
> 3- change TX_OFFLOAD_MASK to include all flags (this impacts all
>drivers using TX_OFFLOAD_MASK)
> 
>
OK. Will change according to your suggestion. It's clearer. Thanks

Jingjing


[dpdk-dev] [PATCH] cfgfile: fix uninitialized variable on load error

2017-02-06 Thread Dmitriy Yakovlev
Uninitialized scalar variable. Using uninitialized value 
cfg->sections[curr_section]->num_entries when calling rte_cfgfile_close.
And memory in variables cfg->sections[curr_section], sect->entries[curr_entry] 
maybe not equal NULL. We must decrement counters curr_section, curr_entry when 
failed to realloc.

Fixes: eaafbad419bf ("cfgfile: library to interpret config files")

Signed-off-by: Dmitriy Yakovlev 
---
 lib/librte_cfgfile/rte_cfgfile.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/lib/librte_cfgfile/rte_cfgfile.c b/lib/librte_cfgfile/rte_cfgfile.c
index d72052a..829109a 100644
--- a/lib/librte_cfgfile/rte_cfgfile.c
+++ b/lib/librte_cfgfile/rte_cfgfile.c
@@ -151,6 +151,7 @@ struct rte_cfgfile *
sizeof(*cfg) + sizeof(cfg->sections[0])
* allocated_sections);
if (n_cfg == NULL) {
+   curr_section--;
printf("Error - no more memory\n");
goto error1;
}
@@ -198,6 +199,7 @@ struct rte_cfgfile *
sizeof(sect->entries[0]) *
allocated_entries);
if (n_sect == NULL) {
+   curr_entry--;
printf("Error - no more memory\n");
goto error1;
}
@@ -233,6 +235,8 @@ struct rte_cfgfile *
 
 error1:
cfg->num_sections = curr_section + 1;
+   if (curr_section >= 0)
+   cfg->sections[curr_section]->num_entries = curr_entry + 1;
rte_cfgfile_close(cfg);
 error2:
fclose(f);
-- 
1.9.1



[dpdk-dev] [PATCH v3 0/4] fix bitmask of supported Tx flags

2017-02-06 Thread Jingjing Wu
Some Tx offload flags are missed in bitmask of all supported packet
Tx flags, it will cause rte_eth_tx_prepare fails when Tx burst packets.

v3 changes:
 - add PKT_TX_MACSEC to PKT_TX_OFFLOAD_MASK
 - refine code of drivers' TX_OFFLOAD_MASK definition

v2 changes:
 - redefine the PKT_TX_OFFLOAD_MASK
 - fix more drivers

Jingjing Wu (4):
  net/i40e: fix bitmask of supported Tx flags
  net/ixgbe: fix bitmask of supported Tx flags
  net/e1000: fix bitmask of supported Tx flags
  mbuf: fix bitmask of Tx offload flags

 drivers/net/e1000/igb_rxtx.c   |  8 +++-
 drivers/net/i40e/i40e_rxtx.c   | 10 +-
 drivers/net/ixgbe/ixgbe_rxtx.c |  8 +++-
 lib/librte_mbuf/rte_mbuf.h |  3 ++-
 4 files changed, 25 insertions(+), 4 deletions(-)

-- 
2.4.11



[dpdk-dev] [PATCH v3 2/4] net/ixgbe: fix bitmask of supported Tx flags

2017-02-06 Thread Jingjing Wu
Add missed PKT_TX_IEEE1588_TMST to bitmask of all supported
packet Tx flags.

Fixes: 7829b8d52be0 ("net/ixgbe: add Tx preparation")
Signed-off-by: Jingjing Wu 
---
 drivers/net/ixgbe/ixgbe_rxtx.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
index 36f1c02..4d71992 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx.c
@@ -80,6 +80,11 @@
 #include "base/ixgbe_common.h"
 #include "ixgbe_rxtx.h"
 
+#ifdef RTE_LIBRTE_IEEE1588
+#define IXGBE_TX_IEEE1588_TMST PKT_TX_IEEE1588_TMST
+#else
+#define IXGBE_TX_IEEE1588_TMST 0
+#endif
 /* Bit Mask to indicate what bits required for building TX context */
 #define IXGBE_TX_OFFLOAD_MASK ( \
PKT_TX_VLAN_PKT |\
@@ -87,7 +92,8 @@
PKT_TX_L4_MASK | \
PKT_TX_TCP_SEG | \
PKT_TX_MACSEC |  \
-   PKT_TX_OUTER_IP_CKSUM)
+   PKT_TX_OUTER_IP_CKSUM |  \
+   IXGBE_TX_IEEE1588_TMST)
 
 #define IXGBE_TX_OFFLOAD_NOTSUP_MASK \
(PKT_TX_OFFLOAD_MASK ^ IXGBE_TX_OFFLOAD_MASK)
-- 
2.4.11



[dpdk-dev] [PATCH v3 1/4] net/i40e: fix bitmask of supported Tx flags

2017-02-06 Thread Jingjing Wu
PKT_TX_TUNNEL_MASK and PKT_TX_IEEE1588_TMST are missed in bitmask
of all supported packet Tx flags by i40e. It will cause packet preparing
fail when sending tunnel packets with Tx offload.
This patch fixes it.

Fixes: 3f33e643e5c6 ("net/i40e: add Tx preparation")
Signed-off-by: Jingjing Wu 
---
 drivers/net/i40e/i40e_rxtx.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
index 608685f..48429cc 100644
--- a/drivers/net/i40e/i40e_rxtx.c
+++ b/drivers/net/i40e/i40e_rxtx.c
@@ -75,6 +75,12 @@
 
 #define I40E_TXD_CMD (I40E_TX_DESC_CMD_EOP | I40E_TX_DESC_CMD_RS)
 
+#ifdef RTE_LIBRTE_IEEE1588
+#define I40E_TX_IEEE1588_TMST PKT_TX_IEEE1588_TMST
+#else
+#define I40E_TX_IEEE1588_TMST 0
+#endif
+
 #define I40E_TX_CKSUM_OFFLOAD_MASK (\
PKT_TX_IP_CKSUM |\
PKT_TX_L4_MASK | \
@@ -87,7 +93,9 @@
PKT_TX_OUTER_IP_CKSUM | \
PKT_TX_TCP_SEG |\
PKT_TX_QINQ_PKT |   \
-   PKT_TX_VLAN_PKT)
+   PKT_TX_VLAN_PKT |   \
+   PKT_TX_TUNNEL_MASK |\
+   I40E_TX_IEEE1588_TMST)
 
 #define I40E_TX_OFFLOAD_NOTSUP_MASK \
(PKT_TX_OFFLOAD_MASK ^ I40E_TX_OFFLOAD_MASK)
-- 
2.4.11



[dpdk-dev] [PATCH v3 3/4] net/e1000: fix bitmask of supported Tx flags

2017-02-06 Thread Jingjing Wu
Add missed PKT_TX_IEEE1588_TMST to bitmask of all supported
packet Tx flags.

Fixes: 2b76648872c9 ("net/e1000: add Tx preparation")
Signed-off-by: Jingjing Wu 
---
 drivers/net/e1000/igb_rxtx.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/net/e1000/igb_rxtx.c b/drivers/net/e1000/igb_rxtx.c
index 45f3f24..c9cf392 100644
--- a/drivers/net/e1000/igb_rxtx.c
+++ b/drivers/net/e1000/igb_rxtx.c
@@ -72,12 +72,18 @@
 #include "base/e1000_api.h"
 #include "e1000_ethdev.h"
 
+#ifdef RTE_LIBRTE_IEEE1588
+#define IGB_TX_IEEE1588_TMST PKT_TX_IEEE1588_TMST
+#else
+#define IGB_TX_IEEE1588_TMST 0
+#endif
 /* Bit Mask to indicate what bits required for building TX context */
 #define IGB_TX_OFFLOAD_MASK (   \
PKT_TX_VLAN_PKT |\
PKT_TX_IP_CKSUM |\
PKT_TX_L4_MASK | \
-   PKT_TX_TCP_SEG)
+   PKT_TX_TCP_SEG | \
+   IGB_TX_IEEE1588_TMST)
 
 #define IGB_TX_OFFLOAD_NOTSUP_MASK \
(PKT_TX_OFFLOAD_MASK ^ IGB_TX_OFFLOAD_MASK)
-- 
2.4.11



[dpdk-dev] [PATCH v3 4/4] mbuf: fix bitmask of Tx offload flags

2017-02-06 Thread Jingjing Wu
Add missed PKT_TX_MACSEC flag to bitmask of all supported packet Tx
offload features flags.

Fixes: 4fb7e803eb1a ("ethdev: add Tx preparation")
Signed-off-by: Jingjing Wu 
---
 lib/librte_mbuf/rte_mbuf.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index 0d01167..2392995 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -300,7 +300,8 @@ extern "C" {
PKT_TX_TCP_SEG | \
PKT_TX_QINQ_PKT |\
PKT_TX_VLAN_PKT |\
-   PKT_TX_TUNNEL_MASK)
+   PKT_TX_TUNNEL_MASK | \
+   PKT_TX_MACSEC)
 
 #define __RESERVED   (1ULL << 61) /**< reserved for future mbuf use */
 
-- 
2.4.11



[dpdk-dev] [PATCH] net/i40e: fix ethertype filter func fail on X722

2017-02-06 Thread Jeff Guo
The GL_SWR_PRI_JOIN_MAP registers are affecting filters hit, modify
the register default value will result the ethertype filter function
fail. The GL_SWR_PRI_JOIN_MAP value is difference between different
NICs, should keep up the register value with default NVM value in X722.

Fixes: 973273c7a4b7 ("i40e: workaround for X710 performance")

Signed-off-by: Jeff Guo 
---
 drivers/net/i40e/i40e_ethdev.c | 27 ---
 1 file changed, 24 insertions(+), 3 deletions(-)

diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 4492bcc..49ab0de 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -8732,6 +8732,10 @@ i40e_pctype_to_flowtype(enum i40e_filter_pctype pctype)
 #define I40E_GL_SWR_PRI_JOIN_MAP_2_VALUE 0x011f0200
 #define I40E_GL_SWR_PRI_JOIN_MAP_2   0x26CE08
 
+/* For X722 */
+#define I40E_X722_GL_SWR_PRI_JOIN_MAP_0_VALUE 0x2200
+#define I40E_X722_GL_SWR_PRI_JOIN_MAP_2_VALUE 0x013F0200
+
 /* For X710 */
 #define I40E_GL_SWR_PM_UP_THR_EF_VALUE   0x03030303
 /* For XL710 */
@@ -8754,7 +8758,6 @@ i40e_dev_sync_phy_type(struct i40e_hw *hw)
return 0;
 }
 
-
 static void
 i40e_configure_registers(struct i40e_hw *hw)
 {
@@ -8762,8 +8765,8 @@ i40e_configure_registers(struct i40e_hw *hw)
uint32_t addr;
uint64_t val;
} reg_table[] = {
-   {I40E_GL_SWR_PRI_JOIN_MAP_0, I40E_GL_SWR_PRI_JOIN_MAP_0_VALUE},
-   {I40E_GL_SWR_PRI_JOIN_MAP_2, I40E_GL_SWR_PRI_JOIN_MAP_2_VALUE},
+   {I40E_GL_SWR_PRI_JOIN_MAP_0, 0},
+   {I40E_GL_SWR_PRI_JOIN_MAP_2, 0},
{I40E_GL_SWR_PM_UP_THR, 0}, /* Compute value dynamically */
};
uint64_t reg;
@@ -8771,6 +8774,24 @@ i40e_configure_registers(struct i40e_hw *hw)
int ret;
 
for (i = 0; i < RTE_DIM(reg_table); i++) {
+   if (reg_table[i].addr == I40E_GL_SWR_PRI_JOIN_MAP_0) {
+   if (hw->mac.type == I40E_MAC_X722) /* For X722 */
+   reg_table[i].val =
+   I40E_X722_GL_SWR_PRI_JOIN_MAP_0_VALUE;
+   else /* For X710/XL710/XXV710 */
+   reg_table[i].val =
+   I40E_GL_SWR_PRI_JOIN_MAP_0_VALUE;
+   }
+
+   if (reg_table[i].addr == I40E_GL_SWR_PRI_JOIN_MAP_2) {
+   if (hw->mac.type == I40E_MAC_X722) /* For X722 */
+   reg_table[i].val =
+   I40E_X722_GL_SWR_PRI_JOIN_MAP_2_VALUE;
+   else /* For X710/XL710/XXV710 */
+   reg_table[i].val =
+   I40E_GL_SWR_PRI_JOIN_MAP_2_VALUE;
+   }
+
if (reg_table[i].addr == I40E_GL_SWR_PM_UP_THR) {
if (I40E_PHY_TYPE_SUPPORT_40G(hw->phy.phy_types) || /* 
For XL710 */
I40E_PHY_TYPE_SUPPORT_25G(hw->phy.phy_types)) /* 
For XXV710 */
-- 
2.7.4



Re: [dpdk-dev] [PATCH v2] mempool: Introduce _populate_mz_range api

2017-02-06 Thread Santosh Shukla
Hi Olivier,

On Mon, Feb 06, 2017 at 06:01:15PM +0100, Olivier Matz wrote:
> On Tue, 31 Jan 2017 20:02:26 +0530, Santosh Shukla
>  wrote:
> > Hi Olivier,
> > 
> > Reply inline.
> > 
> > On Tue, Jan 31, 2017 at 11:31:51AM +0100, Olivier Matz wrote:
> > > Hi Santosh,
> > > 
> > > I guess this patch is targeted for 17.05, right?
> > >  
> > 
> > Yes.
> > 
> > > Please see some other comments below.
> > > 
> > > On Fri, 20 Jan 2017 20:43:41 +0530,
> > >  wrote:  
> > > > From: Santosh Shukla 

[snip..]

> > > > --- a/lib/librte_mempool/rte_mempool.h
> > > > +++ b/lib/librte_mempool/rte_mempool.h
> > > > @@ -387,6 +387,12 @@ typedef int (*rte_mempool_dequeue_t)(struct
> > > > rte_mempool *mp, */
> > > >  typedef unsigned (*rte_mempool_get_count)(const struct
> > > > rte_mempool *mp); +/**
> > > > + * Set the memzone va/pa addr range and len in the external pool.
> > > > + */
> > > > +typedef void (*rte_mempool_populate_mz_range_t)(struct
> > > > rte_mempool *mp,
> > > > +   const struct rte_memzone *mz);
> > > > +  
> > > 
> > > And this API would be:
> > > typedef void (*rte_mempool_populate_t)(struct rte_mempool *mp,
> > >   char *vaddr, phys_addr_t paddr, size_t len)
> > >  
> > > 
> > > If your hw absolutly needs a contiguous memory, a solution could be:
> > > 
> > > - add a new flag MEMPOOL_F_CONTIG (maybe a better nale could be
> > >   found) saying that all the mempool objects must be contiguous.
> > > - add the ops_populate() callback in rte_mempool_populate_phys(), as
> > >   suggested above
> > > 
> > > Then:
> > > 
> > >   /* create an empty mempool */
> > >   rte_mempool_create_empty(...);
> > > 
> > >   /* set the handler:
> > >*   in the ext handler, the mempool flags are updated with
> > >*   MEMPOOL_F_CONTIG
> > >*/
> > >   rte_mempool_set_ops_byname(..., "my_hardware");
> > >   
> > 
> > You mean, ext handler will set mempool flag using 'pool_config'
> > param; somethng like rte_mempool_ops_by_name(..., "my_hardware" ,
> > MEMPOOL_F_CONTIG); ?
> 
> I don't mean changing the API of rte_mempool_set_ops_byname().
> I was suggesting something like this:
> 
> static int your_handler_alloc(struct rte_mempool *mp)
> {
>   /* handler init */
>   ...
> 
>   mp->flags |= MEMPOOL_F_CONTIG;
>   return 0;
> }
> 

Ok,. Will do in v3.

> And update rte_mempool_populate_*() functions to manage this flag:
> instead of segment, just fail if it cannot fit in one segment. It won't
> really solve the issue, but at least it will be properly detected, and
> the user could take dispositions to have more contiguous memory (ex:
> use larger hugepages, allocate hugepages at boot time).
>

Agree.
Will take care in v3.

> > 
> > >   /* if MEMPOOL_F_CONTIG is set, all populate() functions should
> > > ensure
> > >* that there is only one contiguous zone
> > >*/
> > >   rte_mempool_populate_default(...);
> > >   
> > 
> > I am not too sure about scope of MEMPOOL_F_CONTIG. How
> > MEMPOOL_F_CONTIG flag {setted by application/ driver by calling
> > rte_mempool_create(..., flag)} can make sure only one contiguous
> > zone? Like to understand from you.
> 
> As described above, there would be no change from application point of
> view. The handler would set the mempool flag by itself to change the
> behavior of the populate functions.
> 
> 

Right.

> > 
> > In my understanding:
> > rte_mempool_populate_default() will request for memzone based on
> > mp->size value; And if mp->size more than one hugepage_sz{i.e. one mz
> > request not enough} then it will request more hugepages{ i.e. more mz
> > request},. So IIIU then contiguity not gauranteed.
> 
> Yes, that's how it works today. As EAL cannot guarantees that the
> hugepages are physically contiguous, it tries to segment the mempool
> objects in several memory zones.
> 
> 
> Regards,
> Olivier
> 
> 
> 

Regards,
Santosh



Re: [dpdk-dev] [PATCH v4 1/6] eventdev: introduce event driven programming model

2017-02-06 Thread Jerin Jacob
On Fri, Feb 03, 2017 at 04:28:15PM +0530, Hemant Agrawal wrote:
> On 2/3/2017 12:08 PM, Nipun Gupta wrote:
> > > > > -Original Message-
> > > > > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Jerin Jacob
> > > > > Sent: Wednesday, December 21, 2016 14:55
> > > > > To: dev@dpdk.org
> > > > > Cc: thomas.monja...@6wind.com; bruce.richard...@intel.com; Hemant
> > > > > Agrawal ; gage.e...@intel.com;
> > > > > harry.van.haa...@intel.com; Jerin Jacob
> > > 
> > > > > Subject: [dpdk-dev] [PATCH v4 1/6] eventdev: introduce event driven
> > > > > programming model
> > > > > 
> > > > > In a polling model, lcores poll ethdev ports and associated
> > > > > rx queues directly to look for packet. In an event driven model,
> > > > > by contrast, lcores call the scheduler that selects packets for
> > > > > them based on programmer-specified criteria. Eventdev library
> > > > > adds support for event driven programming model, which offer
> > > > > applications automatic multicore scaling, dynamic load balancing,
> > > > > pipelining, packet ingress order maintenance and
> > > > > synchronization services to simplify application packet processing.
> > > > > 
> > > > > By introducing event driven programming model, DPDK can support
> > > > > both polling and event driven programming models for packet 
> > > > > processing,
> > > > > and applications are free to choose whatever model
> > > > > (or combination of the two) that best suits their needs.
> > > > > 
> > > > > This patch adds the eventdev specification header file.
> > > > > 
> > > > > Signed-off-by: Jerin Jacob 
> > > > > Acked-by: Bruce Richardson 
> > > > > ---
> > > > >  MAINTAINERS|3 +
> > > > >  doc/api/doxy-api-index.md  |1 +
> > > > >  doc/api/doxy-api.conf  |1 +
> > > > >  lib/librte_eventdev/rte_eventdev.h | 1275
> > > > > 
> > > > >  4 files changed, 1280 insertions(+)
> > > > >  create mode 100644 lib/librte_eventdev/rte_eventdev.h
> > > > 
> > > > 
> > > > 
> > > > > +
> > > > > +/**
> > > > > + * Event device information
> > > > > + */
> > > > > +struct rte_event_dev_info {
> > > > > + const char *driver_name;/**< Event driver name */
> > > > > + struct rte_pci_device *pci_dev; /**< PCI information */
> > > > 
> > > > With 'rte_device' in place (rte_dev.h), should we not have 'rte_device' 
> > > > instead
> > > of 'rte_pci_device' here?
> > > 
> > > Yes. Please post a patch to fix this. As the time of merging to
> > > next-eventdev tree it was not the case.
> > 
> > Sure. I'll send a patch regarding this.
> > 
> > > 
> > > > 
> > > > > + * The number of events dequeued is the number of scheduler contexts 
> > > > > held
> > > by
> > > > > + * this port. These contexts are automatically released in the next
> > > > > + * rte_event_dequeue_burst() invocation, or invoking
> > > > > rte_event_enqueue_burst()
> > > > > + * with RTE_EVENT_OP_RELEASE operation can be used to release the
> > > > > + * contexts early.
> > > > > + *
> > > > > + * @param dev_id
> > > > > + *   The identifier of the device.
> > > > > + * @param port_id
> > > > > + *   The identifier of the event port.
> > > > > + * @param[out] ev
> > > > > + *   Points to an array of *nb_events* objects of type *rte_event* 
> > > > > structure
> > > > > + *   for output to be populated with the dequeued event objects.
> > > > > + * @param nb_events
> > > > > + *   The maximum number of event objects to dequeue, typically 
> > > > > number of
> > > > > + *   rte_event_port_dequeue_depth() available for this port.
> > > > > + *
> > > > > + * @param timeout_ticks
> > > > > + *   - 0 no-wait, returns immediately if there is no event.
> > > > > + *   - >0 wait for the event, if the device is configured with
> > > > > + *   RTE_EVENT_DEV_CFG_PER_DEQUEUE_TIMEOUT then this function will
> > > > > wait until
> > > > > + *   the event available or *timeout_ticks* time.
> > > > 
> > > > Just for understanding - Is expectation that rte_event_dequeue_burst() 
> > > > will
> > > wait till timeout
> > > > unless requested number of events (nb_events) are not received on the 
> > > > event
> > > port?
> > > 
> > > Yes. If you need any change then a send RFC patch for the header file
> > > change.
> 
> "at least one event available"

Looks good to me. If there no objections then you can send a patch to
update the header file.

> 
> The API should not wait, if at least one event is available to discard the
> timeout value.
> 
> the *timeout* is valid only until the first event is received (even when
> multiple events are requested) and driver will only checking for further
> events availability and return as many events as it is able to get in its
> processing loop.
> 
> 
> > > 
> > > > 
> > > > > + *   if the device is not configured with
> > > > > RTE_EVENT_DEV_CFG_PER_DEQUEUE_TIMEOUT
> > > > > + *   then this function will wait until the event available or
> > > > > + *   *dequeue_timeout_ns

Re: [dpdk-dev] [RFC] Add GRO support in DPDK

2017-02-06 Thread Jiayu Hu
On Wed, Jan 25, 2017 at 05:51:50PM +0800, Ananyev, Konstantin wrote:
> 
> 
> > -Original Message-
> > From: Wiles, Keith
> > Sent: Wednesday, January 25, 2017 3:40 AM
> > To: Stephen Hemminger 
> > Cc: Ananyev, Konstantin ; Hu, Jiayu 
> > ; dev@dpdk.org; Kinsella, Ray
> > ; Gilmore, Walter E ; 
> > Venkatesan, Venky ;
> > yuanhan@linux.intel.com
> > Subject: Re: [dpdk-dev] [RFC] Add GRO support in DPDK
> > 
> > 
> > > On Jan 24, 2017, at 2:04 PM, Stephen Hemminger 
> > >  wrote:
> > >
> > > On Tue, 24 Jan 2017 20:09:07 +
> > > "Wiles, Keith"  wrote:
> > >
> > >>> On Jan 24, 2017, at 12:45 PM, Ananyev, Konstantin 
> > >>>  wrote:
> > >>>
> > >>>
> > >>>
> >  -Original Message-
> >  From: Wiles, Keith
> >  Sent: Tuesday, January 24, 2017 2:49 PM
> >  To: Ananyev, Konstantin 
> >  Cc: Stephen Hemminger ; Hu, Jiayu 
> >  ; dev@dpdk.org; Kinsella, Ray
> >  ; Gilmore, Walter E 
> >  ; Venkatesan, Venky 
> >  ;
> >  yuanhan@linux.intel.com
> >  Subject: Re: [dpdk-dev] [RFC] Add GRO support in DPDK
> > 
> > 
> > > On Jan 24, 2017, at 3:33 AM, Ananyev, Konstantin 
> > >  wrote:
> > >
> > >
> > >
> > >> -Original Message-
> > >> From: Wiles, Keith
> > >> Sent: Tuesday, January 24, 2017 5:26 AM
> > >> To: Ananyev, Konstantin 
> > >> Cc: Stephen Hemminger ; Hu, Jiayu 
> > >> ; dev@dpdk.org; Kinsella, Ray
> > >> ; Gilmore, Walter E 
> > >> ; Venkatesan, Venky 
> > >> ;
> > >> yuanhan@linux.intel.com
> > >> Subject: Re: [dpdk-dev] [RFC] Add GRO support in DPDK
> > >>
> > >>
> > >>> On Jan 23, 2017, at 6:43 PM, Ananyev, Konstantin 
> > >>>  wrote:
> > >>>
> > >>>
> > >>>
> >  -Original Message-
> >  From: Wiles, Keith
> >  Sent: Monday, January 23, 2017 9:53 PM
> >  To: Stephen Hemminger 
> >  Cc: Hu, Jiayu ; dev@dpdk.org; Kinsella, Ray 
> >  ; Ananyev, Konstantin
> >  ; Gilmore, Walter E 
> >  ; Venkatesan, Venky
> > >> ;
> >  yuanhan@linux.intel.com
> >  Subject: Re: [dpdk-dev] [RFC] Add GRO support in DPDK
> > 
> > 
> > > On Jan 23, 2017, at 10:15 AM, Stephen Hemminger 
> > >  wrote:
> > >
> > > On Mon, 23 Jan 2017 21:03:12 +0800
> > > Jiayu Hu  wrote:
> > >
> > >> With the support of hardware segmentation techniques in DPDK, the
> > >> networking stack overheads of send-side of applications, which 
> > >> directly
> > >> leverage DPDK, have been greatly reduced. But for receive-side, 
> > >> numbers of
> > >> segmented packets seriously burden the networking stack of 
> > >> applications.
> > >> Generic Receive Offload (GRO) is a widely used method to solve 
> > >> the
> > >> receive-side issue, which gains performance by reducing the 
> > >> amount of
> > >> packets processed by the networking stack. But currently, DPDK 
> > >> doesn't
> > >> support GRO. Therefore, we propose to add GRO support in DPDK, 
> > >> and this
> > >> RFC is used to explain the basic DPDK GRO design.
> > >>
> > >> DPDK GRO is a SW-based packets assembly library, which provides 
> > >> GRO
> > >> abilities for numbers of protocols. In DPDK GRO, packets are 
> > >> merged
> > >> before returning to applications and after receiving from 
> > >> drivers.
> > >>
> > >> In DPDK, GRO is a capability of NIC drivers. That support GRO or 
> > >> not and
> > >> what GRO types are supported are up to NIC drivers. Different 
> > >> drivers may
> > >> support different GRO types. By default, drivers enable all 
> > >> supported GRO
> > >> types. For applications, they can inquire the supported GRO 
> > >> types by
> > >> each driver, and can control what GRO types are applied. For 
> > >> example,
> > >> ixgbe supports TCP and UDP GRO, but the application just needs 
> > >> TCP GRO.
> > >> The application can disable ixgbe UDP GRO.
> > >>
> > >> To support GRO, a driver should provide a way to tell 
> > >> applications what
> > >> GRO types are supported, and provides a GRO function, which is 
> > >> in charge
> > >> of assembling packets. Since different drivers may support 
> > >> different GRO
> > >> types, their GRO functions may be different. For applications, 
> > >> they don't
> > >> need extra operations to enable GRO. But if there are some GRO 
> > >> types that
> > >> are not needed, applications can use an API, like
> > >> rte_eth_gro_disable_protocols, to disable them. Besides, they can
> > >> re-enable 

[dpdk-dev] [PATCH] doc: announce ixgbe MTU setting limitation

2017-02-06 Thread Wenzhuo Lu
Signed-off-by: Wenzhuo Lu 
---
 doc/guides/nics/ixgbe.rst | 16 ++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/doc/guides/nics/ixgbe.rst b/doc/guides/nics/ixgbe.rst
index 3b6851b..1f7d6b3 100644
--- a/doc/guides/nics/ixgbe.rst
+++ b/doc/guides/nics/ixgbe.rst
@@ -185,8 +185,11 @@ As in the case of l3fwd, set configure 
port_conf.rxmode.hw_ip_checksum=0 to enab
 In addition, for improved performance, use -bsz "(32,32),(64,64),(32,32)" in 
load_balancer to avoid using the default burst size of 144.
 
 
+Limitations or Known issues
+---
+
 Malicious Driver Detection not Supported
-
+
 
 The Intel x550 series NICs support a feature called MDD (Malicious
 Driver Detection) which checks the behavior of the VF driver.
@@ -207,7 +210,7 @@ it by default.)
 
 
 Statistics
---
+~~
 
 The statistics of ixgbe hardware must be polled regularly in order for it to
 remain consistent. Running a DPDK application without polling the statistics 
will
@@ -230,6 +233,15 @@ be calculated as follows:
 
 In order to ensure valid results, it is recommended to poll every 4 minutes.
 
+MTU setting
+~~~
+
+Although user can set MTU separately on PF and VF ports, ixgbe only supports
+one global MTU per physical port.
+So when user sets different MTUs on PF and VF ports in one physical port, the
+real MTU for all these PF and VF ports is the biggest one.
+This behavior is leveraged from kernel driver.
+
 
 Supported Chipsets and NICs
 ---
-- 
1.9.3



Re: [dpdk-dev] [PATCH v2 02/15] eventdev: add APIs for extended stats

2017-02-06 Thread Jerin Jacob
On Mon, Feb 06, 2017 at 10:37:31AM +, Van Haaren, Harry wrote:
> > -Original Message-
> > From: Jerin Jacob [mailto:jerin.ja...@caviumnetworks.com]
> > Sent: Monday, February 6, 2017 8:23 AM
> > To: Van Haaren, Harry 
> > Cc: dev@dpdk.org; Richardson, Bruce 
> > Subject: Re: [PATCH v2 02/15] eventdev: add APIs for extended stats
> > 
> > On Tue, Jan 31, 2017 at 04:14:20PM +, Harry van Haaren wrote:
> > > From: Bruce Richardson 
> > >
> > > Add in APIs for extended stats so that eventdev implementations can report
> > > out information on their internal state. The APIs are based on, but not
> > > identical to, the equivalent ethdev functions.
> > 
> > The APIs Looks good. One minor comment though,

One more suggestion,

How about adding the reset function for resetting the selective xstat counters 
similar to
ethdev? IMO, It will be useful.

> > 
> > Can you add statistics specific to per event queue and event
> > port?, To improve the cases like below in the application code(taken from
> > app/test/test_sw_eventdev.c).
> > 
> > IMO, it is useful because,
> > - ethdev has similar semantics
> > - majority of the implementations will have port and queue specific 
> > statistics counters
> 
> I'm not totally sure what you're asking but if I understand correctly, you're 
> suggesting a struct based stats API like this?
> 
> struct rte_event_dev_port_stats {
> uint64_t rx;
> uint64_t tx;
> ...
> };
> 
> struct rte_event_dev_queue_stats {
> uint64_t rx;
> uint64_t tx;
> ...
> };
> 
> /** a get function to get a specific port's statistics. The *stats* pointer 
> is filled in */ 
> int rte_event_dev_port_stats_get(dev, uint8_t port_id, struct 
> rte_event_dev_port_stats *stats);
> 
> /** a get function to get a specific queue's statistics. The *stats* pointer 
> is filled in */ 
> int rte_event_dev_queue_stats_get(dev, uint8_t queue_id, struct 
> rte_event_dev_queue_stats *stats);
> 
> 
> Is this what you meant, or did I misunderstand?

I meant, queue and port specific "xstat" as each implementation may
have different statistics counters for queue/port.

Just to share my view, I have modified the exiting proposal.
Thoughts?

+enum rte_event_dev_xstats_mode {
+   RTE_EVENT_DEV_XSTAT_DEVICE; /* Event device specific global xstats */
+   RTE_EVENT_DEV_XSTAT_QUEUE; /* Event queue specific xstats */
+   RTE_EVENT_DEV_XSTAT_PORT; /* Event port specific xstats */
+};
+
 /**
  * Retrieve names of extended statistics of an event device.
  *
@@ -1436,9 +1442,11 @@ struct rte_event_dev_xstat_name {
  */
 int
 rte_event_dev_xstats_names_get(uint8_t dev_id,
+  enum rte_event_dev_xstats_mode mode;
   struct rte_event_dev_xstat_name *xstat_names,
   unsigned int size);

 /**
  * Retrieve extended statistics of an event device.
  *
@@ -1458,7 +1466,10 @@ rte_event_dev_xstats_names_get(uint8_t dev_id,
  * device doesn't support this function.
  */
 int
-rte_event_dev_xstats_get(uint8_t dev_id, const unsigned int ids[],
+rte_event_dev_xstats_get(uint8_t dev_id,
+enum rte_event_dev_xstats_mode mode,
+uint8_t queue_port_id; /* valid when 
RTE_EVENT_DEV_XSTAT_QUEUE or RTE_EVENT_DEV_XSTAT_PORT */
+const unsigned int ids[],
 uint64_t values[], unsigned int n);
 
 /**
@@ -1478,7 +1489,9 @@ rte_event_dev_xstats_get(uint8_t dev_id, const
unsigned int ids[],
  *   - negative value: -EINVAL if stat not found, -ENOTSUP if not
  *   supported.
  */
 uint64_t
-rte_event_dev_xstats_by_name_get(uint8_t dev_id, const char *name,
+rte_event_dev_xstats_by_name_get(uint8_t dev_id,
+enum rte_event_dev_xstats_mode mode,
+const char *name,
 unsigned int *id);
> 
> 
> > 
> > +   for (i = 0; i < MAX_PORTS; i++) {
> > +   char name[32];
> > +   snprintf(name, sizeof(name), "port_%u_rx", i);
> > +   stats->port_rx_pkts[i] = rte_event_dev_xstats_by_name_get(
> > +   dev_id, name, &port_rx_pkts_ids[i]);
> > 
> > +   for (i = 0; i < MAX_QIDS; i++) {
> > +   char name[32];
> > +   snprintf(name, sizeof(name), "qid_%u_rx", i);
> > +   stats->qid_rx_pkts[i] = rte_event_dev_xstats_by_name_get(
> > +   dev_id, name, &qid_rx_pkts_ids[i]);
> > 
> 


[dpdk-dev] [PATCH v3] net/ixgbe: clean up rte_eth_dev_info_get

2017-02-06 Thread Wenzhuo Lu
It's not appropriate to call rte_eth_dev_info_get in PMD,
as rte_eth_dev_info_get need to get info from PMD.
Remove rte_eth_dev_info_get from PMD code and get the
info directly.

Signed-off-by: Wenzhuo Lu 
---
v2:
- change is_ixgbe_pmd to is_device_supported to make it more generic.
v3:
- minor change.

 drivers/net/ixgbe/ixgbe_ethdev.c | 161 ++-
 1 file changed, 76 insertions(+), 85 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index 5b625a3..e565ae3 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -253,7 +253,8 @@ static void ixgbe_add_rar(struct rte_eth_dev *dev, struct 
ether_addr *mac_addr,
 static void ixgbe_set_default_mac_addr(struct rte_eth_dev *dev,
   struct ether_addr *mac_addr);
 static void ixgbe_dcb_init(struct ixgbe_hw *hw, struct ixgbe_dcb_config 
*dcb_config);
-static int is_ixgbe_pmd(const char *driver_name);
+static bool is_device_supported(struct rte_eth_dev *dev,
+   struct eth_driver *drv);
 
 /* For Virtual Function support */
 static int eth_ixgbevf_dev_init(struct rte_eth_dev *eth_dev);
@@ -4380,16 +4381,14 @@ static int ixgbevf_dev_xstats_get_names(__rte_unused 
struct rte_eth_dev *dev,
ixgbe_add_rar(dev, addr, 0, 0);
 }
 
-static int
-is_ixgbe_pmd(const char *driver_name)
+static bool
+is_device_supported(struct rte_eth_dev *dev, struct eth_driver *drv)
 {
-   if (!strstr(driver_name, "ixgbe"))
-   return -ENOTSUP;
+   if (strcmp(dev->driver->pci_drv.driver.name,
+  drv->pci_drv.driver.name))
+   return false;
 
-   if (strstr(driver_name, "ixgbe_vf"))
-   return -ENOTSUP;
-
-   return 0;
+   return true;
 }
 
 int
@@ -4401,17 +4400,17 @@ static int ixgbevf_dev_xstats_get_names(__rte_unused 
struct rte_eth_dev *dev,
int rar_entry;
uint8_t *new_mac = (uint8_t *)(mac_addr);
struct rte_eth_dev *dev;
-   struct rte_eth_dev_info dev_info;
+   struct rte_pci_device *pci_dev;
 
RTE_ETH_VALID_PORTID_OR_ERR_RET(port, -ENODEV);
 
dev = &rte_eth_devices[port];
-   rte_eth_dev_info_get(port, &dev_info);
+   pci_dev = IXGBE_DEV_TO_PCI(dev);
 
-   if (is_ixgbe_pmd(dev_info.driver_name) != 0)
+   if (!is_device_supported(dev, &rte_ixgbe_pmd))
return -ENOTSUP;
 
-   if (vf >= dev_info.max_vfs)
+   if (vf >= pci_dev->max_vfs)
return -EINVAL;
 
hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
@@ -4902,17 +4901,17 @@ static void ixgbevf_set_vfta_all(struct rte_eth_dev 
*dev, bool on)
struct ixgbe_hw *hw;
struct ixgbe_mac_info *mac;
struct rte_eth_dev *dev;
-   struct rte_eth_dev_info dev_info;
+   struct rte_pci_device *pci_dev;
 
RTE_ETH_VALID_PORTID_OR_ERR_RET(port, -ENODEV);
 
dev = &rte_eth_devices[port];
-   rte_eth_dev_info_get(port, &dev_info);
+   pci_dev = IXGBE_DEV_TO_PCI(dev);
 
-   if (is_ixgbe_pmd(dev_info.driver_name) != 0)
+   if (!is_device_supported(dev, &rte_ixgbe_pmd))
return -ENOTSUP;
 
-   if (vf >= dev_info.max_vfs)
+   if (vf >= pci_dev->max_vfs)
return -EINVAL;
 
if (on > 1)
@@ -4932,17 +4931,17 @@ static void ixgbevf_set_vfta_all(struct rte_eth_dev 
*dev, bool on)
struct ixgbe_hw *hw;
struct ixgbe_mac_info *mac;
struct rte_eth_dev *dev;
-   struct rte_eth_dev_info dev_info;
+   struct rte_pci_device *pci_dev;
 
RTE_ETH_VALID_PORTID_OR_ERR_RET(port, -ENODEV);
 
dev = &rte_eth_devices[port];
-   rte_eth_dev_info_get(port, &dev_info);
+   pci_dev = IXGBE_DEV_TO_PCI(dev);
 
-   if (is_ixgbe_pmd(dev_info.driver_name) != 0)
+   if (!is_device_supported(dev, &rte_ixgbe_pmd))
return -ENOTSUP;
 
-   if (vf >= dev_info.max_vfs)
+   if (vf >= pci_dev->max_vfs)
return -EINVAL;
 
if (on > 1)
@@ -4961,17 +4960,17 @@ static void ixgbevf_set_vfta_all(struct rte_eth_dev 
*dev, bool on)
struct ixgbe_hw *hw;
uint32_t ctrl;
struct rte_eth_dev *dev;
-   struct rte_eth_dev_info dev_info;
+   struct rte_pci_device *pci_dev;
 
RTE_ETH_VALID_PORTID_OR_ERR_RET(port, -ENODEV);
 
dev = &rte_eth_devices[port];
-   rte_eth_dev_info_get(port, &dev_info);
+   pci_dev = IXGBE_DEV_TO_PCI(dev);
 
-   if (is_ixgbe_pmd(dev_info.driver_name) != 0)
+   if (!is_device_supported(dev, &rte_ixgbe_pmd))
return -ENOTSUP;
 
-   if (vf >= dev_info.max_vfs)
+   if (vf >= pci_dev->max_vfs)
return -EINVAL;
 
if (vlan_id > ETHER_MAX_VLAN_ID)
@@ -4997,14 +4996,12 @@ static void ixgbevf_set_vfta_all(struct rte_eth_dev 
*dev, bool on)
struct ixgbe_hw *hw;
uint32_t ctrl;
struct rte_eth_dev *d

Re: [dpdk-dev] [PATCH v2 07/15] event/sw: add support for event queues

2017-02-06 Thread Jerin Jacob
On Mon, Feb 06, 2017 at 10:25:18AM +, Van Haaren, Harry wrote:
> > -Original Message-
> > From: Jerin Jacob [mailto:jerin.ja...@caviumnetworks.com]
> > Sent: Monday, February 6, 2017 9:25 AM
> > To: Van Haaren, Harry 
> > Cc: dev@dpdk.org; Richardson, Bruce 
> > Subject: Re: [PATCH v2 07/15] event/sw: add support for event queues
> > 
> > On Tue, Jan 31, 2017 at 04:14:25PM +, Harry van Haaren wrote:
> > > From: Bruce Richardson 
> > >
> > > Add in the data structures for the event queues, and the eventdev
> > > functions to create and destroy those queues.
> > >
> > > Signed-off-by: Bruce Richardson 
> > > Signed-off-by: Harry van Haaren 
> > > ---
> > >  drivers/event/sw/iq_ring.h  | 176 
> > > 
> > >  drivers/event/sw/sw_evdev.c | 158 +++
> > >  drivers/event/sw/sw_evdev.h |  75 +++
> > >  3 files changed, 409 insertions(+)
> > >  create mode 100644 drivers/event/sw/iq_ring.h
> > >
> > > + */
> > > +
> > > +/*
> > > + * Ring structure definitions used for the internal ring buffers of the
> > > + * SW eventdev implementation. These are designed for single-core use 
> > > only.
> > > + */
> > 
> > If I understand it correctly, IQ and QE rings are single producer and
> > single consumer rings. By the specification, multiple producers through
> > multiple ports can enqueue to the event queues at a time.Does SW 
> > implementation
> > support that? or am I missing something here?
> 
> You're right that the IQ and QE rings are Single Producer, Single Consumer 
> rings. More specifically, the QE is a ring for sending rte_event structs 
> between cores, while the IQ ring is optimized for internal use in the 
> scheduler core - and should not be used to send events between cores. Note 
> that the design of the SW scheduler includes a central core for performing 
> scheduling.

Thanks Harry. One question though,
In RTE_EVENT_DEV_CAP_DISTRIBUTED_SCHED mode, multiple SW schedulers can
be active. Right? If so, We need multi consumer. Right?

> 
> In other works, the QE rings transfer events from the worker core to the 
> scheduler - and the scheduler pushes the events into what you call the "event 
> queues" (aka, the atomic/ordered queue itself). These "event queues" are IQ 
> instances. On egress from the scheduler, the event passes through a QE ring 
> to the worker.
> 
> The result is that despite that only SP/SC rings are used, multiple workers 
> can enqueue to any event queue.

Got it.




Re: [dpdk-dev] [RFC] lib/librte_ether: consistent PMD batching behavior

2017-02-06 Thread Yang, Zhiyong
Hi, Adrien:

Sorry for the late reply  due to Chinese new year.

> -Original Message-
> From: Adrien Mazarguil [mailto:adrien.mazarg...@6wind.com]
> Sent: Tuesday, January 24, 2017 12:36 AM
> To: Richardson, Bruce 
> Cc: Ananyev, Konstantin ; Andrew
> Rybchenko ; Yang, Zhiyong
> ; dev@dpdk.org; thomas.monja...@6wind.com
> Subject: Re: [dpdk-dev] [RFC] lib/librte_ether: consistent PMD batching
> behavior
> 
> On Fri, Jan 20, 2017 at 11:48:22AM +, Bruce Richardson wrote:
> > On Fri, Jan 20, 2017 at 11:24:40AM +, Ananyev, Konstantin wrote:
> > > >
> > > > From: Andrew Rybchenko [mailto:arybche...@solarflare.com]
> > > > Sent: Friday, January 20, 2017 10:26 AM
> > > > To: Yang, Zhiyong ; dev@dpdk.org
> > > > Cc: thomas.monja...@6wind.com; Richardson, Bruce
> > > > ; Ananyev, Konstantin
> > > > 
> > > > Subject: Re: [dpdk-dev] [RFC] lib/librte_ether: consistent PMD
> > > > batching behavior
> > > >
> > > > On 01/20/2017 12:51 PM, Zhiyong Yang wrote:
> > > > The rte_eth_tx_burst() function in the file Rte_ethdev.h is
> > > > invoked to transmit output packets on the output queue for DPDK
> > > > applications as follows.
> > > >
> > > > static inline uint16_t
> > > > rte_eth_tx_burst(uint8_t port_id, uint16_t queue_id,
> > > >  struct rte_mbuf **tx_pkts, uint16_t nb_pkts);
> > > >
> > > > Note: The fourth parameter nb_pkts: The number of packets to
> transmit.
> > > > The rte_eth_tx_burst() function returns the number of packets it
> > > > actually sent. The return value equal to *nb_pkts* means that all
> > > > packets have been sent, and this is likely to signify that other
> > > > output packets could be immediately transmitted again.
> > > > Applications that implement a "send as many packets to transmit as
> > > > possible" policy can check this specific case and keep invoking
> > > > the rte_eth_tx_burst() function until a value less than
> > > > *nb_pkts* is returned.
> > > >
> > > > When you call TX only once in rte_eth_tx_burst, you may get
> > > > different behaviors from different PMDs. One problem that every
> > > > DPDK user has to face is that they need to take the policy into
> > > > consideration at the app- lication level when using any specific
> > > > PMD to send the packets whether or not it is necessary, which
> > > > brings usage complexities and makes DPDK users easily confused
> > > > since they have to learn the details on TX function limit of
> > > > specific PMDs and have to handle the different return value: the
> > > > number of packets transmitted successfully for various PMDs. Some
> > > > PMDs Tx func- tions have a limit of sending at most 32 packets for
> > > > every invoking, some PMDs have another limit of at most 64 packets
> > > > once, another ones have imp- lemented to send as many packets to
> transmit as possible, etc. This will easily cause wrong usage for DPDK users.
> > > >
> > > > This patch proposes to implement the above policy in DPDK lib in
> > > > order to simplify the application implementation and avoid the
> > > > incorrect invoking as well. So, DPDK Users don't need to consider
> > > > the implementation policy and to write duplicated code at the
> > > > application level again when sending packets. In addition to it,
> > > > the users don't need to know the difference of specific PMD TX and
> > > > can transmit the arbitrary number of packets as they expect when
> > > > invoking TX API rte_eth_tx_burst, then check the return value to get
> the number of packets actually sent.
> > > >
> > > > How to implement the policy in DPDK lib? Two solutions are proposed
> below.
> > > >
> > > > Solution 1:
> > > > Implement the wrapper functions to remove some limits for each
> > > > specific PMDs as i40e_xmit_pkts_simple and ixgbe_xmit_pkts_simple
> do like that.
> > > >
> > > > > IMHO, the solution is a bit better since it:
> > > > > 1. Does not affect other PMDs at all
> > > > > 2. Could be a bit faster for the PMDs which require it since has
> > > > >no indirect
> > > > >    function call on each iteration
> > > > > 3. No ABI change
> > >
> > > I also would prefer solution number 1 for the reasons outlined by Andrew
> above.
> > > Also, IMO current limitation for number of packets to TX in some
> > > Intel PMD TX routines are sort of artificial:
> > > - they are not caused by any real HW limitations
> > > - avoiding them at PMD level shouldn't cause any performance or
> functional degradation.
> > > So I don't see any good reason why instead of fixing these
> > > limitations in our own PMDs we are trying to push them to the upper
> (rte_ethdev) layer.
> 
> For what it's worth, I agree with Konstantin. Wrappers should be as thin as
> possible on top of PMD functions, they are not helpers. We could define a
> set of higher level functions for this purpose though.
> 
> In the meantime, exposing and documenting PMD limitations seems safe
> enough.
> 
> We could assert that RX/TX burst requests larger than the size of the target
> q