[dpdk-dev] [PATCH 07/12] pmd/ixgbe: add dev_ptype_info_get implementation

2016-01-05 Thread Tan, Jianfeng


On 1/5/2016 2:12 AM, Ananyev, Konstantin wrote:
>
>> -Original Message-
>> From: Tan, Jianfeng
>> Sent: Thursday, December 31, 2015 6:53 AM
>> To: dev at dpdk.org
>> Cc: Zhang, Helin; Ananyev, Konstantin; Tan, Jianfeng
>> Subject: [PATCH 07/12] pmd/ixgbe: add dev_ptype_info_get implementation
>>
>> Signed-off-by: Jianfeng Tan 
>> ---
>>   drivers/net/ixgbe/ixgbe_ethdev.c | 50 
>> 
>>   drivers/net/ixgbe/ixgbe_ethdev.h |  2 ++
>>   drivers/net/ixgbe/ixgbe_rxtx.c   |  5 +++-
>>   3 files changed, 56 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c 
>> b/drivers/net/ixgbe/ixgbe_ethdev.c
>> index 4c4c6df..de5c3a9 100644
>> --- a/drivers/net/ixgbe/ixgbe_ethdev.c
>> +++ b/drivers/net/ixgbe/ixgbe_ethdev.c
>> @@ -166,6 +166,8 @@ static int ixgbe_dev_queue_stats_mapping_set(struct 
>> rte_eth_dev *eth_dev,
>>   uint8_t is_rx);
>>   static void ixgbe_dev_info_get(struct rte_eth_dev *dev,
>> struct rte_eth_dev_info *dev_info);
>> +static int ixgbe_dev_ptype_info_get(struct rte_eth_dev *dev,
>> +uint32_t ptype_mask, uint32_t ptypes[]);
>>   static void ixgbevf_dev_info_get(struct rte_eth_dev *dev,
>>   struct rte_eth_dev_info *dev_info);
>>   static int ixgbe_dev_mtu_set(struct rte_eth_dev *dev, uint16_t mtu);
>> @@ -428,6 +430,7 @@ static const struct eth_dev_ops ixgbe_eth_dev_ops = {
>>  .xstats_reset = ixgbe_dev_xstats_reset,
>>  .queue_stats_mapping_set = ixgbe_dev_queue_stats_mapping_set,
>>  .dev_infos_get= ixgbe_dev_info_get,
>> +.dev_ptype_info_get   = ixgbe_dev_ptype_info_get,
>>  .mtu_set  = ixgbe_dev_mtu_set,
>>  .vlan_filter_set  = ixgbe_vlan_filter_set,
>>  .vlan_tpid_set= ixgbe_vlan_tpid_set,
>> @@ -512,6 +515,7 @@ static const struct eth_dev_ops ixgbevf_eth_dev_ops = {
>>  .xstats_reset = ixgbevf_dev_stats_reset,
>>  .dev_close= ixgbevf_dev_close,
>>  .dev_infos_get= ixgbevf_dev_info_get,
>> +.dev_ptype_info_get   = ixgbe_dev_ptype_info_get,
>>  .mtu_set  = ixgbevf_dev_set_mtu,
>>  .vlan_filter_set  = ixgbevf_vlan_filter_set,
>>  .vlan_strip_queue_set = ixgbevf_vlan_strip_queue_set,
>> @@ -2829,6 +2833,52 @@ ixgbe_dev_info_get(struct rte_eth_dev *dev, struct 
>> rte_eth_dev_info *dev_info)
>>  dev_info->flow_type_rss_offloads = IXGBE_RSS_OFFLOAD_ALL;
>>   }
>>
>> +static int
>> +ixgbe_dev_ptype_info_get(struct rte_eth_dev *dev, uint32_t ptype_mask,
>> +uint32_t ptypes[])
>> +{
>> +int num = 0;
>> +
>> +if ((dev->rx_pkt_burst == ixgbe_recv_pkts)
>> +|| (dev->rx_pkt_burst == 
>> ixgbe_recv_pkts_lro_single_alloc)
>> +|| (dev->rx_pkt_burst == ixgbe_recv_pkts_lro_bulk_alloc)
>> +|| (dev->rx_pkt_burst == ixgbe_recv_pkts_bulk_alloc)
>> +   ) {
>
> As I remember vector RX for ixgbe sets up packet_type properly too.

Hi Konstantin,

Yes, Helin also reminds me about that. Going to add it in next version.

Thanks,
Jianfeng

>
>> +/* refers to ixgbe_rxd_pkt_info_to_pkt_type() */
>> +if ((ptype_mask & RTE_PTYPE_L2_MASK) == RTE_PTYPE_L2_MASK)
>> +ptypes[num++] = RTE_PTYPE_L2_ETHER;
>> +
>> +if ((ptype_mask & RTE_PTYPE_L3_MASK) == RTE_PTYPE_L3_MASK) {
>> +ptypes[num++] = RTE_PTYPE_L3_IPV4;
>> +ptypes[num++] = RTE_PTYPE_L3_IPV4_EXT;
>> +ptypes[num++] = RTE_PTYPE_L3_IPV6;
>> +ptypes[num++] = RTE_PTYPE_L3_IPV6_EXT;
>> +}
>> +
>> +if ((ptype_mask & RTE_PTYPE_L4_MASK) == RTE_PTYPE_L4_MASK) {
>> +ptypes[num++] = RTE_PTYPE_L4_SCTP;
>> +ptypes[num++] = RTE_PTYPE_L4_TCP;
>> +ptypes[num++] = RTE_PTYPE_L4_UDP;
>> +}
>> +
>> +if ((ptype_mask & RTE_PTYPE_TUNNEL_MASK) == 
>> RTE_PTYPE_TUNNEL_MASK)
>> +ptypes[num++] = RTE_PTYPE_TUNNEL_IP;
>> +
>> +if ((ptype_mask & RTE_PTYPE_INNER_L3_MASK) == 
>> RTE_PTYPE_INNER_L3_MASK) {
>> +ptypes[num++] = RTE_PTYPE_INNER_L3_IPV6;
>> +ptypes[num++] = RTE_PTYPE_INNER_L3_IPV6_EXT;
>> +}
>> +
>> +if ((ptype_mask & RTE_PTYPE_INNER_L4_MASK) == 
>> RTE_PTYPE_INNER_L4_MASK) {
>> +ptypes[num++] = RTE_PTYPE_INNER_L4_TCP;
>> +ptypes[num++] = RTE_PTYPE_INNER_L4_UDP;
>> +}
>> +} else
>> +num = -ENOTSUP;
>> +
>> +return num;
>> +}
>> +
>>   static void
>>   ixgbevf_dev_info_get(struct rte_eth_dev *dev,
>>   struct rte_eth_dev_info *dev_info)
>> diff --git a/drivers/net/ixgbe/ixgbe_ethdev.h 
>> b/drivers/net/ixgbe/ixgbe_ethdev.h
>> index d26771a..2479830 100644
>> --- a/drivers/ne

[dpdk-dev] [PATCH v2 4/4] virtio: check if any kernel driver is manipulating the virtio device

2016-01-05 Thread Yuanhan Liu
On Mon, Jan 04, 2016 at 05:56:49PM +, Xie, Huawei wrote:
> On 1/5/2016 1:24 AM, Stephen Hemminger wrote:
> > On Mon,  4 Jan 2016 01:56:13 +0800
> > Huawei Xie  wrote:
> >
> >> +  if (pci_dev->kdrv != RTE_KDRV_NONE) {
> >> +  PMD_INIT_LOG(INFO,
> >> +  "kernel driver is manipulating this device." \
> >> +  " Please unbind the kernel driver.");
> > Splitting strings in general is a bad idea since it makes it harder to find 
> > log messages.
> > Also the first clause is lower case and the second is captialized.
> Got it. This is to avoid 80 char warning. Will put it in one line to
> make it friendly for searching.

I agree with Stephen that _in general_ it's a bad idea. But for this
case, I think it's okay, as it'd be enough to locate the code by
searching "manipulating this device", or "unbind the kernel driver",
or other combinations. I mean, nobody would try searching with:

  "kernel driver is manipulating this device. Please unbind the kernel driver."

Right?

--yliu

> The first clause is lower is because it actually follows "%s():".
> >
> > Lastly, the backslash continuation is unnecessary here and will cause 
> > checkpatch warning.
> >
> 


[dpdk-dev] [PATCH] fix checkpatch errors

2016-01-05 Thread Tan, Jianfeng


> -Original Message-
> From: Xie, Huawei
> Sent: Monday, January 4, 2016 9:52 AM
> To: dev at dpdk.org
> Cc: Mcnamara, John; Tan, Jianfeng; Xie, Huawei
> Subject: [PATCH] fix checkpatch errors
> 
> Signed-off-by: Huawei Xie 
...
>   mbuf_poolname_build(sock_id, pool_name, sizeof(pool_name));
> - return (rte_mempool_lookup((const char *)pool_name));
> + return rte_mempool_lookup((const char *)pool_name);

Hi Huawei,

Assume this patch is to solve below error (reported by checkpatch):
ERROR: return is not a function, parentheses are not required

So maybe above fix is not necessary? Involve more people to discuss.

And please include the error message in the commit message.

Thanks,
Jianfeng



[dpdk-dev] [PATCH] fix checkpatch errors

2016-01-05 Thread Yuanhan Liu
On Tue, Jan 05, 2016 at 02:21:12AM +, Tan, Jianfeng wrote:
> 
> 
> > -Original Message-
> > From: Xie, Huawei
> > Sent: Monday, January 4, 2016 9:52 AM
> > To: dev at dpdk.org
> > Cc: Mcnamara, John; Tan, Jianfeng; Xie, Huawei
> > Subject: [PATCH] fix checkpatch errors
> > 
> > Signed-off-by: Huawei Xie 
> ...
> > mbuf_poolname_build(sock_id, pool_name, sizeof(pool_name));
> > -   return (rte_mempool_lookup((const char *)pool_name));
> > +   return rte_mempool_lookup((const char *)pool_name);
> 
> Hi Huawei,
> 
> Assume this patch is to solve below error (reported by checkpatch):
> ERROR: return is not a function, parentheses are not required
> 
> So maybe above fix is not necessary? Involve more people to discuss.

This fix is good to me.

> And please include the error message in the commit message.

+1

--yliu


[dpdk-dev] [PATCH] fix checkpatch errors

2016-01-05 Thread Xie, Huawei
On 1/5/2016 10:21 AM, Tan, Jianfeng wrote:
>
>> -Original Message-
>> From: Xie, Huawei
>> Sent: Monday, January 4, 2016 9:52 AM
>> To: dev at dpdk.org
>> Cc: Mcnamara, John; Tan, Jianfeng; Xie, Huawei
>> Subject: [PATCH] fix checkpatch errors
>>
>> Signed-off-by: Huawei Xie 
> ...
>>  mbuf_poolname_build(sock_id, pool_name, sizeof(pool_name));
>> -return (rte_mempool_lookup((const char *)pool_name));
>> +return rte_mempool_lookup((const char *)pool_name);
> Hi Huawei,
>
> Assume this patch is to solve below error (reported by checkpatch):
> ERROR: return is not a function, parentheses are not required
>
> So maybe above fix is not necessary? Involve more people to discuss.
Yes, Almost all of the 800 errors are check patch errors. The
parentheses for some logic expressions, like return val == 0, return
function, are also removed. At least in this patch, they are not needed.
>
> And please include the error message in the commit message.
>
> Thanks,
> Jianfeng
>
>



[dpdk-dev] [PATCH 12/12] examples/l3fwd: add option to parse ptype

2016-01-05 Thread Tan, Jianfeng


> -Original Message-
> From: Ananyev, Konstantin
> Sent: Tuesday, January 5, 2016 2:32 AM
> To: Tan, Jianfeng; dev at dpdk.org
> Cc: Zhang, Helin
> Subject: RE: [PATCH 12/12] examples/l3fwd: add option to parse ptype
> 
> 
> Hi Jianfeng,
> > -Original Message-
> > From: Tan, Jianfeng
> > Sent: Thursday, December 31, 2015 6:53 AM
> > To: dev at dpdk.org
> > Cc: Zhang, Helin; Ananyev, Konstantin; Tan, Jianfeng
> > Subject: [PATCH 12/12] examples/l3fwd: add option to parse ptype
> >
> > Firstly, use rte_eth_dev_get_ptype_info() API to check if device will
> > parse needed packet type. If not, specifying the newly added option,
> > --parse-ptype to do it in the callback softly.
> >
> > Signed-off-by: Jianfeng Tan 
> > ---
> >  examples/l3fwd/main.c | 86
> +++
> >  1 file changed, 86 insertions(+)
> >
> > diff --git a/examples/l3fwd/main.c b/examples/l3fwd/main.c
> > index 5b0c2dd..ccbdce3 100644
> > --- a/examples/l3fwd/main.c
> > +++ b/examples/l3fwd/main.c
> > @@ -174,6 +174,7 @@ static __m128i val_eth[RTE_MAX_ETHPORTS];
> >  static uint32_t enabled_port_mask = 0;
> >  static int promiscuous_on = 0; /**< Ports set in promiscuous mode off by
> default. */
> >  static int numa_on = 1; /**< NUMA is enabled by default. */
> > +static int parse_ptype = 0; /**< parse packet type using rx callback */
> >
> >  #if (APP_LOOKUP_METHOD == APP_LOOKUP_EXACT_MATCH)
> >  static int ipv6 = 0; /**< ipv6 is false by default. */
> > @@ -2022,6 +2023,7 @@ parse_eth_dest(const char *optarg)
> >  #define CMD_LINE_OPT_IPV6 "ipv6"
> >  #define CMD_LINE_OPT_ENABLE_JUMBO "enable-jumbo"
> >  #define CMD_LINE_OPT_HASH_ENTRY_NUM "hash-entry-num"
> > +#define CMD_LINE_OPT_PARSE_PTYPE "parse-ptype"
> >
> >  /* Parse the argument given in the command line of the application */
> >  static int
> > @@ -2038,6 +2040,7 @@ parse_args(int argc, char **argv)
> > {CMD_LINE_OPT_IPV6, 0, 0, 0},
> > {CMD_LINE_OPT_ENABLE_JUMBO, 0, 0, 0},
> > {CMD_LINE_OPT_HASH_ENTRY_NUM, 1, 0, 0},
> > +   {CMD_LINE_OPT_PARSE_PTYPE, 0, 0, 0},
> > {NULL, 0, 0, 0}
> > };
> >
> > @@ -2125,6 +2128,12 @@ parse_args(int argc, char **argv)
> > }
> > }
> >  #endif
> > +   if (!strncmp(lgopts[option_index].name,
> CMD_LINE_OPT_PARSE_PTYPE,
> > +   sizeof(CMD_LINE_OPT_PARSE_PTYPE))) {
> > +   printf("soft parse-ptype is enabled \n");
> > +   parse_ptype = 1;
> > +   }
> > +
> > break;
> >
> > default:
> > @@ -2559,6 +2568,75 @@ check_all_ports_link_status(uint8_t port_num,
> uint32_t port_mask)
> > }
> >  }
> >
> > +static int
> > +check_packet_type_ok(int portid)
> > +{
> > +   int i;
> > +   int ret;
> > +   uint32_t ptypes[RTE_PTYPE_L3_MAX_NUM];
> > +   int ptype_l3_ipv4 = 0, ptype_l3_ipv6 = 0;
> > +
> > +   ret = rte_eth_dev_get_ptype_info(portid, RTE_PTYPE_L3_MASK,
> ptypes);
> > +   for (i = 0; i < ret; ++i) {
> > +   if (ptypes[i] & RTE_PTYPE_L3_IPV4)
> > +   ptype_l3_ipv4 = 1;
> > +   if (ptypes[i] & RTE_PTYPE_L3_IPV6)
> > +   ptype_l3_ipv6 = 1;
> > +   }
> > +
> > +   if (ptype_l3_ipv4 == 0)
> > +   printf("port %d cannot parse RTE_PTYPE_L3_IPV4\n", portid);
> > +
> > +   if (ptype_l3_ipv6 == 0)
> > +   printf("port %d cannot parse RTE_PTYPE_L3_IPV6\n", portid);
> > +
> > +   if (ptype_l3_ipv4 || ptype_l3_ipv6)
> > +   return 1;
> > +
> > +   return 0;
> > +}
> > +static inline void
> > +parse_packet_type(struct rte_mbuf *m)
> > +{
> > +   struct ether_hdr *eth_hdr;
> > +   struct vlan_hdr *vlan_hdr;
> > +   uint32_t packet_type = 0;
> > +   uint16_t ethertype;
> > +
> > +   eth_hdr = rte_pktmbuf_mtod(m, struct ether_hdr *);
> > +   ethertype = rte_be_to_cpu_16(eth_hdr->ether_type);
> > +   if (ethertype == ETHER_TYPE_VLAN) {
> 
> I don't think either LPM or EM support packets with VLAN right now.
> So, probably there is no need to support it here.

Good to know. Will remove it.

> 
> > +   vlan_hdr = (struct vlan_hdr *)(eth_hdr + 1);
> > +   ethertype = rte_be_to_cpu_16(vlan_hdr->eth_proto);
> > +   }
> > +   switch (ethertype) {
> > +   case ETHER_TYPE_IPv4:
> > +   packet_type |= RTE_PTYPE_L3_IPV4_EXT_UNKNOWN;
> > +   break;
> > +   case ETHER_TYPE_IPv6:
> > +   packet_type |= RTE_PTYPE_L3_IPV6_EXT_UNKNOWN;
> > +   break;
> > +   default:
> > +   break;
> > +   }
> > +
> > +   m->packet_type = packet_type;
> 
> Probably:
> m->packet_type |= packet_type;
> in case HW supports some other packet types.

I agree. Will fix it.

> 
> > +}
> > +
> > +static uint16_t
> > +cb_parse_packet_type(uint8_t port __rte_unused,
> > +   uint16_t queue __rte_unused,
> > +   struct rte_mbuf *pkts[],
> > +   uint16_t nb_pkts,
> > 

[dpdk-dev] [PATCH 08/12] pmd/mlx4: add dev_ptype_info_get implementation

2016-01-05 Thread Tan, Jianfeng


On 1/4/2016 7:11 PM, Adrien Mazarguil wrote:
> Hi Jianfeng,
>
> I'm only commenting the mlx4/mlx5 bits in this message, see below.
>
> On Thu, Dec 31, 2015 at 02:53:15PM +0800, Jianfeng Tan wrote:
>> Signed-off-by: Jianfeng Tan 
>> ---
>>   drivers/net/mlx4/mlx4.c | 27 +++
>>   1 file changed, 27 insertions(+)
>>
>> diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
>> index 207bfe2..85afa32 100644
>> --- a/drivers/net/mlx4/mlx4.c
>> +++ b/drivers/net/mlx4/mlx4.c
>> @@ -2836,6 +2836,8 @@ rxq_cleanup(struct rxq *rxq)
>>* @param flags
>>*   RX completion flags returned by poll_length_flags().
>>*
>> + * @note: fix mlx4_dev_ptype_info_get() if any change here.
>> + *
>>* @return
>>*   Packet type for struct rte_mbuf.
>>*/
>> @@ -4268,6 +4270,30 @@ mlx4_dev_infos_get(struct rte_eth_dev *dev, struct 
>> rte_eth_dev_info *info)
>>  priv_unlock(priv);
>>   }
>>   
>> +static int
>> +mlx4_dev_ptype_info_get(struct rte_eth_dev *dev, uint32_t ptype_mask,
>> +uint32_t ptypes[])
>> +{
>> +int num = 0;
>> +
>> +if ((dev->rx_pkt_burst == mlx4_rx_burst)
>> +|| (dev->rx_pkt_burst == mlx4_rx_burst_sp)) {
>> +/* refers to rxq_cq_to_pkt_type() */
>> +if ((ptype_mask & RTE_PTYPE_L3_MASK) == RTE_PTYPE_L3_MASK) {
>> +ptypes[num++] = RTE_PTYPE_L3_IPV4;
>> +ptypes[num++] = RTE_PTYPE_L3_IPV6;
>> +}
>> +
>> +if ((ptype_mask & RTE_PTYPE_INNER_L3_MASK) == 
>> RTE_PTYPE_INNER_L3_MASK) {
>> +ptypes[num++] = RTE_PTYPE_INNER_L3_IPV4;
>> +ptypes[num++] = RTE_PTYPE_INNER_L3_IPV6;
>> +}
>> +} else
>> +num = -ENOTSUP;
>> +
>> +return num;
>> +}
> I think checking for mlx4_rx_burst and mlx4_rx_burst_sp is unnecessary at
> the moment, all RX burst functions do update the packet_type field, no need
> for extra complexity.
>
> Same comment for mlx5.

Hi Mazarguil,

My original thought is that rx_pkt_burst could be also set as 
removed_rx_burst, which does not make sense indeed
because it's only possible when the device is closed.

Another consideration is to keep same style with other devices. Each 
kind of device could have several rx burst functions.
So current implementation can keep extensibility to add new rx burst 
functions. How do you think of it?

Thanks,
Jianfeng

>
>> +
>>   /**
>>* DPDK callback to get device statistics.
>>*
>> @@ -4989,6 +5015,7 @@ static const struct eth_dev_ops mlx4_dev_ops = {
>>  .stats_reset = mlx4_stats_reset,
>>  .queue_stats_mapping_set = NULL,
>>  .dev_infos_get = mlx4_dev_infos_get,
>> +.dev_ptypes_info_get = mlx4_dev_ptype_info_get,
>>  .vlan_filter_set = mlx4_vlan_filter_set,
>>  .vlan_tpid_set = NULL,
>>  .vlan_strip_queue_set = NULL,
>> -- 
>> 2.1.4
>>



[dpdk-dev] Traffic scheduling in DPDK

2016-01-05 Thread ravulakollu.ku...@wipro.com
Thanks Jasvinder , I am running the below command

./build/qos_sched -c 0xe -n 1  -- --pfc "0,1,3,2" --cfg ./profile.cfg

Bound two 1G physical ports to DPDK , and started running the above command 
with the default profile mentioned in profile.cfg .
I am using lcore 3 and 2 for RX and TX. It was not successful, getting the 
below error.

APP: Initializing port 0... PMD: eth_igb_rx_queue_setup(): 
sw_ring=0x7f5b20ba2240 hw_ring=0x7f5b20ba2680 dma_addr=0xbf87a2680
PMD: eth_igb_tx_queue_setup(): To improve 1G driver performance, consider 
setting the TX WTHRESH value to 4, 8, or 16.
PMD: eth_igb_tx_queue_setup(): sw_ring=0x7f5b20b910c0 hw_ring=0x7f5b20b92100 
dma_addr=0xbf8792100
PMD: eth_igb_start(): <<
done:  Link Up - speed 1000 Mbps - full-duplex
APP: Initializing port 1... PMD: eth_igb_rx_queue_setup(): 
sw_ring=0x7f5b20b80a40 hw_ring=0x7f5b20b80e80 dma_addr=0xbf8780e80
PMD: eth_igb_tx_queue_setup(): To improve 1G driver performance, consider 
setting the TX WTHRESH value to 4, 8, or 16.
PMD: eth_igb_tx_queue_setup(): sw_ring=0x7f5b20b6f8c0 hw_ring=0x7f5b20b70900 
dma_addr=0xbf8770900
PMD: eth_igb_start(): <<
done:  Link Up - speed 1000 Mbps - full-duplex
SCHED: Low level config for pipe profile 0:
Token bucket: period = 3277, credits per period = 8, size = 100
Traffic classes: period = 500, credits per period = [12207, 12207, 
12207, 12207]
Traffic class 3 oversubscription: weight = 0
WRR cost: [1, 1, 1, 1], [1, 1, 1, 1], [1, 1, 1, 1], [1, 1, 1, 1]
EAL: Error - exiting with code: 1
  Cause: Unable to config sched subport 0, err=-2

Please, tell me whether I am missing any other configuration.

Thanks,
Uday


-Original Message-
From: Singh, Jasvinder [mailto:jasvinder.si...@intel.com]
Sent: Monday, January 04, 2016 9:26 PM
To: Ravulakollu Udaya Kumar (WT01 - Product Engineering Service); dev at 
dpdk.org
Subject: RE: [dpdk-dev] Traffic scheduling in DPDK

Hi Uday,


> I have an issue in running qos_sched application in DPDK .Could
> someone tell me how to run the command  and what each parameter does
> In the below mentioned text.
>
> Application mandatory parameters:
> --pfc "RX PORT, TX PORT, RX LCORE, WT LCORE" : Packet flow configuration
>multiple pfc can be configured in command line


RX PORT - Specifies the packets receive port TX PORT - Specifies the packets 
transmit port RXCORE - Specifies the  Core used for Packet reception and 
Classification stage of the QoS application.
WTCORE-  Specifies the  Core used for Packet enqueue/dequeue operation (QoS 
scheduling)  and subsequently transmitting the packets out.

Multiple pfc  can be specified depending upon the number of instances of qos 
sched required in application.  For example- in order to run two instance, 
following can be used-

./build/qos_sched -c 0x7e -n 4 -- --pfc "0,1,2,3,4" --pfc "2,3,5,6" --cfg 
"profile.cfg"

First instance of qos sched receives packets from port 0 and transmits its 
packets through port 1 ,while second qos sched will receives packets from port 
2 and transmit through port 3. In case of single qos sched instance, following 
can be used-

./build/qos_sched -c 0x1e -n 4 -- --pfc "0,1,2,3,4" --cfg "profile.cfg"


Thanks,
Jasvinder
The information contained in this electronic message and any attachments to 
this message are intended for the exclusive use of the addressee(s) and may 
contain proprietary, confidential or privileged information. If you are not the 
intended recipient, you should not disseminate, distribute or copy this e-mail. 
Please notify the sender immediately and destroy all copies of this message and 
any attachments. WARNING: Computer viruses can be transmitted via email. The 
recipient should check this email and any attachments for the presence of 
viruses. The company accepts no liability for any damage caused by any virus 
transmitted by this email. www.wipro.com


[dpdk-dev] [PATCH] virtio: fix rx ring descriptor starvation

2016-01-05 Thread Xie, Huawei
On 12/17/2015 7:18 PM, Tom Kiely wrote:
>
>
> On 11/25/2015 05:32 PM, Xie, Huawei wrote:
>> On 11/13/2015 5:33 PM, Tom Kiely wrote:
>>> If all rx descriptors are processed while transient
>>> mbuf exhaustion is present, the rx ring ends up with
>>> no available descriptors. Thus no packets are received
>>> on that ring. Since descriptor refill is performed post
>>> rx descriptor processing, in this case no refill is
>>> ever subsequently performed resulting in permanent rx
>>> traffic drop.
>>>
>>> Signed-off-by: Tom Kiely 
>>> ---
>>>   drivers/net/virtio/virtio_rxtx.c |6 --
>>>   1 file changed, 4 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/net/virtio/virtio_rxtx.c
>>> b/drivers/net/virtio/virtio_rxtx.c
>>> index 5770fa2..a95e234 100644
>>> --- a/drivers/net/virtio/virtio_rxtx.c
>>> +++ b/drivers/net/virtio/virtio_rxtx.c
>>> @@ -586,7 +586,8 @@ virtio_recv_pkts(void *rx_queue, struct rte_mbuf
>>> **rx_pkts, uint16_t nb_pkts)
>>>   if (likely(num > DESC_PER_CACHELINE))
>>>   num = num - ((rxvq->vq_used_cons_idx + num) %
>>> DESC_PER_CACHELINE);
>>>   -if (num == 0)
>>> +/* Refill free descriptors even if no pkts recvd */
>>> +if (num == 0 && virtqueue_full(rxvq))
>> Should the return condition be that no used buffers and we have avail
>> descs in avail ring, i.e,
>>  num == 0 && rxvq->vq_free_cnt != rxvq->vq_nentries
>>
>> rather than
>>  num == 0 && rxvq->vq_free_cnt == 0
> Yes we could do that but I don't see a good reason to wait until the
> vq_free_cnt == vq_nentries
> before attempting the refill. The existing code will attempt refill
> even if only 1 packet was received
> and the free count is small. To me it seems safer to extend that to
> try refill even if no packet was received
> but the free count is non-zero.
The existing code attempt to refill only if 1 packet was received.

If we want to refill even no packet was received, then the strict
condition should be
num == 0 && rxvq->vq_free_cnt != rxvq->vq_nentries

The safer condition, what you want to use,  should be
num == 0 && !virtqueue_full(...)
rather than
num == 0 && virtqueue_full(...)

We could simplify things a bit, just remove this check, if the following
receiving code already takes care of the "num == 0" condition.

I find virtqueue_full is confusing, maybe we could change it to some
other meaningful name.

>
>Tom
>
>>>   return 0;
>>> num = virtqueue_dequeue_burst_rx(rxvq, rcv_pkts, len, num);
>>> @@ -683,7 +684,8 @@ virtio_recv_mergeable_pkts(void *rx_queue,
>>> virtio_rmb();
>>>   -if (nb_used == 0)
>>> +/* Refill free descriptors even if no pkts recvd */
>>> +if (nb_used == 0 && virtqueue_full(rxvq))
>>>   return 0;
>>> PMD_RX_LOG(DEBUG, "used:%d\n", nb_used);
>
>



[dpdk-dev] [PATCH] vhost: remove lockless enqueue to the virtio ring

2016-01-05 Thread Xie, Huawei
On 1/5/2016 2:42 PM, Xie, Huawei wrote:
> This patch removes the internal lockless enqueue implmentation.
> DPDK doesn't support receiving/transmitting packets from/to the same
> queue. Vhost PMD wraps vhost device as normal DPDK port. DPDK
> applications normally have their own lock implmentation when enqueue
> packets to the same queue of a port.
>
> The atomic cmpset is a costly operation. This patch should help
> performance a bit.
>
> Signed-off-by: Huawei Xie 
This patch modifies the API's behavior, which is also a trivial ABI
change. In my opinion, application shouldn't rely on previous behavior.
Anyway, i am checking how to declare the ABI change.


[dpdk-dev] [PATCH] pmd/virtio: fix cannot start virtio dev after stop

2016-01-05 Thread Jianfeng Tan
Fix the issue that virtio device cannot be started after stopped.

The field, hw->started, should be changed by virtio_dev_start/stop instead
of virtio_dev_close.

Signed-off-by: Jianfeng Tan 
---
 drivers/net/virtio/virtio_ethdev.c | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index d928339..07fe271 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -490,11 +490,13 @@ virtio_dev_close(struct rte_eth_dev *dev)

PMD_INIT_LOG(DEBUG, "virtio_dev_close");

+   if (hw->started == 1)
+   virtio_dev_stop(eth_dev);
+
/* reset the NIC */
if (pci_dev->driver->drv_flags & RTE_PCI_DRV_INTR_LSC)
vtpci_irq_config(hw, VIRTIO_MSI_NO_VECTOR);
vtpci_reset(hw);
-   hw->started = 0;
virtio_dev_free_mbufs(dev);
virtio_free_queues(dev);
 }
@@ -1408,10 +1410,9 @@ eth_virtio_dev_uninit(struct rte_eth_dev *eth_dev)
if (rte_eal_process_type() == RTE_PROC_SECONDARY)
return -EPERM;

-   if (hw->started == 1) {
-   virtio_dev_stop(eth_dev);
-   virtio_dev_close(eth_dev);
-   }
+   /* Close it anyway since there's no way to know if closed */
+   virtio_dev_close(eth_dev);
+
pci_dev = eth_dev->pci_dev;

eth_dev->dev_ops = NULL;
@@ -1615,6 +1616,8 @@ virtio_dev_stop(struct rte_eth_dev *dev)

PMD_INIT_LOG(DEBUG, "stop");

+   hw->started = 0;
+
if (dev->data->dev_conf.intr_conf.lsc)
rte_intr_disable(&dev->pci_dev->intr_handle);

-- 
2.1.4



[dpdk-dev] [PATCH v2 0/5] virtio: Tx performance improvements

2016-01-05 Thread Xie, Huawei
On 10/26/2015 10:06 PM, Xie, Huawei wrote:
> On 10/19/2015 1:16 PM, Stephen Hemminger wrote:
>> This is a tested version of the virtio Tx performance improvements
>> that I posted earlier on the list, and described at the DPDK Userspace
>> meeting in Dublin. Together they get a 25% performance improvement for
>> both small packet and large multi-segment packet case when testing
>> from DPDK guest application to Linux KVM host.
>>
>> Stephen Hemminger (5):
>>   virtio: clean up space checks on xmit
>>   virtio: don't use unlikely for normal tx stuff
>>   virtio: use indirect ring elements
>>   virtio: use any layout on transmit
>>   virtio: optimize transmit enqueue
> There is one open why merge-able header is used in tx path. Since old
> implementation is also using the merge-able header in tx path if this
> feature is negotiated, i choose to ack the patch and address this later
> if not now.
>
> Acked-by: Huawei Xie 

Thomas:
This patch isn't in the patchwork. Does Stephen need to send a new one?

>
>
>
>



[dpdk-dev] [PATCH v2 0/1] change hugepage sorting to avoid overlapping memcpy

2016-01-05 Thread Ralf Hoffmann
Hi,

I want to catch up with the patch about the overlapping memory
areas/hugepage sorting. I have incorporated the qsort patch from Jay
and made the suggested changes. So this fixes both the valgrind
warning about the overlapping memcpy and possible performance problems
due to the bubblesort.

Best Regards,

Ralf

---
Ralf Hoffmann (1):
  change hugepage sorting to avoid overlapping memcpy

 lib/librte_eal/linuxapp/eal/eal_memory.c | 60 
 1 file changed, 14 insertions(+), 46 deletions(-)

-- 
2.5.0



[dpdk-dev] [PATCH v2 1/1] change hugepage sorting to avoid overlapping memcpy

2016-01-05 Thread Ralf Hoffmann
with only one hugepage or already sorted hugepage addresses, the sort
function called memcpy with same src and dst pointer. Debugging with
valgrind will issue a warning about overlapping area. This patch
changes the sort method to qsort to avoid this behavior, according to
original patch from Jay Rolette . The separate
sort function is no longer necessary.

Signed-off-by: Ralf Hoffmann 
---
v2:

* incorporate patch from http://dpdk.org/dev/patchwork/patch/2061/
  to use qsort instead of bubble sort,
  original patch by Jay Rolette 

 lib/librte_eal/linuxapp/eal/eal_memory.c | 60 
 1 file changed, 14 insertions(+), 46 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c 
b/lib/librte_eal/linuxapp/eal/eal_memory.c
index 846fd31..a96d10a 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memory.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
@@ -701,54 +701,23 @@ error:
return -1;
 }

-/*
- * Sort the hugepg_tbl by physical address (lower addresses first on x86,
- * higher address first on powerpc). We use a slow algorithm, but we won't
- * have millions of pages, and this is only done at init time.
- */
 static int
-sort_by_physaddr(struct hugepage_file *hugepg_tbl, struct hugepage_info *hpi)
+cmp_physaddr(const void *a, const void *b)
 {
-   unsigned i, j;
-   int compare_idx;
-   uint64_t compare_addr;
-   struct hugepage_file tmp;
-
-   for (i = 0; i < hpi->num_pages[0]; i++) {
-   compare_addr = 0;
-   compare_idx = -1;
-
-   /*
-* browse all entries starting at 'i', and find the
-* entry with the smallest addr
-*/
-   for (j=i; j< hpi->num_pages[0]; j++) {
-
-   if (compare_addr == 0 ||
-#ifdef RTE_ARCH_PPC_64
-   hugepg_tbl[j].physaddr > compare_addr) {
+#ifndef RTE_ARCH_PPC_64
+   const struct hugepage_file *p1 = (const struct hugepage_file *)a;
+   const struct hugepage_file *p2 = (const struct hugepage_file *)b;
 #else
-   hugepg_tbl[j].physaddr < compare_addr) {
+   /* PowerPC needs memory sorted in reverse order from x86 */
+   const struct hugepage_file *p1 = (const struct hugepage_file *)b;
+   const struct hugepage_file *p2 = (const struct hugepage_file *)a;
 #endif
-   compare_addr = hugepg_tbl[j].physaddr;
-   compare_idx = j;
-   }
-   }
-
-   /* should not happen */
-   if (compare_idx == -1) {
-   RTE_LOG(ERR, EAL, "%s(): error in physaddr sorting\n", 
__func__);
-   return -1;
-   }
-
-   /* swap the 2 entries in the table */
-   memcpy(&tmp, &hugepg_tbl[compare_idx],
-   sizeof(struct hugepage_file));
-   memcpy(&hugepg_tbl[compare_idx], &hugepg_tbl[i],
-   sizeof(struct hugepage_file));
-   memcpy(&hugepg_tbl[i], &tmp, sizeof(struct hugepage_file));
-   }
-   return 0;
+   if (p1->physaddr < p2->physaddr)
+   return -1;
+   else if (p1->physaddr > p2->physaddr)
+   return 1;
+   else
+   return 0;
 }

 /*
@@ -1195,8 +1164,7 @@ rte_eal_hugepage_init(void)
goto fail;
}

-   if (sort_by_physaddr(&tmp_hp[hp_offset], hpi) < 0)
-   goto fail;
+   qsort(&tmp_hp[hp_offset], hpi->num_pages[0], sizeof(struct 
hugepage_file), cmp_physaddr);

 #ifdef RTE_EAL_SINGLE_FILE_SEGMENTS
/* remap all hugepages into single file segments */
-- 
2.5.0



[dpdk-dev] [PATCH v3 0/1] eal/linux: change hugepage sorting to avoid overlapping memcpy

2016-01-05 Thread Ralf Hoffmann
Hi again,

I forgot to correctly set the commit title, so this is v3.

Best Regards,

Ralf

---
Ralf Hoffmann (1):
  change hugepage sorting to avoid overlapping memcpy

 lib/librte_eal/linuxapp/eal/eal_memory.c | 60 
 1 file changed, 14 insertions(+), 46 deletions(-)

-- 
2.5.0



[dpdk-dev] [PATCH v3 1/1] eal/linux: change hugepage sorting to avoid overlapping memcpy

2016-01-05 Thread Ralf Hoffmann
with only one hugepage or already sorted hugepage addresses, the sort
function called memcpy with same src and dst pointer. Debugging with
valgrind will issue a warning about overlapping area. This patch
changes the sort method to qsort to avoid this behavior, according to
original patch from Jay Rolette . The separate
sort function is no longer necessary.

Signed-off-by: Ralf Hoffmann 
---
v3:
* set commit title to eal/linux

v2:

* incorporate patch from http://dpdk.org/dev/patchwork/patch/2061/
  to use qsort instead of bubble sort,
  original patch by Jay Rolette 

 lib/librte_eal/linuxapp/eal/eal_memory.c | 60 
 1 file changed, 14 insertions(+), 46 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c 
b/lib/librte_eal/linuxapp/eal/eal_memory.c
index 846fd31..a96d10a 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memory.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
@@ -701,54 +701,23 @@ error:
return -1;
 }

-/*
- * Sort the hugepg_tbl by physical address (lower addresses first on x86,
- * higher address first on powerpc). We use a slow algorithm, but we won't
- * have millions of pages, and this is only done at init time.
- */
 static int
-sort_by_physaddr(struct hugepage_file *hugepg_tbl, struct hugepage_info *hpi)
+cmp_physaddr(const void *a, const void *b)
 {
-   unsigned i, j;
-   int compare_idx;
-   uint64_t compare_addr;
-   struct hugepage_file tmp;
-
-   for (i = 0; i < hpi->num_pages[0]; i++) {
-   compare_addr = 0;
-   compare_idx = -1;
-
-   /*
-* browse all entries starting at 'i', and find the
-* entry with the smallest addr
-*/
-   for (j=i; j< hpi->num_pages[0]; j++) {
-
-   if (compare_addr == 0 ||
-#ifdef RTE_ARCH_PPC_64
-   hugepg_tbl[j].physaddr > compare_addr) {
+#ifndef RTE_ARCH_PPC_64
+   const struct hugepage_file *p1 = (const struct hugepage_file *)a;
+   const struct hugepage_file *p2 = (const struct hugepage_file *)b;
 #else
-   hugepg_tbl[j].physaddr < compare_addr) {
+   /* PowerPC needs memory sorted in reverse order from x86 */
+   const struct hugepage_file *p1 = (const struct hugepage_file *)b;
+   const struct hugepage_file *p2 = (const struct hugepage_file *)a;
 #endif
-   compare_addr = hugepg_tbl[j].physaddr;
-   compare_idx = j;
-   }
-   }
-
-   /* should not happen */
-   if (compare_idx == -1) {
-   RTE_LOG(ERR, EAL, "%s(): error in physaddr sorting\n", 
__func__);
-   return -1;
-   }
-
-   /* swap the 2 entries in the table */
-   memcpy(&tmp, &hugepg_tbl[compare_idx],
-   sizeof(struct hugepage_file));
-   memcpy(&hugepg_tbl[compare_idx], &hugepg_tbl[i],
-   sizeof(struct hugepage_file));
-   memcpy(&hugepg_tbl[i], &tmp, sizeof(struct hugepage_file));
-   }
-   return 0;
+   if (p1->physaddr < p2->physaddr)
+   return -1;
+   else if (p1->physaddr > p2->physaddr)
+   return 1;
+   else
+   return 0;
 }

 /*
@@ -1195,8 +1164,7 @@ rte_eal_hugepage_init(void)
goto fail;
}

-   if (sort_by_physaddr(&tmp_hp[hp_offset], hpi) < 0)
-   goto fail;
+   qsort(&tmp_hp[hp_offset], hpi->num_pages[0], sizeof(struct 
hugepage_file), cmp_physaddr);

 #ifdef RTE_EAL_SINGLE_FILE_SEGMENTS
/* remap all hugepages into single file segments */
-- 
2.5.0



[dpdk-dev] Traffic scheduling in DPDK

2016-01-05 Thread Singh, Jasvinder
Hi Uday,

> 
> Thanks Jasvinder , I am running the below command
> 
> ./build/qos_sched -c 0xe -n 1  -- --pfc "0,1,3,2" --cfg ./profile.cfg
> 
> Bound two 1G physical ports to DPDK , and started running the above
> command with the default profile mentioned in profile.cfg .
> I am using lcore 3 and 2 for RX and TX. It was not successful, getting the
> below error.
> 
> APP: Initializing port 0... PMD: eth_igb_rx_queue_setup():
> sw_ring=0x7f5b20ba2240 hw_ring=0x7f5b20ba2680 dma_addr=0xbf87a2680
> PMD: eth_igb_tx_queue_setup(): To improve 1G driver performance,
> consider setting the TX WTHRESH value to 4, 8, or 16.
> PMD: eth_igb_tx_queue_setup(): sw_ring=0x7f5b20b910c0
> hw_ring=0x7f5b20b92100 dma_addr=0xbf8792100
> PMD: eth_igb_start(): <<
> done:  Link Up - speed 1000 Mbps - full-duplex
> APP: Initializing port 1... PMD: eth_igb_rx_queue_setup():
> sw_ring=0x7f5b20b80a40 hw_ring=0x7f5b20b80e80 dma_addr=0xbf8780e80
> PMD: eth_igb_tx_queue_setup(): To improve 1G driver performance,
> consider setting the TX WTHRESH value to 4, 8, or 16.
> PMD: eth_igb_tx_queue_setup(): sw_ring=0x7f5b20b6f8c0
> hw_ring=0x7f5b20b70900 dma_addr=0xbf8770900
> PMD: eth_igb_start(): <<
> done:  Link Up - speed 1000 Mbps - full-duplex
> SCHED: Low level config for pipe profile 0:
> Token bucket: period = 3277, credits per period = 8, size = 100
> Traffic classes: period = 500, credits per period = [12207, 12207, 
> 12207,
> 12207]
> Traffic class 3 oversubscription: weight = 0
> WRR cost: [1, 1, 1, 1], [1, 1, 1, 1], [1, 1, 1, 1], [1, 1, 1, 1]
> EAL: Error - exiting with code: 1
>   Cause: Unable to config sched subport 0, err=-2


In default profile.cfg, It is assumed that all the nic ports have 10 Gbps rate. 
The above error occurs when subport's  tb_rate (10Gbps) is found more than NIC 
port's capacity (1 Gbps). Therefore, you need to use either 10 Gbps ports in 
your application or have to amend the profile.cfg to work with 1 Gbps port.  
Please refer to  DPDK QoS framework document for more details on various 
parameters - http://dpdk.org/doc/guides/prog_guide/qos_framework.html


> -Original Message-
> From: Singh, Jasvinder [mailto:jasvinder.singh at intel.com]
> Sent: Monday, January 04, 2016 9:26 PM
> To: Ravulakollu Udaya Kumar (WT01 - Product Engineering Service);
> dev at dpdk.org
> Subject: RE: [dpdk-dev] Traffic scheduling in DPDK
> 
> Hi Uday,
> 
> 
> > I have an issue in running qos_sched application in DPDK .Could
> > someone tell me how to run the command  and what each parameter does
> > In the below mentioned text.
> >
> > Application mandatory parameters:
> > --pfc "RX PORT, TX PORT, RX LCORE, WT LCORE" : Packet flow
> configuration
> >multiple pfc can be configured in command line
> 
> 
> RX PORT - Specifies the packets receive port TX PORT - Specifies the packets
> transmit port RXCORE - Specifies the  Core used for Packet reception and
> Classification stage of the QoS application.
> WTCORE-  Specifies the  Core used for Packet enqueue/dequeue operation
> (QoS scheduling)  and subsequently transmitting the packets out.
> 
> Multiple pfc  can be specified depending upon the number of instances of
> qos sched required in application.  For example- in order to run two instance,
> following can be used-
> 
> ./build/qos_sched -c 0x7e -n 4 -- --pfc "0,1,2,3,4" --pfc "2,3,5,6" --cfg
> "profile.cfg"
> 
> First instance of qos sched receives packets from port 0 and transmits its
> packets through port 1 ,while second qos sched will receives packets from
> port 2 and transmit through port 3. In case of single qos sched instance,
> following can be used-
> 
> ./build/qos_sched -c 0x1e -n 4 -- --pfc "0,1,2,3,4" --cfg "profile.cfg"
> 
> 
> Thanks,
> Jasvinder
> The information contained in this electronic message and any attachments to
> this message are intended for the exclusive use of the addressee(s) and may
> contain proprietary, confidential or privileged information. If you are not 
> the
> intended recipient, you should not disseminate, distribute or copy this e-
> mail. Please notify the sender immediately and destroy all copies of this
> message and any attachments. WARNING: Computer viruses can be
> transmitted via email. The recipient should check this email and any
> attachments for the presence of viruses. The company accepts no liability for
> any damage caused by any virus transmitted by this email. www.wipro.com


[dpdk-dev] [PATCH] mk: fix examples build failure

2016-01-05 Thread steeven lee
Hi Michael:

Seems the examples makefile seems to be broken, easy to reproduce on
master branch, below is the outputs on Ubuntu 14.04 amd64 version:

~/work/dpdk$ export RTE_SDK=/home/steeven/work/dpdk
~/work/dpdk$ cd /home/steeven/work/dpdk/examples/helloworld/
~/work/dpdk/examples/helloworld$ export RTE_TARGET=x86_64-native-linuxapp-gcc
~/work/dpdk/examples/helloworld$ make
/home/steeven/work/dpdk/mk/internal/rte.extvars.mk:57: *** Cannot find
.config in /home/xueming/work/dpdk.  Stop.
~/work/dpdk/examples/helloworld$ cd ../cmdline/
~/work/dpdk/examples/cmdline$ make
/home/steeven/work/dpdk/mk/internal/rte.extvars.mk:57: *** Cannot find
.config in /home/xueming/work/dpdk.  Stop.


Thanks,
Steeven

On Mon, Dec 28, 2015 at 12:20 PM, Qiu, Michael  wrote:
> On 12/24/2015 8:38 PM, steeven lee wrote:
>> 1. Fix examples build failure
>> 2. make build as default output folder name
>>
>> Signed-off-by: steeven 
>> ---
>>  mk/internal/rte.extvars.mk | 4 ++--
>>  mk/rte.extsubdir.mk| 2 +-
>>  2 files changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/mk/internal/rte.extvars.mk b/mk/internal/rte.extvars.mk
>> index 040d39f..cabef0a 100644
>> --- a/mk/internal/rte.extvars.mk
>> +++ b/mk/internal/rte.extvars.mk
>> @@ -52,9 +52,9 @@ RTE_EXTMK ?= $(RTE_SRCDIR)/Makefile
>>  export RTE_EXTMK
>>
>>  # RTE_SDK_BIN must point to .config, include/ and lib/.
>> -RTE_SDK_BIN := $(RTE_SDK)/$(RTE_TARGET)
>> +RTE_SDK_BIN := $(RTE_SDK)/build
>>  ifeq ($(wildcard $(RTE_SDK_BIN)/.config),)
>> -$(error Cannot find .config in $(RTE_SDK))
>> +$(error Cannot find .config in $(RTE_SDK_BIN))
>>  endif
>>
>>  #
>> diff --git a/mk/rte.extsubdir.mk b/mk/rte.extsubdir.mk
>> index f50f006..819020a 100644
>> --- a/mk/rte.extsubdir.mk
>> +++ b/mk/rte.extsubdir.mk
>> @@ -46,7 +46,7 @@ $(DIRS-y):
>> @echo "== $@"
>> $(Q)$(MAKE) -C $(@) \
>> M=$(CURDIR)/$(@)/Makefile \
>> -   O=$(BASE_OUTPUT)/$(CUR_SUBDIR)/$(@)/$(RTE_TARGET) \
>> +   O=$(BASE_OUTPUT)/$(CUR_SUBDIR)/build \
>> BASE_OUTPUT=$(BASE_OUTPUT) \
>> CUR_SUBDIR=$(CUR_SUBDIR)/$(@) \
>> S=$(CURDIR)/$(@) \
>
> Could you show your compile error log? And how to reproduce it?
>
> Thanks,
> Michael


[dpdk-dev] [PATCH] fix checkpatch errors

2016-01-05 Thread Mcnamara, John
> -Original Message-
> From: Tan, Jianfeng
> Sent: Tuesday, January 5, 2016 2:21 AM
> To: Xie, Huawei; dev at dpdk.org
> Cc: Mcnamara, John; Stephen Hemminger; Yuanhan Liu
> Subject: RE: [PATCH] fix checkpatch errors
> 
> 
> 
> > -Original Message-
> > From: Xie, Huawei
> > Sent: Monday, January 4, 2016 9:52 AM
> > To: dev at dpdk.org
> > Cc: Mcnamara, John; Tan, Jianfeng; Xie, Huawei
> > Subject: [PATCH] fix checkpatch errors
> >
> > Signed-off-by: Huawei Xie 
> ...
> > mbuf_poolname_build(sock_id, pool_name, sizeof(pool_name));
> > -   return (rte_mempool_lookup((const char *)pool_name));
> > +   return rte_mempool_lookup((const char *)pool_name);
> 
> Hi Huawei,
> 
> Assume this patch is to solve below error (reported by checkpatch):
> ERROR: return is not a function, parentheses are not required
> 
> So maybe above fix is not necessary? Involve more people to discuss.
> 
> And please include the error message in the commit message.

Hi Huawei,

The fix looks good and there was a similar patch applied previously for lib 
(from Ferruh):

6307b909b8e0 ("lib: remove extra parenthesis after return")

However, the commit message could be better. Maybe something like the above:
"remove extra parentheses".

John
-- 




[dpdk-dev] [PATCH v2] mbuf: optimize rte_mbuf_refcnt_update

2016-01-05 Thread Olivier MATZ
Hi Hanoch,

On 01/04/2016 03:43 PM, Hanoch Haim (hhaim) wrote:
> Hi Oliver,
>
> Let's take your drawing as a reference and add my question
> The use case is sending a duplicate multicast packet by many threads.
> I can split it to x threads to do the job and with atomic-ref (my multicast 
> not mbuf) count it until it reaches zero.
>
> In my following example the two cores (0 and 1) sending the indirect m1/m2 do 
> alloc/attach/send
>
>  core0 |  core1
> - 
> |---
> m_const=rte_pktmbuf_alloc(mp) |
>|
> while true: |  while True:
>m1 =rte_pktmbuf_alloc(mp_64) |m2 =rte_pktmbuf_alloc(mp_64)
>rte_pktmbuf_attach(m1, m_const) |rte_pktmbuf_attach(m1, 
> m_const)
>tx_burst(m1)   |tx_burst(m2)
>
> Is this example is not valid?

For me, m_const is not expected to be used concurrently on
several cores. By "used", I mean calling a function that modifies
the mbuf, which is the case for rte_pktmbuf_attach().

> BTW this is our workaround
>
>
>core0  |   core1
> -  
> |---
> m_const=rte_pktmbuf_alloc(mp)  |
> rte_mbuf_refcnt_update(m_const,1)| <<-- workaround
> |
> while true:  |  while True:
>m1 =rte_pktmbuf_alloc(mp_64)  |m2 =rte_pktmbuf_alloc(mp_64)
>rte_pktmbuf_attach(m1, m_const)  |rte_pktmbuf_attach(m1, m_const)
>tx_burst(m1) |tx_burst(m2)

This workaround indeed solves the issue. Another solution would be to
protect the call to attach() with a lock, or call all the
rte_pktmbuf_attach() on the same core.

I'm open to discuss this behavior for rte_pktmbuf_attach() function
(should concurrent calls be allowed or not). In any case, we may
want to better document it in the doxygen API comments.


Regards,
Olivier


[dpdk-dev] [PATCH v2 1/3] librte_ether: remove RTE_PROC_PRIMARY_OR_ERR_RET and RTE_PROC_PRIMARY_OR_RET

2016-01-05 Thread Reshma Pattan
Macros RTE_PROC_PRIMARY_OR_ERR_RET and RTE_PROC_PRIMARY_OR_RET
are blocking the secondary process from using the APIs.
API access should be given to both secondary and primary.

Reported-by: Sean Harte 
Signed-off-by: Reshma Pattan 
---
v2:
* Removed checkpatch fixes of lib/librte_ether/rte_ethdev.h from this patch.

 lib/librte_ether/rte_ethdev.c |   50 +
 1 files changed, 1 insertions(+), 49 deletions(-)


diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index ed971b4..5849102 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -711,10 +711,6 @@ rte_eth_dev_rx_queue_start(uint8_t port_id, uint16_t 
rx_queue_id)
 {
struct rte_eth_dev *dev;

-   /* This function is only safe when called from the primary process
-* in a multi-process setup*/
-   RTE_PROC_PRIMARY_OR_ERR_RET(-E_RTE_SECONDARY);
-
RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);

dev = &rte_eth_devices[port_id];
@@ -741,10 +737,6 @@ rte_eth_dev_rx_queue_stop(uint8_t port_id, uint16_t 
rx_queue_id)
 {
struct rte_eth_dev *dev;

-   /* This function is only safe when called from the primary process
-* in a multi-process setup*/
-   RTE_PROC_PRIMARY_OR_ERR_RET(-E_RTE_SECONDARY);
-
RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);

dev = &rte_eth_devices[port_id];
@@ -771,10 +763,6 @@ rte_eth_dev_tx_queue_start(uint8_t port_id, uint16_t 
tx_queue_id)
 {
struct rte_eth_dev *dev;

-   /* This function is only safe when called from the primary process
-* in a multi-process setup*/
-   RTE_PROC_PRIMARY_OR_ERR_RET(-E_RTE_SECONDARY);
-
RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);

dev = &rte_eth_devices[port_id];
@@ -801,10 +789,6 @@ rte_eth_dev_tx_queue_stop(uint8_t port_id, uint16_t 
tx_queue_id)
 {
struct rte_eth_dev *dev;

-   /* This function is only safe when called from the primary process
-* in a multi-process setup*/
-   RTE_PROC_PRIMARY_OR_ERR_RET(-E_RTE_SECONDARY);
-
RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);

dev = &rte_eth_devices[port_id];
@@ -874,10 +858,6 @@ rte_eth_dev_configure(uint8_t port_id, uint16_t nb_rx_q, 
uint16_t nb_tx_q,
struct rte_eth_dev_info dev_info;
int diag;

-   /* This function is only safe when called from the primary process
-* in a multi-process setup*/
-   RTE_PROC_PRIMARY_OR_ERR_RET(-E_RTE_SECONDARY);
-
RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);

if (nb_rx_q > RTE_MAX_QUEUES_PER_PORT) {
@@ -1059,10 +1039,6 @@ rte_eth_dev_start(uint8_t port_id)
struct rte_eth_dev *dev;
int diag;

-   /* This function is only safe when called from the primary process
-* in a multi-process setup*/
-   RTE_PROC_PRIMARY_OR_ERR_RET(-E_RTE_SECONDARY);
-
RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);

dev = &rte_eth_devices[port_id];
@@ -1096,10 +1072,6 @@ rte_eth_dev_stop(uint8_t port_id)
 {
struct rte_eth_dev *dev;

-   /* This function is only safe when called from the primary process
-* in a multi-process setup*/
-   RTE_PROC_PRIMARY_OR_RET();
-
RTE_ETH_VALID_PORTID_OR_RET(port_id);
dev = &rte_eth_devices[port_id];

@@ -1121,10 +1093,6 @@ rte_eth_dev_set_link_up(uint8_t port_id)
 {
struct rte_eth_dev *dev;

-   /* This function is only safe when called from the primary process
-* in a multi-process setup*/
-   RTE_PROC_PRIMARY_OR_ERR_RET(-E_RTE_SECONDARY);
-
RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);

dev = &rte_eth_devices[port_id];
@@ -1138,10 +1106,6 @@ rte_eth_dev_set_link_down(uint8_t port_id)
 {
struct rte_eth_dev *dev;

-   /* This function is only safe when called from the primary process
-* in a multi-process setup*/
-   RTE_PROC_PRIMARY_OR_ERR_RET(-E_RTE_SECONDARY);
-
RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);

dev = &rte_eth_devices[port_id];
@@ -1155,10 +1119,6 @@ rte_eth_dev_close(uint8_t port_id)
 {
struct rte_eth_dev *dev;

-   /* This function is only safe when called from the primary process
-* in a multi-process setup*/
-   RTE_PROC_PRIMARY_OR_RET();
-
RTE_ETH_VALID_PORTID_OR_RET(port_id);
dev = &rte_eth_devices[port_id];

@@ -1183,10 +1143,6 @@ rte_eth_rx_queue_setup(uint8_t port_id, uint16_t 
rx_queue_id,
struct rte_eth_dev *dev;
struct rte_eth_dev_info dev_info;

-   /* This function is only safe when called from the primary process
-* in a multi-process setup*/
-   RTE_PROC_PRIMARY_OR_ERR_RET(-E_RTE_SECONDARY);
-
RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);

dev = &rte_eth_devices[port_id];
@@ -1266,10 +1222,6 @@ rte_eth_tx_queue_setup(uint8_t port_id, uint16_t 
tx_queue_id,
struct rte_eth_dev *dev;
   

[dpdk-dev] [PATCH v2] mbuf: optimize rte_mbuf_refcnt_update

2016-01-05 Thread Hanoch Haim (hhaim)
Hi Oliver, 
Thank you for the fast response and it would be great to open a discussion on 
that.
In general our project can leverage your optimization and I think it is great 
(we should have thought about it) . We can use it using the workaround I 
described.
However, for me it  seems odd that  rte_pktmbuf_attach () that does not 
*change* anything in m_const, except of the *atomic* ref counter does not work 
in parallel.
The example I gave is a classic use case of rte_pktmbuf_attach  (multicast ) 
and I don't see why it wouldn't work after your optimization. 

Do you have a pointer to the documentation that state that that you can't call 
the atomic ref counter from more than one thread?

Thanks,
Hanoh

-Original Message-
From: Olivier MATZ [mailto:olivier.m...@6wind.com] 
Sent: Tuesday, January 05, 2016 12:58 PM
To: Hanoch Haim (hhaim); bruce.richardson at intel.com
Cc: dev at dpdk.org; Ido Barnea (ibarnea); Itay Marom (imarom)
Subject: Re: [dpdk-dev] [PATCH v2] mbuf: optimize rte_mbuf_refcnt_update

Hi Hanoch,

On 01/04/2016 03:43 PM, Hanoch Haim (hhaim) wrote:
> Hi Oliver,
>
> Let's take your drawing as a reference and add my question The use 
> case is sending a duplicate multicast packet by many threads.
> I can split it to x threads to do the job and with atomic-ref (my multicast 
> not mbuf) count it until it reaches zero.
>
> In my following example the two cores (0 and 1) sending the indirect 
> m1/m2 do alloc/attach/send
>
>  core0 |  core1
> - 
> |---
> m_const=rte_pktmbuf_alloc(mp) |
>|
> while true: |  while True:
>m1 =rte_pktmbuf_alloc(mp_64) |m2 =rte_pktmbuf_alloc(mp_64)
>rte_pktmbuf_attach(m1, m_const) |rte_pktmbuf_attach(m1, 
> m_const)
>tx_burst(m1)   |tx_burst(m2)
>
> Is this example is not valid?

For me, m_const is not expected to be used concurrently on several cores. By 
"used", I mean calling a function that modifies the mbuf, which is the case for 
rte_pktmbuf_attach().

> BTW this is our workaround
>
>
>core0  |   core1
> -  
> |---
> m_const=rte_pktmbuf_alloc(mp)  |
> rte_mbuf_refcnt_update(m_const,1)| <<-- workaround
> |
> while true:  |  while True:
>m1 =rte_pktmbuf_alloc(mp_64)  |m2 =rte_pktmbuf_alloc(mp_64)
>rte_pktmbuf_attach(m1, m_const)  |rte_pktmbuf_attach(m1, m_const)
>tx_burst(m1) |tx_burst(m2)

This workaround indeed solves the issue. Another solution would be to protect 
the call to attach() with a lock, or call all the
rte_pktmbuf_attach() on the same core.

I'm open to discuss this behavior for rte_pktmbuf_attach() function (should 
concurrent calls be allowed or not). In any case, we may want to better 
document it in the doxygen API comments.


Regards,
Olivier


[dpdk-dev] [PATCH v2] mbuf: optimize rte_mbuf_refcnt_update

2016-01-05 Thread Olivier MATZ
Hi Hanoch,

On 01/05/2016 12:11 PM, Hanoch Haim (hhaim) wrote:
> Hi Oliver,
> Thank you for the fast response and it would be great to open a discussion on 
> that.
> In general our project can leverage your optimization and I think it is great 
> (we should have thought about it) . We can use it using the workaround I 
> described.
> However, for me it  seems odd that  rte_pktmbuf_attach () that does not 
> *change* anything in m_const, except of the *atomic* ref counter does not 
> work in parallel.
> The example I gave is a classic use case of rte_pktmbuf_attach  (multicast ) 
> and I don't see why it wouldn't work after your optimization.
>
> Do you have a pointer to the documentation that state that that you can't 
> call the atomic ref counter from more than one thread?

Unfortunately it's not documented yet, but it's something we should
better describe.

Regards,
Olivier


[dpdk-dev] [RFC PATCH 1/3] fm10k: enable FTAG based forwarding

2016-01-05 Thread Wang Xiao W
This patch enables reading sglort info into mbuf for RX and inserting
an FTAG at the beginning of the packet for TX. The vlan_tci_outer field
selected from rte_mbuf structure for sglort is not used in fm10k now.
In FTAG based forwarding mode, the switch will forward packets according
to glort info in FTAG rather than mac and vlan table.

To activate this feature, user needs to turn CONFIG_RTE_LIBRTE_FM10K_FTAG_FWD
to y in common_linuxapp or common_bsdapp. Currently this feature is supported
only on PF.

Signed-off-by: Wang Xiao W 
---
 config/common_bsdapp   |  1 +
 config/common_linuxapp |  1 +
 drivers/net/fm10k/fm10k_ethdev.c   |  5 +
 drivers/net/fm10k/fm10k_rxtx.c | 17 +
 drivers/net/fm10k/fm10k_rxtx_vec.c |  9 +
 5 files changed, 33 insertions(+)

diff --git a/config/common_bsdapp b/config/common_bsdapp
index ed7c31c..451f81a 100644
--- a/config/common_bsdapp
+++ b/config/common_bsdapp
@@ -208,6 +208,7 @@ CONFIG_RTE_LIBRTE_FM10K_DEBUG_TX=n
 CONFIG_RTE_LIBRTE_FM10K_DEBUG_TX_FREE=n
 CONFIG_RTE_LIBRTE_FM10K_DEBUG_DRIVER=n
 CONFIG_RTE_LIBRTE_FM10K_RX_OLFLAGS_ENABLE=y
+CONFIG_RTE_LIBRTE_FM10K_FTAG_FWD=n

 #
 # Compile burst-oriented Mellanox ConnectX-3 (MLX4) PMD
diff --git a/config/common_linuxapp b/config/common_linuxapp
index 74bc515..c928bce 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -207,6 +207,7 @@ CONFIG_RTE_LIBRTE_FM10K_DEBUG_TX_FREE=n
 CONFIG_RTE_LIBRTE_FM10K_DEBUG_DRIVER=n
 CONFIG_RTE_LIBRTE_FM10K_RX_OLFLAGS_ENABLE=y
 CONFIG_RTE_LIBRTE_FM10K_INC_VECTOR=y
+CONFIG_RTE_LIBRTE_FM10K_FTAG_FWD=n

 #
 # Compile burst-oriented Mellanox ConnectX-3 (MLX4) PMD
diff --git a/drivers/net/fm10k/fm10k_ethdev.c b/drivers/net/fm10k/fm10k_ethdev.c
index e4aed94..d5c376a 100644
--- a/drivers/net/fm10k/fm10k_ethdev.c
+++ b/drivers/net/fm10k/fm10k_ethdev.c
@@ -668,6 +668,11 @@ fm10k_dev_tx_init(struct rte_eth_dev *dev)
PMD_INIT_LOG(ERR, "failed to disable queue %d", i);
return -1;
}
+#ifdef RTE_LIBRTE_FM10K_FTAG_FWD
+   /* enable use of FTAG bit in Tx descriptor, register is RO for 
VF */
+   if (hw->mac.type == fm10k_mac_pf)
+   FM10K_WRITE_REG(hw, FM10K_PFVTCTL(i), 
FM10K_PFVTCTL_FTAG_DESC_ENABLE);
+#endif

/* set location and size for descriptor ring */
FM10K_WRITE_REG(hw, FM10K_TDBAL(i),
diff --git a/drivers/net/fm10k/fm10k_rxtx.c b/drivers/net/fm10k/fm10k_rxtx.c
index e958865..f87987d 100644
--- a/drivers/net/fm10k/fm10k_rxtx.c
+++ b/drivers/net/fm10k/fm10k_rxtx.c
@@ -152,6 +152,13 @@ fm10k_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
 */
mbuf->ol_flags |= PKT_RX_VLAN_PKT;
mbuf->vlan_tci = desc.w.vlan;
+#ifdef RTE_LIBRTE_FM10K_FTAG_FWD
+   /**
+* mbuf->vlan_tci_outer is an idle field in fm10k driver,
+* so it can be selected to store sglort value.
+*/
+   mbuf->vlan_tci_outer = rte_le_to_cpu_16(desc.w.sglort);
+#endif

rx_pkts[count] = mbuf;
if (++next_dd == q->nb_desc) {
@@ -307,6 +314,13 @@ fm10k_recv_scattered_pkts(void *rx_queue, struct rte_mbuf 
**rx_pkts,
 */
mbuf->ol_flags |= PKT_RX_VLAN_PKT;
first_seg->vlan_tci = desc.w.vlan;
+#ifdef RTE_LIBRTE_FM10K_FTAG_FWD
+   /**
+* mbuf->vlan_tci_outer is an idle field in fm10k driver,
+* so it can be selected to store sglort value.
+*/
+   first_seg->vlan_tci_outer = rte_le_to_cpu_16(desc.w.sglort);
+#endif

/* Prefetch data of first segment, if configured to do so. */
rte_packet_prefetch((char *)first_seg->buf_addr +
@@ -432,6 +446,9 @@ static inline void tx_xmit_pkt(struct fm10k_tx_queue *q, 
struct rte_mbuf *mb)
q->nb_free -= mb->nb_segs;

q->hw_ring[q->next_free].flags = 0;
+#ifdef RTE_LIBRTE_FM10K_FTAG_FWD
+   q->hw_ring[q->next_free].flags |= FM10K_TXD_FLAG_FTAG;
+#endif
/* set checksum flags on first descriptor of packet. SCTP checksum
 * offload is not supported, but we do not explicitly check for this
 * case in favor of greatly simplified processing. */
diff --git a/drivers/net/fm10k/fm10k_rxtx_vec.c 
b/drivers/net/fm10k/fm10k_rxtx_vec.c
index 2a57eef..0b0f2e3 100644
--- a/drivers/net/fm10k/fm10k_rxtx_vec.c
+++ b/drivers/net/fm10k/fm10k_rxtx_vec.c
@@ -198,7 +198,12 @@ fm10k_rx_vec_condition_check(struct rte_eth_dev *dev)
rxmode->header_split == 1)
return -1;

+#ifdef RTE_LIBRTE_FM10K_FTAG_FWD
+   return -1;
+#else
return 0;
+#endif
+
 #else
RTE_SET_USED(dev);
return -1;
@@ -648,7 +653,11 @@ fm10k_tx_vec_condition_check(struct fm10k_tx_queue *txq)
if ((txq->txq_flags & FM10K_SIMPLE_TX_FLAG) != FM10K_SIMPLE_TX_FLAG)

[dpdk-dev] [RFC PATCH 2/3] fm10k: add a unit test for FTAG based forwarding

2016-01-05 Thread Wang Xiao W
This patch adds a unit test case for FTAG functional test. Before running
the test, set PORT0_GLORT and PORT1_GLORT environment variables, and ensure
two fm10k ports are used for dpdk, glort info for each port can be shown in
TestPoint. In the unit test, a packet will be forwarded to the target port by
the switch without changing the destination mac address.

Signed-off-by: Wang Xiao W 
---
 app/test/Makefile  |   1 +
 app/test/test_fm10k_ftag.c | 253 +
 2 files changed, 254 insertions(+)
 create mode 100644 app/test/test_fm10k_ftag.c

diff --git a/app/test/Makefile b/app/test/Makefile
index ec33e1a..d72be8d 100644
--- a/app/test/Makefile
+++ b/app/test/Makefile
@@ -57,6 +57,7 @@ SRCS-y += test_memzone.c
 SRCS-y += test_ring.c
 SRCS-y += test_ring_perf.c
 SRCS-y += test_pmd_perf.c
+SRCS-$(CONFIG_RTE_LIBRTE_FM10K_FTAG_FWD) += test_fm10k_ftag.c

 ifeq ($(CONFIG_RTE_LIBRTE_TABLE),y)
 SRCS-y += test_table.c
diff --git a/app/test/test_fm10k_ftag.c b/app/test/test_fm10k_ftag.c
new file mode 100644
index 000..325a652
--- /dev/null
+++ b/app/test/test_fm10k_ftag.c
@@ -0,0 +1,253 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "test.h"
+
+#define RX_RING_SIZE 128
+#define TX_RING_SIZE 512
+
+#define NUM_MBUFS 8191
+#define MBUF_CACHE_SIZE 250
+#define BURST_SIZE 32
+
+struct fm10k_ftag {
+   uint16_t swpri_type_user;
+   uint16_t vlan;
+   uint16_t sglort;
+   uint16_t dglort;
+};
+
+static const struct rte_eth_conf port_conf_default = {
+   .rxmode = { .max_rx_pkt_len = ETHER_MAX_LEN }
+};
+
+/*
+ * Initializes a given port using global settings and with the RX buffers
+ * coming from the mbuf_pool passed as a parameter.
+ */
+static inline int
+port_init(uint8_t port, struct rte_mempool *mbuf_pool)
+{
+   struct rte_eth_conf port_conf = port_conf_default;
+   const uint16_t rx_rings = 1, tx_rings = 1;
+   int retval;
+   uint16_t q;
+
+   if (port >= rte_eth_dev_count())
+   return -1;
+
+   /* Configure the Ethernet device. */
+   retval = rte_eth_dev_configure(port, rx_rings, tx_rings, &port_conf);
+   if (retval != 0)
+   return retval;
+
+   /* Allocate and set up 1 RX queue per Ethernet port. */
+   for (q = 0; q < rx_rings; q++) {
+   retval = rte_eth_rx_queue_setup(port, q, RX_RING_SIZE,
+   rte_eth_dev_socket_id(port), NULL, mbuf_pool);
+   if (retval < 0)
+   return retval;
+   }
+
+   /* Allocate and set up 1 TX queue per Ethernet port. */
+   for (q = 0; q < tx_rings; q++) {
+   retval = rte_eth_tx_queue_setup(port, q, TX_RING_SIZE,
+   rte_eth_dev_socket_id(port), NULL);
+   if (retval < 0)
+   return retval;
+   }
+
+   /* Start the Ethernet port. */
+   retval = rte_eth_dev_start(port);
+   if (retval < 0)
+   return retval;
+
+   /* Display the port MAC address. */
+   struct ether_addr addr;
+   rte_eth_macaddr_get(port, &addr);
+   printf("Port %u MAC: %02" PRIx8 " %02" PRIx8 " %02" PRIx8
+  " %02" PRIx8 

[dpdk-dev] [RFC PATCH 0/3] fm10k: enable FTAG based forwarding

2016-01-05 Thread Wang Xiao W
This is a RFC patch set for FTAG based forwarding feature of RRC.

Wang Xiao W (3):
  fm10k: enable FTAG based forwarding
  fm10k: add a unit test for FTAG based forwarding
  doc: add introduction for fm10k FTAG based forwarding

 app/test/Makefile  |   1 +
 app/test/test_fm10k_ftag.c | 253 +
 config/common_bsdapp   |   1 +
 config/common_linuxapp |   1 +
 doc/guides/nics/fm10k.rst  |  13 ++
 drivers/net/fm10k/fm10k_ethdev.c   |   5 +
 drivers/net/fm10k/fm10k_rxtx.c |  17 +++
 drivers/net/fm10k/fm10k_rxtx_vec.c |   9 ++
 8 files changed, 300 insertions(+)
 create mode 100644 app/test/test_fm10k_ftag.c

-- 
1.9.3



[dpdk-dev] [RFC PATCH 3/3] doc: add introduction for fm10k FTAG based forwarding

2016-01-05 Thread Wang Xiao W
Add a brief introduction on FTAG, describes what's FTAG and how it works
in forwarding, introduction on how to run fm10k with FTAG is also included.

Signed-off-by: Wang Xiao W 
---
 doc/guides/nics/fm10k.rst | 13 +
 1 file changed, 13 insertions(+)

diff --git a/doc/guides/nics/fm10k.rst b/doc/guides/nics/fm10k.rst
index 4206b7f..d82bf41 100644
--- a/doc/guides/nics/fm10k.rst
+++ b/doc/guides/nics/fm10k.rst
@@ -34,6 +34,19 @@ FM10K Poll Mode Driver
 The FM10K poll mode driver library provides support for the Intel FM1
 (FM10K) family of 40GbE/100GbE adapters.

+FTAG Based Forwarding of FM10K
+--
+FTAG Based Forwarding is a unique feature of FM10K. The FM10K family of NICs
+support the addition of a Fabric Tag (FTAG) to carry special information.
+The FTAG is placed at the beginning of the frame, it contains information such
+as where the packet comes from and goes, the vlan tag. In FTAG based forwarding
+mode, the switch logic forwards packets according to glort (global resource 
tag)
+information, other than the mac and vlan table. Now this feature works only on
+PF.
+
+To enable this feature, turn CONFIG_RTE_LIBRTE_FM10K_FTAG_FWD to y in the
+configuration file. A unit test case fm10k_ftag_autotest is for reference, it 
shows
+how to read sglort info on RX and how to make an FTAG on TX.

 Limitations
 ---
-- 
1.9.3



[dpdk-dev] [PATCH 1/8] bond: use existing enslaved device queues

2016-01-05 Thread Declan Doherty
On 04/12/15 17:14, Stephen Hemminger wrote:
> From: Eric Kinzie 
>
> This solves issues when an active device is added to a bond.
>
> If a device to be enslaved already has transmit and/or receive queues
> allocated, use those and then create any additional queues that are
> necessary.
>
> Signed-off-by: Eric Kinzie 
> Signed-off-by: Stephen Hemminger 
> ---
...
>

Acked-by: Declan Doherty 


[dpdk-dev] [PATCH 2/8] bond mode 4: copy entire config structure

2016-01-05 Thread Declan Doherty
On 04/12/15 17:14, Stephen Hemminger wrote:
> From: Eric Kinzie 
>
> Copy all needed fields from the mode8023ad_private structure in
> bond_mode_8023ad_conf_get().  This help ensure that a subsequent call
> to rte_eth_bond_8023ad_setup() is not passed uninitialized data that
> would result in either incorrect behavior or a failed sanity check.
>
> Fixes: 46fb43683679 ("bond: add mode 4")
>
> Signed-off-by: Eric Kinzie 
> Signed-off-by: Stephen Hemminger 
> ---
...
>

Acked-by: Declan Doherty 


[dpdk-dev] [PATCH 3/8] bond mode 4: do not ignore multicast

2016-01-05 Thread Declan Doherty
On 04/12/15 17:14, Stephen Hemminger wrote:
> From: Eric Kinzie 
>
> The bonding PMD in mode 4 puts all enslaved interfaces into promiscuous
> mode in order to receive LACPDUs and must filter unwanted packets
> after the traffic has been "collected".  Allow broadcast and multicast
> through so that ARP and IPv6 neighbor discovery continue to work.
>
> Fixes: 46fb43683679 ("bond: add mode 4")
>
> Signed-off-by: Eric Kinzie 
> Signed-off-by: Stephen Hemminger 
> ---
...
>


Acked-by: Declan Doherty 


[dpdk-dev] [PATCH 4/8] bond mode 4: allow external state machine

2016-01-05 Thread Declan Doherty
On 04/12/15 17:14, Stephen Hemminger wrote:
> From: Eric Kinzie 
>
> Provide functions to allow an external 802.3ad state machine to transmit
> and recieve LACPDUs and to set the collection/distribution flags on
> slave interfaces.
>
> Signed-off-by: Eric Kinzie 
> Signed-off-by: Stephen Hemminger 
> ---
...
>

Acked-by: Declan Doherty 



[dpdk-dev] [PATCH 5/8] bond: active slaves with no primary

2016-01-05 Thread Declan Doherty
On 04/12/15 17:14, Stephen Hemminger wrote:
> From: Eric Kinzie 
>
> If the link state of a slave is "up" when added, it is added to the list
> of active slaves but, even if it is the only slave, is not selected as
> the primary interface.  Generally, handling of link state interrupts
> selects an interface to be primary, but only if the active count is zero.
> This change avoids the situation where there are active slaves but
> no primary.
>
> Signed-off-by: Eric Kinzie 
> Signed-off-by: Stephen Hemminger 
> ---
...
>

Acked-by: Declan Doherty 


[dpdk-dev] [PATCH 6/8] bond: handle slaves with fewer queues than bonding device

2016-01-05 Thread Declan Doherty
On 04/12/15 19:18, Eric Kinzie wrote:
> On Fri Dec 04 19:36:09 +0100 2015, Andriy Berestovskyy wrote:
>> Hi guys,
>> I'm not quite sure if we can support less TX queues on a slave that easy:
>>
>>> queue_id = bond_slave_txqid(internals, i, bd_tx_q->queue_id);
>>> num_tx_slave = rte_eth_tx_burst(slaves[i], queue_id,
>>>   slave_bufs[i], slave_nb_pkts[i]);
>>
>> It seems that two different lcores might end up writing to the same
>> slave queue at the same time, isn't it?
>>
>> Regards,
>> Andriy
>
> Andriy, I think you're probably right about this.  Perhaps it should
> instead refuse to add or refuse to activate a slave with too few
> tx queues.  Could probably fix this with another layer of buffering
> so that an lcore with a valid tx queue could pick up the mbufs later,
> but this doesn't seem very appealing.
>
> Eric
>
>
>> On Fri, Dec 4, 2015 at 6:14 PM, Stephen Hemminger
>>  wrote:
>>> From: Eric Kinzie 
>>>
>>> In the event that the bonding device has a greater number of tx and/or rx
>>> queues than the slave being added, track the queue limits of the slave.
>>> On receive, ignore queue identifiers beyond what the slave interface
>>> can support.  During transmit, pick a different queue id to use if the
>>> intended queue is not available on the slave.
>>>
>>> Signed-off-by: Eric Kinzie 
>>> Signed-off-by: Stephen Hemminger 
>>> ---
...


I don't there is any straight forward way of supporting slaves with 
different numbers of queues, the initial library was written with the 
assumption that the number of tx/rx queues would always be the same on 
each slave. This is why,when a slave is added to a bonded device we 
reconfigure the queues. For features like RSS we have to have the same 
number of rx queues otherwise the flow distribution to an application 
could change in the case of a fail over event. Also by supporting 
different numbers of queues between slaves we would be no longer be 
supporting the standard behavior of ethdevs in DPDK were we expect that 
by using different queues we don't require locking to be thread safe.




[dpdk-dev] [PATCH 8/8] bond: do not activate slave twice

2016-01-05 Thread Declan Doherty
On 04/12/15 17:14, Stephen Hemminger wrote:
> From: Eric Kinzie 
>
> The current code for detecting link during slave addition can cause a
> slave interface to be activated twice -- once during slave_configure()
> and again at the end of __eth_bond_slave_add_lock_free().  This will
> either cause the active slave count to be incorrect or will cause the
> 802.3ad activation function to panic.  Ensure that the interface is not
> activated more than once.
>
> Signed-off-by: Eric Kinzie 
> Signed-off-by: Stephen Hemminger 
> ---
...
 >

Acked-by: Declan Doherty 


[dpdk-dev] [PATCH] af_packet: make the device detachable

2016-01-05 Thread Wojciech Zmuda
Fix memory leak when detaching virtual device. Set dev_flags to
RTE_ETH_DEV_DETACHABLE and implement pmd_af_packet_drv.uninit method.
Copy device name to ethdev->data to make it compatibile with
rte_eth_dev_allocated().

Signed-off-by: Wojciech Zmuda 
---
 drivers/net/af_packet/rte_eth_af_packet.c | 29 -
 1 file changed, 28 insertions(+), 1 deletion(-)

diff --git a/drivers/net/af_packet/rte_eth_af_packet.c 
b/drivers/net/af_packet/rte_eth_af_packet.c
index 767f36b..7ef65ff 100644
--- a/drivers/net/af_packet/rte_eth_af_packet.c
+++ b/drivers/net/af_packet/rte_eth_af_packet.c
@@ -667,11 +667,13 @@ rte_pmd_init_internals(const char *name,
data->nb_tx_queues = (uint16_t)nb_queues;
data->dev_link = pmd_link;
data->mac_addrs = &(*internals)->eth_addr;
+   strncpy(data->name,
+   (*eth_dev)->data->name, strlen((*eth_dev)->data->name));

(*eth_dev)->data = data;
(*eth_dev)->dev_ops = &ops;
(*eth_dev)->driver = NULL;
-   (*eth_dev)->data->dev_flags = 0;
+   (*eth_dev)->data->dev_flags = RTE_ETH_DEV_DETACHABLE;
(*eth_dev)->data->drv_name = drivername;
(*eth_dev)->data->kdrv = RTE_KDRV_NONE;
(*eth_dev)->data->numa_node = numa_node;
@@ -836,10 +838,35 @@ exit:
return ret;
 }

+static int
+rte_pmd_af_packet_devuninit(const char *name)
+{
+   struct rte_eth_dev *eth_dev = NULL;
+
+   RTE_LOG(INFO, PMD, "Closing AF_PACKET ethdev on numa socket %u\n",
+   rte_socket_id());
+
+   if (name == NULL)
+   return -1;
+
+   /* reserve an ethdev entry */
+   eth_dev = rte_eth_dev_allocated(name);
+   if (eth_dev == NULL)
+   return -1;
+
+   rte_free(eth_dev->data->dev_private);
+   rte_free(eth_dev->data);
+
+   rte_eth_dev_release_port(eth_dev);
+
+   return 0;
+}
+
 static struct rte_driver pmd_af_packet_drv = {
.name = "eth_af_packet",
.type = PMD_VDEV,
.init = rte_pmd_af_packet_devinit,
+   .uninit = rte_pmd_af_packet_devuninit,
 };

 PMD_REGISTER_DRIVER(pmd_af_packet_drv);
-- 
1.9.1



[dpdk-dev] [PATCH 4/4] virtio: check if any kernel driver is manipulating the device

2016-01-05 Thread Panu Matilainen
On 01/04/2016 11:02 AM, Xie, Huawei wrote:
> On 12/25/2015 6:33 PM, Xie, Huawei wrote:
>> virtio PMD could use IO port to configure the virtio device without
>> using uio driver.
>>
>> There are two issues with previous implementation:
>> 1) virtio PMD will take over each virtio device blindly even if some
>> are not intended for DPDK.
>> 2) driver conflict between virtio PMD and virtio-net kernel driver.
>>
>> This patch checks if there is any kernel driver manipulating the virtio
>> device before virtio PMD uses IO port to configure the device.
>>
>> Fixes: da978dfdc43b ("virtio: use port IO to get PCI resource")
>>
>> Signed-off-by: Huawei Xie 
>> ---
>>   drivers/net/virtio/virtio_ethdev.c | 7 +++
>>   1 file changed, 7 insertions(+)
>>
>> diff --git a/drivers/net/virtio/virtio_ethdev.c 
>> b/drivers/net/virtio/virtio_ethdev.c
>> index 00015ef..504346a 100644
>> --- a/drivers/net/virtio/virtio_ethdev.c
>> +++ b/drivers/net/virtio/virtio_ethdev.c
>> @@ -1138,6 +1138,13 @@ static int virtio_resource_init_by_ioports(struct 
>> rte_pci_device *pci_dev)
>>  int found = 0;
>>  size_t linesz;
>>
>> +if (pci_dev->kdrv != RTE_KDRV_NONE) {
>> +PMD_INIT_LOG(ERR,
> Better change ERR to INFO and revise the message followed, since user
> might not want to use this device for DPDK.

Indeed. The whole point of this exercise is to have a clear way of 
telling DPDK which virtio devices it should (and should not) use, so it 
should just act accordingly and shut up.

>> +"%s(): kernel driver is manipulating this device." \
>> +" Please unbind the kernel driver.", __func__);

I'd suggest just dropping the whole message, DPDK doesn't log such 
messages for any other devices either. That, or make it a generic 
debug-level log in pci_scan_one().

- Panu -


[dpdk-dev] [PATCH 6/8] bond: handle slaves with fewer queues than bonding device

2016-01-05 Thread Stephen Hemminger
A common usage scenario is to bond a vnic like virtio which typically has
only a single rx queue with a VF device that has multiple receive queues.
This is done to do live migration
On Jan 5, 2016 05:47, "Declan Doherty"  wrote:

> On 04/12/15 19:18, Eric Kinzie wrote:
>
>> On Fri Dec 04 19:36:09 +0100 2015, Andriy Berestovskyy wrote:
>>
>>> Hi guys,
>>> I'm not quite sure if we can support less TX queues on a slave that easy:
>>>
>>> queue_id = bond_slave_txqid(internals, i, bd_tx_q->queue_id);
 num_tx_slave = rte_eth_tx_burst(slaves[i], queue_id,
   slave_bufs[i], slave_nb_pkts[i]);

>>>
>>> It seems that two different lcores might end up writing to the same
>>> slave queue at the same time, isn't it?
>>>
>>> Regards,
>>> Andriy
>>>
>>
>> Andriy, I think you're probably right about this.  Perhaps it should
>> instead refuse to add or refuse to activate a slave with too few
>> tx queues.  Could probably fix this with another layer of buffering
>> so that an lcore with a valid tx queue could pick up the mbufs later,
>> but this doesn't seem very appealing.
>>
>> Eric
>>
>>
>> On Fri, Dec 4, 2015 at 6:14 PM, Stephen Hemminger
>>>  wrote:
>>>
 From: Eric Kinzie 

 In the event that the bonding device has a greater number of tx and/or
 rx
 queues than the slave being added, track the queue limits of the slave.
 On receive, ignore queue identifiers beyond what the slave interface
 can support.  During transmit, pick a different queue id to use if the
 intended queue is not available on the slave.

 Signed-off-by: Eric Kinzie 
 Signed-off-by: Stephen Hemminger 
 ---

>>> ...
>
>
> I don't there is any straight forward way of supporting slaves with
> different numbers of queues, the initial library was written with the
> assumption that the number of tx/rx queues would always be the same on each
> slave. This is why,when a slave is added to a bonded device we reconfigure
> the queues. For features like RSS we have to have the same number of rx
> queues otherwise the flow distribution to an application could change in
> the case of a fail over event. Also by supporting different numbers of
> queues between slaves we would be no longer be supporting the standard
> behavior of ethdevs in DPDK were we expect that by using different queues
> we don't require locking to be thread safe.
>
>
>


[dpdk-dev] [PATCH 01/12] ethdev: add API to query what/if packet type is set

2016-01-05 Thread Nélio Laranjeiro
On Mon, Jan 04, 2016 at 02:36:14PM +, Ananyev, Konstantin wrote:
> 
> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Adrien Mazarguil
> > Sent: Monday, January 04, 2016 11:38 AM
> > To: Tan, Jianfeng
> > Cc: dev at dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH 01/12] ethdev: add API to query what/if 
> > packet type is set
> > 
> > I'm not sure about the usefulness of this new callback, but one issue I see
> > with rte_eth_dev_get_ptype_info() is that determining the proper size for
> > ptypes[] according to a mask is awkward. For instance suppose
> > RTE_PTYPE_L4_MASK is redefined to a different size at some point, the caller
> > must dynamically adjust its ptypes[] array size to avoid a possible
> > overflow, just in case.
> > 
> > I suggest one of these solutions:
> > 
> > - A callback to query for a single type at once instead (easiest method in
> >   my opinion).
> > 
> > - An additional argument with the number of entries in ptypes[], in which
> >   case rte_eth_dev_get_ptype_info() should return the number of entries that
> >   would have been filled regardless, a bit like snprintf().
> 
> +1 for the second option.
> Also not sure you really need: RTE_PTYPE_*_MAX_NUM macros.
> Konstantin

+1 for the second option.  But see below.

> > 
> > On Thu, Dec 31, 2015 at 02:53:08PM +0800, Jianfeng Tan wrote:
> > > Add a new API rte_eth_dev_get_ptype_info to query what/if packet type will
> > > be set by current rx burst function.
> > >
> > > Signed-off-by: Jianfeng Tan 
> > > ---
> > >  lib/librte_ether/rte_ethdev.c | 12 
> > >  lib/librte_ether/rte_ethdev.h | 22 ++
> > >  lib/librte_mbuf/rte_mbuf.h| 13 +
> > >  3 files changed, 47 insertions(+)
> > >
> > > diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> > > index ed971b4..1885374 100644
> > > --- a/lib/librte_ether/rte_ethdev.c
> > > +++ b/lib/librte_ether/rte_ethdev.c
> > > @@ -1614,6 +1614,18 @@ rte_eth_dev_info_get(uint8_t port_id, struct 
> > > rte_eth_dev_info *dev_info)
> > >   dev_info->driver_name = dev->data->drv_name;
> > >  }
> > >
> > > +int
> > > +rte_eth_dev_get_ptype_info(uint8_t port_id, uint32_t ptype_mask,
> > > + uint32_t ptypes[])
> > > +{
> > > + struct rte_eth_dev *dev;
> > > +
> > > + RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > > + dev = &rte_eth_devices[port_id];
> > > + RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_ptype_info_get, -ENOTSUP);
> > > + return (*dev->dev_ops->dev_ptype_info_get)(dev, ptype_mask, ptypes);
> > > +}
> > > +
> > >  void
> > >  rte_eth_macaddr_get(uint8_t port_id, struct ether_addr *mac_addr)
> > >  {
> > > diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
> > > index bada8ad..e97b632 100644
> > > --- a/lib/librte_ether/rte_ethdev.h
> > > +++ b/lib/librte_ether/rte_ethdev.h
> > > @@ -1021,6 +1021,10 @@ typedef void (*eth_dev_infos_get_t)(struct 
> > > rte_eth_dev *dev,
> > >   struct rte_eth_dev_info *dev_info);
> > >  /**< @internal Get specific informations of an Ethernet device. */
> > >
> > > +typedef int (*eth_dev_ptype_info_get_t)(struct rte_eth_dev *dev,
> > > + uint32_t ptype_mask, uint32_t ptypes[]);
> > > +/**< @internal Get ptype info of eth_rx_burst_t. */
> > > +
> > >  typedef int (*eth_queue_start_t)(struct rte_eth_dev *dev,
> > >   uint16_t queue_id);
> > >  /**< @internal Start rx and tx of a queue of an Ethernet device. */
> > > @@ -1347,6 +1351,7 @@ struct eth_dev_ops {
> > >   eth_queue_stats_mapping_set_t queue_stats_mapping_set;
> > >   /**< Configure per queue stat counter mapping. */
> > >   eth_dev_infos_get_tdev_infos_get; /**< Get device info. */
> > > + eth_dev_ptype_info_get_t   dev_ptype_info_get; /** Get ptype info */
> > >   mtu_set_t  mtu_set; /**< Set MTU. */
> > >   vlan_filter_set_t  vlan_filter_set;  /**< Filter VLAN Setup. */
> > >   vlan_tpid_set_tvlan_tpid_set;  /**< Outer VLAN TPID 
> > > Setup. */
> > > @@ -2273,6 +2278,23 @@ extern void rte_eth_dev_info_get(uint8_t port_id,
> > >struct rte_eth_dev_info *dev_info);
> > >
> > >  /**
> > > + * Retrieve the contextual information of an Ethernet device.
> > > + *
> > > + * @param port_id
> > > + *   The port identifier of the Ethernet device.
> > > + * @param ptype_mask
> > > + *   A hint of what kind of packet type which the caller is interested in
> > > + * @param ptypes
> > > + *   An array of packet types to be filled with
> > > + * @return
> > > + *   - (>=0) if successful. Indicate number of valid values in ptypes 
> > > array.
> > > + *   - (-ENOTSUP) if hardware-assisted VLAN stripping not configured.
> > > + *   - (-ENODEV) if *port_id* invalid.
> > > + */
> > > +extern int rte_eth_dev_get_ptype_info(uint8_t port_id,
> > > +  uint32_t ptype_mask, uint32_t ptypes[]);
> > > +
> > > +/**
> > 

[dpdk-dev] [PATCH 08/12] pmd/mlx4: add dev_ptype_info_get implementation

2016-01-05 Thread Adrien Mazarguil
On Tue, Jan 05, 2016 at 11:08:04AM +0800, Tan, Jianfeng wrote:
> 
> 
> On 1/4/2016 7:11 PM, Adrien Mazarguil wrote:
> >Hi Jianfeng,
> >
> >I'm only commenting the mlx4/mlx5 bits in this message, see below.
> >
> >On Thu, Dec 31, 2015 at 02:53:15PM +0800, Jianfeng Tan wrote:
> >>Signed-off-by: Jianfeng Tan 
> >>---
> >>  drivers/net/mlx4/mlx4.c | 27 +++
> >>  1 file changed, 27 insertions(+)
> >>
> >>diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
> >>index 207bfe2..85afa32 100644
> >>--- a/drivers/net/mlx4/mlx4.c
> >>+++ b/drivers/net/mlx4/mlx4.c
> >>@@ -2836,6 +2836,8 @@ rxq_cleanup(struct rxq *rxq)
> >>   * @param flags
> >>   *   RX completion flags returned by poll_length_flags().
> >>   *
> >>+ * @note: fix mlx4_dev_ptype_info_get() if any change here.
> >>+ *
> >>   * @return
> >>   *   Packet type for struct rte_mbuf.
> >>   */
> >>@@ -4268,6 +4270,30 @@ mlx4_dev_infos_get(struct rte_eth_dev *dev, struct 
> >>rte_eth_dev_info *info)
> >>priv_unlock(priv);
> >>  }
> >>+static int
> >>+mlx4_dev_ptype_info_get(struct rte_eth_dev *dev, uint32_t ptype_mask,
> >>+   uint32_t ptypes[])

Note this line is not properly indented (uint32_t should be aligned like the
rest of the file).

> >>+{
> >>+   int num = 0;
> >>+
> >>+   if ((dev->rx_pkt_burst == mlx4_rx_burst)
> >>+   || (dev->rx_pkt_burst == mlx4_rx_burst_sp)) {

I prefer operators/separators at the end of the previous line, indentation
should be fixed as well.

> >>+   /* refers to rxq_cq_to_pkt_type() */
> >>+   if ((ptype_mask & RTE_PTYPE_L3_MASK) == RTE_PTYPE_L3_MASK) {
> >>+   ptypes[num++] = RTE_PTYPE_L3_IPV4;
> >>+   ptypes[num++] = RTE_PTYPE_L3_IPV6;
> >>+   }
> >>+
> >>+   if ((ptype_mask & RTE_PTYPE_INNER_L3_MASK) == 
> >>RTE_PTYPE_INNER_L3_MASK) {
> >>+   ptypes[num++] = RTE_PTYPE_INNER_L3_IPV4;
> >>+   ptypes[num++] = RTE_PTYPE_INNER_L3_IPV6;
> >>+   }
> >>+   } else
> >>+   num = -ENOTSUP;
> >>+
> >>+   return num;
> >>+}
> >I think checking for mlx4_rx_burst and mlx4_rx_burst_sp is unnecessary at
> >the moment, all RX burst functions do update the packet_type field, no need
> >for extra complexity.
> >
> >Same comment for mlx5.
> 
> Hi Mazarguil,
> 
> My original thought is that rx_pkt_burst could be also set as
> removed_rx_burst, which does not make sense indeed
> because it's only possible when the device is closed.

Yes, indeed.

> Another consideration is to keep same style with other devices. Each
> kind of device could have several rx burst functions.
> So current implementation can keep extensibility to add new rx burst
> functions. How do you think of it?

OK, that makes sense. Please check my above comments about coding
style/indents (I know I'm annoying).

> >>+
> >>  /**
> >>   * DPDK callback to get device statistics.
> >>   *
> >>@@ -4989,6 +5015,7 @@ static const struct eth_dev_ops mlx4_dev_ops = {
> >>.stats_reset = mlx4_stats_reset,
> >>.queue_stats_mapping_set = NULL,
> >>.dev_infos_get = mlx4_dev_infos_get,
> >>+   .dev_ptypes_info_get = mlx4_dev_ptype_info_get,
> >>.vlan_filter_set = mlx4_vlan_filter_set,
> >>.vlan_tpid_set = NULL,
> >>.vlan_strip_queue_set = NULL,
> >>-- 
> >>2.1.4
> >>
> 

-- 
Adrien Mazarguil
6WIND


[dpdk-dev] [PATCH] fix checkpatch errors

2016-01-05 Thread Xie, Huawei
On 1/5/2016 6:22 PM, Mcnamara, John wrote:
>> -Original Message-
>> From: Tan, Jianfeng
>> Sent: Tuesday, January 5, 2016 2:21 AM
>> To: Xie, Huawei; dev at dpdk.org
>> Cc: Mcnamara, John; Stephen Hemminger; Yuanhan Liu
>> Subject: RE: [PATCH] fix checkpatch errors
>>
>>
>>
>>> -Original Message-
>>> From: Xie, Huawei
>>> Sent: Monday, January 4, 2016 9:52 AM
>>> To: dev at dpdk.org
>>> Cc: Mcnamara, John; Tan, Jianfeng; Xie, Huawei
>>> Subject: [PATCH] fix checkpatch errors
>>>
>>> Signed-off-by: Huawei Xie 
>> ...
>>> mbuf_poolname_build(sock_id, pool_name, sizeof(pool_name));
>>> -   return (rte_mempool_lookup((const char *)pool_name));
>>> +   return rte_mempool_lookup((const char *)pool_name);
>> Hi Huawei,
>>
>> Assume this patch is to solve below error (reported by checkpatch):
>> ERROR: return is not a function, parentheses are not required
>>
>> So maybe above fix is not necessary? Involve more people to discuss.
>>
>> And please include the error message in the commit message.
> Hi Huawei,
>
> The fix looks good and there was a similar patch applied previously for lib 
> (from Ferruh):
>
> 6307b909b8e0 ("lib: remove extra parenthesis after return")
Oh yes, but no idea why Ferruh Yigit missed so many. I have greped the
pattern, so this patch should fix almost all of them.
>
> However, the commit message could be better. Maybe something like the above:
> "remove extra parentheses".
OK. Weird that my commit message gets lost again. Will send a new one.
>
> John



[dpdk-dev] [PATCH] bnx2x: remove unused mbuf_alloc_size

2016-01-05 Thread Stephen Hemminger
The mbuf_alloc_size is leftover from BSD or some other code base.
It is set but never used in DPDK driver.  After that the related defines
can also be eliminated.

Signed-off-by: Stephen Hemminger 
---
 drivers/net/bnx2x/bnx2x.c |  9 -
 drivers/net/bnx2x/bnx2x.h | 18 --
 2 files changed, 27 deletions(-)

diff --git a/drivers/net/bnx2x/bnx2x.c b/drivers/net/bnx2x/bnx2x.c
index 67af5da..6ba6f44 100644
--- a/drivers/net/bnx2x/bnx2x.c
+++ b/drivers/net/bnx2x/bnx2x.c
@@ -2331,15 +2331,6 @@ static void bnx2x_set_fp_rx_buf_size(struct bnx2x_softc 
*sc)
/* get the Rx buffer size for RX frames */
sc->fp[i].rx_buf_size =
(IP_HEADER_ALIGNMENT_PADDING + ETH_OVERHEAD + sc->mtu);
-
-   /* get the mbuf allocation size for RX frames */
-   if (sc->fp[i].rx_buf_size <= MCLBYTES) {
-   sc->fp[i].mbuf_alloc_size = MCLBYTES;
-   } else if (sc->fp[i].rx_buf_size <= BNX2X_PAGE_SIZE) {
-   sc->fp[i].mbuf_alloc_size = PAGE_SIZE;
-   } else {
-   sc->fp[i].mbuf_alloc_size = MJUM9BYTES;
-   }
}
 }

diff --git a/drivers/net/bnx2x/bnx2x.h b/drivers/net/bnx2x/bnx2x.h
index 2abab0c..9682b8d 100644
--- a/drivers/net/bnx2x/bnx2x.h
+++ b/drivers/net/bnx2x/bnx2x.h
@@ -151,23 +151,6 @@ struct bnx2x_device_type {
 #define FW_PREFETCH_CNT  16U
 #define DROPLESS_FC_HEADROOM 100

-#ifndef MCLSHIFT
-#define MCLSHIFT  11
-#endif
-#define MCLBYTES  (1 << MCLSHIFT)
-
-#if !defined(MJUMPAGESIZE)
-#if BNX2X_PAGE_SIZE < 2048
-#define MJUMPAGESIZEMCLBYTES
-#elif BNX2X_PAGE_SIZE <= 8192
-#define MJUMPAGESIZEBNX2X_PAGE_SIZE
-#else
-#define MJUMPAGESIZE(8 * 1024)
-#endif
-#endif
-#define MJUM9BYTES  (9 * 1024)
-#define MJUM16BYTES (16 * 1024)
-
 /*
  * Transmit Buffer Descriptor (tx_bd) definitions*
  */
@@ -402,7 +385,6 @@ struct bnx2x_fastpath {
uint8_t fw_sb_id;  /* status block number in FW */

uint32_t rx_buf_size;
-   int mbuf_alloc_size;

int state;
 #define BNX2X_FP_STATE_CLOSED  0x01
-- 
2.1.4



[dpdk-dev] [PATCH v3 1/3] librte_ether: remove RTE_PROC_PRIMARY_OR_ERR_RET and RTE_PROC_PRIMARY_OR_RET

2016-01-05 Thread Reshma Pattan
Macros RTE_PROC_PRIMARY_OR_ERR_RET and RTE_PROC_PRIMARY_OR_RET
are blocking the secondary process from using the APIs.
API access should be given to both secondary and primary.

Reported-by: Sean Harte 
Signed-off-by: Reshma Pattan 
---
v3:
* Removed checkpatch fixes of lib/librte_ether/rte_ethdev.h from this patch

 lib/librte_ether/rte_ethdev.c |   50 +
 1 files changed, 1 insertions(+), 49 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index ed971b4..5849102 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -711,10 +711,6 @@ rte_eth_dev_rx_queue_start(uint8_t port_id, uint16_t 
rx_queue_id)
 {
struct rte_eth_dev *dev;

-   /* This function is only safe when called from the primary process
-* in a multi-process setup*/
-   RTE_PROC_PRIMARY_OR_ERR_RET(-E_RTE_SECONDARY);
-
RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);

dev = &rte_eth_devices[port_id];
@@ -741,10 +737,6 @@ rte_eth_dev_rx_queue_stop(uint8_t port_id, uint16_t 
rx_queue_id)
 {
struct rte_eth_dev *dev;

-   /* This function is only safe when called from the primary process
-* in a multi-process setup*/
-   RTE_PROC_PRIMARY_OR_ERR_RET(-E_RTE_SECONDARY);
-
RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);

dev = &rte_eth_devices[port_id];
@@ -771,10 +763,6 @@ rte_eth_dev_tx_queue_start(uint8_t port_id, uint16_t 
tx_queue_id)
 {
struct rte_eth_dev *dev;

-   /* This function is only safe when called from the primary process
-* in a multi-process setup*/
-   RTE_PROC_PRIMARY_OR_ERR_RET(-E_RTE_SECONDARY);
-
RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);

dev = &rte_eth_devices[port_id];
@@ -801,10 +789,6 @@ rte_eth_dev_tx_queue_stop(uint8_t port_id, uint16_t 
tx_queue_id)
 {
struct rte_eth_dev *dev;

-   /* This function is only safe when called from the primary process
-* in a multi-process setup*/
-   RTE_PROC_PRIMARY_OR_ERR_RET(-E_RTE_SECONDARY);
-
RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);

dev = &rte_eth_devices[port_id];
@@ -874,10 +858,6 @@ rte_eth_dev_configure(uint8_t port_id, uint16_t nb_rx_q, 
uint16_t nb_tx_q,
struct rte_eth_dev_info dev_info;
int diag;

-   /* This function is only safe when called from the primary process
-* in a multi-process setup*/
-   RTE_PROC_PRIMARY_OR_ERR_RET(-E_RTE_SECONDARY);
-
RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);

if (nb_rx_q > RTE_MAX_QUEUES_PER_PORT) {
@@ -1059,10 +1039,6 @@ rte_eth_dev_start(uint8_t port_id)
struct rte_eth_dev *dev;
int diag;

-   /* This function is only safe when called from the primary process
-* in a multi-process setup*/
-   RTE_PROC_PRIMARY_OR_ERR_RET(-E_RTE_SECONDARY);
-
RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);

dev = &rte_eth_devices[port_id];
@@ -1096,10 +1072,6 @@ rte_eth_dev_stop(uint8_t port_id)
 {
struct rte_eth_dev *dev;

-   /* This function is only safe when called from the primary process
-* in a multi-process setup*/
-   RTE_PROC_PRIMARY_OR_RET();
-
RTE_ETH_VALID_PORTID_OR_RET(port_id);
dev = &rte_eth_devices[port_id];

@@ -1121,10 +1093,6 @@ rte_eth_dev_set_link_up(uint8_t port_id)
 {
struct rte_eth_dev *dev;

-   /* This function is only safe when called from the primary process
-* in a multi-process setup*/
-   RTE_PROC_PRIMARY_OR_ERR_RET(-E_RTE_SECONDARY);
-
RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);

dev = &rte_eth_devices[port_id];
@@ -1138,10 +1106,6 @@ rte_eth_dev_set_link_down(uint8_t port_id)
 {
struct rte_eth_dev *dev;

-   /* This function is only safe when called from the primary process
-* in a multi-process setup*/
-   RTE_PROC_PRIMARY_OR_ERR_RET(-E_RTE_SECONDARY);
-
RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);

dev = &rte_eth_devices[port_id];
@@ -1155,10 +1119,6 @@ rte_eth_dev_close(uint8_t port_id)
 {
struct rte_eth_dev *dev;

-   /* This function is only safe when called from the primary process
-* in a multi-process setup*/
-   RTE_PROC_PRIMARY_OR_RET();
-
RTE_ETH_VALID_PORTID_OR_RET(port_id);
dev = &rte_eth_devices[port_id];

@@ -1183,10 +1143,6 @@ rte_eth_rx_queue_setup(uint8_t port_id, uint16_t 
rx_queue_id,
struct rte_eth_dev *dev;
struct rte_eth_dev_info dev_info;

-   /* This function is only safe when called from the primary process
-* in a multi-process setup*/
-   RTE_PROC_PRIMARY_OR_ERR_RET(-E_RTE_SECONDARY);
-
RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);

dev = &rte_eth_devices[port_id];
@@ -1266,10 +1222,6 @@ rte_eth_tx_queue_setup(uint8_t port_id, uint16_t 
tx_queue_id,
struct rte_eth_dev *dev;
s

[dpdk-dev] [PATCH v3 3/3] librte_ether: fix rte_eth_dev_configure

2016-01-05 Thread Reshma Pattan
User should be able to configure ethdev with zero rx/tx queues, but both should 
not be zero.
After above change, rte_eth_dev_tx_queue_config, rte_eth_dev_rx_queue_config 
should
allocate memory for rx/tx queues only when number of rx/tx queues are nonzero.

Signed-off-by: Reshma Pattan 
---
 lib/librte_ether/rte_ethdev.c |   36 
 1 files changed, 24 insertions(+), 12 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 5849102..a7647b6 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -673,7 +673,7 @@ rte_eth_dev_rx_queue_config(struct rte_eth_dev *dev, 
uint16_t nb_queues)
void **rxq;
unsigned i;

-   if (dev->data->rx_queues == NULL) { /* first time configuration */
+   if (dev->data->rx_queues == NULL && nb_queues != 0) { /* first time 
configuration */
dev->data->rx_queues = rte_zmalloc("ethdev->rx_queues",
sizeof(dev->data->rx_queues[0]) * nb_queues,
RTE_CACHE_LINE_SIZE);
@@ -681,7 +681,7 @@ rte_eth_dev_rx_queue_config(struct rte_eth_dev *dev, 
uint16_t nb_queues)
dev->data->nb_rx_queues = 0;
return -(ENOMEM);
}
-   } else { /* re-configure */
+   } else if (dev->data->rx_queues != NULL && nb_queues != 0) { /* 
re-configure */
RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_release, 
-ENOTSUP);

rxq = dev->data->rx_queues;
@@ -701,6 +701,13 @@ rte_eth_dev_rx_queue_config(struct rte_eth_dev *dev, 
uint16_t nb_queues)

dev->data->rx_queues = rxq;

+   } else if (dev->data->rx_queues != NULL && nb_queues == 0) {
+   RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_release, 
-ENOTSUP);
+
+   rxq = dev->data->rx_queues;
+
+   for (i = nb_queues; i < old_nb_queues; i++)
+   (*dev->dev_ops->rx_queue_release)(rxq[i]);
}
dev->data->nb_rx_queues = nb_queues;
return 0;
@@ -817,7 +824,7 @@ rte_eth_dev_tx_queue_config(struct rte_eth_dev *dev, 
uint16_t nb_queues)
void **txq;
unsigned i;

-   if (dev->data->tx_queues == NULL) { /* first time configuration */
+   if (dev->data->tx_queues == NULL && nb_queues != 0) { /* first time 
configuration */
dev->data->tx_queues = rte_zmalloc("ethdev->tx_queues",
   
sizeof(dev->data->tx_queues[0]) * nb_queues,
   RTE_CACHE_LINE_SIZE);
@@ -825,7 +832,7 @@ rte_eth_dev_tx_queue_config(struct rte_eth_dev *dev, 
uint16_t nb_queues)
dev->data->nb_tx_queues = 0;
return -(ENOMEM);
}
-   } else { /* re-configure */
+   } else if (dev->data->tx_queues != NULL && nb_queues != 0) { /* 
re-configure */
RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_queue_release, 
-ENOTSUP);

txq = dev->data->tx_queues;
@@ -845,6 +852,13 @@ rte_eth_dev_tx_queue_config(struct rte_eth_dev *dev, 
uint16_t nb_queues)

dev->data->tx_queues = txq;

+   } else if (dev->data->tx_queues != NULL && nb_queues == 0) {
+   RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_queue_release, 
-ENOTSUP);
+
+   txq = dev->data->tx_queues;
+
+   for (i = nb_queues; i < old_nb_queues; i++)
+   (*dev->dev_ops->tx_queue_release)(txq[i]);
}
dev->data->nb_tx_queues = nb_queues;
return 0;
@@ -891,25 +905,23 @@ rte_eth_dev_configure(uint8_t port_id, uint16_t nb_rx_q, 
uint16_t nb_tx_q,
 * configured device.
 */
(*dev->dev_ops->dev_infos_get)(dev, &dev_info);
+
+   if (nb_rx_q == 0 && nb_tx_q == 0) {
+   RTE_PMD_DEBUG_TRACE("ethdev port_id=%d both rx and tx queue 
cannot be 0\n", port_id);
+   return -EINVAL;
+   }
+
if (nb_rx_q > dev_info.max_rx_queues) {
RTE_PMD_DEBUG_TRACE("ethdev port_id=%d nb_rx_queues=%d > %d\n",
port_id, nb_rx_q, dev_info.max_rx_queues);
return -EINVAL;
}
-   if (nb_rx_q == 0) {
-   RTE_PMD_DEBUG_TRACE("ethdev port_id=%d nb_rx_q == 0\n", 
port_id);
-   return -EINVAL;
-   }

if (nb_tx_q > dev_info.max_tx_queues) {
RTE_PMD_DEBUG_TRACE("ethdev port_id=%d nb_tx_queues=%d > %d\n",
port_id, nb_tx_q, dev_info.max_tx_queues);
return -EINVAL;
}
-   if (nb_tx_q == 0) {
-   RTE_PMD_DEBUG_TRACE("ethdev port_id=%d nb_tx_q == 0\n", 
port_id);
-   return -EINVAL;
-   }

/* Copy the dev_conf parameter into the dev structure */
memcpy(&dev->data->dev_conf, dev_conf, sizeof(dev->data->dev_conf));
-- 
1.7.4

[dpdk-dev] [PATCH v3 0/3] fix RTE_PROC_PRIMARY_OR_ERR_RET RTE_PROC_PRIMARY_OR_RET

2016-01-05 Thread Reshma Pattan
From: reshmapa 

Patches 1 and 2 removes RTE_PROC_PRIMARY_OR_ERR_RET and
RTE_PROC_PRIMARY_OR_RET macro usage from rte_ether and rte_cryptodev libraries 
to allow API
access to secondary process.

Patch 3 allows users to configure ethdev with zero rx/tx queues, but both 
should not be zero.
Fix rte_eth_dev_tx_queue_config, rte_eth_dev_rx_queue_config to allocate memory 
for rx/tx queues
only when number of rx/tx queues are nonzero.

v3:
* Removed checkpatch fixes of lib/librte_ether/rte_ethdev.h from patch number 1.

Reshma Pattan (3):
  librte_ether: remove RTE_PROC_PRIMARY_OR_ERR_RET and
RTE_PROC_PRIMARY_OR_RET
  librte_cryptodev: remove RTE_PROC_PRIMARY_OR_RET
  librte_ether: fix rte_eth_dev_configure

 lib/librte_cryptodev/rte_cryptodev.c |   42 
 lib/librte_ether/rte_ethdev.c|   86 ++
 2 files changed, 25 insertions(+), 103 deletions(-)

-- 
1.7.4.1



[dpdk-dev] [PATCH v3 2/3] librte_cryptodev: remove RTE_PROC_PRIMARY_OR_RET

2016-01-05 Thread Reshma Pattan
Macro RTE_PROC_PRIMARY_OR_ERR_RET blocking the secondary process from API usage.
API access should be given to both secondary and primary.

Signed-off-by: Reshma Pattan 
---
 lib/librte_cryptodev/rte_cryptodev.c |   42 --
 1 files changed, 0 insertions(+), 42 deletions(-)

diff --git a/lib/librte_cryptodev/rte_cryptodev.c 
b/lib/librte_cryptodev/rte_cryptodev.c
index f09f67e..207e92c 100644
--- a/lib/librte_cryptodev/rte_cryptodev.c
+++ b/lib/librte_cryptodev/rte_cryptodev.c
@@ -532,12 +532,6 @@ rte_cryptodev_queue_pair_start(uint8_t dev_id, uint16_t 
queue_pair_id)
 {
struct rte_cryptodev *dev;

-   /*
-* This function is only safe when called from the primary process
-* in a multi-process setup
-*/
-   RTE_PROC_PRIMARY_OR_ERR_RET(-E_RTE_SECONDARY);
-
if (!rte_cryptodev_pmd_is_valid_dev(dev_id)) {
CDEV_LOG_ERR("Invalid dev_id=%" PRIu8, dev_id);
return -EINVAL;
@@ -560,12 +554,6 @@ rte_cryptodev_queue_pair_stop(uint8_t dev_id, uint16_t 
queue_pair_id)
 {
struct rte_cryptodev *dev;

-   /*
-* This function is only safe when called from the primary process
-* in a multi-process setup
-*/
-   RTE_PROC_PRIMARY_OR_ERR_RET(-E_RTE_SECONDARY);
-
if (!rte_cryptodev_pmd_is_valid_dev(dev_id)) {
CDEV_LOG_ERR("Invalid dev_id=%" PRIu8, dev_id);
return -EINVAL;
@@ -593,12 +581,6 @@ rte_cryptodev_configure(uint8_t dev_id, struct 
rte_cryptodev_config *config)
struct rte_cryptodev *dev;
int diag;

-   /*
-* This function is only safe when called from the primary process
-* in a multi-process setup
-*/
-   RTE_PROC_PRIMARY_OR_ERR_RET(-E_RTE_SECONDARY);
-
if (!rte_cryptodev_pmd_is_valid_dev(dev_id)) {
CDEV_LOG_ERR("Invalid dev_id=%" PRIu8, dev_id);
return (-EINVAL);
@@ -635,12 +617,6 @@ rte_cryptodev_start(uint8_t dev_id)

CDEV_LOG_DEBUG("Start dev_id=%" PRIu8, dev_id);

-   /*
-* This function is only safe when called from the primary process
-* in a multi-process setup
-*/
-   RTE_PROC_PRIMARY_OR_ERR_RET(-E_RTE_SECONDARY);
-
if (!rte_cryptodev_pmd_is_valid_dev(dev_id)) {
CDEV_LOG_ERR("Invalid dev_id=%" PRIu8, dev_id);
return (-EINVAL);
@@ -670,12 +646,6 @@ rte_cryptodev_stop(uint8_t dev_id)
 {
struct rte_cryptodev *dev;

-   /*
-* This function is only safe when called from the primary process
-* in a multi-process setup
-*/
-   RTE_PROC_PRIMARY_OR_RET();
-
if (!rte_cryptodev_pmd_is_valid_dev(dev_id)) {
CDEV_LOG_ERR("Invalid dev_id=%" PRIu8, dev_id);
return;
@@ -701,12 +671,6 @@ rte_cryptodev_close(uint8_t dev_id)
struct rte_cryptodev *dev;
int retval;

-   /*
-* This function is only safe when called from the primary process
-* in a multi-process setup
-*/
-   RTE_PROC_PRIMARY_OR_ERR_RET(-EINVAL);
-
if (!rte_cryptodev_pmd_is_valid_dev(dev_id)) {
CDEV_LOG_ERR("Invalid dev_id=%" PRIu8, dev_id);
return -1;
@@ -747,12 +711,6 @@ rte_cryptodev_queue_pair_setup(uint8_t dev_id, uint16_t 
queue_pair_id,
 {
struct rte_cryptodev *dev;

-   /*
-* This function is only safe when called from the primary process
-* in a multi-process setup
-*/
-   RTE_PROC_PRIMARY_OR_ERR_RET(-E_RTE_SECONDARY);
-
if (!rte_cryptodev_pmd_is_valid_dev(dev_id)) {
CDEV_LOG_ERR("Invalid dev_id=%" PRIu8, dev_id);
return (-EINVAL);
-- 
1.7.4.1



[dpdk-dev] [PATCH 12/12] examples/l3fwd: add option to parse ptype

2016-01-05 Thread Ananyev, Konstantin
Hi Jianfeng,

> > >
> > > +static int
> > > +check_packet_type_ok(int portid)
> > > +{
> > > + int i;
> > > + int ret;
> > > + uint32_t ptypes[RTE_PTYPE_L3_MAX_NUM];
> > > + int ptype_l3_ipv4 = 0, ptype_l3_ipv6 = 0;
> > > +
> > > + ret = rte_eth_dev_get_ptype_info(portid, RTE_PTYPE_L3_MASK,
> > ptypes);
> > > + for (i = 0; i < ret; ++i) {
> > > + if (ptypes[i] & RTE_PTYPE_L3_IPV4)
> > > + ptype_l3_ipv4 = 1;
> > > + if (ptypes[i] & RTE_PTYPE_L3_IPV6)
> > > + ptype_l3_ipv6 = 1;
> > > + }
> > > +
> > > + if (ptype_l3_ipv4 == 0)
> > > + printf("port %d cannot parse RTE_PTYPE_L3_IPV4\n", portid);
> > > +
> > > + if (ptype_l3_ipv6 == 0)
> > > + printf("port %d cannot parse RTE_PTYPE_L3_IPV6\n", portid);
> > > +
> > > + if (ptype_l3_ipv4 || ptype_l3_ipv6)
> > > + return 1;


Forgot one thing: I think it should be:

if (ptype_l3_ipv4 && ptype_l3_ipv6)
  return 1;
return 0;

or just:

return ptype_l3_ipv4 && ptype_l3_ipv6;

Konstantin



[dpdk-dev] [PATCH 01/12] ethdev: add API to query what/if packet type is set

2016-01-05 Thread Ananyev, Konstantin
Hi Neilo,

> -Original Message-
> From: N?lio Laranjeiro [mailto:nelio.laranjeiro at 6wind.com]
> Sent: Tuesday, January 05, 2016 4:14 PM
> To: Tan, Jianfeng
> Cc: Adrien Mazarguil; Ananyev, Konstantin; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 01/12] ethdev: add API to query what/if packet 
> type is set
> 
> On Mon, Jan 04, 2016 at 02:36:14PM +, Ananyev, Konstantin wrote:
> >
> >
> > > -Original Message-
> > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Adrien Mazarguil
> > > Sent: Monday, January 04, 2016 11:38 AM
> > > To: Tan, Jianfeng
> > > Cc: dev at dpdk.org
> > > Subject: Re: [dpdk-dev] [PATCH 01/12] ethdev: add API to query what/if 
> > > packet type is set
> > >
> > > I'm not sure about the usefulness of this new callback, but one issue I 
> > > see
> > > with rte_eth_dev_get_ptype_info() is that determining the proper size for
> > > ptypes[] according to a mask is awkward. For instance suppose
> > > RTE_PTYPE_L4_MASK is redefined to a different size at some point, the 
> > > caller
> > > must dynamically adjust its ptypes[] array size to avoid a possible
> > > overflow, just in case.
> > >
> > > I suggest one of these solutions:
> > >
> > > - A callback to query for a single type at once instead (easiest method in
> > >   my opinion).
> > >
> > > - An additional argument with the number of entries in ptypes[], in which
> > >   case rte_eth_dev_get_ptype_info() should return the number of entries 
> > > that
> > >   would have been filled regardless, a bit like snprintf().
> >
> > +1 for the second option.
> > Also not sure you really need: RTE_PTYPE_*_MAX_NUM macros.
> > Konstantin
> 
> +1 for the second option.  But see below.
> 
> > >
> > > On Thu, Dec 31, 2015 at 02:53:08PM +0800, Jianfeng Tan wrote:
> > > > Add a new API rte_eth_dev_get_ptype_info to query what/if packet type 
> > > > will
> > > > be set by current rx burst function.
> > > >
> > > > Signed-off-by: Jianfeng Tan 
> > > > ---
> > > >  lib/librte_ether/rte_ethdev.c | 12 
> > > >  lib/librte_ether/rte_ethdev.h | 22 ++
> > > >  lib/librte_mbuf/rte_mbuf.h| 13 +
> > > >  3 files changed, 47 insertions(+)
> > > >
> > > > diff --git a/lib/librte_ether/rte_ethdev.c 
> > > > b/lib/librte_ether/rte_ethdev.c
> > > > index ed971b4..1885374 100644
> > > > --- a/lib/librte_ether/rte_ethdev.c
> > > > +++ b/lib/librte_ether/rte_ethdev.c
> > > > @@ -1614,6 +1614,18 @@ rte_eth_dev_info_get(uint8_t port_id, struct 
> > > > rte_eth_dev_info *dev_info)
> > > > dev_info->driver_name = dev->data->drv_name;
> > > >  }
> > > >
> > > > +int
> > > > +rte_eth_dev_get_ptype_info(uint8_t port_id, uint32_t ptype_mask,
> > > > +   uint32_t ptypes[])
> > > > +{
> > > > +   struct rte_eth_dev *dev;
> > > > +
> > > > +   RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > > > +   dev = &rte_eth_devices[port_id];
> > > > +   RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_ptype_info_get, 
> > > > -ENOTSUP);
> > > > +   return (*dev->dev_ops->dev_ptype_info_get)(dev, ptype_mask, 
> > > > ptypes);
> > > > +}
> > > > +
> > > >  void
> > > >  rte_eth_macaddr_get(uint8_t port_id, struct ether_addr *mac_addr)
> > > >  {
> > > > diff --git a/lib/librte_ether/rte_ethdev.h 
> > > > b/lib/librte_ether/rte_ethdev.h
> > > > index bada8ad..e97b632 100644
> > > > --- a/lib/librte_ether/rte_ethdev.h
> > > > +++ b/lib/librte_ether/rte_ethdev.h
> > > > @@ -1021,6 +1021,10 @@ typedef void (*eth_dev_infos_get_t)(struct 
> > > > rte_eth_dev *dev,
> > > > struct rte_eth_dev_info *dev_info);
> > > >  /**< @internal Get specific informations of an Ethernet device. */
> > > >
> > > > +typedef int (*eth_dev_ptype_info_get_t)(struct rte_eth_dev *dev,
> > > > +   uint32_t ptype_mask, uint32_t ptypes[]);
> > > > +/**< @internal Get ptype info of eth_rx_burst_t. */
> > > > +
> > > >  typedef int (*eth_queue_start_t)(struct rte_eth_dev *dev,
> > > > uint16_t queue_id);
> > > >  /**< @internal Start rx and tx of a queue of an Ethernet device. */
> > > > @@ -1347,6 +1351,7 @@ struct eth_dev_ops {
> > > > eth_queue_stats_mapping_set_t queue_stats_mapping_set;
> > > > /**< Configure per queue stat counter mapping. */
> > > > eth_dev_infos_get_tdev_infos_get; /**< Get device info. 
> > > > */
> > > > +   eth_dev_ptype_info_get_t   dev_ptype_info_get; /** Get ptype 
> > > > info */
> > > > mtu_set_t  mtu_set; /**< Set MTU. */
> > > > vlan_filter_set_t  vlan_filter_set;  /**< Filter VLAN 
> > > > Setup. */
> > > > vlan_tpid_set_tvlan_tpid_set;  /**< Outer VLAN 
> > > > TPID Setup. */
> > > > @@ -2273,6 +2278,23 @@ extern void rte_eth_dev_info_get(uint8_t port_id,
> > > >  struct rte_eth_dev_info *dev_info);
> > > >
> > > >  /**
> > > > + * Retrieve the contextual 

[dpdk-dev] [PATCH] bnx2x: remove unused mbuf_alloc_size

2016-01-05 Thread Harish Patil
>
>The mbuf_alloc_size is leftover from BSD or some other code base.
>It is set but never used in DPDK driver.  After that the related defines
>can also be eliminated.
>
>Signed-off-by: Stephen Hemminger 
>---
> drivers/net/bnx2x/bnx2x.c |  9 -
> drivers/net/bnx2x/bnx2x.h | 18 --
> 2 files changed, 27 deletions(-)
>
>diff --git a/drivers/net/bnx2x/bnx2x.c b/drivers/net/bnx2x/bnx2x.c
>index 67af5da..6ba6f44 100644
>--- a/drivers/net/bnx2x/bnx2x.c
>+++ b/drivers/net/bnx2x/bnx2x.c
>@@ -2331,15 +2331,6 @@ static void bnx2x_set_fp_rx_buf_size(struct
>bnx2x_softc *sc)
>   /* get the Rx buffer size for RX frames */
>   sc->fp[i].rx_buf_size =
>   (IP_HEADER_ALIGNMENT_PADDING + ETH_OVERHEAD + sc->mtu);
>-
>-  /* get the mbuf allocation size for RX frames */
>-  if (sc->fp[i].rx_buf_size <= MCLBYTES) {
>-  sc->fp[i].mbuf_alloc_size = MCLBYTES;
>-  } else if (sc->fp[i].rx_buf_size <= BNX2X_PAGE_SIZE) {
>-  sc->fp[i].mbuf_alloc_size = PAGE_SIZE;
>-  } else {
>-  sc->fp[i].mbuf_alloc_size = MJUM9BYTES;
>-  }
>   }
> }
>
>diff --git a/drivers/net/bnx2x/bnx2x.h b/drivers/net/bnx2x/bnx2x.h
>index 2abab0c..9682b8d 100644
>--- a/drivers/net/bnx2x/bnx2x.h
>+++ b/drivers/net/bnx2x/bnx2x.h
>@@ -151,23 +151,6 @@ struct bnx2x_device_type {
> #define FW_PREFETCH_CNT  16U
> #define DROPLESS_FC_HEADROOM 100
>
>-#ifndef MCLSHIFT
>-#define MCLSHIFT  11
>-#endif
>-#define MCLBYTES  (1 << MCLSHIFT)
>-
>-#if !defined(MJUMPAGESIZE)
>-#if BNX2X_PAGE_SIZE < 2048
>-#define MJUMPAGESIZEMCLBYTES
>-#elif BNX2X_PAGE_SIZE <= 8192
>-#define MJUMPAGESIZEBNX2X_PAGE_SIZE
>-#else
>-#define MJUMPAGESIZE(8 * 1024)
>-#endif
>-#endif
>-#define MJUM9BYTES  (9 * 1024)
>-#define MJUM16BYTES (16 * 1024)
>-
> /*
>  * Transmit Buffer Descriptor (tx_bd) definitions*
>  */
>@@ -402,7 +385,6 @@ struct bnx2x_fastpath {
>   uint8_t fw_sb_id;  /* status block number in FW */
>
>   uint32_t rx_buf_size;
>-  int mbuf_alloc_size;
>
>   int state;
> #define BNX2X_FP_STATE_CLOSED  0x01
>--
>2.1.4
>
>

Acked-by: Harish Patil 


Thanks,
Harish




This message and any attached documents contain information from the sending 
company or its parent company(s), subsidiaries, divisions or branch offices 
that may be confidential. If you are not the intended recipient, you may not 
read, copy, distribute, or use this information. If you have received this 
transmission in error, please notify the sender immediately by reply e-mail and 
then delete this message.


[dpdk-dev] [PATCH 1/2] mlx4: add callback to set primary mac address

2016-01-05 Thread David Marchand
Signed-off-by: David Marchand 
---
 drivers/net/mlx4/mlx4.c |   17 +
 1 file changed, 17 insertions(+)

diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
index 207bfe2..acc76d7 100644
--- a/drivers/net/mlx4/mlx4.c
+++ b/drivers/net/mlx4/mlx4.c
@@ -4432,6 +4432,22 @@ end:
 }

 /**
+ * DPDK callback to set the primary MAC address.
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ * @param mac_addr
+ *   MAC address to register.
+ */
+static void
+mlx4_mac_addr_set(struct rte_eth_dev *dev, struct ether_addr *mac_addr)
+{
+   DEBUG("%p: setting primary MAC address", (void *)dev);
+   mlx4_mac_addr_remove(dev, 0);
+   mlx4_mac_addr_add(dev, mac_addr, 0, 0);
+}
+
+/**
  * DPDK callback to enable promiscuous mode.
  *
  * @param dev
@@ -5004,6 +5020,7 @@ static const struct eth_dev_ops mlx4_dev_ops = {
.priority_flow_ctrl_set = NULL,
.mac_addr_remove = mlx4_mac_addr_remove,
.mac_addr_add = mlx4_mac_addr_add,
+   .mac_addr_set = mlx4_mac_addr_set,
.mtu_set = mlx4_dev_set_mtu,
.udp_tunnel_add = NULL,
.udp_tunnel_del = NULL,
-- 
1.7.10.4



[dpdk-dev] [PATCH 2/2] mlx5: add callback to set primary mac address

2016-01-05 Thread David Marchand
Signed-off-by: David Marchand 
---
 drivers/net/mlx5/mlx5.c |1 +
 drivers/net/mlx5/mlx5.h |1 +
 drivers/net/mlx5/mlx5_mac.c |   16 
 3 files changed, 18 insertions(+)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 821ee0f..30d88b5 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -162,6 +162,7 @@ static const struct eth_dev_ops mlx5_dev_ops = {
.flow_ctrl_set = mlx5_dev_set_flow_ctrl,
.mac_addr_remove = mlx5_mac_addr_remove,
.mac_addr_add = mlx5_mac_addr_add,
+   .mac_addr_set = mlx5_mac_addr_set,
.mtu_set = mlx5_dev_set_mtu,
.reta_update = mlx5_dev_rss_reta_update,
.reta_query = mlx5_dev_rss_reta_query,
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index b84d31d..2f9a594 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -179,6 +179,7 @@ int priv_mac_addr_add(struct priv *, unsigned int,
 int priv_mac_addrs_enable(struct priv *);
 void mlx5_mac_addr_add(struct rte_eth_dev *, struct ether_addr *, uint32_t,
   uint32_t);
+void mlx5_mac_addr_set(struct rte_eth_dev *, struct ether_addr *);

 /* mlx5_rss.c */

diff --git a/drivers/net/mlx5/mlx5_mac.c b/drivers/net/mlx5/mlx5_mac.c
index e37ce06..b1f34d9 100644
--- a/drivers/net/mlx5/mlx5_mac.c
+++ b/drivers/net/mlx5/mlx5_mac.c
@@ -488,3 +488,19 @@ mlx5_mac_addr_add(struct rte_eth_dev *dev, struct 
ether_addr *mac_addr,
 end:
priv_unlock(priv);
 }
+
+/**
+ * DPDK callback to set primary MAC address.
+ *
+ * @param dev
+ *   Pointer to Ethernet device structure.
+ * @param mac_addr
+ *   MAC address to register.
+ */
+void
+mlx5_mac_addr_set(struct rte_eth_dev *dev, struct ether_addr *mac_addr)
+{
+   DEBUG("%p: setting primary MAC address", (void *)dev);
+   mlx5_mac_addr_remove(dev, 0);
+   mlx5_mac_addr_add(dev, mac_addr, 0, 0);
+}
-- 
1.7.10.4



[dpdk-dev] time to kill rte_pci_dev_ids.h

2016-01-05 Thread Stephen Hemminger
Has anyone looked at getting rid of rte_pci_dev_ids.h?
The current method with #ifdef's and putting all devices in one file
really doesn't scale well. Something more like other OS's where
the data is only in each device driver would be better.


[dpdk-dev] [PATCH] vhost: fix leak of fds and mmaps

2016-01-05 Thread Rich Lane
The common vhost code only supported a single mmap per device. vhost-user
worked around this by saving the address/length/fd of each mmap after the end
of the rte_virtio_memory struct. This only works if the vhost-user code frees
dev->mem, since the common code is unaware of the extra info. The
VHOST_USER_RESET_OWNER message is one situation where the common code frees
dev->mem and leaks the fds and mappings. This happens every time I shut down a
VM.

The new code does not keep the fds around since they aren't required for
munmap. It saves the address/length in a new structure which is read by the
common code.

The vhost-cuse changes are only compile tested.

Signed-off-by: Rich Lane 
---
 lib/librte_vhost/rte_virtio_net.h | 14 +++--
 lib/librte_vhost/vhost_cuse/virtio-net-cdev.c | 24 ---
 lib/librte_vhost/vhost_user/virtio-net-user.c | 90 ---
 lib/librte_vhost/virtio-net.c | 24 ++-
 lib/librte_vhost/virtio-net.h |  3 +
 5 files changed, 75 insertions(+), 80 deletions(-)

diff --git a/lib/librte_vhost/rte_virtio_net.h 
b/lib/librte_vhost/rte_virtio_net.h
index 10dcb90..5233879 100644
--- a/lib/librte_vhost/rte_virtio_net.h
+++ b/lib/librte_vhost/rte_virtio_net.h
@@ -144,16 +144,22 @@ struct virtio_memory_regions {
uint64_taddress_offset; /**< Offset of region for 
address translation. */
 };

+/**
+ * Record a memory mapping so that it can be munmap'd later.
+ */
+struct virtio_memory_mapping {
+   void *addr;
+   size_t length;
+};

 /**
  * Memory structure includes region and mapping information.
  */
 struct virtio_memory {
-   uint64_tbase_address;   /**< Base QEMU userspace address of the 
memory file. */
-   uint64_tmapped_address; /**< Mapped address of memory file base 
in our applications memory space. */
-   uint64_tmapped_size;/**< Total size of memory file. */
uint32_tnregions;   /**< Number of memory regions. */
-   struct virtio_memory_regions  regions[0]; /**< Memory region 
information. */
+   uint32_tnmappings;  /**< Number of memory mappings */
+   struct virtio_memory_regionsregions[VHOST_MEMORY_MAX_NREGIONS]; 
/**< Memory region information. */
+   struct virtio_memory_mappingmappings[VHOST_MEMORY_MAX_NREGIONS]; 
/**< Memory mappings */
 };

 /**
diff --git a/lib/librte_vhost/vhost_cuse/virtio-net-cdev.c 
b/lib/librte_vhost/vhost_cuse/virtio-net-cdev.c
index ae2c3fa..1cd0c52 100644
--- a/lib/librte_vhost/vhost_cuse/virtio-net-cdev.c
+++ b/lib/librte_vhost/vhost_cuse/virtio-net-cdev.c
@@ -278,15 +278,20 @@ cuse_set_mem_table(struct vhost_device_ctx ctx,
if (dev == NULL)
return -1;

-   if (dev->mem && dev->mem->mapped_address) {
-   munmap((void *)(uintptr_t)dev->mem->mapped_address,
-   (size_t)dev->mem->mapped_size);
-   free(dev->mem);
+   if (nregions > VHOST_MEMORY_MAX_NREGIONS) {
+   RTE_LOG(ERR, VHOST_CONFIG,
+   "(%"PRIu64") Too many memory regions (%u, max %u)\n",
+   dev->device_fh, nregions,
+   VHOST_MEMORY_MAX_NREGIONS);
+   return -1;
+   }
+
+   if (dev->mem) {
+   rte_vhost_free_mem(dev->mem);
dev->mem = NULL;
}

-   dev->mem = calloc(1, sizeof(struct virtio_memory) +
-   sizeof(struct virtio_memory_regions) * nregions);
+   dev->mem = calloc(1, sizeof(*dev->mem));
if (dev->mem == NULL) {
RTE_LOG(ERR, VHOST_CONFIG,
"(%"PRIu64") Failed to allocate memory for dev->mem\n",
@@ -325,9 +330,10 @@ cuse_set_mem_table(struct vhost_device_ctx ctx,
dev->mem = NULL;
return -1;
}
-   dev->mem->mapped_address = mapped_address;
-   dev->mem->base_address = base_address;
-   dev->mem->mapped_size = mapped_size;
+
+   rte_vhost_add_mapping(dev->mem,
+   (void *)(uintptr_t)mapped_address,
+   mapped_size);
}
}

diff --git a/lib/librte_vhost/vhost_user/virtio-net-user.c 
b/lib/librte_vhost/vhost_user/virtio-net-user.c
index 2934d1c..492927a 100644
--- a/lib/librte_vhost/vhost_user/virtio-net-user.c
+++ b/lib/librte_vhost/vhost_user/virtio-net-user.c
@@ -48,18 +48,6 @@
 #include "vhost-net-user.h"
 #include "vhost-net.h"

-struct orig_region_map {
-   int fd;
-   uint64_t mapped_address;
-   uint64_t mapped_size;
-   uint64_t blksz;
-};
-
-#define orig_region(ptr, nregions) \
-   ((struct orig_region_map *)RTE_PTR_ADD((ptr), \
-   sizeof(struct virtio_memory) + \
-   sizeof(struct virtio_memory_regions) * (nregions)))
-
 static uint6

[dpdk-dev] [PATCH v2 1/4] vmxnet3: restore tx data ring support

2016-01-05 Thread Yong Wang
On 1/4/16, 9:16 PM, "Stephen Hemminger"  wrote:
>On Mon,  4 Jan 2016 18:28:16 -0800
>Yong Wang  wrote:
>
>> Tx data ring support was removed in a previous change
>> to add multi-seg transmit.  This change adds it back.
>> 
>> Fixes: 7ba5de417e3c ("vmxnet3: support multi-segment transmit")
>> 
>> Signed-off-by: Yong Wang 
>
>Do you have any numbers to confirm this?


[dpdk-dev] [PATCH v2 3/4] vmxnet3: add TSO support

2016-01-05 Thread Yong Wang
On 1/4/16, 9:14 PM, "Stephen Hemminger"  wrote:



>On Mon,  4 Jan 2016 18:28:18 -0800
>Yong Wang  wrote:
>
>> +mbuf = txq->cmd_ring.buf_info[eop_idx].m;
>> +if (unlikely(mbuf == NULL))
>> +rte_panic("EOP desc does not point to a valid mbuf");
>> +else
>
>The unlikely is really not needed with rte_panic since it is declared
>with cold attribute which has same effect.
>
>Else is unnecessary because rte_panic never returns.

Done.


[dpdk-dev] [PATCH v2 3/4] vmxnet3: add TSO support

2016-01-05 Thread Yong Wang
On 1/4/16, 9:15 PM, "Stephen Hemminger"  wrote:



>On Mon,  4 Jan 2016 18:28:18 -0800
>Yong Wang  wrote:
>
>> +/* The number of descriptors that are needed for a packet. */
>> +static unsigned
>> +txd_estimate(const struct rte_mbuf *m)
>> +{
>> +return m->nb_segs;
>> +}
>> +
>
>A wrapper function only really clarifies if it is hiding some information.
>Why not just code this in place?

Sure and removed.


[dpdk-dev] [PATCH v3 4/4] vmxnet3: announce device offload capability

2016-01-05 Thread Yong Wang
Signed-off-by: Yong Wang 
---
 drivers/net/vmxnet3/vmxnet3_ethdev.c | 16 ++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/net/vmxnet3/vmxnet3_ethdev.c 
b/drivers/net/vmxnet3/vmxnet3_ethdev.c
index c363bf6..8a40127 100644
--- a/drivers/net/vmxnet3/vmxnet3_ethdev.c
+++ b/drivers/net/vmxnet3/vmxnet3_ethdev.c
@@ -693,7 +693,8 @@ vmxnet3_dev_stats_get(struct rte_eth_dev *dev, struct 
rte_eth_stats *stats)
 }

 static void
-vmxnet3_dev_info_get(__attribute__((unused))struct rte_eth_dev *dev, struct 
rte_eth_dev_info *dev_info)
+vmxnet3_dev_info_get(__attribute__((unused))struct rte_eth_dev *dev,
+struct rte_eth_dev_info *dev_info)
 {
dev_info->max_rx_queues = VMXNET3_MAX_RX_QUEUES;
dev_info->max_tx_queues = VMXNET3_MAX_TX_QUEUES;
@@ -716,6 +717,17 @@ vmxnet3_dev_info_get(__attribute__((unused))struct 
rte_eth_dev *dev, struct rte_
.nb_min = VMXNET3_DEF_TX_RING_SIZE,
.nb_align = 1,
};
+
+   dev_info->rx_offload_capa =
+   DEV_RX_OFFLOAD_VLAN_STRIP |
+   DEV_RX_OFFLOAD_UDP_CKSUM |
+   DEV_RX_OFFLOAD_TCP_CKSUM;
+
+   dev_info->tx_offload_capa =
+   DEV_TX_OFFLOAD_VLAN_INSERT |
+   DEV_TX_OFFLOAD_TCP_CKSUM |
+   DEV_TX_OFFLOAD_UDP_CKSUM |
+   DEV_TX_OFFLOAD_TCP_TSO;
 }

 /* return 0 means link status changed, -1 means not changed */
@@ -819,7 +831,7 @@ vmxnet3_dev_vlan_filter_set(struct rte_eth_dev *dev, 
uint16_t vid, int on)
else
VMXNET3_CLEAR_VFTABLE_ENTRY(hw->shadow_vfta, vid);

-   /* don't change active filter if in promiscious mode */
+   /* don't change active filter if in promiscuous mode */
if (rxConf->rxMode & VMXNET3_RXM_PROMISC)
return 0;

-- 
1.9.1



[dpdk-dev] [PATCH v3 2/4] vmxnet3: add tx l4 cksum offload

2016-01-05 Thread Yong Wang
Support TCP/UDP checksum offload.

Signed-off-by: Yong Wang 
---
 doc/guides/rel_notes/release_2_3.rst |  3 +++
 drivers/net/vmxnet3/vmxnet3_rxtx.c   | 39 +++-
 2 files changed, 33 insertions(+), 9 deletions(-)

diff --git a/doc/guides/rel_notes/release_2_3.rst 
b/doc/guides/rel_notes/release_2_3.rst
index a23c8ac..58205fe 100644
--- a/doc/guides/rel_notes/release_2_3.rst
+++ b/doc/guides/rel_notes/release_2_3.rst
@@ -20,6 +20,9 @@ Drivers
   Tx data ring has been shown to improve small pkt forwarding performance
   on vSphere environment.

+* **vmxnet3: add tx l4 cksum offload.**
+
+  Support TCP/UDP checksum offload.

 Libraries
 ~
diff --git a/drivers/net/vmxnet3/vmxnet3_rxtx.c 
b/drivers/net/vmxnet3/vmxnet3_rxtx.c
index 2202d31..08e6115 100644
--- a/drivers/net/vmxnet3/vmxnet3_rxtx.c
+++ b/drivers/net/vmxnet3/vmxnet3_rxtx.c
@@ -332,6 +332,8 @@ vmxnet3_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
uint16_t nb_tx;
vmxnet3_tx_queue_t *txq = tx_queue;
struct vmxnet3_hw *hw = txq->hw;
+   Vmxnet3_TxQueueCtrl *txq_ctrl = &txq->shared->ctrl;
+   uint32_t deferred = rte_le_to_cpu_32(txq_ctrl->txNumDeferred);

if (unlikely(txq->stopped)) {
PMD_TX_LOG(DEBUG, "Tx queue is stopped.");
@@ -413,21 +415,40 @@ vmxnet3_xmit_pkts(void *tx_queue, struct rte_mbuf 
**tx_pkts,
gdesc->txd.tci = txm->vlan_tci;
}

-   /* TODO: Add transmit checksum offload here */
+   if (txm->ol_flags & PKT_TX_L4_MASK) {
+   gdesc->txd.om = VMXNET3_OM_CSUM;
+   gdesc->txd.hlen = txm->l2_len + txm->l3_len;
+
+   switch (txm->ol_flags & PKT_TX_L4_MASK) {
+   case PKT_TX_TCP_CKSUM:
+   gdesc->txd.msscof = gdesc->txd.hlen + 
offsetof(struct tcp_hdr, cksum);
+   break;
+   case PKT_TX_UDP_CKSUM:
+   gdesc->txd.msscof = gdesc->txd.hlen + 
offsetof(struct udp_hdr, dgram_cksum);
+   break;
+   default:
+   PMD_TX_LOG(WARNING, "requested cksum offload 
not supported %#llx",
+  txm->ol_flags & PKT_TX_L4_MASK);
+   abort();
+   }
+   } else {
+   gdesc->txd.hlen = 0;
+   gdesc->txd.om = VMXNET3_OM_NONE;
+   gdesc->txd.msscof = 0;
+   }
+
+   txq_ctrl->txNumDeferred = rte_cpu_to_le_32(++deferred);

/* flip the GEN bit on the SOP */
rte_compiler_barrier();
gdesc->dword[2] ^= VMXNET3_TXD_GEN;
-
-   txq->shared->ctrl.txNumDeferred++;
nb_tx++;
}

-   PMD_TX_LOG(DEBUG, "vmxnet3 txThreshold: %u", 
txq->shared->ctrl.txThreshold);
-
-   if (txq->shared->ctrl.txNumDeferred >= txq->shared->ctrl.txThreshold) {
+   PMD_TX_LOG(DEBUG, "vmxnet3 txThreshold: %u", 
rte_le_to_cpu_32(txq_ctrl->txThreshold));

-   txq->shared->ctrl.txNumDeferred = 0;
+   if (deferred >= rte_le_to_cpu_32(txq_ctrl->txThreshold)) {
+   txq_ctrl->txNumDeferred = 0;
/* Notify vSwitch that packets are available. */
VMXNET3_WRITE_BAR0_REG(hw, (VMXNET3_REG_TXPROD + txq->queue_id 
* VMXNET3_REG_ALIGN),
   txq->cmd_ring.next2fill);
@@ -728,8 +749,8 @@ vmxnet3_dev_tx_queue_setup(struct rte_eth_dev *dev,
PMD_INIT_FUNC_TRACE();

if ((tx_conf->txq_flags & ETH_TXQ_FLAGS_NOXSUMS) !=
-   ETH_TXQ_FLAGS_NOXSUMS) {
-   PMD_INIT_LOG(ERR, "TX no support for checksum offload yet");
+   ETH_TXQ_FLAGS_NOXSUMSCTP) {
+   PMD_INIT_LOG(ERR, "SCTP checksum offload not supported");
return -EINVAL;
}

-- 
1.9.1



[dpdk-dev] [PATCH v3 0/4] vmxnet3 TSO and tx cksum offload

2016-01-05 Thread Yong Wang
v3:
* fixed comments from Stephen
* added performance number for tx data ring

v2:
* fixed some logging issues when debug option turned on
* updated the txq_flags check in vmxnet3_dev_tx_queue_setup()

This patchset adds TCP/UDP checksum offload and TSO to vmxnet3 PMD.
One of the use cases for these features is to support STT.  It also
restores the tx data ring feature that was removed from a previous
patch.

Yong Wang (4):
  vmxnet3: restore tx data ring support
  vmxnet3: add tx l4 cksum offload
  vmxnet3: add TSO support
  vmxnet3: announce device offload capability

 doc/guides/rel_notes/release_2_3.rst |  11 +++
 drivers/net/vmxnet3/vmxnet3_ethdev.c |  16 +++-
 drivers/net/vmxnet3/vmxnet3_ring.h   |  13 ---
 drivers/net/vmxnet3/vmxnet3_rxtx.c   | 162 +++
 4 files changed, 151 insertions(+), 51 deletions(-)

-- 
1.9.1



[dpdk-dev] [PATCH v3 3/4] vmxnet3: add TSO support

2016-01-05 Thread Yong Wang
This commit adds vmxnet3 TSO support.

Verified with test-pmd (set fwd csum) that both tso and
non-tso pkts can be successfully transmitted and all
segmentes for a tso pkt are correct on the receiver side.

Signed-off-by: Yong Wang 
---
 doc/guides/rel_notes/release_2_3.rst |   3 +
 drivers/net/vmxnet3/vmxnet3_ring.h   |  13 -
 drivers/net/vmxnet3/vmxnet3_rxtx.c   | 110 ++-
 3 files changed, 85 insertions(+), 41 deletions(-)

diff --git a/doc/guides/rel_notes/release_2_3.rst 
b/doc/guides/rel_notes/release_2_3.rst
index 58205fe..ae487bb 100644
--- a/doc/guides/rel_notes/release_2_3.rst
+++ b/doc/guides/rel_notes/release_2_3.rst
@@ -24,6 +24,9 @@ Drivers

   Support TCP/UDP checksum offload.

+* **vmxnet3: add TSO support.**
+
+
 Libraries
 ~

diff --git a/drivers/net/vmxnet3/vmxnet3_ring.h 
b/drivers/net/vmxnet3/vmxnet3_ring.h
index 612487e..15b19e1 100644
--- a/drivers/net/vmxnet3/vmxnet3_ring.h
+++ b/drivers/net/vmxnet3/vmxnet3_ring.h
@@ -130,18 +130,6 @@ struct vmxnet3_txq_stats {
uint64_ttx_ring_full;
 };

-typedef struct vmxnet3_tx_ctx {
-   int  ip_type;
-   bool is_vlan;
-   bool is_cso;
-
-   uint16_t evl_tag;   /* only valid when is_vlan == TRUE */
-   uint32_t eth_hdr_size;  /* only valid for pkts requesting tso or csum
-* offloading */
-   uint32_t ip_hdr_size;
-   uint32_t l4_hdr_size;
-} vmxnet3_tx_ctx_t;
-
 typedef struct vmxnet3_tx_queue {
struct vmxnet3_hw*hw;
struct vmxnet3_cmd_ring  cmd_ring;
@@ -155,7 +143,6 @@ typedef struct vmxnet3_tx_queue {
uint8_t  port_id;   /**< Device port 
identifier. */
 } vmxnet3_tx_queue_t;

-
 struct vmxnet3_rxq_stats {
uint64_t drop_total;
uint64_t drop_err;
diff --git a/drivers/net/vmxnet3/vmxnet3_rxtx.c 
b/drivers/net/vmxnet3/vmxnet3_rxtx.c
index 08e6115..fc879ee 100644
--- a/drivers/net/vmxnet3/vmxnet3_rxtx.c
+++ b/drivers/net/vmxnet3/vmxnet3_rxtx.c
@@ -295,27 +295,45 @@ vmxnet3_dev_clear_queues(struct rte_eth_dev *dev)
}
 }

+static int
+vmxnet3_unmap_pkt(uint16_t eop_idx, vmxnet3_tx_queue_t *txq)
+{
+   int completed = 0;
+   struct rte_mbuf *mbuf;
+
+   /* Release cmd_ring descriptor and free mbuf */
+   VMXNET3_ASSERT(txq->cmd_ring.base[eop_idx].txd.eop == 1);
+
+   mbuf = txq->cmd_ring.buf_info[eop_idx].m;
+   if (mbuf == NULL)
+   rte_panic("EOP desc does not point to a valid mbuf");
+   rte_pktmbuf_free(mbuf);
+
+   txq->cmd_ring.buf_info[eop_idx].m = NULL;
+
+   while (txq->cmd_ring.next2comp != eop_idx) {
+   /* no out-of-order completion */
+   
VMXNET3_ASSERT(txq->cmd_ring.base[txq->cmd_ring.next2comp].txd.cq == 0);
+   vmxnet3_cmd_ring_adv_next2comp(&txq->cmd_ring);
+   completed++;
+   }
+
+   /* Mark the txd for which tcd was generated as completed */
+   vmxnet3_cmd_ring_adv_next2comp(&txq->cmd_ring);
+
+   return completed + 1;
+}
+
 static void
 vmxnet3_tq_tx_complete(vmxnet3_tx_queue_t *txq)
 {
int completed = 0;
-   struct rte_mbuf *mbuf;
vmxnet3_comp_ring_t *comp_ring = &txq->comp_ring;
struct Vmxnet3_TxCompDesc *tcd = (struct Vmxnet3_TxCompDesc *)
(comp_ring->base + comp_ring->next2proc);

while (tcd->gen == comp_ring->gen) {
-   /* Release cmd_ring descriptor and free mbuf */
-   VMXNET3_ASSERT(txq->cmd_ring.base[tcd->txdIdx].txd.eop == 1);
-   while (txq->cmd_ring.next2comp != tcd->txdIdx) {
-   mbuf = 
txq->cmd_ring.buf_info[txq->cmd_ring.next2comp].m;
-   txq->cmd_ring.buf_info[txq->cmd_ring.next2comp].m = 
NULL;
-   rte_pktmbuf_free_seg(mbuf);
-
-   /* Mark the txd for which tcd was generated as 
completed */
-   vmxnet3_cmd_ring_adv_next2comp(&txq->cmd_ring);
-   completed++;
-   }
+   completed += vmxnet3_unmap_pkt(tcd->txdIdx, txq);

vmxnet3_comp_ring_adv_next2proc(comp_ring);
tcd = (struct Vmxnet3_TxCompDesc *)(comp_ring->base +
@@ -351,21 +369,43 @@ vmxnet3_xmit_pkts(void *tx_queue, struct rte_mbuf 
**tx_pkts,
struct rte_mbuf *txm = tx_pkts[nb_tx];
struct rte_mbuf *m_seg = txm;
int copy_size = 0;
+   bool tso = (txm->ol_flags & PKT_TX_TCP_SEG) != 0;
+   /* # of descriptors needed for a packet. */
+   unsigned count = txm->nb_segs;

-   /* Is this packet execessively fragmented, then drop */
-   if (unlikely(txm->nb_segs > VMXNET3_MAX_TXD_PER_PKT)) {
-   ++txq->stats.drop_too_many_segs;
-   ++txq->stats.drop_total;
+

[dpdk-dev] [PATCH v3 1/4] vmxnet3: restore tx data ring support

2016-01-05 Thread Yong Wang
Tx data ring support was removed in a previous change
to add multi-seg transmit.  This change adds it back.

According to the original commit (2e849373), 64B pkt
rate with l2fwd improved by ~20% on an Ivy Bridge
server at which point we start to hit some bottleneck
on the rx side.

I also re-did the same test on a different setup (Haswell
processor, ~2.3GHz clock rate) on top of the master
and still observed ~17% performance gains.

Fixes: 7ba5de417e3c ("vmxnet3: support multi-segment transmit")

Signed-off-by: Yong Wang 
---
 doc/guides/rel_notes/release_2_3.rst |  5 +
 drivers/net/vmxnet3/vmxnet3_rxtx.c   | 17 -
 2 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/doc/guides/rel_notes/release_2_3.rst 
b/doc/guides/rel_notes/release_2_3.rst
index 99de186..a23c8ac 100644
--- a/doc/guides/rel_notes/release_2_3.rst
+++ b/doc/guides/rel_notes/release_2_3.rst
@@ -15,6 +15,11 @@ EAL
 Drivers
 ~~~

+* **vmxnet3: restore tx data ring.**
+
+  Tx data ring has been shown to improve small pkt forwarding performance
+  on vSphere environment.
+

 Libraries
 ~
diff --git a/drivers/net/vmxnet3/vmxnet3_rxtx.c 
b/drivers/net/vmxnet3/vmxnet3_rxtx.c
index 4de5d89..2202d31 100644
--- a/drivers/net/vmxnet3/vmxnet3_rxtx.c
+++ b/drivers/net/vmxnet3/vmxnet3_rxtx.c
@@ -348,6 +348,7 @@ vmxnet3_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
uint32_t first2fill, avail, dw2;
struct rte_mbuf *txm = tx_pkts[nb_tx];
struct rte_mbuf *m_seg = txm;
+   int copy_size = 0;

/* Is this packet execessively fragmented, then drop */
if (unlikely(txm->nb_segs > VMXNET3_MAX_TXD_PER_PKT)) {
@@ -365,6 +366,14 @@ vmxnet3_xmit_pkts(void *tx_queue, struct rte_mbuf 
**tx_pkts,
break;
}

+   if (rte_pktmbuf_pkt_len(txm) <= VMXNET3_HDR_COPY_SIZE) {
+   struct Vmxnet3_TxDataDesc *tdd;
+
+   tdd = txq->data_ring.base + txq->cmd_ring.next2fill;
+   copy_size = rte_pktmbuf_pkt_len(txm);
+   rte_memcpy(tdd->data, rte_pktmbuf_mtod(txm, char *), 
copy_size);
+   }
+
/* use the previous gen bit for the SOP desc */
dw2 = (txq->cmd_ring.gen ^ 0x1) << VMXNET3_TXD_GEN_SHIFT;
first2fill = txq->cmd_ring.next2fill;
@@ -377,7 +386,13 @@ vmxnet3_xmit_pkts(void *tx_queue, struct rte_mbuf 
**tx_pkts,
   transmit buffer size (16K) is greater than
   maximum sizeof mbuf segment size. */
gdesc = txq->cmd_ring.base + txq->cmd_ring.next2fill;
-   gdesc->txd.addr = RTE_MBUF_DATA_DMA_ADDR(m_seg);
+   if (copy_size)
+   gdesc->txd.addr = 
rte_cpu_to_le_64(txq->data_ring.basePA +
+   
txq->cmd_ring.next2fill *
+   sizeof(struct 
Vmxnet3_TxDataDesc));
+   else
+   gdesc->txd.addr = RTE_MBUF_DATA_DMA_ADDR(m_seg);
+
gdesc->dword[2] = dw2 | m_seg->data_len;
gdesc->dword[3] = 0;

-- 
1.9.1



[dpdk-dev] [PATCH v3 1/4] vmxnet3: restore tx data ring support

2016-01-05 Thread Stephen Hemminger
On Tue,  5 Jan 2016 16:12:55 -0800
Yong Wang  wrote:

> @@ -365,6 +366,14 @@ vmxnet3_xmit_pkts(void *tx_queue, struct rte_mbuf 
> **tx_pkts,
>   break;
>   }
>  
> + if (rte_pktmbuf_pkt_len(txm) <= VMXNET3_HDR_COPY_SIZE) {
> + struct Vmxnet3_TxDataDesc *tdd;
> +
> + tdd = txq->data_ring.base + txq->cmd_ring.next2fill;
> + copy_size = rte_pktmbuf_pkt_len(txm);
> + rte_memcpy(tdd->data, rte_pktmbuf_mtod(txm, char *), 
> copy_size);
> + }

Good idea to use a local region which optmizes the copy in the host,
but this implementation needs to be more general.

As written it is broken for multi-segment packets. A multi-segment
packet will have a pktlen >= datalen as in:
  m -> mb_segs=3, pktlen=1200, datalen=200
-> datalen=900
-> datalen=100

There are two ways to fix this. You could test for nb_segs == 1
or better yet. Optimize each segment it might be that the first
segment (or tail segment) would fit in the available data area.


[dpdk-dev] [PATCH v3 2/4] vmxnet3: add tx l4 cksum offload

2016-01-05 Thread Stephen Hemminger
On Tue,  5 Jan 2016 16:12:56 -0800
Yong Wang  wrote:

> - if (txq->shared->ctrl.txNumDeferred >= txq->shared->ctrl.txThreshold) {
> + PMD_TX_LOG(DEBUG, "vmxnet3 txThreshold: %u", 
> rte_le_to_cpu_32(txq_ctrl->txThreshold));

For bisection, it would be good to split the byte-order fixes from the
offload changes; in other words make them different commits.


[dpdk-dev] [PATCH v3 4/4] vmxnet3: announce device offload capability

2016-01-05 Thread Stephen Hemminger
On Tue,  5 Jan 2016 16:12:58 -0800
Yong Wang  wrote:

>  
>  /* return 0 means link status changed, -1 means not changed */
> @@ -819,7 +831,7 @@ vmxnet3_dev_vlan_filter_set(struct rte_eth_dev *dev, 
> uint16_t vid, int on)
>   else
>   VMXNET3_CLEAR_VFTABLE_ENTRY(hw->shadow_vfta, vid);
>  
> - /* don't change active filter if in promiscious mode */
> + /* don't change active filter if in promiscuous mode */

Maybe send a first patch in series with these message and comment cleanups?

Makes the review easier, and aides bisection.