[dpdk-dev] [PATCH v4 3/6] ixgbe: Get VF queue number

2015-01-07 Thread Ouyang, Changchun


> -Original Message-
> From: Vlad Zolotarov [mailto:vladz at cloudius-systems.com]
> Sent: Tuesday, January 6, 2015 7:27 PM
> To: Ouyang, Changchun; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v4 3/6] ixgbe: Get VF queue number
> 
> 
> On 01/06/15 03:54, Ouyang, Changchun wrote:
> >
> >> -Original Message-
> >> From: Vlad Zolotarov [mailto:vladz at cloudius-systems.com]
> >> Sent: Monday, January 5, 2015 6:07 PM
> >> To: Ouyang, Changchun; dev at dpdk.org
> >> Subject: Re: [dpdk-dev] [PATCH v4 3/6] ixgbe: Get VF queue number
> >>
> >>
> >> On 01/05/15 04:59, Ouyang, Changchun wrote:
>  -Original Message-
>  From: Vlad Zolotarov [mailto:vladz at cloudius-systems.com]
>  Sent: Sunday, January 4, 2015 4:39 PM
>  To: Ouyang, Changchun; dev at dpdk.org
>  Subject: Re: [dpdk-dev] [PATCH v4 3/6] ixgbe: Get VF queue number
> 
> 
>  On 01/04/15 09:18, Ouyang Changchun wrote:
> > Get the available Rx and Tx queue number when receiving
>  IXGBE_VF_GET_QUEUES message from VF.
> > Signed-off-by: Changchun Ouyang 
> > ---
> > lib/librte_pmd_ixgbe/ixgbe_pf.c | 35
>  ++-
> > 1 file changed, 34 insertions(+), 1 deletion(-)
> >
> > diff --git a/lib/librte_pmd_ixgbe/ixgbe_pf.c
> > b/lib/librte_pmd_ixgbe/ixgbe_pf.c index 495aff5..cbb0145 100644
> > --- a/lib/librte_pmd_ixgbe/ixgbe_pf.c
> > +++ b/lib/librte_pmd_ixgbe/ixgbe_pf.c
> > @@ -53,6 +53,8 @@
> > #include "ixgbe_ethdev.h"
> >
> > #define IXGBE_MAX_VFTA (128)
> > +#define IXGBE_VF_MSG_SIZE_DEFAULT 1 #define
> > +IXGBE_VF_GET_QUEUE_MSG_SIZE 5
> >
> > static inline uint16_t
> > dev_num_vf(struct rte_eth_dev *eth_dev) @@ -491,9 +493,36
> @@
> > ixgbe_negotiate_vf_api(struct rte_eth_dev *dev, uint32_t vf,
> > uint32_t
>  *msgbuf)
> > }
> >
> > static int
> > +ixgbe_get_vf_queues(struct rte_eth_dev *dev, uint32_t vf,
> > +uint32_t
> > +*msgbuf) {
> > +   struct ixgbe_vf_info *vfinfo =
> > +   *IXGBE_DEV_PRIVATE_TO_P_VFDATA(dev->data-
> > dev_private);
> > +   uint32_t default_q = vf *
> RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool;
> > +
> > +   /* Verify if the PF supports the mbox APIs version or not */
> > +   switch (vfinfo[vf].api_version) {
> > +   case ixgbe_mbox_api_20:
> > +   case ixgbe_mbox_api_11:
> > +   break;
> > +   default:
> > +   return -1;
> > +   }
> > +
> > +   /* Notify VF of Rx and Tx queue number */
> > +   msgbuf[IXGBE_VF_RX_QUEUES] =
>  RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool;
> > +   msgbuf[IXGBE_VF_TX_QUEUES] =
>  RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool;
> > +
> > +   /* Notify VF of default queue */
> > +   msgbuf[IXGBE_VF_DEF_QUEUE] = default_q;
>  What about IXGBE_VF_TRANS_VLAN field?
> >>> This field is used for vlan strip or dcb case, which the vf rss don't 
> >>> need it.
> >> But VFs do support VLAN stripping and u don't add it to just RSS. If
> >> VFs do not support VLAN stripping in the DPDK yet they should and
> >> then we will need this field.
> > If I don't miss your point, you also agree it is not related to vf rss 
> > itself, right?
> 
> Right.
> 
> > As for Vlan stripping, it need another patch to support it.
> 
> Well, at least put some fat comment in bold there that some the fields in the
> command is not filled and why. ;)

OK, I will put more comments to explain it in v5.

> >
> > +
> > +   return 0;
> > +}
> > +
> > +static int
> > ixgbe_rcv_msg_from_vf(struct rte_eth_dev *dev, uint16_t vf)
> > {
> > uint16_t mbx_size = IXGBE_VFMAILBOX_SIZE;
> > +   uint16_t msg_size = IXGBE_VF_MSG_SIZE_DEFAULT;
> > uint32_t msgbuf[IXGBE_VFMAILBOX_SIZE];
> > int32_t retval;
> > struct ixgbe_hw *hw =
> > IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
> > @@ -537,6 +566,10 @@ ixgbe_rcv_msg_from_vf(struct rte_eth_dev
> >> *dev,
>  uint16_t vf)
> > case IXGBE_VF_API_NEGOTIATE:
> > retval = ixgbe_negotiate_vf_api(dev, vf, msgbuf);
> > break;
> > +   case IXGBE_VF_GET_QUEUES:
> > +   retval = ixgbe_get_vf_queues(dev, vf, msgbuf);
> > +   msg_size = IXGBE_VF_GET_QUEUE_MSG_SIZE;
>  Although the msg_size semantics and motivation is clear, if u want
>  to do
> >> then
>  do it all the way - add it to all other cases too not just to
>  IXGBE_VF_GET_QUEUES.
>  For instance, why do u write all 16 DWORDS for API negotiation
>  (only 2 are
>  required) and only here u decided to get "greedy"? ;)
> 
>  My point is: either drop it completely or fix all other places as well.
> >>> This is because the actual message siz

[dpdk-dev] [PATCH v4 6/6] testpmd: Set Rx VMDq RSS mode

2015-01-07 Thread Ouyang, Changchun


> -Original Message-
> From: Vlad Zolotarov [mailto:vladz at cloudius-systems.com]
> Sent: Tuesday, January 6, 2015 8:53 PM
> To: Ouyang, Changchun; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v4 6/6] testpmd: Set Rx VMDq RSS mode
> 
> 
> On 01/06/15 04:01, Ouyang, Changchun wrote:
> >
> >> -Original Message-
> >> From: Vlad Zolotarov [mailto:vladz at cloudius-systems.com]
> >> Sent: Monday, January 5, 2015 6:12 PM
> >> To: Ouyang, Changchun; dev at dpdk.org
> >> Subject: Re: [dpdk-dev] [PATCH v4 6/6] testpmd: Set Rx VMDq RSS mode
> >>
> >>
> >> On 01/05/15 04:38, Ouyang, Changchun wrote:
>  -Original Message-
>  From: Vlad Zolotarov [mailto:vladz at cloudius-systems.com]
>  Sent: Sunday, January 4, 2015 5:47 PM
>  To: Ouyang, Changchun; dev at dpdk.org
>  Subject: Re: [dpdk-dev] [PATCH v4 6/6] testpmd: Set Rx VMDq RSS
>  mode
> 
> 
>  On 01/04/15 11:01, Ouyang, Changchun wrote:
> >> -Original Message-
> >> From: Vlad Zolotarov [mailto:vladz at cloudius-systems.com]
> >> Sent: Sunday, January 4, 2015 4:50 PM
> >> To: Ouyang, Changchun; dev at dpdk.org
> >> Subject: Re: [dpdk-dev] [PATCH v4 6/6] testpmd: Set Rx VMDq RSS
> >> mode
> >>
> >>
> >> On 01/04/15 09:18, Ouyang Changchun wrote:
> >>> Set VMDq RSS mode if it has VF(VF number is more than 1) and has
> >>> RSS
> >> information.
> >>> Signed-off-by: Changchun Ouyang 
> >>> ---
> >>>  app/test-pmd/testpmd.c | 10 ++
> >>>  1 file changed, 10 insertions(+)
> >>>
> >>> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
> >>> index 8c69756..6230f8b 100644
> >>> --- a/app/test-pmd/testpmd.c
> >>> +++ b/app/test-pmd/testpmd.c
> >>> @@ -1708,6 +1708,16 @@ init_port_config(void)
> >>>   port->dev_conf.rxmode.mq_mode =
> >> ETH_MQ_RX_NONE;
> >>>   }
> >>>
> >>> + if (port->dev_info.max_vfs != 0) {
> >>> + if (port-
> >>> dev_conf.rx_adv_conf.rss_conf.rss_hf != 0)
> >>> + port->dev_conf.rxmode.mq_mode =
> >>> + ETH_MQ_RX_VMDQ_RSS;
> >>> + else {
> >>> + port->dev_conf.rxmode.mq_mode =
> >> ETH_MQ_RX_NONE;
> >>> + port->dev_conf.txmode.mq_mode =
> >> ETH_MQ_TX_NONE;
> >>
> >> And what about the txmode.mq_mode when RSS is available
> (the :if"
>  clause)?
> > I think we can keep its original value for txmode.mq_mode, so
> > don't
>  change its value. How do you think of it?
> 
>  I agree that not changing a Tx mq_mode in both cases would be better.
> >>> In the else clause, set txmode.mq_mode as ETH_MQ_TX_NONE
> explicitly
> >> to
> >>> make sure it is neither ETH_MQ_TX_DCB, ETH_MQ_TX_VMDQ_DCB, nor
> >> ETH_MQ_TX_VMDQ_ONLY.
> >>
> >> It's not obvious to me why u should do that since AFAIK any of these
> >> modes requires RX_RSS. Do I miss anything?
> > No, I don't think so, in the else clause, it doesn't need rx_rss, and
> > no way to do it, because the case is there is no rss configuration
> information(note: in the else clause, dev_conf.rx_adv_conf.rss_conf.rss_hf
> == 0).
> >
> > So ETH_MQ_RX_NONE for rx_mode, and ETH_MQ_TX_NONE for tx_mode.
> 
> Of course, however, in general, one may ask, why u configure TX MQ mode
> in "else" clause an don't do it in the "if" one. Possibly the "if" case in TX 
> MQ
> context has been handled elsewhere but this is what makes this code
> confusing: to make it the most readable u'd rather configure the same
> feature set in both "if" and "else".
> For instance:
> 
> if (bla-bla) {
>tx_mode = X1;
>rx_mode = X2;
> } else {
>   tx_mode = Y1;
>   rx_mode = Y2;
> }
> 
> Look at the non-SR-IOV clause right above the "if-else" block u've added.
> Why don't they configure tx_mode there? Is it a bug in their code?

It also makes sense,  I will add  tx_mode = ETH_MQ_TX_NONE as no rss for tx 
mode,
Rss only for rx mode.

> By the way, u forgot to fix the remark below
> 
> /* In SR-IOV mode, RSS mode is not available */
> 
> which is located a few lines above the code u've added. ;)

Sorry, I missed these few lines before, I will remove them in v5. 

Thanks
Changchun



[dpdk-dev] [PATCH v3 0/3] enhance TX checksum command and csum forwarding engine

2015-01-07 Thread Liu, Jijiang
Hi Olivier,

> -Original Message-
> From: Olivier MATZ [mailto:olivier.matz at 6wind.com]
> Sent: Saturday, December 13, 2014 12:33 AM
> To: Liu, Jijiang
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v3 0/3] enhance TX checksum command and
> csum forwarding engine
>
> Hello,
>
> On 12/12/2014 04:48 AM, Liu, Jijiang wrote:
> > The 'hw/sw' option  is used to set/clear the flag of enabling TX tunneling 
> > packet
> checksum hardware offload in testpmd application.
>
> This is not clear at all.
> In your command, there is (hw|sw|none).
> Are you talking about inner or outer?
> Is this command useful for any kind of packet?
> How does it combine with "tx_checksum set outer-ip (hw|sw)"?
>

I rethink these TX checksum commands in this patch set and agree with you that 
we should make some changes for having clear meaning for them.

There are  3 commands in patch set as follows,
1. tx_checksum set tunnel (hw|sw|none) (port-id)

Now I also think the command 1 may confuse user, they probably don't understand 
 why we need 'hw' or 'sw' option and when  to use the two option,
so I will replace the command with 'tx_checksum set hw-tunnel-mode (on|off) 
(port-id)' command.

2. tx_checksum set outer-ip (hw|sw) (port-id)
3. tx_checksum set  (ip|udp|tcp|sctp) (hw|sw) (port-id)

The command 2 will be merged into command 3, the new command is ' tx_checksum 
set  (outer-ip|ip|udp|tcp|sctp) (hw|sw) (port-id)'.

These most of the cases in 
http://dpdk.org/ml/archives/dev/2014-December/009213.html will be covered by 
using the two commands

The command 'tx_checksum set hw-tunnel-mode (on|off)  (port-id)' is used to 
set/clear  TESTPMD_TX_OFFLOAD_TUNNEL_CKSUM flag.
Actually, the PKT_TX_UDP_TUNNEL_PKT offload flag will be set if the testpmd 
flag is set, which tell driver/HW treat  that transmit packet as a tunneling 
packet.

When 'on' is set, which is able to meet Method B.1 and method C.

When 'off' is set, the TESTPMD_TX_OFFLOAD_TUNNEL_CKSUM is not needed to set, so 
the PKT_TX_UDP_TUNNEL_PKT offload flag is not needed to set,  then HW treat  
that transmit packet as a non-tunneling packet. It is able to meet Method B.2.

As to case A, I think it is not mandatory to cover it in csum fwd engine for 
tunneling packet.

Is the above description clear for you?

> Regards,
> Olivier



[dpdk-dev] [PATCH v4 4/6] ether: Check VMDq RSS mode

2015-01-07 Thread Ouyang, Changchun


> -Original Message-
> From: Vlad Zolotarov [mailto:vladz at cloudius-systems.com]
> Sent: Wednesday, January 7, 2015 3:56 AM
> To: Ouyang, Changchun; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v4 4/6] ether: Check VMDq RSS mode
> 
> 
> On 01/06/15 03:56, Ouyang, Changchun wrote:
> >> -Original Message-
> >> From: Vlad Zolotarov [mailto:vladz at cloudius-systems.com]
> >> Sent: Monday, January 5, 2015 6:10 PM
> >> To: Ouyang, Changchun;dev at dpdk.org
> >> Subject: Re: [dpdk-dev] [PATCH v4 4/6] ether: Check VMDq RSS mode
> >>
> >>
> >> On 01/05/15 03:00, Ouyang, Changchun wrote:
>  -Original Message-
>  From: Vlad Zolotarov [mailto:vladz at cloudius-systems.com]
>  Sent: Sunday, January 4, 2015 5:46 PM
>  To: Ouyang, Changchun;dev at dpdk.org
>  Subject: Re: [dpdk-dev] [PATCH v4 4/6] ether: Check VMDq RSS mode
> 
> 
>  On 01/04/15 10:58, Ouyang, Changchun wrote:
> >> -Original Message-
> >> From: Vlad Zolotarov [mailto:vladz at cloudius-systems.com]
> >> Sent: Sunday, January 4, 2015 4:45 PM
> >> To: Ouyang, Changchun;dev at dpdk.org
> >> Subject: Re: [dpdk-dev] [PATCH v4 4/6] ether: Check VMDq RSS
> mode
> >>
> >>
> >> On 01/04/15 09:18, Ouyang Changchun wrote:
> >>> Check mq mode for VMDq RSS, handle it correctly instead of
> >>> returning an error; Also remove the limitation of per pool queue
> >>> number has max value of 1, because the per pool queue number
> >> could
> >>> be 2 or 4 if it is VMDq RSS mode;
> >>>
> >>> The number of rxq specified in config will determine the mq mode
> >>> for
> >> VMDq RSS.
> >>> Signed-off-by: Changchun Ouyang
> >>> ---
> >>>  lib/librte_ether/rte_ethdev.c | 39
> >> ++-
> >>>  1 file changed, 34 insertions(+), 5 deletions(-)
> >>>
> >>> diff --git a/lib/librte_ether/rte_ethdev.c
> >>> b/lib/librte_ether/rte_ethdev.c index 95f2ceb..59ff325 100644
> >>> --- a/lib/librte_ether/rte_ethdev.c
> >>> +++ b/lib/librte_ether/rte_ethdev.c
> >>> @@ -510,8 +510,7 @@ rte_eth_dev_check_mq_mode(uint8_t
> >> port_id,
> >>> uint16_t nb_rx_q, uint16_t nb_tx_q,
> >>>
> >>>   if (RTE_ETH_DEV_SRIOV(dev).active != 0) {
> >>>   /* check multi-queue mode */
> >>> - if ((dev_conf->rxmode.mq_mode ==
> >> ETH_MQ_RX_RSS) ||
> >>> - (dev_conf->rxmode.mq_mode ==
> >> ETH_MQ_RX_DCB) ||
> >>> + if ((dev_conf->rxmode.mq_mode ==
> >> ETH_MQ_RX_DCB) ||
> >>>   (dev_conf->rxmode.mq_mode ==
> >> ETH_MQ_RX_DCB_RSS)
> >> ||
> >>>   (dev_conf->txmode.mq_mode ==
> >> ETH_MQ_TX_DCB)) {
> >>>   /* SRIOV only works in VMDq enable mode
> >> */ @@ -
> >> 525,7 +524,6 @@
> >>> rte_eth_dev_check_mq_mode(uint8_t port_id, uint16_t nb_rx_q,
> >> uint16_t nb_tx_q,
> >>>   }
> >>>
> >>>   switch (dev_conf->rxmode.mq_mode) {
> >>> - case ETH_MQ_RX_VMDQ_RSS:
> >>>   case ETH_MQ_RX_VMDQ_DCB:
> >>>   case ETH_MQ_RX_VMDQ_DCB_RSS:
> >>>   /* DCB/RSS VMDQ in SRIOV mode, not
> >> implement
> >> yet */ @@ -534,6
> >>> +532,39 @@ rte_eth_dev_check_mq_mode(uint8_t port_id,
> uint16_t
> >> nb_rx_q, uint16_t nb_tx_q,
> >>>   "unsupported VMDQ
> >> mq_mode
> >> rx %u\n",
> >>>   port_id, dev_conf-
> >>> rxmode.mq_mode);
> >>>   return (-EINVAL);
> >>> + case ETH_MQ_RX_RSS:
> >>> + PMD_DEBUG_TRACE("ethdev port_id=%"
> >> PRIu8
> >>> + " SRIOV active, "
> >>> + "Rx mq mode is changed
> >> from:"
> >>> + "mq_mode %u into VMDQ
> >> mq_mode %u\n",
> >>> + port_id,
> >>> + dev_conf-
> >>> rxmode.mq_mode,
> >>> + dev->data-
> >>> dev_conf.rxmode.mq_mode);
> >>> + case ETH_MQ_RX_VMDQ_RSS:
> >>> + dev->data->dev_conf.rxmode.mq_mode =
> >> ETH_MQ_RX_VMDQ_RSS;
> >>> + if (nb_rx_q <
> >> RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool) {
>  Missed that before: shouldn't it be "<=" here?
> >>> Agree with you, need <= here, I will fix it in v5
> >>>
> >>> + switch (nb_rx_q) {
> >>> + case 1:
> >>> + case 2:
> >>> +
> >>RTE_ETH_DEV_SRIOV(dev).active =
> >>> + ETH_64_POOLS;
> >>> + break;
> >>> +

[dpdk-dev] [PATCH v2] bond: vlan flags misinterpreted in xmit_slave_hash function

2015-01-07 Thread Jiajia, SunX
Tested-by: Jiajia, SunX 
- Tested Commit: 6fb3161060fc894295a27f9304c56ef34492799d
- OS: Fedora20 3.11.10-301.fc20.x86_64
- GCC: gcc version 4.8.3
- CPU: Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz
- NIC: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection 
[8086:10fb]
- Target x86_64-native-linuxapp-gcc and i686-native-linuxapp-gcc
- Total 44 cases, 44 passed, 0 failed

TOPO:
* Connections ports between tester/ixia and DUT
  - TESTER(Or IXIA)---DUT
  - portA--port0
  - portB--port1
  - portC--port2
  - portD--port3


Test Setup#1 for Functional test


Tester has 4 ports(portA--portD), and DUT has 4 ports(port0--port3), then 
connect portA to port0, portB to port1, portC to port2, portD to port3. 



- Case: Basic bonding--Create bonded devices and slaves
  Description: 
Use Setup#1.
Create bonded device and add some ports as salve of bonded 
device,
Then removed slaves or added slaves or change the bonding primary 
slave
Or change bonding mode and so on.
  Expected test result:
Verify the basic functions are normal.

- Case: Basic bonding--MAC Address Test
  Description: 
Use Setup#1.
Create bonded device and add some ports as slaves of bonded 
device,
Check that the changes of  the bonded device and slave MAC
  Expected test result:
Verify the behavior of bonded device and slave according to the 
mode.

- Case: Basic bonding--Device Promiscuous Mode Test
  Description: 
Use Setup#1.
Create bonded device and add some ports as slaves of bonded 
device,
Set promiscuous mode on or off, then send packets to the bonded 
device
Or slaves.
  Expected test result:
Verify the RX/TX status of bonded device and slaves according to 
the mode.

- Case: Mode 0(Round Robin) TX/RX test
  Description: 
Use Setup#1.
Create bonded device with mode 0 and add 3 ports as slaves of 
bonded device,
Forward packets between bonded device and unbounded device, start 
to forward,
And send packets to unbound device or slaves.
  Expected test result:
Verify the RX/TX status of bonded device and slaves in mode 0.

- Case: Mode 0(Round Robin) Bring one slave link down
  Description: 
Use Setup#1.
Create bonded device with mode 0 and add 3 ports as slaves of 
bonded device,
Forward packets between bonded device and unbounded device, start 
to forward,
Bring the link on either port 0, 1 or 2 down. And send packets to 
unbound 
device or slaves.
  Expected test result:
Verify the RX/TX status of bonded device and slaves in mode 0.

- Case: Mode 0(Round Robin) Bring all slave links down
  Description: 
Use Setup#1.
Create bonded device with mode 0 and add 3 ports as slaves of 
bonded device,
Forward packets between bonded device and unbounded device, start 
to forward,
Bring the links down on all bonded ports. And send packets to 
unbound 
device or slaves.
  Expected test result:
Verify the RX/TX status of bonded device and slaves in mode 0.

- Case: Mode 1(Active Backup) TX/RX Test
  Description: 
Use Setup#1.
Create bonded device with mode 1 and add 3 ports as slaves of 
bonded device,
Forward packets between bonded device and unbounded device, start 
to forward,
And send packets to unbound device or slaves.
  Expected test result:
Verify the RX/TX status of bonded device and slaves in mode 1.

- Case: Mode 1(Active Backup) Change active slave, RX/TX test
  Description: 
Use Setup#1.
Continuing from previous test case.Change the active slave port 
from port0 
to port1.Verify that the bonded device's MAC has changed to 
slave1's MAC.

testpmd> set bonding primary 1 4 

   Repeat the transmission and reception(TX/RX) test verify that data 
is now 
   transmitted and received through the new active slave and no longer 
through
   port0
  Expected test result:
Verify the RX/TX status of bonded device and slaves in mode 1.

- Case: Mode 1(Active Backup) Link up/down active eth dev
  Description: 
Use Setup#1.

   Bring link between port A and port0 down. If tester is ixia, can use 
   IxExplorer to set the "Simulate Cable Disconnect" at the port 
property.  
   Verify that the active slave has been changed from port0. Repeat the 
   transmission and reception test verify that data is now transmitted 
and
   received through the new active slave and no longer through port0

   Bring port0 to link down a

[dpdk-dev] [PATCH v5 0/6] Enable VF RSS for Niantic

2015-01-07 Thread Ouyang Changchun
This patch enables VF RSS for Niantic, which allow each VF having at most 4 
queues.
The actual queue number per VF depends on the total number of pool, which is
determined by the max number of VF at PF initialization stage and the number of
queue specified in config:
1) If the max number of VF is in the range from 1 to 32, and the number of rxq 
is 4
('--rxq 4' in testpmd), then there is totally 32 pools(ETH_32_POOLS), and each 
VF
have 4 queues;

2)If the max number of VF is in the range from 33 to 64, and the number of rxq 
is 2
('--rxq 2' in testpmd), then there is totally 64 pools(ETH_64_POOLS), and each 
VF
have 2 queues;

On host, to enable VF RSS functionality, rx mq mode should be set as 
ETH_MQ_RX_VMDQ_RSS
or ETH_MQ_RX_RSS mode, and SRIOV mode should be activated(max_vfs >= 1).
It also needs config VF RSS information like hash function, RSS key, RSS key 
length.

The limitation for Niantic VF RSS is:
the hash and key are shared among PF and all VF, the RETA table with 128 
entries are
also shared among PF and all VF. So it could not to provide a method to query 
the hash
and reta content per VF on guest, while, if possible, please query them on 
host(PF) for
the shared RETA information.

changes in v5:
  - Fix minor issue and some comments;

changes in v4:
  - Extract a function to remove embeded switch-case statement;
  - Check whether RX queue number is a valid one, otherwise return error;
  - Update the description a bit;

changes in v3:
  - More cleanup;

changes in v2:
  - Update the description;
  - Use receiving queue number('--rxq ') specified in config to 
determine the
number of pool and the number of queue per VF;

changes in v1:
  - Config VF RSS;

Changchun Ouyang (6):
  ixgbe: Code cleanup
  ixgbe: Negotiate VF API version
  ixgbe: Get VF queue number
  ether: Check VMDq RSS mode
  ixgbe: Config VF RSS
  testpmd: Set Rx VMDq RSS mode

 app/test-pmd/testpmd.c  |  15 +++-
 lib/librte_ether/rte_ethdev.c   |  50 +++--
 lib/librte_pmd_ixgbe/ixgbe_ethdev.h |   1 +
 lib/librte_pmd_ixgbe/ixgbe_pf.c |  80 -
 lib/librte_pmd_ixgbe/ixgbe_rxtx.c   | 138 
 5 files changed, 248 insertions(+), 36 deletions(-)

-- 
1.8.4.2



[dpdk-dev] [PATCH v5 1/6] ixgbe: Code cleanup

2015-01-07 Thread Ouyang Changchun
Put global register configuring out of loop for queue; also fix typo and indent;

Signed-off-by: Changchun Ouyang 
---
 lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 35 ++-
 1 file changed, 18 insertions(+), 17 deletions(-)

diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c 
b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
index 5c36bff..f69abda 100644
--- a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
+++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
@@ -3548,9 +3548,9 @@ ixgbe_dev_rx_init(struct rte_eth_dev *dev)
IXGBE_WRITE_REG(hw, 
IXGBE_PSRTYPE(rxq->reg_idx), psrtype);
}
srrctl = ((dev->data->dev_conf.rxmode.split_hdr_size <<
-  IXGBE_SRRCTL_BSIZEHDRSIZE_SHIFT) &
- IXGBE_SRRCTL_BSIZEHDR_MASK);
-   srrctl |= E1000_SRRCTL_DESCTYPE_HDR_SPLIT_ALWAYS;
+   IXGBE_SRRCTL_BSIZEHDRSIZE_SHIFT) &
+   IXGBE_SRRCTL_BSIZEHDR_MASK);
+   srrctl |= IXGBE_SRRCTL_DESCTYPE_HDR_SPLIT_ALWAYS;
} else
 #endif
srrctl = IXGBE_SRRCTL_DESCTYPE_ADV_ONEBUF;
@@ -3985,7 +3985,7 @@ ixgbevf_dev_rx_init(struct rte_eth_dev *dev)
struct igb_rx_queue *rxq;
struct rte_pktmbuf_pool_private *mbp_priv;
uint64_t bus_addr;
-   uint32_t srrctl;
+   uint32_t srrctl, psrtype = 0;
uint16_t buf_size;
uint16_t i;
int ret;
@@ -4039,20 +4039,10 @@ ixgbevf_dev_rx_init(struct rte_eth_dev *dev)
 * Configure Header Split
 */
if (dev->data->dev_conf.rxmode.header_split) {
-
-   /* Must setup the PSRTYPE register */
-   uint32_t psrtype;
-   psrtype = IXGBE_PSRTYPE_TCPHDR |
-   IXGBE_PSRTYPE_UDPHDR   |
-   IXGBE_PSRTYPE_IPV4HDR  |
-   IXGBE_PSRTYPE_IPV6HDR;
-
-   IXGBE_WRITE_REG(hw, IXGBE_VFPSRTYPE(i), psrtype);
-
srrctl = ((dev->data->dev_conf.rxmode.split_hdr_size <<
-  IXGBE_SRRCTL_BSIZEHDRSIZE_SHIFT) &
- IXGBE_SRRCTL_BSIZEHDR_MASK);
-   srrctl |= E1000_SRRCTL_DESCTYPE_HDR_SPLIT_ALWAYS;
+   IXGBE_SRRCTL_BSIZEHDRSIZE_SHIFT) &
+   IXGBE_SRRCTL_BSIZEHDR_MASK);
+   srrctl |= IXGBE_SRRCTL_DESCTYPE_HDR_SPLIT_ALWAYS;
} else
 #endif
srrctl = IXGBE_SRRCTL_DESCTYPE_ADV_ONEBUF;
@@ -4095,6 +4085,17 @@ ixgbevf_dev_rx_init(struct rte_eth_dev *dev)
}
}

+#ifdef RTE_HEADER_SPLIT_ENABLE
+   if (dev->data->dev_conf.rxmode.header_split)
+   /* Must setup the PSRTYPE register */
+   psrtype = IXGBE_PSRTYPE_TCPHDR |
+   IXGBE_PSRTYPE_UDPHDR   |
+   IXGBE_PSRTYPE_IPV4HDR  |
+   IXGBE_PSRTYPE_IPV6HDR;
+#endif
+
+   IXGBE_WRITE_REG(hw, IXGBE_VFPSRTYPE, psrtype);
+
if (dev->data->dev_conf.rxmode.enable_scatter) {
if (!dev->data->scattered_rx)
PMD_INIT_LOG(DEBUG, "forcing scatter mode");
-- 
1.8.4.2



[dpdk-dev] [PATCH v5 2/6] ixgbe: Negotiate VF API version

2015-01-07 Thread Ouyang Changchun
Negotiate API version with VF when receiving the IXGBE_VF_API_NEGOTIATE message.

Signed-off-by: Changchun Ouyang 
---
 lib/librte_pmd_ixgbe/ixgbe_ethdev.h |  1 +
 lib/librte_pmd_ixgbe/ixgbe_pf.c | 25 +
 2 files changed, 26 insertions(+)

diff --git a/lib/librte_pmd_ixgbe/ixgbe_ethdev.h 
b/lib/librte_pmd_ixgbe/ixgbe_ethdev.h
index ca99170..730098d 100644
--- a/lib/librte_pmd_ixgbe/ixgbe_ethdev.h
+++ b/lib/librte_pmd_ixgbe/ixgbe_ethdev.h
@@ -159,6 +159,7 @@ struct ixgbe_vf_info {
uint16_t tx_rate[IXGBE_MAX_QUEUE_NUM_PER_VF];
uint16_t vlan_count;
uint8_t spoofchk_enabled;
+   uint8_t api_version;
 };

 /*
diff --git a/lib/librte_pmd_ixgbe/ixgbe_pf.c b/lib/librte_pmd_ixgbe/ixgbe_pf.c
index 51da1fd..495aff5 100644
--- a/lib/librte_pmd_ixgbe/ixgbe_pf.c
+++ b/lib/librte_pmd_ixgbe/ixgbe_pf.c
@@ -469,6 +469,28 @@ ixgbe_set_vf_lpe(struct rte_eth_dev *dev, __rte_unused 
uint32_t vf, uint32_t *ms
 }

 static int
+ixgbe_negotiate_vf_api(struct rte_eth_dev *dev, uint32_t vf, uint32_t *msgbuf)
+{
+   uint32_t api_version = msgbuf[1];
+   struct ixgbe_vf_info *vfinfo =
+   *IXGBE_DEV_PRIVATE_TO_P_VFDATA(dev->data->dev_private);
+
+   switch (api_version) {
+   case ixgbe_mbox_api_10:
+   case ixgbe_mbox_api_11:
+   vfinfo[vf].api_version = (uint8_t)api_version;
+   return 0;
+   default:
+   break;
+   }
+
+   RTE_LOG(ERR, PMD, "Negotiate invalid api version %u from VF %d\n",
+   api_version, vf);
+
+   return -1;
+}
+
+static int
 ixgbe_rcv_msg_from_vf(struct rte_eth_dev *dev, uint16_t vf)
 {
uint16_t mbx_size = IXGBE_VFMAILBOX_SIZE;
@@ -512,6 +534,9 @@ ixgbe_rcv_msg_from_vf(struct rte_eth_dev *dev, uint16_t vf)
case IXGBE_VF_SET_VLAN:
retval = ixgbe_vf_set_vlan(dev, vf, msgbuf);
break;
+   case IXGBE_VF_API_NEGOTIATE:
+   retval = ixgbe_negotiate_vf_api(dev, vf, msgbuf);
+   break;
default:
PMD_DRV_LOG(DEBUG, "Unhandled Msg %8.8x", (unsigned)msgbuf[0]);
retval = IXGBE_ERR_MBX;
-- 
1.8.4.2



[dpdk-dev] [PATCH v5 3/6] ixgbe: Get VF queue number

2015-01-07 Thread Ouyang Changchun
Get the available Rx and Tx queue number when receiving IXGBE_VF_GET_QUEUES 
message from VF.

Signed-off-by: Changchun Ouyang 

changes in v5
  - Add some 'FIX ME' comments for IXGBE_VF_TRANS_VLAN.

---
 lib/librte_pmd_ixgbe/ixgbe_pf.c | 40 +++-
 1 file changed, 39 insertions(+), 1 deletion(-)

diff --git a/lib/librte_pmd_ixgbe/ixgbe_pf.c b/lib/librte_pmd_ixgbe/ixgbe_pf.c
index 495aff5..dbda9b5 100644
--- a/lib/librte_pmd_ixgbe/ixgbe_pf.c
+++ b/lib/librte_pmd_ixgbe/ixgbe_pf.c
@@ -53,6 +53,8 @@
 #include "ixgbe_ethdev.h"

 #define IXGBE_MAX_VFTA (128)
+#define IXGBE_VF_MSG_SIZE_DEFAULT 1
+#define IXGBE_VF_GET_QUEUE_MSG_SIZE 5

 static inline uint16_t
 dev_num_vf(struct rte_eth_dev *eth_dev)
@@ -491,9 +493,41 @@ ixgbe_negotiate_vf_api(struct rte_eth_dev *dev, uint32_t 
vf, uint32_t *msgbuf)
 }

 static int
+ixgbe_get_vf_queues(struct rte_eth_dev *dev, uint32_t vf, uint32_t *msgbuf)
+{
+   struct ixgbe_vf_info *vfinfo =
+   *IXGBE_DEV_PRIVATE_TO_P_VFDATA(dev->data->dev_private);
+   uint32_t default_q = vf * RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool;
+
+   /* Verify if the PF supports the mbox APIs version or not */
+   switch (vfinfo[vf].api_version) {
+   case ixgbe_mbox_api_20:
+   case ixgbe_mbox_api_11:
+   break;
+   default:
+   return -1;
+   }
+
+   /* Notify VF of Rx and Tx queue number */
+   msgbuf[IXGBE_VF_RX_QUEUES] = RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool;
+   msgbuf[IXGBE_VF_TX_QUEUES] = RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool;
+
+   /* Notify VF of default queue */
+   msgbuf[IXGBE_VF_DEF_QUEUE] = default_q;
+
+   /*
+* FIX ME if it needs fill msgbuf[IXGBE_VF_TRANS_VLAN]
+* for VLAN strip or VMDQ_DCB or VMDQ_DCB_RSS
+*/
+
+   return 0;
+}
+
+static int
 ixgbe_rcv_msg_from_vf(struct rte_eth_dev *dev, uint16_t vf)
 {
uint16_t mbx_size = IXGBE_VFMAILBOX_SIZE;
+   uint16_t msg_size = IXGBE_VF_MSG_SIZE_DEFAULT;
uint32_t msgbuf[IXGBE_VFMAILBOX_SIZE];
int32_t retval;
struct ixgbe_hw *hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
@@ -537,6 +571,10 @@ ixgbe_rcv_msg_from_vf(struct rte_eth_dev *dev, uint16_t vf)
case IXGBE_VF_API_NEGOTIATE:
retval = ixgbe_negotiate_vf_api(dev, vf, msgbuf);
break;
+   case IXGBE_VF_GET_QUEUES:
+   retval = ixgbe_get_vf_queues(dev, vf, msgbuf);
+   msg_size = IXGBE_VF_GET_QUEUE_MSG_SIZE;
+   break;
default:
PMD_DRV_LOG(DEBUG, "Unhandled Msg %8.8x", (unsigned)msgbuf[0]);
retval = IXGBE_ERR_MBX;
@@ -551,7 +589,7 @@ ixgbe_rcv_msg_from_vf(struct rte_eth_dev *dev, uint16_t vf)

msgbuf[0] |= IXGBE_VT_MSGTYPE_CTS;

-   ixgbe_write_mbx(hw, msgbuf, 1, vf);
+   ixgbe_write_mbx(hw, msgbuf, msg_size, vf);

return retval;
 }
-- 
1.8.4.2



[dpdk-dev] [PATCH v5 4/6] ether: Check VMDq RSS mode

2015-01-07 Thread Ouyang Changchun
Check mq mode for VMDq RSS, handle it correctly instead of returning an error;
Also remove the limitation of per pool queue number has max value of 1, because
the per pool queue number could be 2 or 4 if it is VMDq RSS mode;

The number of rxq specified in config will determine the mq mode for VMDq RSS.

Signed-off-by: Changchun Ouyang 

changes in v5:
  - Fix '<' issue, it should be '<=' to test rxq number;
  - Extract a function to remove the embeded switch-case statement.

---
 lib/librte_ether/rte_ethdev.c | 50 ++-
 1 file changed, 45 insertions(+), 5 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 95f2ceb..8363e26 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -503,6 +503,31 @@ rte_eth_dev_tx_queue_config(struct rte_eth_dev *dev, 
uint16_t nb_queues)
 }

 static int
+rte_eth_dev_check_vf_rss_rxq_num(uint8_t port_id, uint16_t nb_rx_q)
+{
+   struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+   switch (nb_rx_q) {
+   case 1:
+   case 2:
+   RTE_ETH_DEV_SRIOV(dev).active =
+   ETH_64_POOLS;
+   break;
+   case 4:
+   RTE_ETH_DEV_SRIOV(dev).active =
+   ETH_32_POOLS;
+   break;
+   default:
+   return -EINVAL;
+   }
+
+   RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool = nb_rx_q;
+   RTE_ETH_DEV_SRIOV(dev).def_pool_q_idx =
+   dev->pci_dev->max_vfs * nb_rx_q;
+
+   return 0;
+}
+
+static int
 rte_eth_dev_check_mq_mode(uint8_t port_id, uint16_t nb_rx_q, uint16_t nb_tx_q,
  const struct rte_eth_conf *dev_conf)
 {
@@ -510,8 +535,7 @@ rte_eth_dev_check_mq_mode(uint8_t port_id, uint16_t 
nb_rx_q, uint16_t nb_tx_q,

if (RTE_ETH_DEV_SRIOV(dev).active != 0) {
/* check multi-queue mode */
-   if ((dev_conf->rxmode.mq_mode == ETH_MQ_RX_RSS) ||
-   (dev_conf->rxmode.mq_mode == ETH_MQ_RX_DCB) ||
+   if ((dev_conf->rxmode.mq_mode == ETH_MQ_RX_DCB) ||
(dev_conf->rxmode.mq_mode == ETH_MQ_RX_DCB_RSS) ||
(dev_conf->txmode.mq_mode == ETH_MQ_TX_DCB)) {
/* SRIOV only works in VMDq enable mode */
@@ -525,7 +549,6 @@ rte_eth_dev_check_mq_mode(uint8_t port_id, uint16_t 
nb_rx_q, uint16_t nb_tx_q,
}

switch (dev_conf->rxmode.mq_mode) {
-   case ETH_MQ_RX_VMDQ_RSS:
case ETH_MQ_RX_VMDQ_DCB:
case ETH_MQ_RX_VMDQ_DCB_RSS:
/* DCB/RSS VMDQ in SRIOV mode, not implement yet */
@@ -534,6 +557,25 @@ rte_eth_dev_check_mq_mode(uint8_t port_id, uint16_t 
nb_rx_q, uint16_t nb_tx_q,
"unsupported VMDQ mq_mode rx %u\n",
port_id, dev_conf->rxmode.mq_mode);
return (-EINVAL);
+   case ETH_MQ_RX_RSS:
+   PMD_DEBUG_TRACE("ethdev port_id=%" PRIu8
+   " SRIOV active, "
+   "Rx mq mode is changed from:"
+   "mq_mode %u into VMDQ mq_mode %u\n",
+   port_id,
+   dev_conf->rxmode.mq_mode,
+   dev->data->dev_conf.rxmode.mq_mode);
+   case ETH_MQ_RX_VMDQ_RSS:
+   dev->data->dev_conf.rxmode.mq_mode = ETH_MQ_RX_VMDQ_RSS;
+   if (nb_rx_q <= RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool)
+   if (rte_eth_dev_check_vf_rss_rxq_num(port_id, 
nb_rx_q) != 0) {
+   PMD_DEBUG_TRACE("ethdev port_id=%d"
+   " SRIOV active, invalid queue"
+   " number for VMDQ RSS\n",
+   port_id);
+   return -EINVAL;
+   }
+   break;
default: /* ETH_MQ_RX_VMDQ_ONLY or ETH_MQ_RX_NONE */
/* if nothing mq mode configure, use default scheme */
dev->data->dev_conf.rxmode.mq_mode = 
ETH_MQ_RX_VMDQ_ONLY;
@@ -553,8 +595,6 @@ rte_eth_dev_check_mq_mode(uint8_t port_id, uint16_t 
nb_rx_q, uint16_t nb_tx_q,
default: /* ETH_MQ_TX_VMDQ_ONLY or ETH_MQ_TX_NONE */
/* if nothing mq mode configure, use default scheme */
dev->data->dev_conf.txmode.mq_mode = 
ETH_MQ_TX_VMDQ_ONLY;
-   if (RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool > 1)
-   RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool = 1;
break;
}

-- 
1.8.4.2



[dpdk-dev] [PATCH v5 5/6] ixgbe: Config VF RSS

2015-01-07 Thread Ouyang Changchun
It needs config RSS and IXGBE_MRQC and IXGBE_VFPSRTYPE to enable VF RSS.

The psrtype will determine how many queues the received packets will distribute 
to,
and the value of psrtype should depends on both facet: max VF rxq number which
has been negotiated with PF, and the number of rxq specified in config on guest.

Signed-off-by: Changchun Ouyang 

Changes in v4:
 - the number of rxq from config should be power of 2 and should not bigger than
max VF rxq number(negotiated between guest and host).

---
 lib/librte_pmd_ixgbe/ixgbe_pf.c   |  15 ++
 lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 103 +-
 2 files changed, 106 insertions(+), 12 deletions(-)

diff --git a/lib/librte_pmd_ixgbe/ixgbe_pf.c b/lib/librte_pmd_ixgbe/ixgbe_pf.c
index dbda9b5..93f6e43 100644
--- a/lib/librte_pmd_ixgbe/ixgbe_pf.c
+++ b/lib/librte_pmd_ixgbe/ixgbe_pf.c
@@ -187,6 +187,21 @@ int ixgbe_pf_host_configure(struct rte_eth_dev *eth_dev)
IXGBE_WRITE_REG(hw, IXGBE_MPSAR_LO(hw->mac.num_rar_entries), 0);
IXGBE_WRITE_REG(hw, IXGBE_MPSAR_HI(hw->mac.num_rar_entries), 0);

+   /*
+* VF RSS can support at most 4 queues for each VF, even if
+* 8 queues are available for each VF, it need refine to 4
+* queues here due to this limitation, otherwise no queue
+* will receive any packet even RSS is enabled.
+*/
+   if (eth_dev->data->dev_conf.rxmode.mq_mode == ETH_MQ_RX_VMDQ_RSS) {
+   if (RTE_ETH_DEV_SRIOV(eth_dev).nb_q_per_pool == 8) {
+   RTE_ETH_DEV_SRIOV(eth_dev).active = ETH_32_POOLS;
+   RTE_ETH_DEV_SRIOV(eth_dev).nb_q_per_pool = 4;
+   RTE_ETH_DEV_SRIOV(eth_dev).def_pool_q_idx =
+   dev_num_vf(eth_dev) * 4;
+   }
+   }
+
/* set VMDq map to default PF pool */
hw->mac.ops.set_vmdq(hw, 0, RTE_ETH_DEV_SRIOV(eth_dev).def_vmdq_idx);

diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c 
b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
index f69abda..e83a9ab 100644
--- a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
+++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
@@ -3327,6 +3327,68 @@ ixgbe_alloc_rx_queue_mbufs(struct igb_rx_queue *rxq)
 }

 static int
+ixgbe_config_vf_rss(struct rte_eth_dev *dev)
+{
+   struct ixgbe_hw *hw;
+   uint32_t mrqc;
+
+   ixgbe_rss_configure(dev);
+
+   hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+   /* MRQC: enable VF RSS */
+   mrqc = IXGBE_READ_REG(hw, IXGBE_MRQC);
+   mrqc &= ~IXGBE_MRQC_MRQE_MASK;
+   switch (RTE_ETH_DEV_SRIOV(dev).active) {
+   case ETH_64_POOLS:
+   mrqc |= IXGBE_MRQC_VMDQRSS64EN;
+   break;
+
+   case ETH_32_POOLS:
+   case ETH_16_POOLS:
+   mrqc |= IXGBE_MRQC_VMDQRSS32EN;
+   break;
+
+   default:
+   PMD_INIT_LOG(ERR, "Invalid pool number in IOV mode");
+   return -EINVAL;
+   }
+
+   IXGBE_WRITE_REG(hw, IXGBE_MRQC, mrqc);
+
+   return 0;
+}
+
+static int
+ixgbe_config_vf_default(struct rte_eth_dev *dev)
+{
+   struct ixgbe_hw *hw =
+   IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+   switch (RTE_ETH_DEV_SRIOV(dev).active) {
+   case ETH_64_POOLS:
+   IXGBE_WRITE_REG(hw, IXGBE_MRQC,
+   IXGBE_MRQC_VMDQEN);
+   break;
+
+   case ETH_32_POOLS:
+   IXGBE_WRITE_REG(hw, IXGBE_MRQC,
+   IXGBE_MRQC_VMDQRT4TCEN);
+   break;
+
+   case ETH_16_POOLS:
+   IXGBE_WRITE_REG(hw, IXGBE_MRQC,
+   IXGBE_MRQC_VMDQRT8TCEN);
+   break;
+   default:
+   PMD_INIT_LOG(ERR,
+   "invalid pool number in IOV mode");
+   break;
+   }
+   return 0;
+}
+
+static int
 ixgbe_dev_mq_rx_configure(struct rte_eth_dev *dev)
 {
struct ixgbe_hw *hw =
@@ -3358,24 +3420,25 @@ ixgbe_dev_mq_rx_configure(struct rte_eth_dev *dev)
default: ixgbe_rss_disable(dev);
}
} else {
-   switch (RTE_ETH_DEV_SRIOV(dev).active) {
/*
 * SRIOV active scheme
-* FIXME if support DCB/RSS together with VMDq & SRIOV
+* Support RSS together with VMDq & SRIOV
 */
-   case ETH_64_POOLS:
-   IXGBE_WRITE_REG(hw, IXGBE_MRQC, IXGBE_MRQC_VMDQEN);
-   break;
-
-   case ETH_32_POOLS:
-   IXGBE_WRITE_REG(hw, IXGBE_MRQC, IXGBE_MRQC_VMDQRT4TCEN);
+   switch (dev->data->dev_conf.rxmode.mq_mode) {
+   case ETH_MQ_RX_RSS:
+   case ETH_MQ_RX_VMDQ_RSS:
+   ixgbe_config_vf_rss(dev);
break;

-   case ETH_16_POOLS:
-   IXGBE_WRITE_REG(hw, IXGBE_MRQC, IXGBE_MRQC_VMDQRT8TCEN);
- 

[dpdk-dev] [PATCH v5 6/6] testpmd: Set Rx VMDq RSS mode

2015-01-07 Thread Ouyang Changchun
Set VMDq RSS mode if it has VF(VF number is more than 1) and has RSS 
information.

Signed-off-by: Changchun Ouyang 

changes in v5
  - Assign txmode.mq_mode with ETH_MQ_TX_NONE explicitly;
  - Remove one line wrong comment.

---
 app/test-pmd/testpmd.c | 15 ++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 8c69756..64fd4ee 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -1700,7 +1700,6 @@ init_port_config(void)
port->dev_conf.rx_adv_conf.rss_conf.rss_hf = 0;
}

-   /* In SR-IOV mode, RSS mode is not available */
if (port->dcb_flag == 0 && port->dev_info.max_vfs == 0) {
if( port->dev_conf.rx_adv_conf.rss_conf.rss_hf != 0)
port->dev_conf.rxmode.mq_mode = ETH_MQ_RX_RSS;
@@ -1708,6 +1707,20 @@ init_port_config(void)
port->dev_conf.rxmode.mq_mode = ETH_MQ_RX_NONE;
}

+   if (port->dev_info.max_vfs != 0) {
+   if (port->dev_conf.rx_adv_conf.rss_conf.rss_hf != 0) {
+   port->dev_conf.rxmode.mq_mode =
+   ETH_MQ_RX_VMDQ_RSS;
+   port->dev_conf.txmode.mq_mode =
+   ETH_MQ_TX_NONE;
+   } else {
+   port->dev_conf.rxmode.mq_mode =
+   ETH_MQ_RX_NONE;
+   port->dev_conf.txmode.mq_mode =
+   ETH_MQ_TX_NONE;
+   }
+   }
+
port->rx_conf.rx_thresh = rx_thresh;
port->rx_conf.rx_free_thresh = rx_free_thresh;
port->rx_conf.rx_drop_en = rx_drop_en;
-- 
1.8.4.2



[dpdk-dev] [PATCH RFC v2 03/12] lib/librte_vhost: move event_copy logic from virtio-net.c to vhost-net-cdev.c

2015-01-07 Thread Xie, Huawei
> + file = *(const struct vhost_vring_file *)in_buf;
> + LOG_DEBUG(VHOST_CONFIG,
> + "idx:%d fd:%d\n", file.index, file.fd);
> + fd = eventfd_copy(file.fd, ctx.pid);
> + if (fd < 0) {
> + fuse_reply_ioctl(req, -1, NULL, 0);
> + result = -1;
> + break;
> + }
> + file.fd = fd;
> + if (cmd == VHOST_SET_VRING_KICK)
> + VHOST_IOCTL_R(struct vhost_vring_file, file,
> ops->set_vring_kick);
> + else
> + VHOST_IOCTL_R(struct vhost_vring_file, file,
> ops->set_vring_call);
File doesn't get the new fd, but is again assigned with the value in in_buf in 
VHOST_IOCTL_R.
Fix the bug in the next version of patch.

> + }
>   break;
> 



[dpdk-dev] [PATCH v3 0/3] enhance TX checksum command and csum forwarding engine

2015-01-07 Thread Ananyev, Konstantin
Hi Frank,

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Liu, Jijiang
> Sent: Wednesday, January 07, 2015 2:04 AM
> To: 'Olivier MATZ'
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v3 0/3] enhance TX checksum command and csum 
> forwarding engine
> 
> Hi Olivier,
> 
> > -Original Message-
> > From: Olivier MATZ [mailto:olivier.matz at 6wind.com]
> > Sent: Saturday, December 13, 2014 12:33 AM
> > To: Liu, Jijiang
> > Cc: dev at dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v3 0/3] enhance TX checksum command and
> > csum forwarding engine
> >
> > Hello,
> >
> > On 12/12/2014 04:48 AM, Liu, Jijiang wrote:
> > > The 'hw/sw' option  is used to set/clear the flag of enabling TX 
> > > tunneling packet
> > checksum hardware offload in testpmd application.
> >
> > This is not clear at all.
> > In your command, there is (hw|sw|none).
> > Are you talking about inner or outer?
> > Is this command useful for any kind of packet?
> > How does it combine with "tx_checksum set outer-ip (hw|sw)"?
> >
> 
> I rethink these TX checksum commands in this patch set and agree with you 
> that we should make some changes for having clear
> meaning for them.
> 
> There are  3 commands in patch set as follows,
> 1. tx_checksum set tunnel (hw|sw|none) (port-id)
> 
> Now I also think the command 1 may confuse user, they probably don't 
> understand  why we need 'hw' or 'sw' option and when  to
> use the two option,
> so I will replace the command with 'tx_checksum set hw-tunnel-mode (on|off) 
> (port-id)' command.

I am a bit confused here, could you explain what would be a behaviour for 'on' 
and 'off'?
Konstantin

> 
> 2. tx_checksum set outer-ip (hw|sw) (port-id)
> 3. tx_checksum set  (ip|udp|tcp|sctp) (hw|sw) (port-id)
> 
> The command 2 will be merged into command 3, the new command is ' tx_checksum 
> set  (outer-ip|ip|udp|tcp|sctp) (hw|sw) (port-
> id)'.
> 
> These most of the cases in 
> http://dpdk.org/ml/archives/dev/2014-December/009213.html will be covered by 
> using the two
> commands
> 
> The command 'tx_checksum set hw-tunnel-mode (on|off)  (port-id)' is used to 
> set/clear  TESTPMD_TX_OFFLOAD_TUNNEL_CKSUM
> flag.
> Actually, the PKT_TX_UDP_TUNNEL_PKT offload flag will be set if the testpmd 
> flag is set, which tell driver/HW treat  that transmit
> packet as a tunneling packet.
> 
> When 'on' is set, which is able to meet Method B.1 and method C.
> 
> When 'off' is set, the TESTPMD_TX_OFFLOAD_TUNNEL_CKSUM is not needed to set, 
> so the PKT_TX_UDP_TUNNEL_PKT offload flag is
> not needed to set,  then HW treat  that transmit packet as a non-tunneling 
> packet. It is able to meet Method B.2.
> 
> As to case A, I think it is not mandatory to cover it in csum fwd engine for 
> tunneling packet.
> 
> Is the above description clear for you?
> 
> > Regards,
> > Olivier



[dpdk-dev] [PATCH v3 0/3] enhance TX checksum command and csum forwarding engine

2015-01-07 Thread Liu, Jijiang
Hi Konstantin,

> -Original Message-
> From: Ananyev, Konstantin
> Sent: Wednesday, January 7, 2015 5:59 PM
> To: Liu, Jijiang; 'Olivier MATZ'
> Cc: dev at dpdk.org
> Subject: RE: [dpdk-dev] [PATCH v3 0/3] enhance TX checksum command and
> csum forwarding engine
> 
> Hi Frank,
> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Liu, Jijiang
> > Sent: Wednesday, January 07, 2015 2:04 AM
> > To: 'Olivier MATZ'
> > Cc: dev at dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v3 0/3] enhance TX checksum command and
> > csum forwarding engine
> >
> > Hi Olivier,
> >
> > > -Original Message-
> > > From: Olivier MATZ [mailto:olivier.matz at 6wind.com]
> > > Sent: Saturday, December 13, 2014 12:33 AM
> > > To: Liu, Jijiang
> > > Cc: dev at dpdk.org
> > > Subject: Re: [dpdk-dev] [PATCH v3 0/3] enhance TX checksum command
> > > and csum forwarding engine
> > >
> > > Hello,
> > >
> > > On 12/12/2014 04:48 AM, Liu, Jijiang wrote:
> > > > The 'hw/sw' option  is used to set/clear the flag of enabling TX
> > > > tunneling packet
> > > checksum hardware offload in testpmd application.
> > >
> > > This is not clear at all.
> > > In your command, there is (hw|sw|none).
> > > Are you talking about inner or outer?
> > > Is this command useful for any kind of packet?
> > > How does it combine with "tx_checksum set outer-ip (hw|sw)"?
> > >
> >
> > I rethink these TX checksum commands in this patch set and agree with
> > you that we should make some changes for having clear meaning for them.
> >
> > There are  3 commands in patch set as follows, 1. tx_checksum set
> > tunnel (hw|sw|none) (port-id)
> >
> > Now I also think the command 1 may confuse user, they probably don't
> > understand  why we need 'hw' or 'sw' option and when  to use the two
> > option, so I will replace the command with 'tx_checksum set hw-tunnel-mode
> (on|off) (port-id)' command.
> 
> I am a bit confused here, could you explain what would be a behaviour for 
> 'on' and
> 'off'?
> Konstantin

I have explained the behaviour for 'on' and'off' below,

The command 'tx_checksum set hw-tunnel-mode (on|off)  (port-id)' is 
used to set/clear  TESTPMD_TX_OFFLOAD_TUNNEL_CKSUM flag.

Actually, the PKT_TX_UDP_TUNNEL_PKT offload flag will be set if the 
testpmd flag is set, which means to tell HW treat  that transmit packet as a 
tunneling packet to do checksum offload
 When 'on' is set, which is able to meet Method B.1 and method C.

When 'off' is set, the TESTPMD_TX_OFFLOAD_TUNNEL_CKSUM is not needed 
to set, so the PKT_TX_UDP_TUNNEL_PKT offload flag is not needed to set,  then 
HW treat  that transmit packet as a non-tunneling packet. It is able to meet 
Method B.2.

Is the explanation not clear?

>
> >
> > 2. tx_checksum set outer-ip (hw|sw) (port-id) 3. tx_checksum set
> > (ip|udp|tcp|sctp) (hw|sw) (port-id)
> >
> > The command 2 will be merged into command 3, the new command is '
> > tx_checksum set  (outer-ip|ip|udp|tcp|sctp) (hw|sw) (port- id)'.
> >
> > These most of the cases in
> > http://dpdk.org/ml/archives/dev/2014-December/009213.html will be
> > covered by using the two commands
> >
> > The command 'tx_checksum set hw-tunnel-mode (on|off)  (port-id)' is
> > used to set/clear  TESTPMD_TX_OFFLOAD_TUNNEL_CKSUM flag.
> > Actually, the PKT_TX_UDP_TUNNEL_PKT offload flag will be set if the
> > testpmd flag is set, which tell driver/HW treat  that transmit packet as a
> tunneling packet.
> >
> > When 'on' is set, which is able to meet Method B.1 and method C.
> >
> > When 'off' is set, the TESTPMD_TX_OFFLOAD_TUNNEL_CKSUM is not needed
> > to set, so the PKT_TX_UDP_TUNNEL_PKT offload flag is not needed to set,  
> > then
> HW treat  that transmit packet as a non-tunneling packet. It is able to meet
> Method B.2.
> >
> > As to case A, I think it is not mandatory to cover it in csum fwd engine for
> tunneling packet.
> >
> > Is the above description clear for you?
> >
> > > Regards,
> > > Olivier



[dpdk-dev] [PATCH v3 0/3] enhance TX checksum command and csum forwarding engine

2015-01-07 Thread Ananyev, Konstantin


> -Original Message-
> From: Liu, Jijiang
> Sent: Wednesday, January 07, 2015 11:39 AM
> To: Ananyev, Konstantin; 'Olivier MATZ'
> Cc: dev at dpdk.org
> Subject: RE: [dpdk-dev] [PATCH v3 0/3] enhance TX checksum command and csum 
> forwarding engine
> 
> Hi Konstantin,
> 
> > -Original Message-
> > From: Ananyev, Konstantin
> > Sent: Wednesday, January 7, 2015 5:59 PM
> > To: Liu, Jijiang; 'Olivier MATZ'
> > Cc: dev at dpdk.org
> > Subject: RE: [dpdk-dev] [PATCH v3 0/3] enhance TX checksum command and
> > csum forwarding engine
> >
> > Hi Frank,
> >
> > > -Original Message-
> > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Liu, Jijiang
> > > Sent: Wednesday, January 07, 2015 2:04 AM
> > > To: 'Olivier MATZ'
> > > Cc: dev at dpdk.org
> > > Subject: Re: [dpdk-dev] [PATCH v3 0/3] enhance TX checksum command and
> > > csum forwarding engine
> > >
> > > Hi Olivier,
> > >
> > > > -Original Message-
> > > > From: Olivier MATZ [mailto:olivier.matz at 6wind.com]
> > > > Sent: Saturday, December 13, 2014 12:33 AM
> > > > To: Liu, Jijiang
> > > > Cc: dev at dpdk.org
> > > > Subject: Re: [dpdk-dev] [PATCH v3 0/3] enhance TX checksum command
> > > > and csum forwarding engine
> > > >
> > > > Hello,
> > > >
> > > > On 12/12/2014 04:48 AM, Liu, Jijiang wrote:
> > > > > The 'hw/sw' option  is used to set/clear the flag of enabling TX
> > > > > tunneling packet
> > > > checksum hardware offload in testpmd application.
> > > >
> > > > This is not clear at all.
> > > > In your command, there is (hw|sw|none).
> > > > Are you talking about inner or outer?
> > > > Is this command useful for any kind of packet?
> > > > How does it combine with "tx_checksum set outer-ip (hw|sw)"?
> > > >
> > >
> > > I rethink these TX checksum commands in this patch set and agree with
> > > you that we should make some changes for having clear meaning for them.
> > >
> > > There are  3 commands in patch set as follows, 1. tx_checksum set
> > > tunnel (hw|sw|none) (port-id)
> > >
> > > Now I also think the command 1 may confuse user, they probably don't
> > > understand  why we need 'hw' or 'sw' option and when  to use the two
> > > option, so I will replace the command with 'tx_checksum set hw-tunnel-mode
> > (on|off) (port-id)' command.
> >
> > I am a bit confused here, could you explain what would be a behaviour for 
> > 'on' and
> > 'off'?
> > Konstantin
> 
> I have explained the behaviour for 'on' and'off' below,
> 
> The command 'tx_checksum set hw-tunnel-mode (on|off)  (port-id)' is
> used to set/clear  TESTPMD_TX_OFFLOAD_TUNNEL_CKSUM flag.
> 
> Actually, the PKT_TX_UDP_TUNNEL_PKT offload flag will be set if the
> testpmd flag is set, which means to tell HW treat  that transmit packet as a 
> tunneling packet to do checksum offload
>  When 'on' is set, which is able to meet Method B.1 and method C.
> 
> When 'off' is set, the TESTPMD_TX_OFFLOAD_TUNNEL_CKSUM is not needed
> to set, so the PKT_TX_UDP_TUNNEL_PKT offload flag is not needed to set,  then 
> HW treat  that transmit packet as a non-tunneling
> packet. It is able to meet Method B.2.
> 
> Is the explanation not clear?

Ok, and how I can set method A (testpmd treat all packets as non-tunnelling and 
never look beyond outer L4 header) then?
Konstantin

> 
> >
> > >
> > > 2. tx_checksum set outer-ip (hw|sw) (port-id) 3. tx_checksum set
> > > (ip|udp|tcp|sctp) (hw|sw) (port-id)
> > >
> > > The command 2 will be merged into command 3, the new command is '
> > > tx_checksum set  (outer-ip|ip|udp|tcp|sctp) (hw|sw) (port- id)'.
> > >
> > > These most of the cases in
> > > http://dpdk.org/ml/archives/dev/2014-December/009213.html will be
> > > covered by using the two commands
> > >
> > > The command 'tx_checksum set hw-tunnel-mode (on|off)  (port-id)' is
> > > used to set/clear  TESTPMD_TX_OFFLOAD_TUNNEL_CKSUM flag.
> > > Actually, the PKT_TX_UDP_TUNNEL_PKT offload flag will be set if the
> > > testpmd flag is set, which tell driver/HW treat  that transmit packet as a
> > tunneling packet.
> > >
> > > When 'on' is set, which is able to meet Method B.1 and method C.
> > >
> > > When 'off' is set, the TESTPMD_TX_OFFLOAD_TUNNEL_CKSUM is not needed
> > > to set, so the PKT_TX_UDP_TUNNEL_PKT offload flag is not needed to set,  
> > > then
> > HW treat  that transmit packet as a non-tunneling packet. It is able to meet
> > Method B.2.
> > >
> > > As to case A, I think it is not mandatory to cover it in csum fwd engine 
> > > for
> > tunneling packet.
> > >
> > > Is the above description clear for you?
> > >
> > > > Regards,
> > > > Olivier



[dpdk-dev] [PATCH RFC v2 00/12] lib/librte_vhost: vhost-user support

2015-01-07 Thread Qiu, Michael
On 12/18/2014 1:43 AM, Xie, Huawei wrote:
>
>> -Original Message-
>> From: Tetsuya Mukawa [mailto:mukawa at igel.co.jp]
>> Sent: Sunday, December 14, 2014 10:26 PM
>> To: Xie, Huawei; dev at dpdk.org
>> Cc: haifeng.lin at intel.com
>> Subject: Re: [PATCH RFC v2 00/12] lib/librte_vhost: vhost-user support
>>
>> Hi Xie,
>>
>> I've got warnings from checkpatch.pl.
>> Mostly 'over 80 characters' warnings.
>> (But I know these are come from original vhost-example code sometimes.)
>>
>> So far, your patches are RFC, so I haven't check these strictly.
> Thanks.
> I try to, but you know sometimes 'over 80 characters' is unavoidable.

Why unavoidable? I'm very curious :)

>> Thanks,
>> Tetsuya
>>
>> (2014/12/11 6:37), Huawei Xie wrote:
>>> This patchset refines vhost library to support both vhost-cuse and 
>>> vhost-user.
>>>
>>>
>>> Huawei Xie (12):
>>>   create vhost_cuse directory and move vhost-net-cdev.c to vhost_cuse
>> directory
>>>   rename vhost-net-cdev.h as vhost-net.h
>>>   move eventfd_copy logic out from virtio-net.c to vhost-net-cdev.c
>>>   exact copy of host_memory_map from virtio-net.c to new file
>>>   virtio-net-cdev.c
>>>   host_memory_map refine: map partial memory of target process into current
>> process
>>>   cuse_set_memory_table is the VHOST_SET_MEMORY_TABLE message
>> handler for cuse
>>>   fd management for vhost user
>>>   vhost-user support
>>>   minor fix
>>>   vhost-user memory region map/unmap
>>>   kick/callfd fix
>>>   cleanup when vhost user connection is closed
>>>
>>>  lib/librte_vhost/Makefile |   5 +-
>>>  lib/librte_vhost/rte_virtio_net.h |   2 +
>>>  lib/librte_vhost/vhost-net-cdev.c | 389 --
>>>  lib/librte_vhost/vhost-net-cdev.h | 113 ---
>>>  lib/librte_vhost/vhost-net.h  | 117 +++
>>>  lib/librte_vhost/vhost_cuse/vhost-net-cdev.c  | 452
>> ++
>>>  lib/librte_vhost/vhost_cuse/virtio-net-cdev.c | 349 
>>>  lib/librte_vhost/vhost_cuse/virtio-net-cdev.h |  45 +++
>>>  lib/librte_vhost/vhost_rxtx.c |   2 +-
>>>  lib/librte_vhost/vhost_user/fd_man.c  | 205 
>>>  lib/librte_vhost/vhost_user/fd_man.h  |  64 
>>>  lib/librte_vhost/vhost_user/vhost-net-user.c  | 423
>> 
>>>  lib/librte_vhost/vhost_user/vhost-net-user.h  | 107 ++
>>>  lib/librte_vhost/vhost_user/virtio-net-user.c | 313 ++
>>>  lib/librte_vhost/vhost_user/virtio-net-user.h |  49 +++
>>>  lib/librte_vhost/virtio-net.c | 394 ++
>>>  lib/librte_vhost/virtio-net.h |  43 +++
>>>  17 files changed, 2199 insertions(+), 873 deletions(-)
>>>  delete mode 100644 lib/librte_vhost/vhost-net-cdev.c
>>>  delete mode 100644 lib/librte_vhost/vhost-net-cdev.h
>>>  create mode 100644 lib/librte_vhost/vhost-net.h
>>>  create mode 100644 lib/librte_vhost/vhost_cuse/vhost-net-cdev.c
>>>  create mode 100644 lib/librte_vhost/vhost_cuse/virtio-net-cdev.c
>>>  create mode 100644 lib/librte_vhost/vhost_cuse/virtio-net-cdev.h
>>>  create mode 100644 lib/librte_vhost/vhost_user/fd_man.c
>>>  create mode 100644 lib/librte_vhost/vhost_user/fd_man.h
>>>  create mode 100644 lib/librte_vhost/vhost_user/vhost-net-user.c
>>>  create mode 100644 lib/librte_vhost/vhost_user/vhost-net-user.h
>>>  create mode 100644 lib/librte_vhost/vhost_user/virtio-net-user.c
>>>  create mode 100644 lib/librte_vhost/vhost_user/virtio-net-user.h
>>>  create mode 100644 lib/librte_vhost/virtio-net.h
>>>
>



[dpdk-dev] [PATCH 0/2] remove limit on devargs parameters length

2015-01-07 Thread David Marchand
Here is a little patchset that removes the limit on the devargs parameters 
length.
Previously, arguments specified by user would be stored in a static buffer,
while there is no particular reason why we should have such a constraint, afaik.


-- 
David Marchand

David Marchand (2):
  devargs: indent and cleanup
  devargs: remove limit on parameters length

 lib/librte_eal/common/eal_common_devargs.c  |   51 ---
 lib/librte_eal/common/include/rte_devargs.h |4 +--
 2 files changed, 32 insertions(+), 23 deletions(-)

-- 
1.7.10.4



[dpdk-dev] [PATCH 1/2] devargs: indent and cleanup

2015-01-07 Thread David Marchand
Prepare for next commit.
Fix some indent issues, refactor error code.

Signed-off-by: David Marchand 
---
 lib/librte_eal/common/eal_common_devargs.c |   27 ++-
 1 file changed, 14 insertions(+), 13 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_devargs.c 
b/lib/librte_eal/common/eal_common_devargs.c
index 4c7d11a..8c9b31a 100644
--- a/lib/librte_eal/common/eal_common_devargs.c
+++ b/lib/librte_eal/common/eal_common_devargs.c
@@ -48,7 +48,7 @@ struct rte_devargs_list devargs_list =
 int
 rte_eal_devargs_add(enum rte_devtype devtype, const char *devargs_str)
 {
-   struct rte_devargs *devargs;
+   struct rte_devargs *devargs = NULL;
char buf[RTE_DEVARGS_LEN];
char *sep;
int ret;
@@ -57,14 +57,14 @@ rte_eal_devargs_add(enum rte_devtype devtype, const char 
*devargs_str)
if (ret < 0 || ret >= (int)sizeof(buf)) {
RTE_LOG(ERR, EAL, "user device args too large: <%s>\n",
devargs_str);
-   return -1;
+   goto fail;
}

/* use malloc instead of rte_malloc as it's called early at init */
devargs = malloc(sizeof(*devargs));
if (devargs == NULL) {
RTE_LOG(ERR, EAL, "cannot allocate devargs\n");
-   return -1;
+   goto fail;
}
memset(devargs, 0, sizeof(*devargs));
devargs->type = devtype;
@@ -81,28 +81,29 @@ rte_eal_devargs_add(enum rte_devtype devtype, const char 
*devargs_str)
case RTE_DEVTYPE_BLACKLISTED_PCI:
/* try to parse pci identifier */
if (eal_parse_pci_BDF(buf, &devargs->pci.addr) != 0 &&
-   eal_parse_pci_DomBDF(buf, &devargs->pci.addr) != 0) {
-   RTE_LOG(ERR, EAL,
-   "invalid PCI identifier <%s>\n", buf);
-   free(devargs);
-   return -1;
+   eal_parse_pci_DomBDF(buf, &devargs->pci.addr) != 0) {
+   RTE_LOG(ERR, EAL, "invalid PCI identifier <%s>\n", buf);
+   goto fail;
}
break;
case RTE_DEVTYPE_VIRTUAL:
/* save driver name */
ret = snprintf(devargs->virtual.drv_name,
-   sizeof(devargs->virtual.drv_name), "%s", buf);
+  sizeof(devargs->virtual.drv_name), "%s", buf);
if (ret < 0 || ret >= (int)sizeof(devargs->virtual.drv_name)) {
-   RTE_LOG(ERR, EAL,
-   "driver name too large: <%s>\n", buf);
-   free(devargs);
-   return -1;
+   RTE_LOG(ERR, EAL, "driver name too large: <%s>\n", buf);
+   goto fail;
}
break;
}

TAILQ_INSERT_TAIL(&devargs_list, devargs, next);
return 0;
+
+fail:
+   if (devargs)
+   free(devargs);
+   return -1;
 }

 /* count the number of devices of a specified type */
-- 
1.7.10.4



[dpdk-dev] [PATCH 2/2] devargs: remove limit on parameters length

2015-01-07 Thread David Marchand
As far as I know, there is no reason why we should have a limit on the length of
parameters that can be given for a device.
Remove this limit by using dynamic allocations.

Signed-off-by: David Marchand 
---
 lib/librte_eal/common/eal_common_devargs.c  |   26 +-
 lib/librte_eal/common/include/rte_devargs.h |4 ++--
 2 files changed, 19 insertions(+), 11 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_devargs.c 
b/lib/librte_eal/common/eal_common_devargs.c
index 8c9b31a..3aace08 100644
--- a/lib/librte_eal/common/eal_common_devargs.c
+++ b/lib/librte_eal/common/eal_common_devargs.c
@@ -49,17 +49,10 @@ int
 rte_eal_devargs_add(enum rte_devtype devtype, const char *devargs_str)
 {
struct rte_devargs *devargs = NULL;
-   char buf[RTE_DEVARGS_LEN];
+   char *buf = NULL;
char *sep;
int ret;

-   ret = snprintf(buf, sizeof(buf), "%s", devargs_str);
-   if (ret < 0 || ret >= (int)sizeof(buf)) {
-   RTE_LOG(ERR, EAL, "user device args too large: <%s>\n",
-   devargs_str);
-   goto fail;
-   }
-
/* use malloc instead of rte_malloc as it's called early at init */
devargs = malloc(sizeof(*devargs));
if (devargs == NULL) {
@@ -69,11 +62,21 @@ rte_eal_devargs_add(enum rte_devtype devtype, const char 
*devargs_str)
memset(devargs, 0, sizeof(*devargs));
devargs->type = devtype;

+   buf = strdup(devargs_str);
+   if (buf == NULL) {
+   RTE_LOG(ERR, EAL, "cannot allocate temp memory for devargs\n");
+   goto fail;
+   }
+
/* set the first ',' to '\0' to split name and arguments */
sep = strchr(buf, ',');
if (sep != NULL) {
sep[0] = '\0';
-   snprintf(devargs->args, sizeof(devargs->args), "%s", sep + 1);
+   devargs->args = strdup(sep + 1);
+   if (devargs->args == NULL) {
+   RTE_LOG(ERR, EAL, "cannot allocate for devargs args\n");
+   goto fail;
+   }
}

switch (devargs->type) {
@@ -97,10 +100,15 @@ rte_eal_devargs_add(enum rte_devtype devtype, const char 
*devargs_str)
break;
}

+   free(buf);
TAILQ_INSERT_TAIL(&devargs_list, devargs, next);
return 0;

 fail:
+   if (devargs->args)
+   free(devargs->args);
+   if (buf)
+   free(buf);
if (devargs)
free(devargs);
return -1;
diff --git a/lib/librte_eal/common/include/rte_devargs.h 
b/lib/librte_eal/common/include/rte_devargs.h
index 9f9c98f..996e180 100644
--- a/lib/librte_eal/common/include/rte_devargs.h
+++ b/lib/librte_eal/common/include/rte_devargs.h
@@ -88,8 +88,8 @@ struct rte_devargs {
char drv_name[32];
} virtual;
};
-#define RTE_DEVARGS_LEN 256
-   char args[RTE_DEVARGS_LEN]; /**< Arguments string as given by user. */
+   /** Arguments string as given by user. */
+   char *args;
 };

 /** user device double-linked queue type definition */
-- 
1.7.10.4



[dpdk-dev] [PATCH v3 0/3] enhance TX checksum command and csum forwarding engine

2015-01-07 Thread Qiu, Michael
On 12/10/2014 9:04 AM, Jijiang Liu wrote:
> In the current codes, the "tx_checksum set (ip|udp|tcp|sctp|vxlan) (hw|sw) 
> (port-id)" command is not easy to understand and extend, so the patch set 
> enhances the tx_checksum command and reworks csum forwarding engine due to 
> the change of tx_checksum command. 
> The main changes of the tx_checksum command are listed below,
>
> 1> add "tx_checksum set tunnel (hw|sw|none) (port-id)" command
>
> The command is used to set/clear tunnel flag that is used to tell the NIC 
> that the packetg is a tunneing packet and application want hardware TX 
> checksum offload for outer layer, or inner layer, or both.
>
> The 'none' option means that user treat tunneling packet as ordinary packet 
> when using the csum forward engine.
> for example, let say we have a tunnel packet: 
> eth_hdr_out/ipv4_hdr_out/udp_hdr_out/vxlan_hdr/ehtr_hdr_in/ipv4_hdr_in/tcp_hdr_in.
>  one of several scenarios:
>
> 1) User requests HW offload for ipv4_hdr_out  checksum, and doesn't care is 
> it a tunnelled packet or not. So he sets:
>
> tx_checksum set tunnel none 0
>
> tx_checksum set ip hw 0

Hi Jijiang,

I have one question, you know lots of command need port-id field like here, why 
we do not put port-id just after the command? like below:

tx_checksum (port-id) set tunnel (hw|sw|none)

Then for users, if we do not care whether it is a tunneling packet, we just 
ignore the field after port-id.

tx_checksum (port-id)

For code it maybe simpler to praise command, and better for user.

What all I mean is, we can put the required parameters just close the command 
and put the optional parameters(or can be optional) at the end of the command 
line.

(Command)  (required parameter) (optional parameters)

Thus, it would be a better user experience.

But just personal idea.

Thanks,

Michael

>
> So for such case we should set tx_tunnel to 'none'.
>
> 2> add "tx_checksum set outer-ip (hw|sw) (port-id)" command
>
> The command is used to set/clear outer IP flag that is used to tell the NIC 
> that application want hardware offload for outer layer.
>
> 3> remove the 'vxlan' option from the  "tx_checksum set 
> (ip|udp|tcp|sctp|vxlan) (hw|sw) (port-id)" command
>
> The command is used to set IP, UDP, TCP and SCTP TX checksum flag. In the 
> case of tunneling packet, the IP, UDP, TCP and SCTP flags always concern 
> inner layer.
>  
> Moreover, replace the TESTPMD_TX_OFFLOAD_VXLAN_CKSUM flag with 
> TESTPMD_TX_OFFLOAD_TUNNEL_CKSUM flag and add the 
> TESTPMD_TX_OFFLOAD_OUTER_IP_CKSUM and TESTPMD_TX_OFFLOAD_NON_TUNNEL_CKSUM 
> flag in test-pmd application.
>
> v2 change:
>  redefine the 'none' behaviour for "tx_checksum set tunnel (hw|sw|none) 
> (port-id)" command.
> v3 change:
>  typo correction in cmdline help 
>
> Jijiang Liu (3):
>   add outer IP offload capability in librte_ether.
>   add outer IP checksum capability in i40e PMD
>   testpmd command lines of the tx_checksum and csum forwarding rework
>
>  app/test-pmd/cmdline.c|  201 
> +++--
>  app/test-pmd/csumonly.c   |   35 ---
>  app/test-pmd/testpmd.h|6 +-
>  lib/librte_ether/rte_ethdev.h |1 +
>  lib/librte_pmd_i40e/i40e_ethdev.c |3 +-
>  5 files changed, 218 insertions(+), 28 deletions(-)
>



[dpdk-dev] [PATCH] librte_reorder: New reorder library with unit tests and app

2015-01-07 Thread Reshma Pattan
From: Reshma Pattan 

1)New library to provide reordering of out of ordered
mbufs based on sequence number of mbuf. Library uses reorder buffer 
structure
which in tern uses two circular buffers called ready and order buffers.
*rte_reorder_create API creates instance of reorder buffer.
*rte_reorder_init API initializes given reorder buffer instance.
*rte_reorder_reset API resets given reorder buffer instance.
*rte_reorder_insert API inserts the mbuf into order circular buffer.
*rte_reorder_fill_overflow moves mbufs from order buffer to ready buffer
to accomodate early packets in order buffer.
*rte_reorder_drain API provides draining facility to fetch out
reordered mbufs from order and ready buffers.

2)New unit test cases added.

3)New application added to verify the performance of library.

Signed-off-by: Reshma Pattan 
Signed-off-by: Richardson Bruce 
---
 app/test/Makefile  |   2 +
 app/test/test_reorder.c| 452 ++
 config/common_bsdapp   |   5 +
 config/common_linuxapp |   5 +
 examples/packet_ordering/Makefile  |  50 ++
 examples/packet_ordering/main.c| 637 +
 lib/Makefile   |   1 +
 lib/librte_eal/common/include/rte_tailq_elem.h |   2 +
 lib/librte_mbuf/rte_mbuf.h |   3 +
 lib/librte_reorder/Makefile|  50 ++
 lib/librte_reorder/rte_reorder.c   | 464 ++
 lib/librte_reorder/rte_reorder.h   | 184 +++
 mk/rte.app.mk  |   4 +
 13 files changed, 1859 insertions(+)
 create mode 100644 app/test/test_reorder.c
 create mode 100644 examples/packet_ordering/Makefile
 create mode 100644 examples/packet_ordering/main.c
 create mode 100644 lib/librte_reorder/Makefile
 create mode 100644 lib/librte_reorder/rte_reorder.c
 create mode 100644 lib/librte_reorder/rte_reorder.h

diff --git a/app/test/Makefile b/app/test/Makefile
index 4311f96..24b27d7 100644
--- a/app/test/Makefile
+++ b/app/test/Makefile
@@ -124,6 +124,8 @@ SRCS-$(CONFIG_RTE_LIBRTE_IVSHMEM) += test_ivshmem.c
 SRCS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR) += test_distributor.c
 SRCS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR) += test_distributor_perf.c

+SRCS-$(CONFIG_RTE_LIBRTE_REORDER) += test_reorder.c
+
 SRCS-y += test_devargs.c
 SRCS-y += virtual_pmd.c
 SRCS-y += packet_burst_generator.c
diff --git a/app/test/test_reorder.c b/app/test/test_reorder.c
new file mode 100644
index 000..6a673e2
--- /dev/null
+++ b/app/test/test_reorder.c
@@ -0,0 +1,452 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include "test.h"
+#include "stdio.h"
+
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "test.h"
+
+#define BURST 32
+#define REORDER_BUFFER_SIZE 16384
+#define NUM_MBUFS (2*REORDER_BUFFER_SIZE)
+#define REORDER_BUFFER_SIZE_INVALID 2049
+#define MBUF_SIZE (2048 + sizeof(struct rte_mbuf) + RTE_PKTMBUF_HEADROOM)
+
+struct reorder_unittest_params {
+   struct rte_mempool *p;
+   struct rte_reorder_buffer *b;
+};
+
+static struct reorder_unittest_par

[dpdk-dev] [PATCH] librte_reorder: New reorder library with unit tests and app

2015-01-07 Thread Pattan, Reshma
Self Nacked.
Sending multiple sub patches instead of this big patch.

> -Original Message-
> From: Pattan, Reshma
> Sent: Wednesday, January 7, 2015 3:28 PM
> To: dev at dpdk.org
> Cc: Pattan, Reshma
> Subject: [PATCH] librte_reorder: New reorder library with unit tests and app
> 
> From: Reshma Pattan 
> 
> 1)New library to provide reordering of out of ordered
> mbufs based on sequence number of mbuf. Library uses reorder buffer
> structure
> which in tern uses two circular buffers called ready and order 
> buffers.
> *rte_reorder_create API creates instance of reorder buffer.
> *rte_reorder_init API initializes given reorder buffer instance.
> *rte_reorder_reset API resets given reorder buffer instance.
> *rte_reorder_insert API inserts the mbuf into order circular buffer.
> *rte_reorder_fill_overflow moves mbufs from order buffer to ready 
> buffer
> to accomodate early packets in order buffer.
> *rte_reorder_drain API provides draining facility to fetch out
> reordered mbufs from order and ready buffers.
> 
> 2)New unit test cases added.
> 
> 3)New application added to verify the performance of library.
> 
> Signed-off-by: Reshma Pattan 
> Signed-off-by: Richardson Bruce 
> ---
>  app/test/Makefile  |   2 +
>  app/test/test_reorder.c| 452 ++
>  config/common_bsdapp   |   5 +
>  config/common_linuxapp |   5 +
>  examples/packet_ordering/Makefile  |  50 ++
>  examples/packet_ordering/main.c| 637 
> +
>  lib/Makefile   |   1 +
>  lib/librte_eal/common/include/rte_tailq_elem.h |   2 +
>  lib/librte_mbuf/rte_mbuf.h |   3 +
>  lib/librte_reorder/Makefile|  50 ++
>  lib/librte_reorder/rte_reorder.c   | 464 ++
>  lib/librte_reorder/rte_reorder.h   | 184 +++
>  mk/rte.app.mk  |   4 +
>  13 files changed, 1859 insertions(+)
>  create mode 100644 app/test/test_reorder.c
>  create mode 100644 examples/packet_ordering/Makefile
>  create mode 100644 examples/packet_ordering/main.c
>  create mode 100644 lib/librte_reorder/Makefile
>  create mode 100644 lib/librte_reorder/rte_reorder.c
>  create mode 100644 lib/librte_reorder/rte_reorder.h
> 
> diff --git a/app/test/Makefile b/app/test/Makefile
> index 4311f96..24b27d7 100644
> --- a/app/test/Makefile
> +++ b/app/test/Makefile
> @@ -124,6 +124,8 @@ SRCS-$(CONFIG_RTE_LIBRTE_IVSHMEM) +=
> test_ivshmem.c
>  SRCS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR) += test_distributor.c
>  SRCS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR) += test_distributor_perf.c
> 
> +SRCS-$(CONFIG_RTE_LIBRTE_REORDER) += test_reorder.c
> +
>  SRCS-y += test_devargs.c
>  SRCS-y += virtual_pmd.c
>  SRCS-y += packet_burst_generator.c
> diff --git a/app/test/test_reorder.c b/app/test/test_reorder.c
> new file mode 100644
> index 000..6a673e2
> --- /dev/null
> +++ b/app/test/test_reorder.c
> @@ -0,0 +1,452 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + * * Redistributions of source code must retain the above copyright
> + *   notice, this list of conditions and the following disclaimer.
> + * * Redistributions in binary form must reproduce the above copyright
> + *   notice, this list of conditions and the following disclaimer in
> + *   the documentation and/or other materials provided with the
> + *   distribution.
> + * * Neither the name of Intel Corporation nor the names of its
> + *   contributors may be used to endorse or promote products derived
> + *   from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
> CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
> FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
> COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
> INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
> NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
> OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
> ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR
> TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
> THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
> DAMAGE.
> + */
> +
> +

[dpdk-dev] [PATCH 1/3] librte_reorder: New reorder library

2015-01-07 Thread Reshma Pattan
From: Reshma Pattan 

1)New library to provide reordering of out of ordered
mbufs based on sequence number of mbuf. Library uses reorder buffer 
structure
which in tern uses two circular buffers called ready and order 
buffers.
*rte_reorder_create API creates instance of reorder buffer.
*rte_reorder_init API initializes given reorder buffer instance.
*rte_reorder_reset API resets given reorder buffer instance.
*rte_reorder_insert API inserts the mbuf into order circular buffer.
*rte_reorder_fill_overflow moves mbufs from order buffer to ready 
buffer
to accomodate early packets in order buffer.
*rte_reorder_drain API provides draining facility to fetch out
reordered mbufs from order and ready buffers.

Signed-off-by: Reshma Pattan 
Signed-off-by: Richardson Bruce 
---
 config/common_bsdapp   |   5 +
 config/common_linuxapp |   5 +
 lib/Makefile   |   1 +
 lib/librte_eal/common/include/rte_tailq_elem.h |   2 +
 lib/librte_mbuf/rte_mbuf.h |   3 +
 lib/librte_reorder/Makefile|  50 +++
 lib/librte_reorder/rte_reorder.c   | 464 +
 lib/librte_reorder/rte_reorder.h   | 184 ++
 8 files changed, 714 insertions(+)
 create mode 100644 lib/librte_reorder/Makefile
 create mode 100644 lib/librte_reorder/rte_reorder.c
 create mode 100644 lib/librte_reorder/rte_reorder.h

diff --git a/config/common_bsdapp b/config/common_bsdapp
index 9177db1..e3e0e94 100644
--- a/config/common_bsdapp
+++ b/config/common_bsdapp
@@ -334,6 +334,11 @@ CONFIG_RTE_SCHED_PORT_N_GRINDERS=8
 CONFIG_RTE_LIBRTE_DISTRIBUTOR=y

 #
+# Compile the reorder library
+#
+CONFIG_RTE_LIBRTE_REORDER=y
+
+#
 # Compile librte_port
 #
 CONFIG_RTE_LIBRTE_PORT=y
diff --git a/config/common_linuxapp b/config/common_linuxapp
index 2f9643b..b5ec730 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -342,6 +342,11 @@ CONFIG_RTE_SCHED_PORT_N_GRINDERS=8
 CONFIG_RTE_LIBRTE_DISTRIBUTOR=y

 #
+# Compile the reorder library
+#
+CONFIG_RTE_LIBRTE_REORDER=y
+
+#
 # Compile librte_port
 #
 CONFIG_RTE_LIBRTE_PORT=y
diff --git a/lib/Makefile b/lib/Makefile
index 0ffc982..5919d32 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -65,6 +65,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR) += librte_distributor
 DIRS-$(CONFIG_RTE_LIBRTE_PORT) += librte_port
 DIRS-$(CONFIG_RTE_LIBRTE_TABLE) += librte_table
 DIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += librte_pipeline
+DIRS-$(CONFIG_RTE_LIBRTE_REORDER) += librte_reorder

 ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
 DIRS-$(CONFIG_RTE_LIBRTE_KNI) += librte_kni
diff --git a/lib/librte_eal/common/include/rte_tailq_elem.h 
b/lib/librte_eal/common/include/rte_tailq_elem.h
index f74fc7c..3013869 100644
--- a/lib/librte_eal/common/include/rte_tailq_elem.h
+++ b/lib/librte_eal/common/include/rte_tailq_elem.h
@@ -84,6 +84,8 @@ rte_tailq_elem(RTE_TAILQ_ACL, "RTE_ACL")

 rte_tailq_elem(RTE_TAILQ_DISTRIBUTOR, "RTE_DISTRIBUTOR")

+rte_tailq_elem(RTE_TAILQ_REORDER, "RTE_REORDER")
+
 rte_tailq_end(RTE_TAILQ_NUM)

 #undef rte_tailq_elem
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index 16059c6..ed27eb8 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -262,6 +262,9 @@ struct rte_mbuf {
uint32_t usr; /**< User defined tags. See 
@rte_distributor_process */
} hash;   /**< hash information */

+   /* sequence number - field used in distributor and reorder library */
+   uint32_t seqn;
+
/* second cache line - fields only used in slow path or on TX */
MARKER cacheline1 __rte_cache_aligned;

diff --git a/lib/librte_reorder/Makefile b/lib/librte_reorder/Makefile
new file mode 100644
index 000..12b916f
--- /dev/null
+++ b/lib/librte_reorder/Makefile
@@ -0,0 +1,50 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of Intel Corporation nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRE

[dpdk-dev] [PATCH 2/3] librte_reorder: New unit test cases added

2015-01-07 Thread Reshma Pattan
From: Reshma Pattan 

Signed-off-by: Reshma Pattan 
---
 app/test/Makefile   |   2 +
 app/test/test_reorder.c | 452 
 mk/rte.app.mk   |   4 +
 3 files changed, 458 insertions(+)
 create mode 100644 app/test/test_reorder.c

diff --git a/app/test/Makefile b/app/test/Makefile
index 4311f96..24b27d7 100644
--- a/app/test/Makefile
+++ b/app/test/Makefile
@@ -124,6 +124,8 @@ SRCS-$(CONFIG_RTE_LIBRTE_IVSHMEM) += test_ivshmem.c
 SRCS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR) += test_distributor.c
 SRCS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR) += test_distributor_perf.c

+SRCS-$(CONFIG_RTE_LIBRTE_REORDER) += test_reorder.c
+
 SRCS-y += test_devargs.c
 SRCS-y += virtual_pmd.c
 SRCS-y += packet_burst_generator.c
diff --git a/app/test/test_reorder.c b/app/test/test_reorder.c
new file mode 100644
index 000..6a673e2
--- /dev/null
+++ b/app/test/test_reorder.c
@@ -0,0 +1,452 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include "test.h"
+#include "stdio.h"
+
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "test.h"
+
+#define BURST 32
+#define REORDER_BUFFER_SIZE 16384
+#define NUM_MBUFS (2*REORDER_BUFFER_SIZE)
+#define REORDER_BUFFER_SIZE_INVALID 2049
+#define MBUF_SIZE (2048 + sizeof(struct rte_mbuf) + RTE_PKTMBUF_HEADROOM)
+
+struct reorder_unittest_params {
+   struct rte_mempool *p;
+   struct rte_reorder_buffer *b;
+};
+
+static struct reorder_unittest_params default_params  = {
+   .p = NULL,
+   .b = NULL
+};
+
+static struct reorder_unittest_params *test_params = &default_params;
+
+static int
+test_reorder_create_inval_name(void)
+{
+   struct rte_reorder_buffer *b = NULL;
+   char *name = NULL;
+
+   b = rte_reorder_create(name, rte_socket_id(), REORDER_BUFFER_SIZE);
+   TEST_ASSERT_EQUAL(b, NULL, "No error on create() with invalid name 
param.");
+   TEST_ASSERT_EQUAL(rte_errno, EINVAL,
+   "No error on create() with invalid name 
param.");
+   return 0;
+}
+
+static int
+test_reorder_create_inval_size(void)
+{
+   struct rte_reorder_buffer *b = NULL;
+
+   b = rte_reorder_create("PKT", rte_socket_id(), 
REORDER_BUFFER_SIZE_INVALID);
+   TEST_ASSERT_EQUAL(b, NULL,
+   "No error on create() with invalid buffer size 
param.");
+   TEST_ASSERT_EQUAL(rte_errno, EINVAL,
+   "No error on create() with invalid buffer size 
param.");
+   return 0;
+}
+
+static int
+test_reorder_init_null_buffer(void)
+{
+   struct rte_reorder_buffer *b = NULL;
+   /*
+* The minimum memory area size that should be passed to library is,
+* sizeof(struct rte_reorder_buffer) + (2 * size * sizeof(struct 
rte_mbuf *));
+* Otherwise error will be thrown
+*/
+   unsigned int mzsize = 262336;
+   b = rte_reorder_init(b, mzsize, "PKT1", REORDER_BUFFER_SIZE);
+   TEST_ASSERT_EQUAL(b, NULL, "No error on init with NULL buffer.");
+   TEST_ASSERT_EQUAL(rte_errno, EINVAL, "No error on init with NULL 
buffer.");
+   return 0;
+}
+
+static int
+test_reorder_init_inval_mzsize(void)
+{
+   struct rte_reorder_buffer *b = NULL;
+   unsigned int mzsize =  100;
+   b = rte_malloc(NULL,

[dpdk-dev] [PATCH 3/3] librte_reorder: New sample app for reorder library

2015-01-07 Thread Reshma Pattan
From: Reshma Pattan 

*Sample application consists of RX, Worker and TX threads.
*RX thread marks the seqn field of mbufs upon receiving mbufs from 
driver.
Marked mbufs will be enqueued in multi consumer ring.
*Worker threads will dequeue mbufs from multi consumer ring and performs
XOR on input port value of mbufs. Operated mbufs will be enqueued to 
another ring for TX.
*TX thread will dequeue the mbufs from ring and hand it over to reorder 
lib for reordering
before sending them out.

Signed-of-by: Reshma Pattan 
---
 examples/packet_ordering/Makefile |  50 +++
 examples/packet_ordering/main.c   | 637 ++
 2 files changed, 687 insertions(+)
 create mode 100644 examples/packet_ordering/Makefile
 create mode 100644 examples/packet_ordering/main.c

diff --git a/examples/packet_ordering/Makefile 
b/examples/packet_ordering/Makefile
new file mode 100644
index 000..44bd2e1
--- /dev/null
+++ b/examples/packet_ordering/Makefile
@@ -0,0 +1,50 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of Intel Corporation nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+ifeq ($(RTE_SDK),)
+$(error "Please define RTE_SDK environment variable")
+endif
+
+# Default target, can be overriden by command line or environment
+RTE_TARGET ?= x86_64-ivshmem-linuxapp-gcc
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# binary name
+APP = packet_ordering
+
+# all source are stored in SRCS-y
+SRCS-y := main.c
+
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+
+include $(RTE_SDK)/mk/rte.extapp.mk
diff --git a/examples/packet_ordering/main.c b/examples/packet_ordering/main.c
new file mode 100644
index 000..8b65275
--- /dev/null
+++ b/examples/packet_ordering/main.c
@@ -0,0 +1,637 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF

[dpdk-dev] [PATCH 1/3] librte_reorder: New reorder library

2015-01-07 Thread Neil Horman
On Wed, Jan 07, 2015 at 04:39:11PM +, Reshma Pattan wrote:
> From: Reshma Pattan 
> 
> 1)New library to provide reordering of out of ordered
> mbufs based on sequence number of mbuf. Library uses reorder 
> buffer structure
> which in tern uses two circular buffers called ready and order 
> buffers.
> *rte_reorder_create API creates instance of reorder buffer.
> *rte_reorder_init API initializes given reorder buffer instance.
> *rte_reorder_reset API resets given reorder buffer instance.
> *rte_reorder_insert API inserts the mbuf into order circular 
> buffer.
> *rte_reorder_fill_overflow moves mbufs from order buffer to ready 
> buffer
> to accomodate early packets in order buffer.
> *rte_reorder_drain API provides draining facility to fetch out
> reordered mbufs from order and ready buffers.
> 
> Signed-off-by: Reshma Pattan 
> Signed-off-by: Richardson Bruce 
> ---
>  config/common_bsdapp   |   5 +
>  config/common_linuxapp |   5 +
>  lib/Makefile   |   1 +
>  lib/librte_eal/common/include/rte_tailq_elem.h |   2 +
>  lib/librte_mbuf/rte_mbuf.h |   3 +
>  lib/librte_reorder/Makefile|  50 +++
>  lib/librte_reorder/rte_reorder.c   | 464 
> +
>  lib/librte_reorder/rte_reorder.h   | 184 ++
>  8 files changed, 714 insertions(+)
>  create mode 100644 lib/librte_reorder/Makefile
>  create mode 100644 lib/librte_reorder/rte_reorder.c
>  create mode 100644 lib/librte_reorder/rte_reorder.h
> +
> +int
> +rte_reorder_insert(struct rte_reorder_buffer *b, struct rte_mbuf *mbuf)
> +{
> + uint32_t offset, position;
> + struct cir_buffer *order_buf = &b->order_buf;
> +
> + /*
> +  * calculate the offset from the head pointer we need to go.
> +  * The subtraction takes care of the sequence number wrapping.
> +  * For example (using 16-bit for brevity):
> +  *  min_seqn  = 0xFFFD
> +  *  mbuf_seqn = 0x0010
> +  *  offset= 0x0010 - 0xFFFD = 0x13
> +  */
> + offset = mbuf->seqn - b->min_seqn;
> +
> + /*
> +  * action to take depends on offset.
> +  * offset < buffer->size: the mbuf fits within the current window of
> +  *sequence numbers we can reorder. EXPECTED CASE.
> +  * offset > buffer->size: the mbuf is outside the current window. There
> +  *are a number of cases to consider:
> +  *1. The packet sequence is just outside the window, then we need
> +  *   to see about shifting the head pointer and taking any ready
> +  *   to return packets out of the ring. If there was a delayed
> +  *   or dropped packet preventing drains from shifting the window
> +  *   this case will skip over the dropped packet instead, and any
> +  *   packets dequeued here will be returned on the next drain call.
> +  *2. The packet sequence number is vastly outside our window, taken
> +  *   here as having offset greater than twice the buffer size. In
> +  *   this case, the packet is probably an old or late packet that
> +  *   was previously skipped, so just enqueue the packet for
> +  *   immediate return on the next drain call, or else return error.
> +  */
> + if (offset < b->order_buf.size) {
> + position = (order_buf->head + offset) & order_buf->mask;
> + order_buf->entries[position] = mbuf;
> + } else if (offset < 2 * b->order_buf.size) {
> + if (rte_reorder_fill_overflow(b, offset - order_buf->size) <
> + offset - order_buf->size) {
> + /* Put in handling for enqueue straight to output */
> + rte_errno = ENOSPC;
> + return -1;
> + }
> + offset = mbuf->seqn - b->min_seqn;
> + position = (order_buf->head + offset) & order_buf->mask;
> + order_buf->entries[position] = mbuf;
> + } else {
> + /* Put in handling for enqueue straight to output */
> + rte_errno = ERANGE;
> + return -1;
> + }
How does this work if you get two packets with the same sequence number?  That
situation seems like it would happen frequently with your example app, and from
my read of the above, you just wind up overwriting the same pointer in ther
entries array here, which leads to silent packet loss.



[dpdk-dev] [PATCH 2/2] devargs: remove limit on parameters length

2015-01-07 Thread Stephen Hemminger
On Wed,  7 Jan 2015 14:03:29 +0100
David Marchand  wrote:

> + buf = strdup(devargs_str);
> + if (buf == NULL) {
> + RTE_LOG(ERR, EAL, "cannot allocate temp memory for devargs\n");
> + goto fail;
> + }
> +

If string is only used in same function you might consider using strdupa() 
which avoids
worrying about freeing in error paths.