[dpdk-dev] TCP/IP stack for DPDK

2014-09-09 Thread zimeiw
hi,


I  have porting major FreeBSD tcp/ip stack to dpdk. new tcp/ip stack is based 
on dpdk rte_mbuf, rte_ring, rte_memory and rte_table. it is faster to 
forwarding packets.

Below feature are ready:

Netdp initialize
Ether layer
ARP
IP layer
Routing
ICMP
Commands for adding, deleting, showing IP address
Commands for adding, deleting, showing static route
Next planning:
Porting udp to netdp.

Porting tcp to netdp.
Porting socket to netdp.


You can find the code from the link: https://github.com/dpdk-net/netdp





[dpdk-dev] [PATCH 03/13] mbuf: add packet_type field

2014-09-09 Thread Liu, Jijiang
Hi Olivier,

> -Original Message-
> From: Zhu, Heqing
> Sent: Tuesday, September 09, 2014 9:48 AM
> To: Wu, Jingjing; Liu, Jijiang; Zhang, Helin
> Cc: Zhu, Heqing
> Subject: FW: [dpdk-dev] [PATCH 03/13] mbuf: add packet_type field
> 
> One of you need respond to this thread? Please make the answer generic and
> easy to understand/accept if possible.
> 
> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Olivier MATZ
> Sent: Monday, September 08, 2014 7:17 PM
> To: Yerden Zhumabekov; Richardson, Bruce; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 03/13] mbuf: add packet_type field
> 
> Hi Yerden,
> 
> On 09/08/2014 12:33 PM, Yerden Zhumabekov wrote:
> > 08.09.2014 16:17, Olivier MATZ ?:
> >>> --- a/lib/librte_mbuf/rte_mbuf.h
> >>> +++ b/lib/librte_mbuf/rte_mbuf.h
> >>> @@ -146,7 +146,7 @@ struct rte_mbuf {
> >>>   uint32_t reserved1; /**< Unused field. Required for padding */
> >>>
> >>>   /* remaining bytes are set on RX when pulling packet from descriptor
> */
> >>> - uint16_t reserved2; /**< Unused field. Required for padding */
> >>> + uint16_t packet_type;   /**< Type of packet, e.g. protocols used */
> >>>   uint16_t data_len;  /**< Amount of data in segment buffer. */
> >>>   uint32_t pkt_len;   /**< Total pkt len: sum of all segments. */
> >>>   uint16_t l3_len:9;  /**< L3 (IP) Header Length. */
> >>>
> >> This patch adds a new fields that nobody uses. So why should we add it ?
> >
> > I would use it :)
> > It's useful to store the IP protocol number (UDP, TCP etc) and version
> > of IP (4, 6) and then relay packet to specific handler.
> 
> I'm not saying this field is useless. But even if it's useful for some 
> applications
> like yours, it does not mean that it should go in the generic mbuf structure.

In some types of NIC, for an example, Intel Fortville, which supports various 
packet types, and packet type value is filled in a field of receive Descriptor 
by HW,
Normally, application analyses easily what incoming packet format is if it know 
what packet type is. It is a common approach for analyzing packet format.

> Also, for a new field, we should define who is in charge of filling it.
> Is is the driver? Does it mean that all drivers have to be modified to fill 
> it? Or is
> it just a placeholder for applications? In this case, shouldn't we use
> application-specific metadata?

Yes, driver is in charge of filling it if this type of NIC has a packet type 
filed in receive Descriptor.

> In the other direction (TX), we would also need
> to define if this field must be filled by the application before transmitting 
> a
> mbuf to a driver.

Normally, TX side don't care the packet type, instead, feature offload flag 
will be used in TX side.
 In RX side, NIC HW analyses incoming packet and know what packet type 
is, and fill packet type value in Receive Descriptor.
Of course, I also don't object to add a packet type in TX side if mbuf TX space 
is enough, but from the present point of view, it is not necessary. 

> Regards,
> Olivier

Regards,
Jijiang Liu




[dpdk-dev] [PATCH 03/13] mbuf: add packet_type field

2014-09-09 Thread Zhang, Helin


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Olivier MATZ
> Sent: Monday, September 8, 2014 7:17 PM
> To: Yerden Zhumabekov; Richardson, Bruce; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 03/13] mbuf: add packet_type field
> 
> Hi Yerden,
> 
> On 09/08/2014 12:33 PM, Yerden Zhumabekov wrote:
> > 08.09.2014 16:17, Olivier MATZ ?:
> >>> --- a/lib/librte_mbuf/rte_mbuf.h
> >>> +++ b/lib/librte_mbuf/rte_mbuf.h
> >>> @@ -146,7 +146,7 @@ struct rte_mbuf {
> >>>   uint32_t reserved1; /**< Unused field. Required for padding */
> >>>
> >>>   /* remaining bytes are set on RX when pulling packet from descriptor
> */
> >>> - uint16_t reserved2; /**< Unused field. Required for padding */
> >>> + uint16_t packet_type;   /**< Type of packet, e.g. protocols used */
> >>>   uint16_t data_len;  /**< Amount of data in segment buffer. */
> >>>   uint32_t pkt_len;   /**< Total pkt len: sum of all segments. */
> >>>   uint16_t l3_len:9;  /**< L3 (IP) Header Length. */
> >>>
> >> This patch adds a new fields that nobody uses. So why should we add it ?
> >
> > I would use it :)
> > It's useful to store the IP protocol number (UDP, TCP etc) and version
> > of IP (4, 6) and then relay packet to specific handler.

It is a common field which i40e PMD will use it to store the 'packet type ID'. 
i40e
hardware can recognize more than a hundred of packet types of received packets,
this is quite useful for upper layer stack or application. So this field is 
quite useful
and will be filled by PMD.
In ixgbe/igb, it has less than 10 packet types which are marked in offload 
flags. From
now on, it would be better to have new field here to put the hardware offloaded
packet type in and it could be used for future NICs.

> 
> I'm not saying this field is useless. But even if it's useful for some 
> applications
> like yours, it does not mean that it should go in the generic mbuf structure.
> 
> Also, for a new field, we should define who is in charge of filling it.
> Is is the driver? Does it mean that all drivers have to be modified to fill 
> it? Or is
> it just a placeholder for applications? In this case, shouldn't we use
> application-specific metadata? In the other direction (TX), we would also need
> to define if this field must be filled by the application before transmitting 
> a mbuf
> to a driver.
Yes, PMD will fill it. I40e PMD will be the first one, ixgbe/igb can be kept as 
it is, or
modified to be consistent. It is used for RX side only, and for TX side, it can 
be
investigated to see if it can be used also. I think some new features in 
development
can think of that.
Anyway, it is a quite useful field for i40e and future generation of NICs.
> 
> Regards,
> Olivier


[dpdk-dev] initialization order in rte_eal_init()

2014-09-09 Thread Tetsuya Mukawa
Hi,

I have a question about initialization order in rte_eal_init()
It seems some lcore related functions are called between
rte_eal_dev_init(PMD_INIT_PRE_PCI_PROBE) and rte_eal_pci_probe().
Is there any reason to do so?

int
rte_eal_init(int argc, char **argv)
{

(snip)

if (rte_eal_dev_init(PMD_INIT_PRE_PCI_PROBE) < 0)
rte_panic("Cannot init pmd devices\n");

RTE_LCORE_FOREACH_SLAVE(i) {

/*
* create communication pipes between master thread
* and children
*/
if (pipe(lcore_config[i].pipe_master2slave) < 0)
rte_panic("Cannot create pipe\n");
if (pipe(lcore_config[i].pipe_slave2master) < 0)
rte_panic("Cannot create pipe\n");

lcore_config[i].state = WAIT;

/* create a thread for each lcore */
ret = pthread_create(&lcore_config[i].thread_id, NULL,
eal_thread_loop, NULL);
if (ret != 0)
rte_panic("Cannot create thread\n");
}

/*
* Launch a dummy function on all slave lcores, so that master lcore
* knows they are all ready when this function returns.
*/
rte_eal_mp_remote_launch(sync_func, NULL, SKIP_MASTER);
rte_eal_mp_wait_lcore();

/* Probe & Initialize PCI devices */
if (rte_eal_pci_probe())
rte_panic("Cannot probe PCI\n");

/* Initialize any outstanding devices */
if (rte_eal_dev_init(PMD_INIT_POST_PCI_PROBE) < 0)
rte_panic("Cannot init pmd devices\n");

return fctret;
}

Thanks,
Tetsuya


[dpdk-dev] TCP/IP stack for DPDK

2014-09-09 Thread Vadim Suraev
I've ported the Linux kernel TCP/IP stack to user space and integrated with
DPDK,  the source and documentation and the roadmap will be published (and
announced) within few days.
Regards,
Vadim
On Sep 9, 2014 9:20 AM, "Matthew Hall"  wrote:

> On Tue, Sep 09, 2014 at 08:49:44AM +0800, zimeiw wrote:
> > I have porting major FreeBSD tcp/ip stack to dpdk. new tcp/ip stack is
> based
> > on dpdk rte_mbuf, rte_ring, rte_memory and rte_table. it is faster to
> > forwarding packets.
>
> Hello,
>
> This is awesome work to be doing and badly needed to use DPDK for any L4
> purposes where it is very limited. I'll be following your progress.
>
> You didn't mention your name, and compare your work with
> https://github.com/rumpkernel/dpdk-rumptcpip/ , and talk about behavior /
> performance, and how long you think it'll take. I'm curious if you can give
> some more comments.
>
> I'm implementing an RX-side very basic stack myself... but I'm not using
> BSD
> standard APIs or doing TX-side like yours will have.
>
> Matthew.
>


[dpdk-dev] TCP/IP stack for DPDK

2014-09-09 Thread Zhang, Helin


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Vadim Suraev
> Sent: Tuesday, September 9, 2014 2:30 PM
> To: Matthew Hall
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] TCP/IP stack for DPDK
> 
> I've ported the Linux kernel TCP/IP stack to user space and integrated with
> DPDK,  the source and documentation and the roadmap will be published (and
> announced) within few days.

Any license issue of porting Linux kernel stack into DPDK?

> Regards,
> Vadim
> On Sep 9, 2014 9:20 AM, "Matthew Hall"  wrote:
> 
> > On Tue, Sep 09, 2014 at 08:49:44AM +0800, zimeiw wrote:
> > > I have porting major FreeBSD tcp/ip stack to dpdk. new tcp/ip stack
> > > is
> > based
> > > on dpdk rte_mbuf, rte_ring, rte_memory and rte_table. it is faster
> > > to forwarding packets.
> >
> > Hello,
> >
> > This is awesome work to be doing and badly needed to use DPDK for any
> > L4 purposes where it is very limited. I'll be following your progress.
> >
> > You didn't mention your name, and compare your work with
> > https://github.com/rumpkernel/dpdk-rumptcpip/ , and talk about
> > behavior / performance, and how long you think it'll take. I'm curious
> > if you can give some more comments.
> >
> > I'm implementing an RX-side very basic stack myself... but I'm not
> > using BSD standard APIs or doing TX-side like yours will have.
> >
> > Matthew.
> >

Regards,
Helin


[dpdk-dev] TCP/IP stack for DPDK

2014-09-09 Thread Vadim Suraev
IMHO, since GPL is more restrictive so the source must remain open
On Sep 9, 2014 9:39 AM, "Zhang, Helin"  wrote:

>
>
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Vadim Suraev
> > Sent: Tuesday, September 9, 2014 2:30 PM
> > To: Matthew Hall
> > Cc: dev at dpdk.org
> > Subject: Re: [dpdk-dev] TCP/IP stack for DPDK
> >
> > I've ported the Linux kernel TCP/IP stack to user space and integrated
> with
> > DPDK,  the source and documentation and the roadmap will be published
> (and
> > announced) within few days.
>
> Any license issue of porting Linux kernel stack into DPDK?
>
> > Regards,
> > Vadim
> > On Sep 9, 2014 9:20 AM, "Matthew Hall"  wrote:
> >
> > > On Tue, Sep 09, 2014 at 08:49:44AM +0800, zimeiw wrote:
> > > > I have porting major FreeBSD tcp/ip stack to dpdk. new tcp/ip stack
> > > > is
> > > based
> > > > on dpdk rte_mbuf, rte_ring, rte_memory and rte_table. it is faster
> > > > to forwarding packets.
> > >
> > > Hello,
> > >
> > > This is awesome work to be doing and badly needed to use DPDK for any
> > > L4 purposes where it is very limited. I'll be following your progress.
> > >
> > > You didn't mention your name, and compare your work with
> > > https://github.com/rumpkernel/dpdk-rumptcpip/ , and talk about
> > > behavior / performance, and how long you think it'll take. I'm curious
> > > if you can give some more comments.
> > >
> > > I'm implementing an RX-side very basic stack myself... but I'm not
> > > using BSD standard APIs or doing TX-side like yours will have.
> > >
> > > Matthew.
> > >
>
> Regards,
> Helin
>


[dpdk-dev] TCP/IP stack for DPDK

2014-09-09 Thread Zhang, Helin
? That means your great works under GPL/LGPL license will not occur in DPDK 
main line, as it is always BSD license.

Regards,
Helin

From: Vadim Suraev [mailto:vadim.sur...@gmail.com]
Sent: Tuesday, September 9, 2014 2:43 PM
To: Zhang, Helin
Cc: Matthew Hall; dev at dpdk.org
Subject: RE: [dpdk-dev] TCP/IP stack for DPDK


IMHO, since GPL is more restrictive so the source must remain open
On Sep 9, 2014 9:39 AM, "Zhang, Helin" mailto:helin.zhang at intel.com>> wrote:


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On 
> Behalf Of Vadim Suraev
> Sent: Tuesday, September 9, 2014 2:30 PM
> To: Matthew Hall
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] TCP/IP stack for DPDK
>
> I've ported the Linux kernel TCP/IP stack to user space and integrated with
> DPDK,  the source and documentation and the roadmap will be published (and
> announced) within few days.

Any license issue of porting Linux kernel stack into DPDK?

> Regards,
> Vadim
> On Sep 9, 2014 9:20 AM, "Matthew Hall" mailto:mhall 
> at mhcomputing.net>> wrote:
>
> > On Tue, Sep 09, 2014 at 08:49:44AM +0800, zimeiw wrote:
> > > I have porting major FreeBSD tcp/ip stack to dpdk. new tcp/ip stack
> > > is
> > based
> > > on dpdk rte_mbuf, rte_ring, rte_memory and rte_table. it is faster
> > > to forwarding packets.
> >
> > Hello,
> >
> > This is awesome work to be doing and badly needed to use DPDK for any
> > L4 purposes where it is very limited. I'll be following your progress.
> >
> > You didn't mention your name, and compare your work with
> > https://github.com/rumpkernel/dpdk-rumptcpip/ , and talk about
> > behavior / performance, and how long you think it'll take. I'm curious
> > if you can give some more comments.
> >
> > I'm implementing an RX-side very basic stack myself... but I'm not
> > using BSD standard APIs or doing TX-side like yours will have.
> >
> > Matthew.
> >

Regards,
Helin


[dpdk-dev] [PATCH 00/15] i40e base driver udpate

2014-09-09 Thread Helin Zhang
Here is the update of i40e base driver. Also it involves a few
relevant necessary code changes in i40e PMD.

Helin Zhang (15):
  i40e: make the indentation more consistent in share code
  i40e: support nvmupdate by default
  i40e: remove useless code which was written for Solaris
  i40e: remove test code for 'ethtool'
  i40e: force a shifted '1' to be 'unsigned'
  i40e: remove useless code for pre-boot support
  i40e: Get rid of sparse warnings, and remove unreachable code
  i40e: remove code which is for software validation only
  i40e: remove code for TPH (TLP Processing Hints)
  i40e: support of 10G base T
  i40e: expose debug_write_register request
  i40e: workaround of get_firmware_version, and enhancements
  i40e: Use get_link_status to report FC settings
  i40e: fix and enhancement in arq_event_info struct
  i40e: support redefined struct of 'i40e_arq_event_info'

 lib/librte_pmd_i40e/i40e/i40e_adminq.c |   55 +-
 lib/librte_pmd_i40e/i40e/i40e_adminq.h |5 +-
 lib/librte_pmd_i40e/i40e/i40e_adminq_cmd.h | 2132 ++--
 lib/librte_pmd_i40e/i40e/i40e_common.c |  173 +--
 lib/librte_pmd_i40e/i40e/i40e_dcb.c|  625 
 lib/librte_pmd_i40e/i40e/i40e_dcb.h|  103 --
 lib/librte_pmd_i40e/i40e/i40e_diag.c   |   10 -
 lib/librte_pmd_i40e/i40e/i40e_hmc.h|5 +-
 lib/librte_pmd_i40e/i40e/i40e_lan_hmc.c|  227 +--
 lib/librte_pmd_i40e/i40e/i40e_lan_hmc.h|   14 -
 lib/librte_pmd_i40e/i40e/i40e_nvm.c|  120 +-
 lib/librte_pmd_i40e/i40e/i40e_prototype.h  |   19 +-
 lib/librte_pmd_i40e/i40e/i40e_type.h   |   49 +-
 lib/librte_pmd_i40e/i40e_ethdev.c  |8 +-
 lib/librte_pmd_i40e/i40e_ethdev_vf.c   |   10 +-
 15 files changed, 1242 insertions(+), 2313 deletions(-)

-- 
1.8.1.4



[dpdk-dev] [PATCH 01/15] i40e: make the indentation more consistent in share code

2014-09-09 Thread Helin Zhang
In share code, 'tab' is used to align values rather than 'space'.
The changes in i40e_adminq_cmd.h is to make the indentation more
consistent in share code.

Signed-off-by: Helin Zhang 
Reviewed-by: Chen Jing 
---
 lib/librte_pmd_i40e/i40e/i40e_adminq_cmd.h | 2132 ++--
 1 file changed, 1066 insertions(+), 1066 deletions(-)

diff --git a/lib/librte_pmd_i40e/i40e/i40e_adminq_cmd.h 
b/lib/librte_pmd_i40e/i40e/i40e_adminq_cmd.h
index d7f65bc..5ea9b7d 100644
--- a/lib/librte_pmd_i40e/i40e/i40e_adminq_cmd.h
+++ b/lib/librte_pmd_i40e/i40e/i40e_adminq_cmd.h
@@ -40,8 +40,8 @@ POSSIBILITY OF SUCH DAMAGE.
  * This file needs to comply with the Linux Kernel coding style.
  */

-#define I40E_FW_API_VERSION_MAJOR  0x0001
-#define I40E_FW_API_VERSION_MINOR  0x0002
+#define I40E_FW_API_VERSION_MAJOR  0x0001
+#define I40E_FW_API_VERSION_MINOR  0x0002

 struct i40e_aq_desc {
__le16 flags;
@@ -73,216 +73,216 @@ struct i40e_aq_desc {
  */

 /* command flags and offsets*/
-#define I40E_AQ_FLAG_DD_SHIFT  0
-#define I40E_AQ_FLAG_CMP_SHIFT 1
-#define I40E_AQ_FLAG_ERR_SHIFT 2
-#define I40E_AQ_FLAG_VFE_SHIFT 3
-#define I40E_AQ_FLAG_LB_SHIFT  9
-#define I40E_AQ_FLAG_RD_SHIFT  10
-#define I40E_AQ_FLAG_VFC_SHIFT 11
-#define I40E_AQ_FLAG_BUF_SHIFT 12
-#define I40E_AQ_FLAG_SI_SHIFT  13
-#define I40E_AQ_FLAG_EI_SHIFT  14
-#define I40E_AQ_FLAG_FE_SHIFT  15
-
-#define I40E_AQ_FLAG_DD  (1 << I40E_AQ_FLAG_DD_SHIFT)  /* 0x1*/
-#define I40E_AQ_FLAG_CMP (1 << I40E_AQ_FLAG_CMP_SHIFT) /* 0x2*/
-#define I40E_AQ_FLAG_ERR (1 << I40E_AQ_FLAG_ERR_SHIFT) /* 0x4*/
-#define I40E_AQ_FLAG_VFE (1 << I40E_AQ_FLAG_VFE_SHIFT) /* 0x8*/
-#define I40E_AQ_FLAG_LB  (1 << I40E_AQ_FLAG_LB_SHIFT)  /* 0x200  */
-#define I40E_AQ_FLAG_RD  (1 << I40E_AQ_FLAG_RD_SHIFT)  /* 0x400  */
-#define I40E_AQ_FLAG_VFC (1 << I40E_AQ_FLAG_VFC_SHIFT) /* 0x800  */
-#define I40E_AQ_FLAG_BUF (1 << I40E_AQ_FLAG_BUF_SHIFT) /* 0x1000 */
-#define I40E_AQ_FLAG_SI  (1 << I40E_AQ_FLAG_SI_SHIFT)  /* 0x2000 */
-#define I40E_AQ_FLAG_EI  (1 << I40E_AQ_FLAG_EI_SHIFT)  /* 0x4000 */
-#define I40E_AQ_FLAG_FE  (1 << I40E_AQ_FLAG_FE_SHIFT)  /* 0x8000 */
+#define I40E_AQ_FLAG_DD_SHIFT  0
+#define I40E_AQ_FLAG_CMP_SHIFT 1
+#define I40E_AQ_FLAG_ERR_SHIFT 2
+#define I40E_AQ_FLAG_VFE_SHIFT 3
+#define I40E_AQ_FLAG_LB_SHIFT  9
+#define I40E_AQ_FLAG_RD_SHIFT  10
+#define I40E_AQ_FLAG_VFC_SHIFT 11
+#define I40E_AQ_FLAG_BUF_SHIFT 12
+#define I40E_AQ_FLAG_SI_SHIFT  13
+#define I40E_AQ_FLAG_EI_SHIFT  14
+#define I40E_AQ_FLAG_FE_SHIFT  15
+
+#define I40E_AQ_FLAG_DD(1 << I40E_AQ_FLAG_DD_SHIFT)  /* 0x1
*/
+#define I40E_AQ_FLAG_CMP   (1 << I40E_AQ_FLAG_CMP_SHIFT) /* 0x2*/
+#define I40E_AQ_FLAG_ERR   (1 << I40E_AQ_FLAG_ERR_SHIFT) /* 0x4*/
+#define I40E_AQ_FLAG_VFE   (1 << I40E_AQ_FLAG_VFE_SHIFT) /* 0x8*/
+#define I40E_AQ_FLAG_LB(1 << I40E_AQ_FLAG_LB_SHIFT)  /* 0x200  
*/
+#define I40E_AQ_FLAG_RD(1 << I40E_AQ_FLAG_RD_SHIFT)  /* 0x400  
*/
+#define I40E_AQ_FLAG_VFC   (1 << I40E_AQ_FLAG_VFC_SHIFT) /* 0x800  */
+#define I40E_AQ_FLAG_BUF   (1 << I40E_AQ_FLAG_BUF_SHIFT) /* 0x1000 */
+#define I40E_AQ_FLAG_SI(1 << I40E_AQ_FLAG_SI_SHIFT)  /* 0x2000 
*/
+#define I40E_AQ_FLAG_EI(1 << I40E_AQ_FLAG_EI_SHIFT)  /* 0x4000 
*/
+#define I40E_AQ_FLAG_FE(1 << I40E_AQ_FLAG_FE_SHIFT)  /* 0x8000 
*/

 /* error codes */
 enum i40e_admin_queue_err {
-   I40E_AQ_RC_OK   = 0,/* success */
-   I40E_AQ_RC_EPERM= 1,/* Operation not permitted */
-   I40E_AQ_RC_ENOENT   = 2,/* No such element */
-   I40E_AQ_RC_ESRCH= 3,/* Bad opcode */
-   I40E_AQ_RC_EINTR= 4,/* operation interrupted */
-   I40E_AQ_RC_EIO  = 5,/* I/O error */
-   I40E_AQ_RC_ENXIO= 6,/* No such resource */
-   I40E_AQ_RC_E2BIG= 7,/* Arg too long */
-   I40E_AQ_RC_EAGAIN   = 8,/* Try again */
-   I40E_AQ_RC_ENOMEM   = 9,/* Out of memory */
-   I40E_AQ_RC_EACCES   = 10,   /* Permission denied */
-   I40E_AQ_RC_EFAULT   = 11,   /* Bad address */
-   I40E_AQ_RC_EBUSY= 12,   /* Device or resource busy */
-   I40E_AQ_RC_EEXIST   = 13,   /* object already exists */
-   I40E_AQ_RC_EINVAL   = 14,   /* Invalid argument */
-   I40E_AQ_RC_ENOTTY   = 15,   /* Not a typewriter */
-   I40E_AQ_RC_ENOSPC   = 16,   /* No space left or alloc failure */
-   I40E_AQ_RC_ENOSYS   = 17,   /* Function not implemented */
-   I40E_AQ_RC_ERANGE   = 18,   /* Parameter out of range */
-   I40E_AQ_RC_EFLUSHED = 19,   /* Cmd flushed because of prev cmd error */
-   I40E_AQ_RC_BAD_ADDR = 20,   /* Descriptor contains a bad pointer */
-   I40E_AQ_RC_EMODE= 21,   /* Op not allowed in current dev mode */
-   I40E_AQ_RC_EFBIG= 22,   /* File too large */
+   I40E_AQ_RC_OK   = 0,  /* success */
+   I40E_AQ_RC_EPERM= 1,  /* Operation not permitted */

[dpdk-dev] [PATCH 02/15] i40e: support nvmupdate by default

2014-09-09 Thread Helin Zhang
'nvmupdate' is intended to support the userland NVMUpdate tool for
Fortville eeprom. These code changes is to remove the conditional
compile macro, and support those by default. In addition, renaming
all 'errno' to avoid any compile warning or error.

Signed-off-by: Helin Zhang 
Reviewed-by: Chen Jing 
---
 lib/librte_pmd_i40e/i40e/i40e_adminq.h |   2 -
 lib/librte_pmd_i40e/i40e/i40e_nvm.c| 120 -
 2 files changed, 59 insertions(+), 63 deletions(-)

diff --git a/lib/librte_pmd_i40e/i40e/i40e_adminq.h 
b/lib/librte_pmd_i40e/i40e/i40e_adminq.h
index 3a59faa..27f2843 100644
--- a/lib/librte_pmd_i40e/i40e/i40e_adminq.h
+++ b/lib/librte_pmd_i40e/i40e/i40e_adminq.h
@@ -110,7 +110,6 @@ struct i40e_adminq_info {
enum i40e_admin_queue_err asq_last_status;
enum i40e_admin_queue_err arq_last_status;
 };
-#ifdef I40E_NVMUPD_SUPPORT

 /**
  * i40e_aq_rc_to_posix - convert errors to user-land codes
@@ -146,7 +145,6 @@ STATIC inline int i40e_aq_rc_to_posix(u16 aq_rc)

return aq_to_posix[aq_rc];
 }
-#endif

 /* general information */
 #define I40E_AQ_LARGE_BUF  512
diff --git a/lib/librte_pmd_i40e/i40e/i40e_nvm.c 
b/lib/librte_pmd_i40e/i40e/i40e_nvm.c
index 876c451..c62f5eb 100644
--- a/lib/librte_pmd_i40e/i40e/i40e_nvm.c
+++ b/lib/librte_pmd_i40e/i40e/i40e_nvm.c
@@ -478,29 +478,28 @@ enum i40e_status_code i40e_validate_nvm_checksum(struct 
i40e_hw *hw,
 i40e_validate_nvm_checksum_exit:
return ret_code;
 }
-#ifdef I40E_NVMUPD_SUPPORT

 STATIC enum i40e_status_code i40e_nvmupd_state_init(struct i40e_hw *hw,
struct i40e_nvm_access *cmd,
-   u8 *bytes, int *errno);
+   u8 *bytes, int *err);
 STATIC enum i40e_status_code i40e_nvmupd_state_reading(struct i40e_hw *hw,
struct i40e_nvm_access *cmd,
-   u8 *bytes, int *errno);
+   u8 *bytes, int *err);
 STATIC enum i40e_status_code i40e_nvmupd_state_writing(struct i40e_hw *hw,
struct i40e_nvm_access *cmd,
-   u8 *bytes, int *errno);
+   u8 *bytes, int *err);
 STATIC enum i40e_nvmupd_cmd i40e_nvmupd_validate_command(struct i40e_hw *hw,
struct i40e_nvm_access *cmd,
-   int *errno);
+   int *err);
 STATIC enum i40e_status_code i40e_nvmupd_nvm_erase(struct i40e_hw *hw,
   struct i40e_nvm_access *cmd,
-  int *errno);
+  int *err);
 STATIC enum i40e_status_code i40e_nvmupd_nvm_write(struct i40e_hw *hw,
   struct i40e_nvm_access *cmd,
-  u8 *bytes, int *errno);
+  u8 *bytes, int *err);
 STATIC enum i40e_status_code i40e_nvmupd_nvm_read(struct i40e_hw *hw,
  struct i40e_nvm_access *cmd,
- u8 *bytes, int *errno);
+ u8 *bytes, int *err);
 STATIC inline u8 i40e_nvmupd_get_module(u32 val)
 {
return (u8)(val & I40E_NVM_MOD_PNT_MASK);
@@ -515,38 +514,38 @@ STATIC inline u8 i40e_nvmupd_get_transaction(u32 val)
  * @hw: pointer to hardware structure
  * @cmd: pointer to nvm update command
  * @bytes: pointer to the data buffer
- * @errno: pointer to return error code
+ * @err: pointer to return error code
  *
  * Dispatches command depending on what update state is current
  **/
 enum i40e_status_code i40e_nvmupd_command(struct i40e_hw *hw,
  struct i40e_nvm_access *cmd,
- u8 *bytes, int *errno)
+ u8 *bytes, int *err)
 {
enum i40e_status_code status;

DEBUGFUNC("i40e_nvmupd_command");

/* assume success */
-   *errno = 0;
+   *err = 0;

switch (hw->nvmupd_state) {
case I40E_NVMUPD_STATE_INIT:
-   status = i40e_nvmupd_state_init(hw, cmd, bytes, errno);
+   status = i40e_nvmupd_state_init(hw, cmd, bytes, err);
break;

case I40E_NVMUPD_STATE_READING:
-   status = i40e_nvmupd_state_reading(hw, cmd, bytes, errno);
+   status = i40e_nvmupd_state_reading(hw, cmd, bytes, err);
break;

case I40E_NVMUPD_STATE_WRITING:
-   status = i40e_nvmupd_sta

[dpdk-dev] [PATCH 03/15] i40e: remove useless code which was written for Solaris

2014-09-09 Thread Helin Zhang
The code wrapped in '#ifdef DMA_SYNC_SUPPORT' was written specially
for Solaris, it is not needed anymore for others including DPDK.

Signed-off-by: Helin Zhang 
Reviewed-by: Chen Jing 
---
 lib/librte_pmd_i40e/i40e/i40e_adminq.c | 19 ---
 1 file changed, 19 deletions(-)

diff --git a/lib/librte_pmd_i40e/i40e/i40e_adminq.c 
b/lib/librte_pmd_i40e/i40e/i40e_adminq.c
index d078cea..9b5a294 100644
--- a/lib/librte_pmd_i40e/i40e/i40e_adminq.c
+++ b/lib/librte_pmd_i40e/i40e/i40e_adminq.c
@@ -690,9 +690,6 @@ u16 i40e_clean_asq(struct i40e_hw *hw)

desc = I40E_ADMINQ_DESC(*asq, ntc);
details = I40E_ADMINQ_DETAILS(*asq, ntc);
-#ifdef DMA_SYNC_SUPPORT
-   I40E_DMA_SYNC(&hw->aq.asq.desc_buf, I40E_SYNC_FORKERNEL);
-#endif /* DMA_SYNC_SUPPORT */
while (rd32(hw, hw->aq.asq.head) != ntc) {
i40e_debug(hw, I40E_DEBUG_AQ_MESSAGE,
   "%s: ntc %d head %d.\n", __FUNCTION__, ntc,
@@ -866,14 +863,8 @@ enum i40e_status_code i40e_asq_send_command(struct i40e_hw 
*hw,
CPU_TO_LE32(I40E_HI_DWORD(dma_buff->pa));
desc_on_ring->params.external.addr_low =
CPU_TO_LE32(I40E_LO_DWORD(dma_buff->pa));
-#ifdef DMA_SYNC_SUPPORT
-   I40E_DMA_SYNC(dma_buff, I40E_SYNC_FORDEVICE);
-#endif /* DMA_SYNC_SUPPORT */
}

-#ifdef DMA_SYNC_SUPPORT
-   I40E_DMA_SYNC(&hw->aq.asq.desc_buf, I40E_SYNC_FORDEVICE);
-#endif /* DMA_SYNC_SUPPORT */
/* bump the tail */
i40e_debug(hw, I40E_DEBUG_AQ_MESSAGE, "AQTX: desc and buffer:\n");
i40e_debug_aq(hw, I40E_DEBUG_AQ_COMMAND, (void *)desc_on_ring, buff);
@@ -904,9 +895,6 @@ enum i40e_status_code i40e_asq_send_command(struct i40e_hw 
*hw,

/* if ready, copy the desc back to temp */
if (i40e_asq_done(hw)) {
-#ifdef DMA_SYNC_SUPPORT
-   I40E_DMA_SYNC(&hw->aq.asq.desc_buf, I40E_SYNC_FORKERNEL);
-#endif /* DMA_SYNC_SUPPORT */
i40e_memcpy(desc, desc_on_ring, sizeof(struct i40e_aq_desc),
I40E_DMA_TO_NONDMA);
if (buff != NULL)
@@ -995,9 +983,6 @@ enum i40e_status_code i40e_clean_arq_element(struct i40e_hw 
*hw,
u16 datalen;
u16 flags;
u16 ntu;
-#ifdef DMA_SYNC_SUPPORT
-   I40E_DMA_SYNC(&hw->aq.arq.desc_buf, I40E_SYNC_FORKERNEL);
-#endif /* DMA_SYNC_SUPPORT */

/* take the lock before we start messing with the ring */
i40e_acquire_spinlock(&hw->aq.arq_spinlock);
@@ -1016,10 +1001,6 @@ enum i40e_status_code i40e_clean_arq_element(struct 
i40e_hw *hw,
/* now clean the next descriptor */
desc = I40E_ADMINQ_DESC(hw->aq.arq, ntc);
desc_idx = ntc;
-#ifdef DMA_SYNC_SUPPORT
-   I40E_DMA_SYNC(&hw->aq.arq.r.arq_bi[desc_idx], I40E_SYNC_FORKERNEL);
-#endif /* DMA_SYNC_SUPPORT */
-
flags = LE16_TO_CPU(desc->flags);
if (flags & I40E_AQ_FLAG_ERR) {
ret_code = I40E_ERR_ADMIN_QUEUE_ERROR;
-- 
1.8.1.4



[dpdk-dev] [PATCH 04/15] i40e: remove test code for 'ethtool'

2014-09-09 Thread Helin Zhang
The code wrapped in '#ifdef ETHTOOL_TEST' in i40e_diag.c is for
ethtool testing only, it is not needed anymore and can be removed.

Signed-off-by: Helin Zhang 
Reviewed-by: Chen Jing 
---
 lib/librte_pmd_i40e/i40e/i40e_diag.c | 10 --
 1 file changed, 10 deletions(-)

diff --git a/lib/librte_pmd_i40e/i40e/i40e_diag.c 
b/lib/librte_pmd_i40e/i40e/i40e_diag.c
index f24bf81..167fcf8 100644
--- a/lib/librte_pmd_i40e/i40e/i40e_diag.c
+++ b/lib/librte_pmd_i40e/i40e/i40e_diag.c
@@ -71,11 +71,6 @@ static enum i40e_status_code 
i40e_diag_reg_pattern_test(struct i40e_hw *hw,
wr32(hw, reg, (pat & mask));
val = rd32(hw, reg);
if ((val & mask) != (pat & mask)) {
-#ifdef ETHTOOL_TEST
-   i40e_debug(hw, I40E_DEBUG_DIAG,
-  "%s: reg pattern test failed - reg 0x%08x 
pat 0x%08x val 0x%08x\n",
-  __func__, reg, pat, val);
-#endif
return I40E_ERR_DIAG_TEST_FAILED;
}
}
@@ -83,11 +78,6 @@ static enum i40e_status_code 
i40e_diag_reg_pattern_test(struct i40e_hw *hw,
wr32(hw, reg, orig_val);
val = rd32(hw, reg);
if (val != orig_val) {
-#ifdef ETHTOOL_TEST
-   i40e_debug(hw, I40E_DEBUG_DIAG,
-  "%s: reg restore test failed - reg 0x%08x orig_val 
0x%08x val 0x%08x\n",
-  __func__, reg, orig_val, val);
-#endif
return I40E_ERR_DIAG_TEST_FAILED;
}

-- 
1.8.1.4



[dpdk-dev] [PATCH 07/15] i40e: Get rid of sparse warnings, and remove unreachable code

2014-09-09 Thread Helin Zhang
There are variables that represent values in little endian.
Adding prefix of '__Le' can remove warnings during sparse
checks. In addition, remove some unreachable 'break' statements,
and add 'UL' on a couple of constants.

Signed-off-by: Helin Zhang 
Reviewed-by: Chen Jing 
---
 lib/librte_pmd_i40e/i40e/i40e_lan_hmc.c | 24 ++--
 lib/librte_pmd_i40e/i40e/i40e_lan_hmc.h |  1 -
 2 files changed, 14 insertions(+), 11 deletions(-)

diff --git a/lib/librte_pmd_i40e/i40e/i40e_lan_hmc.c 
b/lib/librte_pmd_i40e/i40e/i40e_lan_hmc.c
index 9f98d6d..b08534b 100644
--- a/lib/librte_pmd_i40e/i40e/i40e_lan_hmc.c
+++ b/lib/librte_pmd_i40e/i40e/i40e_lan_hmc.c
@@ -424,7 +424,6 @@ enum i40e_status_code i40e_create_lan_hmc_object(struct 
i40e_hw *hw,
default:
ret_code = I40E_ERR_INVALID_SD_TYPE;
goto exit;
-   break;
}
}
}
@@ -509,7 +508,6 @@ try_type_paged:
DEBUGOUT1("i40e_configure_lan_hmc: Unknown SD type: %d\n",
  ret_code);
goto configure_lan_hmc_out;
-   break;
}

/* Configure and program the FPM registers so objects can be created */
@@ -803,9 +801,10 @@ static void i40e_write_word(u8 *hmc_bits,
struct i40e_context_ele *ce_info,
u8 *src)
 {
-   u16 src_word, dest_word, mask;
+   u16 src_word, mask;
u8 *from, *dest;
u16 shift_width;
+   __le16 dest_word;

/* copy from the next struct field */
from = src + ce_info->offset;
@@ -846,9 +845,10 @@ static void i40e_write_dword(u8 *hmc_bits,
 struct i40e_context_ele *ce_info,
 u8 *src)
 {
-   u32 src_dword, dest_dword, mask;
+   u32 src_dword, mask;
u8 *from, *dest;
u16 shift_width;
+   __le32 dest_dword;

/* copy from the next struct field */
from = src + ce_info->offset;
@@ -897,9 +897,10 @@ static void i40e_write_qword(u8 *hmc_bits,
 struct i40e_context_ele *ce_info,
 u8 *src)
 {
-   u64 src_qword, dest_qword, mask;
+   u64 src_qword, mask;
u8 *from, *dest;
u16 shift_width;
+   __le64 dest_qword;

/* copy from the next struct field */
from = src + ce_info->offset;
@@ -914,7 +915,7 @@ static void i40e_write_qword(u8 *hmc_bits,
if (ce_info->width < 64)
mask = ((u64)1 << ce_info->width) - 1;
else
-   mask = 0x;
+   mask = 0xUL;

/* don't swizzle the bits until after the mask because the mask bits
 * will be in a different bit position on big endian machines
@@ -985,9 +986,10 @@ static void i40e_read_word(u8 *hmc_bits,
   struct i40e_context_ele *ce_info,
   u8 *dest)
 {
-   u16 src_word, dest_word, mask;
+   u16 dest_word, mask;
u8 *src, *target;
u16 shift_width;
+   __le16 src_word;

/* prepare the bits and mask */
shift_width = ce_info->lsb % 8;
@@ -1028,9 +1030,10 @@ static void i40e_read_dword(u8 *hmc_bits,
struct i40e_context_ele *ce_info,
u8 *dest)
 {
-   u32 src_dword, dest_dword, mask;
+   u32 dest_dword, mask;
u8 *src, *target;
u16 shift_width;
+   __le32 src_dword;

/* prepare the bits and mask */
shift_width = ce_info->lsb % 8;
@@ -1080,9 +1083,10 @@ static void i40e_read_qword(u8 *hmc_bits,
struct i40e_context_ele *ce_info,
u8 *dest)
 {
-   u64 src_qword, dest_qword, mask;
+   u64 dest_qword, mask;
u8 *src, *target;
u16 shift_width;
+   __le64 src_qword;

/* prepare the bits and mask */
shift_width = ce_info->lsb % 8;
@@ -1094,7 +1098,7 @@ static void i40e_read_qword(u8 *hmc_bits,
if (ce_info->width < 64)
mask = ((u64)1 << ce_info->width) - 1;
else
-   mask = 0x;
+   mask = 0xUL;

/* shift to correct alignment */
mask <<= shift_width;
diff --git a/lib/librte_pmd_i40e/i40e/i40e_lan_hmc.h 
b/lib/librte_pmd_i40e/i40e/i40e_lan_hmc.h
index f0f0f89..70ef65c 100644
--- a/lib/librte_pmd_i40e/i40e/i40e_lan_hmc.h
+++ b/lib/librte_pmd_i40e/i40e/i40e_lan_hmc.h
@@ -36,7 +36,6 @@ POSSIBILITY OF SUCH DAMAGE.

 /* forward-declare the HW struct for the compiler */
 struct i40e_hw;
-enum i40e_status_code;

 /* HMC element context information */

-- 
1.8.1.4



[dpdk-dev] [PATCH 08/15] i40e: remove code which is for software validation only

2014-09-09 Thread Helin Zhang
The code wrapped in '#ifdef I40E_DCB_SW' is currently for software
validation only, it should be removed at all.

Signed-off-by: Helin Zhang 
Reviewed-by: Chen Jing 
---
 lib/librte_pmd_i40e/i40e/i40e_common.c|  27 --
 lib/librte_pmd_i40e/i40e/i40e_dcb.c   | 625 --
 lib/librte_pmd_i40e/i40e/i40e_dcb.h   | 103 -
 lib/librte_pmd_i40e/i40e/i40e_prototype.h |   6 -
 lib/librte_pmd_i40e/i40e/i40e_type.h  |  40 --
 5 files changed, 801 deletions(-)

diff --git a/lib/librte_pmd_i40e/i40e/i40e_common.c 
b/lib/librte_pmd_i40e/i40e/i40e_common.c
index 4254aad..4f11542 100644
--- a/lib/librte_pmd_i40e/i40e/i40e_common.c
+++ b/lib/librte_pmd_i40e/i40e/i40e_common.c
@@ -4575,33 +4575,6 @@ enum i40e_status_code i40e_aq_set_oem_mode(struct 
i40e_hw *hw,

return status;
 }
-#ifdef I40E_DCB_SW
-
-/**
- * i40e_aq_suspend_port_tx
- * @hw: pointer to the hardware structure
- * @seid: port seid
- * @cmd_details: pointer to command details structure or NULL
- *
- * Suspend port's Tx traffic
- **/
-enum i40e_status_code i40e_aq_suspend_port_tx(struct i40e_hw *hw, u16 seid,
-   struct i40e_asq_cmd_details *cmd_details)
-{
-   struct i40e_aq_desc desc;
-   enum i40e_status_code status;
-   struct i40e_aqc_tx_sched_ind *cmd =
-   (struct i40e_aqc_tx_sched_ind *)&desc.params.raw;
-
-   i40e_fill_default_direct_cmd_desc(&desc, i40e_aqc_opc_suspend_port_tx);
-
-   cmd->vsi_seid = CPU_TO_LE16(seid);
-
-   status = i40e_asq_send_command(hw, &desc, NULL, 0, cmd_details);
-
-   return status;
-}
-#endif /* I40E_DCB_SW */

 /**
  * i40e_aq_resume_port_tx
diff --git a/lib/librte_pmd_i40e/i40e/i40e_dcb.c 
b/lib/librte_pmd_i40e/i40e/i40e_dcb.c
index 435cf80..d067028 100644
--- a/lib/librte_pmd_i40e/i40e/i40e_dcb.c
+++ b/lib/librte_pmd_i40e/i40e/i40e_dcb.c
@@ -477,628 +477,3 @@ enum i40e_status_code i40e_init_dcb(struct i40e_hw *hw)

return ret;
 }
-#ifdef I40E_DCB_SW
-
-/**
- * i40e_dcbx_event_handler
- * @hw: pointer to the hw struct
- * @e: event data to be processed (LLDPDU)
- *
- * Process LLDP MIB Change event from the Firmware
- **/
-enum i40e_status_code i40e_process_lldp_event(struct i40e_hw *hw,
- struct i40e_arq_event_info *e)
-{
-   enum i40e_status_code ret = I40E_SUCCESS;
-   UNREFERENCED_2PARAMETER(hw, e);
-
-   return ret;
-}
-
-/**
- * i40e_dcb_hw_rx_fifo_config
- * @hw: pointer to the hw struct
- * @ets_mode: Strict Priority or Round Robin mode
- * @non_ets_mode: Strict Priority or Round Robin
- * @max_exponent: Exponent to calculate max refill credits
- * @lltc_map: Low latency TC bitmap
- *
- * Configure HW Rx FIFO as part of DCB configuration.
- **/
-void i40e_dcb_hw_rx_fifo_config(struct i40e_hw *hw,
-   enum i40e_dcb_arbiter_mode ets_mode,
-   enum i40e_dcb_arbiter_mode non_ets_mode,
-   u32 max_exponent,
-   u8 lltc_map)
-{
-   u32 reg = 0;
-
-   reg = rd32(hw, I40E_PRTDCB_RETSC);
-
-   reg &= ~I40E_PRTDCB_RETSC_ETS_MODE_MASK;
-   reg |= ((u32)ets_mode << I40E_PRTDCB_RETSC_ETS_MODE_SHIFT) &
-   I40E_PRTDCB_RETSC_ETS_MODE_MASK;
-
-   reg &= ~I40E_PRTDCB_RETSC_NON_ETS_MODE_MASK;
-   reg |= ((u32)non_ets_mode << I40E_PRTDCB_RETSC_NON_ETS_MODE_SHIFT) &
-   I40E_PRTDCB_RETSC_NON_ETS_MODE_MASK;
-
-   reg &= ~I40E_PRTDCB_RETSC_ETS_MAX_EXP_MASK;
-   reg |= (max_exponent << I40E_PRTDCB_RETSC_ETS_MAX_EXP_SHIFT) &
-   I40E_PRTDCB_RETSC_ETS_MAX_EXP_MASK;
-
-   reg &= ~I40E_PRTDCB_RETSC_LLTC_MASK;
-   reg |= (lltc_map << I40E_PRTDCB_RETSC_LLTC_SHIFT) &
-   I40E_PRTDCB_RETSC_LLTC_MASK;
-   wr32(hw, I40E_PRTDCB_RETSC, reg);
-}
-
-/**
- * i40e_dcb_hw_rx_cmd_monitor_config
- * @hw: pointer to the hw struct
- * @num_tc: Total number of traffic class
- * @num_ports: Total number of ports on device
- *
- * Configure HW Rx command monitor as part of DCB configuration.
- **/
-void i40e_dcb_hw_rx_cmd_monitor_config(struct i40e_hw *hw,
-  u8 num_tc, u8 num_ports)
-{
-   u32 threshold = 0;
-   u32 fifo_size = 0;
-   u32 reg = 0;
-
-   /* Set the threshold and fifo_size based on number of ports */
-   switch (num_ports) {
-   case 1:
-   threshold = 0xF;
-   fifo_size = 0x10;
-   break;
-   case 2:
-   if (num_tc > 4) {
-   threshold = 0xC;
-   fifo_size = 0x8;
-   } else {
-   threshold = 0xF;
-   fifo_size = 0x10;
-   }
-   break;
-   case 4:
-   if (num_tc > 4) {
-   threshold = 0x6;
-   fifo_size = 0x4;
-   } else {
-   threshold = 0x9;
- 

[dpdk-dev] [PATCH 11/15] i40e: expose debug_write_register request

2014-09-09 Thread Helin Zhang
The firware api request of writes to hardware registers should be
exposed to driver. The new API of 'i40e_aq_debug_write_register'
is introduced for that.

Signed-off-by: Helin Zhang 
Reviewed-by: Chen Jing 
---
 lib/librte_pmd_i40e/i40e/i40e_common.c| 29 +
 lib/librte_pmd_i40e/i40e/i40e_prototype.h |  7 +--
 2 files changed, 34 insertions(+), 2 deletions(-)

diff --git a/lib/librte_pmd_i40e/i40e/i40e_common.c 
b/lib/librte_pmd_i40e/i40e/i40e_common.c
index 84af47a..d901c8d 100644
--- a/lib/librte_pmd_i40e/i40e/i40e_common.c
+++ b/lib/librte_pmd_i40e/i40e/i40e_common.c
@@ -2358,6 +2358,35 @@ enum i40e_status_code i40e_aq_send_msg_to_vf(struct 
i40e_hw *hw, u16 vfid,
 }

 /**
+ * i40e_aq_debug_write_register
+ * @hw: pointer to the hw struct
+ * @reg_addr: register address
+ * @reg_val: register value
+ * @cmd_details: pointer to command details structure or NULL
+ *
+ * Write to a register using the admin queue commands
+ **/
+enum i40e_status_code i40e_aq_debug_write_register(struct i40e_hw *hw,
+   u32 reg_addr, u64 reg_val,
+   struct i40e_asq_cmd_details *cmd_details)
+{
+   struct i40e_aq_desc desc;
+   struct i40e_aqc_debug_reg_read_write *cmd =
+   (struct i40e_aqc_debug_reg_read_write *)&desc.params.raw;
+   enum i40e_status_code status;
+
+   i40e_fill_default_direct_cmd_desc(&desc, i40e_aqc_opc_debug_write_reg);
+
+   cmd->address = CPU_TO_LE32(reg_addr);
+   cmd->value_high = CPU_TO_LE32((u32)(reg_val >> 32));
+   cmd->value_low = CPU_TO_LE32((u32)(reg_val & 0x));
+
+   status = i40e_asq_send_command(hw, &desc, NULL, 0, cmd_details);
+
+   return status;
+}
+
+/**
  * i40e_aq_get_hmc_resource_profile
  * @hw: pointer to the hw struct
  * @profile: type of profile the HMC is to be set as
diff --git a/lib/librte_pmd_i40e/i40e/i40e_prototype.h 
b/lib/librte_pmd_i40e/i40e/i40e_prototype.h
index e559569..f819f9a 100644
--- a/lib/librte_pmd_i40e/i40e/i40e_prototype.h
+++ b/lib/librte_pmd_i40e/i40e/i40e_prototype.h
@@ -90,6 +90,9 @@ enum i40e_status_code i40e_aq_get_firmware_version(struct 
i40e_hw *hw,
u16 *fw_major_version, u16 *fw_minor_version,
u16 *api_major_version, u16 *api_minor_version,
struct i40e_asq_cmd_details *cmd_details);
+enum i40e_status_code i40e_aq_debug_write_register(struct i40e_hw *hw,
+   u32 reg_addr, u64 reg_val,
+   struct i40e_asq_cmd_details *cmd_details);
 enum i40e_status_code i40e_aq_set_phy_debug(struct i40e_hw *hw, u8 cmd_flags,
struct i40e_asq_cmd_details *cmd_details);
 enum i40e_status_code i40e_aq_set_default_vsi(struct i40e_hw *hw, u16 vsi_id,
@@ -103,11 +106,11 @@ enum i40e_status_code i40e_aq_set_phy_config(struct 
i40e_hw *hw,
struct i40e_asq_cmd_details *cmd_details);
 enum i40e_status_code i40e_set_fc(struct i40e_hw *hw, u8 *aq_failures,
  bool atomic_reset);
+enum i40e_status_code i40e_aq_set_phy_int_mask(struct i40e_hw *hw, u16 mask,
+   struct i40e_asq_cmd_details *cmd_details);
 enum i40e_status_code i40e_aq_set_mac_config(struct i40e_hw *hw,
u16 max_frame_size, bool crc_en, u16 pacing,
struct i40e_asq_cmd_details *cmd_details);
-enum i40e_status_code i40e_aq_set_phy_int_mask(struct i40e_hw *hw, u16 mask,
-   struct i40e_asq_cmd_details *cmd_details);
 enum i40e_status_code i40e_aq_get_local_advt_reg(struct i40e_hw *hw,
u64 *advt_reg,
struct i40e_asq_cmd_details *cmd_details);
-- 
1.8.1.4



[dpdk-dev] [PATCH 12/15] i40e: workaround of get_firmware_version, and enhancements

2014-09-09 Thread Helin Zhang
The workaround helps fix the API if the FW is 4.2 or later.
In addition, an unreachable 'break' statement has been removed.

Signed-off-by: Helin Zhang 
Reviewed-by: Chen Jing 
---
 lib/librte_pmd_i40e/i40e/i40e_adminq.c |  5 ++---
 lib/librte_pmd_i40e/i40e/i40e_common.c | 12 ++--
 2 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/lib/librte_pmd_i40e/i40e/i40e_adminq.c 
b/lib/librte_pmd_i40e/i40e/i40e_adminq.c
index 9b5a294..80da710 100644
--- a/lib/librte_pmd_i40e/i40e/i40e_adminq.c
+++ b/lib/librte_pmd_i40e/i40e/i40e_adminq.c
@@ -879,7 +879,6 @@ enum i40e_status_code i40e_asq_send_command(struct i40e_hw 
*hw,
 */
if (!details->async && !details->postpone) {
u32 total_delay = 0;
-   u32 delay_len = 1;

do {
/* AQ designers suggest use of head for better
@@ -888,8 +887,8 @@ enum i40e_status_code i40e_asq_send_command(struct i40e_hw 
*hw,
if (i40e_asq_done(hw))
break;
/* ugh! delay while spin_lock */
-   i40e_msec_delay(delay_len);
-   total_delay += delay_len;
+   i40e_msec_delay(1);
+   total_delay++;
} while (total_delay < hw->aq.asq_cmd_timeout);
}

diff --git a/lib/librte_pmd_i40e/i40e/i40e_common.c 
b/lib/librte_pmd_i40e/i40e/i40e_common.c
index d901c8d..60ca943 100644
--- a/lib/librte_pmd_i40e/i40e/i40e_common.c
+++ b/lib/librte_pmd_i40e/i40e/i40e_common.c
@@ -571,7 +571,6 @@ enum i40e_status_code i40e_init_shared_code(struct i40e_hw 
*hw)
break;
default:
return I40E_ERR_DEVICE_NOT_SUPPORTED;
-   break;
}

hw->phy.get_link_info = true;
@@ -872,6 +871,7 @@ enum i40e_status_code i40e_pf_reset(struct i40e_hw *hw)

i40e_clear_pxe_mode(hw);

+
return I40E_SUCCESS;
 }

@@ -1946,6 +1946,14 @@ enum i40e_status_code 
i40e_aq_get_firmware_version(struct i40e_hw *hw,
*api_major_version = LE16_TO_CPU(resp->api_major);
if (api_minor_version != NULL)
*api_minor_version = LE16_TO_CPU(resp->api_minor);
+
+   /* A workaround to fix the API version in SW */
+   if (api_major_version && api_minor_version &&
+   fw_major_version && fw_minor_version &&
+   ((*api_major_version == 1) && (*api_minor_version == 1)) &&
+   (((*fw_major_version == 4) && (*fw_minor_version >= 2)) ||
+(*fw_major_version > 4)))
+   *api_minor_version = 2;
}

return status;
@@ -4713,6 +4721,7 @@ enum i40e_status_code i40e_aq_send_msg_to_pf(struct 
i40e_hw *hw,
struct i40e_asq_cmd_details *cmd_details)
 {
struct i40e_aq_desc desc;
+   struct i40e_asq_cmd_details details;
enum i40e_status_code status;

i40e_fill_default_direct_cmd_desc(&desc, i40e_aqc_opc_send_msg_to_pf);
@@ -4727,7 +4736,6 @@ enum i40e_status_code i40e_aq_send_msg_to_pf(struct 
i40e_hw *hw,
desc.datalen = CPU_TO_LE16(msglen);
}
if (!cmd_details) {
-   struct i40e_asq_cmd_details details;
i40e_memset(&details, 0, sizeof(details), I40E_NONDMA_MEM);
details.async = true;
cmd_details = &details;
-- 
1.8.1.4



[dpdk-dev] [PATCH 15/15] i40e: support redefined struct of 'i40e_arq_event_info'

2014-09-09 Thread Helin Zhang
As struct of 'i40e_arq_event_info' in share code has
been redefined, relevant changes in PMD are needed to
support that.

Signed-off-by: Helin Zhang 
Reviewed-by: Chen Jing 
---
 lib/librte_pmd_i40e/i40e_ethdev.c|  8 +++-
 lib/librte_pmd_i40e/i40e_ethdev_vf.c | 10 +-
 2 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/lib/librte_pmd_i40e/i40e_ethdev.c 
b/lib/librte_pmd_i40e/i40e_ethdev.c
index 4e65ca4..ed73389 100644
--- a/lib/librte_pmd_i40e/i40e_ethdev.c
+++ b/lib/librte_pmd_i40e/i40e_ethdev.c
@@ -3334,8 +3334,8 @@ i40e_dev_handle_aq_msg(struct rte_eth_dev *dev)
uint16_t pending, opcode;
int ret;

-   info.msg_size = I40E_AQ_BUF_SZ;
-   info.msg_buf = rte_zmalloc("msg_buffer", I40E_AQ_BUF_SZ, 0);
+   info.buf_len = I40E_AQ_BUF_SZ;
+   info.msg_buf = rte_zmalloc("msg_buffer", info.buf_len, 0);
if (!info.msg_buf) {
PMD_DRV_LOG(ERR, "Failed to allocate mem\n");
return;
@@ -3360,15 +3360,13 @@ i40e_dev_handle_aq_msg(struct rte_eth_dev *dev)
rte_le_to_cpu_32(info.desc.cookie_high),
rte_le_to_cpu_32(info.desc.cookie_low),
info.msg_buf,
-   info.msg_size);
+   info.msg_len);
break;
default:
PMD_DRV_LOG(ERR, "Request %u is not supported yet\n",
opcode);
break;
}
-   /* Reset the buffer after processing one */
-   info.msg_size = I40E_AQ_BUF_SZ;
}
rte_free(info.msg_buf);
 }
diff --git a/lib/librte_pmd_i40e/i40e_ethdev_vf.c 
b/lib/librte_pmd_i40e/i40e_ethdev_vf.c
index d8552ad..b639486 100644
--- a/lib/librte_pmd_i40e/i40e_ethdev_vf.c
+++ b/lib/librte_pmd_i40e/i40e_ethdev_vf.c
@@ -78,6 +78,7 @@
 struct i40evf_arq_msg_info {
enum i40e_virtchnl_ops ops;
enum i40e_status_code result;
+   uint16_t buf_len;
uint16_t msg_len;
uint8_t *msg;
 };
@@ -226,8 +227,8 @@ i40evf_parse_pfmsg(struct i40e_vf *vf,
} else {
/* async reply msg on command issued by vf previously */
ret = I40EVF_MSG_CMD;
-   /* Actual buffer length read from PF */
-   data->msg_len = event->msg_size;
+   /* Actual data length read from PF */
+   data->msg_len = event->msg_len;
}
/* fill the ops and result to notify VF */
data->result = retval;
@@ -248,7 +249,7 @@ i40evf_read_pfmsg(struct rte_eth_dev *dev, struct 
i40evf_arq_msg_info *data)
int ret;
enum i40evf_aq_result result = I40EVF_MSG_NON;

-   event.msg_size = data->msg_len;
+   event.buf_len = data->buf_len;
event.msg_buf = data->msg;
ret = i40e_clean_arq_element(hw, &event, NULL);
/* Can't read any msg from adminQ */
@@ -282,7 +283,6 @@ i40evf_wait_cmd_done(struct rte_eth_dev *dev,
/* Delay some time first */
rte_delay_ms(ASQ_DELAY_MS);
ret = i40evf_read_pfmsg(dev, data);
-
if (ret == I40EVF_MSG_CMD)
return 0;
else if (ret == I40EVF_MSG_ERR)
@@ -332,7 +332,7 @@ i40evf_execute_vf_cmd(struct rte_eth_dev *dev, struct 
vf_cmd_info *args)
return -1;

info.msg = args->out_buffer;
-   info.msg_len = args->out_size;
+   info.buf_len = args->out_size;
info.ops = I40E_VIRTCHNL_OP_UNKNOWN;
info.result = I40E_SUCCESS;

-- 
1.8.1.4



[dpdk-dev] [PATCH 13/15] i40e: Use get_link_status to report FC settings

2014-09-09 Thread Helin Zhang
The fix is to use get_link_status but not get_phy_capabilities
for reporting FC settings.

Signed-off-by: Helin Zhang 
Reviewed-by: Chen Jing 
---
 lib/librte_pmd_i40e/i40e/i40e_common.c | 38 --
 lib/librte_pmd_i40e/i40e/i40e_type.h   |  8 ---
 2 files changed, 18 insertions(+), 28 deletions(-)

diff --git a/lib/librte_pmd_i40e/i40e/i40e_common.c 
b/lib/librte_pmd_i40e/i40e/i40e_common.c
index 60ca943..ffd68a5 100644
--- a/lib/librte_pmd_i40e/i40e/i40e_common.c
+++ b/lib/librte_pmd_i40e/i40e/i40e_common.c
@@ -1203,7 +1203,7 @@ enum i40e_status_code i40e_set_fc(struct i40e_hw *hw, u8 
*aq_failures,
status = i40e_aq_get_phy_capabilities(hw, false, false, &abilities,
  NULL);
if (status) {
-   *aq_failures |= I40E_SET_FC_AQ_FAIL_GET1;
+   *aq_failures |= I40E_SET_FC_AQ_FAIL_GET;
return status;
}

@@ -1228,31 +1228,19 @@ enum i40e_status_code i40e_set_fc(struct i40e_hw *hw, 
u8 *aq_failures,

if (status)
*aq_failures |= I40E_SET_FC_AQ_FAIL_SET;
-
-   /* Get the abilities to set hw->fc.current_mode correctly */
-   status = i40e_aq_get_phy_capabilities(hw, false, false,
- &abilities, NULL);
-   if (status) {
-   /* Wait a little bit and try once more */
-   i40e_msec_delay(1000);
-   status = i40e_aq_get_phy_capabilities(hw, false, false,
- &abilities, NULL);
-   }
-   if (status) {
-   *aq_failures |= I40E_SET_FC_AQ_FAIL_GET2;
-   return status;
-   }
}
-   /* Copy the what was returned from get capabilities into fc */
-   if ((abilities.abilities & I40E_AQ_PHY_FLAG_PAUSE_TX) &&
-   (abilities.abilities & I40E_AQ_PHY_FLAG_PAUSE_RX))
-   hw->fc.current_mode = I40E_FC_FULL;
-   else if (abilities.abilities & I40E_AQ_PHY_FLAG_PAUSE_TX)
-   hw->fc.current_mode = I40E_FC_TX_PAUSE;
-   else if (abilities.abilities & I40E_AQ_PHY_FLAG_PAUSE_RX)
-   hw->fc.current_mode = I40E_FC_RX_PAUSE;
-   else
-   hw->fc.current_mode = I40E_FC_NONE;
+   /* Update the link info */
+   status = i40e_update_link_info(hw, true);
+   if (status) {
+   /* Wait a little bit (on 40G cards it sometimes takes a really
+* long time for link to come back from the atomic reset)
+* and try once more
+*/
+   i40e_msec_delay(1000);
+   status = i40e_update_link_info(hw, true);
+   }
+   if (status)
+   *aq_failures |= I40E_SET_FC_AQ_FAIL_UPDATE;

return status;
 }
diff --git a/lib/librte_pmd_i40e/i40e/i40e_type.h 
b/lib/librte_pmd_i40e/i40e/i40e_type.h
index 737a4c1..bb87640 100644
--- a/lib/librte_pmd_i40e/i40e/i40e_type.h
+++ b/lib/librte_pmd_i40e/i40e/i40e_type.h
@@ -68,8 +68,10 @@ POSSIBILITY OF SUCH DAMAGE.
 (d) == I40E_DEV_ID_QSFP_B  || \
 (d) == I40E_DEV_ID_QSFP_C)

+#ifndef I40E_MASK
 /* I40E_MASK is a macro used on 32 bit registers */
 #define I40E_MASK(mask, shift) (mask << shift)
+#endif

 #define I40E_MAX_PF16
 #define I40E_MAX_PF_VSI64
@@ -216,10 +218,10 @@ enum i40e_fc_mode {

 enum i40e_set_fc_aq_failures {
I40E_SET_FC_AQ_FAIL_NONE = 0,
-   I40E_SET_FC_AQ_FAIL_GET1 = 1,
+   I40E_SET_FC_AQ_FAIL_GET = 1,
I40E_SET_FC_AQ_FAIL_SET = 2,
-   I40E_SET_FC_AQ_FAIL_GET2 = 4,
-   I40E_SET_FC_AQ_FAIL_SET_GET = 6
+   I40E_SET_FC_AQ_FAIL_UPDATE = 4,
+   I40E_SET_FC_AQ_FAIL_SET_UPDATE = 6
 };

 enum i40e_vsi_type {
-- 
1.8.1.4



[dpdk-dev] [PATCH 05/15] i40e: force a shifted '1' to be 'unsigned'

2014-09-09 Thread Helin Zhang
Force a shifted '1' to be 'unsiged' to avoid shifting a signed int.

Signed-off-by: Helin Zhang 
Reviewed-by: Chen Jing 
---
 lib/librte_pmd_i40e/i40e/i40e_hmc.h | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/lib/librte_pmd_i40e/i40e/i40e_hmc.h 
b/lib/librte_pmd_i40e/i40e/i40e_hmc.h
index d1d084a..eb629fc 100644
--- a/lib/librte_pmd_i40e/i40e/i40e_hmc.h
+++ b/lib/librte_pmd_i40e/i40e/i40e_hmc.h
@@ -38,7 +38,6 @@ POSSIBILITY OF SUCH DAMAGE.

 /* forward-declare the HW struct for the compiler */
 struct i40e_hw;
-enum i40e_status_code;

 #define I40E_HMC_INFO_SIGNATURE0x484D5347 /* HMSG */
 #define I40E_HMC_PD_CNT_IN_SD  512
@@ -135,7 +134,7 @@ struct i40e_hmc_info {
type) == I40E_SD_TYPE_PAGED) ? 0 : 1) <<\
I40E_PFHMC_SDDATALOW_PMSDTYPE_SHIFT) |  \
(1 << I40E_PFHMC_SDDATALOW_PMSDVALID_SHIFT);\
-   val3 = (sd_index) | (1 << I40E_PFHMC_SDCMD_PMSDWR_SHIFT);   \
+   val3 = (sd_index) | (1u << I40E_PFHMC_SDCMD_PMSDWR_SHIFT);  \
wr32((hw), I40E_PFHMC_SDDATAHIGH, val1);\
wr32((hw), I40E_PFHMC_SDDATALOW, val2); \
wr32((hw), I40E_PFHMC_SDCMD, val3); \
@@ -154,7 +153,7 @@ struct i40e_hmc_info {
I40E_PFHMC_SDDATALOW_PMSDBPCOUNT_SHIFT) |   \
type) == I40E_SD_TYPE_PAGED) ? 0 : 1) <<\
I40E_PFHMC_SDDATALOW_PMSDTYPE_SHIFT);   \
-   val3 = (sd_index) | (1 << I40E_PFHMC_SDCMD_PMSDWR_SHIFT);   \
+   val3 = (sd_index) | (1u << I40E_PFHMC_SDCMD_PMSDWR_SHIFT);  \
wr32((hw), I40E_PFHMC_SDDATAHIGH, 0);   \
wr32((hw), I40E_PFHMC_SDDATALOW, val2); \
wr32((hw), I40E_PFHMC_SDCMD, val3); \
-- 
1.8.1.4



[dpdk-dev] [PATCH 06/15] i40e: remove useless code for pre-boot support

2014-09-09 Thread Helin Zhang
The code wrapped in '#ifdef PREBOOT_SUPPORT' was added for
queue context initialization specifically for A0 silicon.
As A0 silicon has gone for a long time, the code should be
removed at all. In addition, the checks of 'QV_RELEASE'
and 'PREBOOT_SUPPORT' are also not needed anymore and can
be removed.

Signed-off-by: Helin Zhang 
Reviewed-by: Chen Jing 
---
 lib/librte_pmd_i40e/i40e/i40e_common.c  |   3 -
 lib/librte_pmd_i40e/i40e/i40e_lan_hmc.c | 203 
 lib/librte_pmd_i40e/i40e/i40e_lan_hmc.h |  13 --
 3 files changed, 219 deletions(-)

diff --git a/lib/librte_pmd_i40e/i40e/i40e_common.c 
b/lib/librte_pmd_i40e/i40e/i40e_common.c
index 6cdc0ff..4254aad 100644
--- a/lib/librte_pmd_i40e/i40e/i40e_common.c
+++ b/lib/librte_pmd_i40e/i40e/i40e_common.c
@@ -922,11 +922,8 @@ enum i40e_status_code i40e_pf_reset(struct i40e_hw *hw)
}
}

-#if !defined(QV_RELEASE) && !defined(PREBOOT_SUPPORT)
i40e_clear_pxe_mode(hw);

-#endif
-
return I40E_SUCCESS;
 }

diff --git a/lib/librte_pmd_i40e/i40e/i40e_lan_hmc.c 
b/lib/librte_pmd_i40e/i40e/i40e_lan_hmc.c
index d5e7d44..9f98d6d 100644
--- a/lib/librte_pmd_i40e/i40e/i40e_lan_hmc.c
+++ b/lib/librte_pmd_i40e/i40e/i40e_lan_hmc.c
@@ -1411,206 +1411,3 @@ enum i40e_status_code 
i40e_set_lan_rx_queue_context(struct i40e_hw *hw,
return i40e_set_hmc_context(context_bytes,
i40e_hmc_rxq_ce_info, (u8 *)s);
 }
-#ifdef PREBOOT_SUPPORT
-
-/* Definitions for PFM bypass registers */
-
-/* Each context sub-line consists of 128 bits (16 bytes) of data*/
-#define SUB_LINE_LENGTH  0x10
-
-#define LANCTXCTL_WR 0x1
-#define LANCTXCTL_INVALIDATE 0x2
-#define LANCTXCTL_QUEUE_TYPE_TX  0x1
-#define LANCTXCTL_QUEUE_TYPE_RX  0x0
-
-#define LANCTXSTAT_DELAY 100
-
-/**
- * i40e_write_queue_context_directly
- * @hw: the hardware struct
- * @queue: the absolute queue number
- * @context_bytes: data to write as a queue context
- * @hmc_type: queue type
- *
- * Write the HMC context for the queue using direct queue context programming
- **/
-static enum i40e_status_code i40e_write_queue_context_directly(struct i40e_hw 
*hw,
-   u16 queue, u8 *context_bytes,
-   enum i40e_hmc_lan_rsrc_type hmc_type)
-{
-   u32 length = 0;
-   u32 queue_type = 0;
-   u32 sub_line = 0;
-   u32 i = 0;
-   u32 cnt = 0;
-   u32 *ptr = NULL;
-   enum i40e_status_code ret_code = I40E_SUCCESS;
-
-   switch (hmc_type) {
-   case I40E_HMC_LAN_RX:
-   length = I40E_HMC_OBJ_SIZE_RXQ;
-   queue_type = LANCTXCTL_QUEUE_TYPE_RX;
-   break;
-   case I40E_HMC_LAN_TX:
-   length = I40E_HMC_OBJ_SIZE_TXQ;
-   queue_type = LANCTXCTL_QUEUE_TYPE_TX;
-   break;
-   default:
-   return I40E_NOT_SUPPORTED;
-   }
-
-   ptr = (u32 *)context_bytes;
-
-   for (sub_line = 0; sub_line < (length / SUB_LINE_LENGTH); sub_line++) {
-   u32 reg;
-
-   for (i = 0; i < 4; i++)
-   wr32(hw, I40E_PFCM_LANCTXDATA(i), *ptr++);
-   reg = (LANCTXCTL_WR << I40E_PFCM_LANCTXCTL_OP_CODE_SHIFT) |
- (queue_type << I40E_PFCM_LANCTXCTL_QUEUE_TYPE_SHIFT) |
- (sub_line << I40E_PFCM_LANCTXCTL_SUB_LINE_SHIFT) |
- (queue << I40E_PFCM_LANCTXCTL_QUEUE_NUM_SHIFT);
-   wr32(hw, I40E_PFCM_LANCTXCTL, reg);
-
-   cnt = 0;
-   while (cnt++ <= LANCTXSTAT_DELAY) {
-   reg = rd32(hw, I40E_PFCM_LANCTXSTAT);
-   if (reg)
-   break;
-   i40e_usec_delay(1);
-   };
-
-   if ((reg & I40E_PFCM_LANCTXSTAT_CTX_DONE_MASK) == 0) {
-   ret_code = I40E_ERR_CONFIG;
-   break;
-   }
-   }
-   return ret_code;
-}
-
-/**
- * i40e_invalidate_queue_context_directly
- * @hw: the hardware struct
- * @queue: the absolute queue number
- * @hmc_type: queue type
- *
- * Clear the HMC context for the queue using direct queue context programming
- **/
-static enum i40e_status_code i40e_invalidate_queue_context_directly(struct 
i40e_hw *hw,
-   u16 queue,
-   enum i40e_hmc_lan_rsrc_type hmc_type)
-{
-   u8 queue_type = 0;
-   u32 reg = 0;
-   u32 cnt = 0;
-   enum i40e_status_code ret_code = I40E_SUCCESS;
-
-   switch (hmc_type) {
-   case I40E_HMC_LAN_RX:
-   queue_type = LANCTXCTL_QUEUE_TYPE_RX;
-   break;
-   case I40E_HMC_LAN_TX:
-   queue_type = LANCTXCTL_QUEUE_TYPE_TX;
-   break;
-   default:
-   return I40E_NOT_SUPPORTED;
-   }
-   reg = (LANCTXCTL_INVALIDATE << I40E_PFCM_LANCTXCT

[dpdk-dev] [PATCH 09/15] i40e: remove code for TPH (TLP Processing Hints)

2014-09-09 Thread Helin Zhang
The code wrapped in '#ifdef I40E_TPH_SUPPORT' was added
to check if 'TPH' is supported, and enable it. It is not
used currently and can be removed.

Signed-off-by: Helin Zhang 
Reviewed-by: Chen Jing 
---
 lib/librte_pmd_i40e/i40e/i40e_common.c | 55 --
 1 file changed, 55 deletions(-)

diff --git a/lib/librte_pmd_i40e/i40e/i40e_common.c 
b/lib/librte_pmd_i40e/i40e/i40e_common.c
index 4f11542..7e750ec 100644
--- a/lib/librte_pmd_i40e/i40e/i40e_common.c
+++ b/lib/librte_pmd_i40e/i40e/i40e_common.c
@@ -542,61 +542,6 @@ struct i40e_rx_ptype_decoded i40e_ptype_lookup[] = {
I40E_PTT_UNUSED_ENTRY(255)
 };

-#ifdef I40E_TPH_SUPPORT
-
-/**
- * i40e_tph_present
- * @hw: pointer to the hw struct
- *
- * Check to see if TPH capability is present.
- **/
-bool i40e_tph_present(struct i40e_hw *hw)
-{
-   u32 capsup = rd32(hw, I40E_GLPCI_CAPSUP);
-
-   return capsup & I40E_GLPCI_CAPSUP_TPH_EN_MASK;
-}
-
-/**
- * i40e_enable_tph
- * @hw: pointer to the hw struct
- * @tph_control: contents of TPH Requester Control Register
- *
- * Check to see if TPH can be enabled; if so, enable it.
- **/
-bool i40e_enable_tph(struct i40e_hw *hw, u32 tph_control)
-{
-   u32 gltph, st_mode, permit;
-
-   /* check that TPH is permitted */
-   permit = (tph_control & I40E_TPH_REQ_ENA_MASK)
->> I40E_TPH_REQ_ENA_SHIFT;
-   if (!(permit & I40E_TPH_REQ_PERMIT))
-   return false;
-
-   /* check for valid ST mode */
-   st_mode = tph_control & I40E_TPH_ST_MODE_MASK;
-   if ((st_mode != I40E_TPH_MODE_NOTABLE) &&
-   (st_mode != I40E_TPH_MODE_DEVSPEC))
-   return false;
-
-   /* TPH may be enabled */
-   gltph = rd32(hw, I40E_GLTPH_CTRL);
-
-   /* turn off device-specific */
-   if (st_mode != I40E_TPH_MODE_DEVSPEC)
-   gltph &= ~I40E_GLTPH_CTRL_TPH_DEVSPEC_MASK;
-
-   /* This enables TPH for all queues for the given types of operation.
-* Additional enabling is done per-queue in setup of the queue contexts.
-*/
-   gltph |= I40E_GLTPH_CTRL_DESC_PH_MASK; /* descriptor reads/writes */
-   gltph |= I40E_GLTPH_CTRL_DATA_PH_MASK; /* data reads/writes */
-   wr32(hw, I40E_GLTPH_CTRL, gltph);
-
-   return true;
-}
-#endif /* I40E_TPH_SUPPORT */
 #ifndef VF_DRIVER

 /**
-- 
1.8.1.4



[dpdk-dev] [PATCH 10/15] i40e: support of 10G base T

2014-09-09 Thread Helin Zhang
10G base T type support is added.

Signed-off-by: Helin Zhang 
Reviewed-by: Chen Jing 
---
 lib/librte_pmd_i40e/i40e/i40e_common.c | 3 +++
 lib/librte_pmd_i40e/i40e/i40e_type.h   | 1 +
 2 files changed, 4 insertions(+)

diff --git a/lib/librte_pmd_i40e/i40e/i40e_common.c 
b/lib/librte_pmd_i40e/i40e/i40e_common.c
index 7e750ec..84af47a 100644
--- a/lib/librte_pmd_i40e/i40e/i40e_common.c
+++ b/lib/librte_pmd_i40e/i40e/i40e_common.c
@@ -63,6 +63,7 @@ STATIC enum i40e_status_code i40e_set_mac_type(struct i40e_hw 
*hw)
case I40E_DEV_ID_QSFP_A:
case I40E_DEV_ID_QSFP_B:
case I40E_DEV_ID_QSFP_C:
+   case I40E_DEV_ID_10G_BASE_T:
hw->mac.type = I40E_MAC_XL710;
break;
case I40E_DEV_ID_VF:
@@ -762,6 +763,8 @@ STATIC enum i40e_media_type i40e_get_media_type(struct 
i40e_hw *hw)
switch (hw->phy.link_info.phy_type) {
case I40E_PHY_TYPE_10GBASE_SR:
case I40E_PHY_TYPE_10GBASE_LR:
+   case I40E_PHY_TYPE_1000BASE_SX:
+   case I40E_PHY_TYPE_1000BASE_LX:
case I40E_PHY_TYPE_40GBASE_SR4:
case I40E_PHY_TYPE_40GBASE_LR4:
media = I40E_MEDIA_TYPE_FIBER;
diff --git a/lib/librte_pmd_i40e/i40e/i40e_type.h 
b/lib/librte_pmd_i40e/i40e/i40e_type.h
index 004967a..737a4c1 100644
--- a/lib/librte_pmd_i40e/i40e/i40e_type.h
+++ b/lib/librte_pmd_i40e/i40e/i40e_type.h
@@ -60,6 +60,7 @@ POSSIBILITY OF SUCH DAMAGE.
 #define I40E_DEV_ID_QSFP_A 0x1583
 #define I40E_DEV_ID_QSFP_B 0x1584
 #define I40E_DEV_ID_QSFP_C 0x1585
+#define I40E_DEV_ID_10G_BASE_T 0x1586
 #define I40E_DEV_ID_VF 0x154C
 #define I40E_DEV_ID_VF_HV  0x1571

-- 
1.8.1.4



[dpdk-dev] [PATCH 14/15] i40e: fix and enhancement in arq_event_info struct

2014-09-09 Thread Helin Zhang
Overloading the 'msg_size' field in the 'arq_event_info' struct
is a bad idea. It leads to bugs when the structure is used in a
loop, since the input value (buffer size) is overwritten by the
output value (actual message length). The fix introduces one
more field of 'buf_len' for the buffer size, and renames the
field of 'msg_size' to 'msg_len' for the real message size.

Signed-off-by: Helin Zhang 
Reviewed-by: Chen Jing 
---
 lib/librte_pmd_i40e/i40e/i40e_adminq.c| 33 ---
 lib/librte_pmd_i40e/i40e/i40e_adminq.h|  3 ++-
 lib/librte_pmd_i40e/i40e/i40e_common.c|  8 ++--
 lib/librte_pmd_i40e/i40e/i40e_prototype.h |  6 ++
 4 files changed, 27 insertions(+), 23 deletions(-)

diff --git a/lib/librte_pmd_i40e/i40e/i40e_adminq.c 
b/lib/librte_pmd_i40e/i40e/i40e_adminq.c
index 80da710..e098ed6 100644
--- a/lib/librte_pmd_i40e/i40e/i40e_adminq.c
+++ b/lib/librte_pmd_i40e/i40e/i40e_adminq.c
@@ -867,7 +867,8 @@ enum i40e_status_code i40e_asq_send_command(struct i40e_hw 
*hw,

/* bump the tail */
i40e_debug(hw, I40E_DEBUG_AQ_MESSAGE, "AQTX: desc and buffer:\n");
-   i40e_debug_aq(hw, I40E_DEBUG_AQ_COMMAND, (void *)desc_on_ring, buff);
+   i40e_debug_aq(hw, I40E_DEBUG_AQ_COMMAND, (void *)desc_on_ring,
+ buff, buff_size);
(hw->aq.asq.next_to_use)++;
if (hw->aq.asq.next_to_use == hw->aq.asq.count)
hw->aq.asq.next_to_use = 0;
@@ -917,11 +918,9 @@ enum i40e_status_code i40e_asq_send_command(struct i40e_hw 
*hw,
hw->aq.asq_last_status = (enum i40e_admin_queue_err)retval;
}

-   if (LE16_TO_CPU(desc->datalen) == buff_size) {
-   i40e_debug(hw, I40E_DEBUG_AQ_MESSAGE,
-  "AQTX: desc and buffer writeback:\n");
-   i40e_debug_aq(hw, I40E_DEBUG_AQ_COMMAND, (void *)desc, buff);
-   }
+   i40e_debug(hw, I40E_DEBUG_AQ_MESSAGE,
+  "AQTX: desc and buffer writeback:\n");
+   i40e_debug_aq(hw, I40E_DEBUG_AQ_COMMAND, (void *)desc, buff, buff_size);

/* update the error if time out occurred */
if ((!cmd_completed) &&
@@ -1000,6 +999,7 @@ enum i40e_status_code i40e_clean_arq_element(struct 
i40e_hw *hw,
/* now clean the next descriptor */
desc = I40E_ADMINQ_DESC(hw->aq.arq, ntc);
desc_idx = ntc;
+
flags = LE16_TO_CPU(desc->flags);
if (flags & I40E_AQ_FLAG_ERR) {
ret_code = I40E_ERR_ADMIN_QUEUE_ERROR;
@@ -1009,19 +1009,20 @@ enum i40e_status_code i40e_clean_arq_element(struct 
i40e_hw *hw,
   I40E_DEBUG_AQ_MESSAGE,
   "AQRX: Event received with error 0x%X.\n",
   hw->aq.arq_last_status);
-   } else {
-   i40e_memcpy(&e->desc, desc, sizeof(struct i40e_aq_desc),
-   I40E_DMA_TO_NONDMA);
-   datalen = LE16_TO_CPU(desc->datalen);
-   e->msg_size = min(datalen, e->msg_size);
-   if (e->msg_buf != NULL && (e->msg_size != 0))
-   i40e_memcpy(e->msg_buf,
-   hw->aq.arq.r.arq_bi[desc_idx].va,
-   e->msg_size, I40E_DMA_TO_NONDMA);
}

+   i40e_memcpy(&e->desc, desc, sizeof(struct i40e_aq_desc),
+   I40E_DMA_TO_NONDMA);
+   datalen = LE16_TO_CPU(desc->datalen);
+   e->msg_len = min(datalen, e->buf_len);
+   if (e->msg_buf != NULL && (e->msg_len != 0))
+   i40e_memcpy(e->msg_buf,
+   hw->aq.arq.r.arq_bi[desc_idx].va,
+   e->msg_len, I40E_DMA_TO_NONDMA);
+
i40e_debug(hw, I40E_DEBUG_AQ_MESSAGE, "AQRX: desc and buffer:\n");
-   i40e_debug_aq(hw, I40E_DEBUG_AQ_COMMAND, (void *)desc, e->msg_buf);
+   i40e_debug_aq(hw, I40E_DEBUG_AQ_COMMAND, (void *)desc, e->msg_buf,
+ hw->aq.arq_buf_size);

/* Restore the original datalen and buffer address in the desc,
 * FW updates datalen to indicate the event message
diff --git a/lib/librte_pmd_i40e/i40e/i40e_adminq.h 
b/lib/librte_pmd_i40e/i40e/i40e_adminq.h
index 27f2843..ea611bd 100644
--- a/lib/librte_pmd_i40e/i40e/i40e_adminq.h
+++ b/lib/librte_pmd_i40e/i40e/i40e_adminq.h
@@ -83,7 +83,8 @@ struct i40e_asq_cmd_details {
 /* ARQ event information */
 struct i40e_arq_event_info {
struct i40e_aq_desc desc;
-   u16 msg_size;
+   u16 msg_len;
+   u16 buf_len;
u8 *msg_buf;
 };

diff --git a/lib/librte_pmd_i40e/i40e/i40e_common.c 
b/lib/librte_pmd_i40e/i40e/i40e_common.c
index ffd68a5..ffaa777 100644
--- a/lib/librte_pmd_i40e/i40e/i40e_common.c
+++ b/lib/librte_pmd_i40e/i40e/i40e_common.c
@@ -89,13 +89,15 @@ STATIC enum i40e_status_code i40e_set_mac_type(struct 
i40e_hw *hw)
  * @mask: debug mask
  * @desc: pointer to admin queue descriptor
  * @buffer: pointer to command buffer
+ * @buf_len: max length of buffer
  *
  * Dump

[dpdk-dev] TCP/IP stack for DPDK

2014-09-09 Thread zimeiw


hi,


netdp stack use rte_mbuf directly, so no packet copied from DPDK port queue to 
netdp stack. netdp forwarding performance is same as FreeBSD. 



At 2014-09-09 02:20:16, "Matthew Hall"  wrote:
>On Tue, Sep 09, 2014 at 08:49:44AM +0800, zimeiw wrote:
>> I have porting major FreeBSD tcp/ip stack to dpdk. new tcp/ip stack is based 
>> on dpdk rte_mbuf, rte_ring, rte_memory and rte_table. it is faster to 
>> forwarding packets.
>
>Hello,
>
>This is awesome work to be doing and badly needed to use DPDK for any L4 
>purposes where it is very limited. I'll be following your progress.
>
>You didn't mention your name, and compare your work with 
>https://github.com/rumpkernel/dpdk-rumptcpip/ , and talk about behavior / 
>performance, and how long you think it'll take. I'm curious if you can give 
>some more comments.
>
>I'm implementing an RX-side very basic stack myself... but I'm not using BSD 
>standard APIs or doing TX-side like yours will have.
>
>Matthew.


[dpdk-dev] [PATCH 03/13] mbuf: add packet_type field

2014-09-09 Thread Olivier MATZ
Hello,

On 09/09/2014 05:59 AM, Zhang, Helin wrote:
> It is a common field which i40e PMD will use it to store the 'packet type 
> ID'. i40e
> hardware can recognize more than a hundred of packet types of received 
> packets,
> this is quite useful for upper layer stack or application. So this field is 
> quite useful
> and will be filled by PMD.
> In ixgbe/igb, it has less than 10 packet types which are marked in offload 
> flags. From
> now on, it would be better to have new field here to put the hardware 
> offloaded
> packet type in and it could be used for future NICs.
>
>>
>> I'm not saying this field is useless. But even if it's useful for some 
>> applications
>> like yours, it does not mean that it should go in the generic mbuf structure.
>>
>> Also, for a new field, we should define who is in charge of filling it.
>> Is is the driver? Does it mean that all drivers have to be modified to fill 
>> it? Or is
>> it just a placeholder for applications? In this case, shouldn't we use
>> application-specific metadata? In the other direction (TX), we would also 
>> need
>> to define if this field must be filled by the application before 
>> transmitting a mbuf
>> to a driver.
> Yes, PMD will fill it. I40e PMD will be the first one, ixgbe/igb can be kept 
> as it is, or
> modified to be consistent. It is used for RX side only, and for TX side, it 
> can be
> investigated to see if it can be used also. I think some new features in 
> development
> can think of that.
> Anyway, it is a quite useful field for i40e and future generation of NICs.

To me, having the support in a hardware for that feature is not a
sufficient reason for adding this field. There are many hardware
features that will never be integrated in dpdk.

This first version of the patch:

- just adds a field that is not used by any code, so it is useless.
   At least testpmd or an application example should show how to
   use it.

- does not describe what enhancement is provided by adding the
   field (performance? in this case, numbers + use case would help
   to convince people).

- does not describe what can be the content of the field. Is it
   a protocol number?

- does not explain if all drivers must fill this field. If yes,
   the patch has to update all drivers. If not, something must be
   done to mark the packet field as unknown by default.

Regards,
Olivier



[dpdk-dev] dpdk 1.6 insmod rte_kni.ko error

2014-09-09 Thread Zhang, Jerry
>-Original Message-
>From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of rong wen
>Sent: Saturday, September 6, 2014 8:07 PM
>To: dev
>Subject: [dpdk-dev] dpdk 1.6 insmod rte_kni.ko error
>
>Hi All,
>
>I am a new guy to use dpdk. The kernel version is 2.6.32.
>
>After I build the rte_kni.ko, I run insmod rte_kni.ko, error message is:
>
>Unloading any existing DPDK KNI module
>Loading DPDK KNI module
>insmod: error inserting
>'/home/wenrong.wr/dpdk_dev_r1-6-0/x86_64-default-linuxapp-gcc/kmod/rte_k
>ni.ko':
>-1 Unknown symbol in module
>## ERROR: Could not load kmod/rte_kni.ko.
>
>The dmesg output is:
>
>[181831.351549] rte_kni: Unknown symbol hwmon_device_register
>[181831.353671] rte_kni: Unknown symbol hwmon_device_unregister
>[182558.257153] rte_kni: Unknown symbol hwmon_device_register
>[182558.259288] rte_kni: Unknown symbol hwmon_device_unregister
>

Did you build and insmod kni.ko on the same mache?
If you did, then maybe the hwmon module was not built into your kernel. 
Try the command 'modprobe howmon'


[dpdk-dev] [PATCH 03/13] mbuf: add packet_type field

2014-09-09 Thread Zhang, Helin


> -Original Message-
> From: Olivier MATZ [mailto:olivier.matz at 6wind.com]
> Sent: Tuesday, September 9, 2014 4:03 PM
> To: Zhang, Helin; Yerden Zhumabekov; Richardson, Bruce; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 03/13] mbuf: add packet_type field
> 
> Hello,
> 
> On 09/09/2014 05:59 AM, Zhang, Helin wrote:
> > It is a common field which i40e PMD will use it to store the 'packet
> > type ID'. i40e hardware can recognize more than a hundred of packet
> > types of received packets, this is quite useful for upper layer stack
> > or application. So this field is quite useful and will be filled by PMD.
> > In ixgbe/igb, it has less than 10 packet types which are marked in
> > offload flags. From now on, it would be better to have new field here
> > to put the hardware offloaded packet type in and it could be used for future
> NICs.
> >
> >>
> >> I'm not saying this field is useless. But even if it's useful for
> >> some applications like yours, it does not mean that it should go in the
> generic mbuf structure.
> >>
> >> Also, for a new field, we should define who is in charge of filling it.
> >> Is is the driver? Does it mean that all drivers have to be modified
> >> to fill it? Or is it just a placeholder for applications? In this
> >> case, shouldn't we use application-specific metadata? In the other
> >> direction (TX), we would also need to define if this field must be
> >> filled by the application before transmitting a mbuf to a driver.
> > Yes, PMD will fill it. I40e PMD will be the first one, ixgbe/igb can
> > be kept as it is, or modified to be consistent. It is used for RX side
> > only, and for TX side, it can be investigated to see if it can be used
> > also. I think some new features in development can think of that.
> > Anyway, it is a quite useful field for i40e and future generation of NICs.
> 
> To me, having the support in a hardware for that feature is not a sufficient
> reason for adding this field. There are many hardware features that will never
> be integrated in dpdk.

At least this field is quite important for i40e.
e.g. packet type=43 means that hardware recognize it as a VXLAN packet. To 
avoid checking what type of packet by software, hardware can offload that, and 
fill the packet type ID in that field.
It cannot be put in ol_flags anymore, as it has more than 100 packet type can 
be recognized by hardware. Without it, vxlan feature cannot be implemented at 
all. 

> 
> This first version of the patch:
> 
> - just adds a field that is not used by any code, so it is useless.
>At least testpmd or an application example should show how to
>use it.

It will be used at least in vxlan feature which is in development. Without it, 
vxlan cannot be completed. So this is a very important field i40e and future 
NICs.

>
> - does not describe what enhancement is provided by adding the
>field (performance? in this case, numbers + use case would help
>to convince people).

I40e hardware can recognize received packets as different packet types, and 
there are about 256 packet types can be recognized by i40e hardware. It is not 
a enhancement, it is the key for at least i40e features.

> 
> - does not describe what can be the content of the field. Is it
>a protocol number?
> 

The packet type is a offload feature of hardware, the value of it can mean one 
type of packet recognized by the i40e hardware. E.g.

| Packet type | Description|
| 0 | Reserved  |
| 1 | MAC, PAY2|
| 2 | MAC, TimeSync, PAY2|
...
| 43| MAC, IPV4, GRENAT, PAY3 |

> - does not explain if all drivers must fill this field. If yes,
>the patch has to update all drivers. If not, something must be
>done to mark the packet field as unknown by default.
> 
I40e needs it.
igb/ixgbe can be changed to support it, but not mandatory, as ol_flags can 
represent it.
Actually ixgbe and igb has packet type also, but the number of those types is 
less than 10, so it can be put in ol_flags. For i40e and future NICs, the 
number of that could be 256 or more, ol_flags does not have that many bits for 
it. The best idea is to fill the packet type ID directly into a field.

> Regards,
> Olivier

Regards,
Helin



[dpdk-dev] [PATCH v2 3/6] mbuf: remove rte_ctrlmbuf

2014-09-09 Thread Richardson, Bruce
> -Original Message-
> From: Olivier MATZ [mailto:olivier.matz at 6wind.com]
> Sent: Monday, September 08, 2014 9:22 AM
> To: Richardson, Bruce; dev at dpdk.org
> Subject: Re: [PATCH v2 3/6] mbuf: remove rte_ctrlmbuf
> 
> Hi Bruce,
> 
> On 08/28/2014 05:42 PM, Bruce Richardson wrote:
> > From: Olivier Matz 
> >
> > The initial role of rte_ctrlmbuf is to carry generic messages (data
> > pointer + data length) but it's not used by the DPDK or it applications.
> > Keeping it implies:
> >   - loosing 1 byte in the rte_mbuf structure
> >   - having some dead code rte_mbuf.[ch]
> >
> > This patch removes this feature. Thanks to it, it is now possible to
> > simplify the rte_mbuf structure by merging the rte_pktmbuf structure
> > in it. This is done in next commit.
> >
> > Signed-off-by: Olivier Matz 
> >
> > * Updated patch to HEAD.
> > * Modified patch to retain the old function names for ctrl mbufs as
> >   macros. This helps with app compatibility, and allows the concept
> >   of a control mbuf to be reintroduced via a single-bit flag in
> >   a future change.
> > * Updated the packet framework ip_pipeline example application to
> >   work following this change.
> >
> > Changes in v2:
> > * Fixed whitespace errors introduced by this patch flagged by checkpatch
> >
> > Signed-off-by: Bruce Richardson 
> 
> To be honest, I'm not convinced that keeping the old function names
> is really required, but I suppose you had good reasons to reintroduce
> them. Just for information, is it for compatibility purpose or is there
> a real wish to reintroduce a sort of control mbuf in the future ?
> 
> Acked-by: Olivier Matz 

Compatibility primarily. However, it's a useful enough concept, and can be 
controlled by having a single-bit flag as done in my second patch set.

/Bruce


[dpdk-dev] [PATCH 04/13] mbuf: expand ol_flags field to 64-bits

2014-09-09 Thread Richardson, Bruce
> -Original Message-
> From: Olivier MATZ [mailto:olivier.matz at 6wind.com]
> Sent: Monday, September 08, 2014 11:26 AM
> To: Richardson, Bruce; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 04/13] mbuf: expand ol_flags field to 64-bits
> 
> Hi Bruce,
> 
> On 09/03/2014 05:49 PM, Bruce Richardson wrote:
> > The offload flags field (ol_flags) was 16-bits and had no further room
> > for expansion. This patch increases the field size to 64-bits, using up
> > the remaining reserved space in the single-cache-line mbuf.
> >
> > NOTE: none of the values for existing flags have been changed, i.e. no
> > new numbers have been explicitly reserved between existing flag
> > definitions.
> >
> > Signed-off-by: Bruce Richardson 
> 
> The initial series I've proposed [1][2] had on more enhancement: the
> first patch [1] allowed to remove the definition of flag names in
> testpmd. Indeed, this is not really good because they must be kept
> synchronized with the flags in rte_mbuf. What do you think about this
> patch? Should it be integrated in your series? Or later? Or never? ;)

No, it is a good change - I've just keep it out of my series for simplicity as 
I'm largely trying to keep the scope as small as possible. I would love to see 
that go in as a separate patch maybe once the mbuf rework is finished. 

> 
> The second patch [2] changes the value of the flags. This is not needed
> now, but if we do it in the future, we should not forget to change
> app/test-pmd/cmdline.c accordingly. Maybe this could go in your patch
> directly as it does not hurt?

As above for now. Right now I'm just trying to get the structure worked out, 
and deal with any performance regressions that are found (such as what Pablo 
found last Friday :-( ). 

/Bruce

> 
> Olivier
> 
> 
> [1] http://dpdk.org/ml/archives/dev/2014-May/002545.html
> [2] http://dpdk.org/ml/archives/dev/2014-May/002546.html


[dpdk-dev] [PATCH 07/13] mbuf: use macros only to access the mbuf metadata

2014-09-09 Thread Richardson, Bruce
> -Original Message-
> From: Olivier MATZ [mailto:olivier.matz at 6wind.com]
> Sent: Monday, September 08, 2014 1:06 PM
> To: Richardson, Bruce; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 07/13] mbuf: use macros only to access the
> mbuf metadata
> 
> Hi Bruce,
> 
> On 09/03/2014 05:49 PM, Bruce Richardson wrote:
> > Removed the explicit zero-sized metadata definition at the end of the
> > mbuf data structure. Updated the metadata macros to take account of this
> > change so that all existing code which uses those macros still works.
> >
> > Signed-off-by: Bruce Richardson 
> > ---
> >  lib/librte_mbuf/rte_mbuf.h | 22 --
> >  1 file changed, 8 insertions(+), 14 deletions(-)
> >
> > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> > index 5260001..ca66d9a 100644
> > --- a/lib/librte_mbuf/rte_mbuf.h
> > +++ b/lib/librte_mbuf/rte_mbuf.h
> > @@ -166,31 +166,25 @@ struct rte_mbuf {
> > struct rte_mempool *pool; /**< Pool from which mbuf was allocated.
> */
> > struct rte_mbuf *next;/**< Next segment of scattered packet. */
> >
> > -   union {
> > -   uint8_t metadata[0];
> > -   uint16_t metadata16[0];
> > -   uint32_t metadata32[0];
> > -   uint64_t metadata64[0];
> > -   } __rte_cache_aligned;
> >  } __rte_cache_aligned;
> >
> >  #define RTE_MBUF_METADATA_UINT8(mbuf, offset)  \
> > -   (mbuf->metadata[offset])
> > +   (((uint8_t *)&(mbuf)[1])[offset])
> >  #define RTE_MBUF_METADATA_UINT16(mbuf, offset) \
> > -   (mbuf->metadata16[offset/sizeof(uint16_t)])
> > +   (((uint16_t *)&(mbuf)[1])[offset/sizeof(uint16_t)])
> >  #define RTE_MBUF_METADATA_UINT32(mbuf, offset) \
> > -   (mbuf->metadata32[offset/sizeof(uint32_t)])
> > +   (((uint32_t *)&(mbuf)[1])[offset/sizeof(uint32_t)])
> >  #define RTE_MBUF_METADATA_UINT64(mbuf, offset) \
> > -   (mbuf->metadata64[offset/sizeof(uint64_t)])
> > +   (((uint64_t *)&(mbuf)[1])[offset/sizeof(uint64_t)])
> >
> >  #define RTE_MBUF_METADATA_UINT8_PTR(mbuf, offset)  \
> > -   (&mbuf->metadata[offset])
> > +   (&RTE_MBUF_METADATA_UINT8(mbuf, offset))
> >  #define RTE_MBUF_METADATA_UINT16_PTR(mbuf, offset) \
> > -   (&mbuf->metadata16[offset/sizeof(uint16_t)])
> > +   (&RTE_MBUF_METADATA_UINT16(mbuf, offset))
> >  #define RTE_MBUF_METADATA_UINT32_PTR(mbuf, offset) \
> > -   (&mbuf->metadata32[offset/sizeof(uint32_t)])
> > +   (&RTE_MBUF_METADATA_UINT32(mbuf, offset))
> >  #define RTE_MBUF_METADATA_UINT64_PTR(mbuf, offset) \
> > -   (&mbuf->metadata64[offset/sizeof(uint64_t)])
> > +   (&RTE_MBUF_METADATA_UINT64(mbuf, offset))
> >
> >  /**
> >   * Given the buf_addr returns the pointer to corresponding mbuf.
> >
> 
> I think it goes in the good direction. So:
> Acked-by: Olivier Matz 
> 
> Just one question: why not removing RTE_MBUF_METADATA*() macros?
> I'd just provide one macro that gives a (void*) to the first byte
> after the mbuf structure.
> 
> The format of the metadata is up to the application, that usually
> casts (m + 1) into a private structure, making the macros not very
> useful. I suggest to move these macros outside rte_mbuf.h, in the
> application-specific or library-specific header, what do you think?
> 
> Regards,
> Olivier
> 
Yes, I'll look into that.

/Bruce


[dpdk-dev] [PATCH 0/6] Mbuf structure Rework, part 1

2014-09-09 Thread Richardson, Bruce


> -Original Message-
> From: Olivier MATZ [mailto:olivier.matz at 6wind.com]
> Sent: Monday, September 08, 2014 1:33 PM
> To: Richardson, Bruce; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 0/6] Mbuf structure Rework, part 1
> 
> Hi Bruce,
> 
> On 08/27/2014 05:50 PM, Bruce Richardson wrote:
> > This patch set does some initial pre-work to prepare the mbuf data structure
> > (and ixgbe vector driver to a lesser extent) for more major changes which
> > will follow on in a subsequent patch set. [See previous RFC patch set for
> > more indications of the future coming changes].
> >
> > The main changes here are the flattening out of the mbuf data structure, 
> > with
> > much of it based off work by Olivier. The ctrlmbuf and pktmbuf structures 
> > are
> > now gone, as is the vlan_macip structure. However, in this set, the concept
> > of having a separate ctrl mbuf type is kept around. The plan is in a later 
> > set
> > when we expand the flags field to 64-bits, we can use a single bit in the 
> > flags
> > to indicate a control packet. For now, though, the ctrlmbuf functions and
> macros
> > just are aliases for the pktmbuf equivalents as much as possible.
> 
> I'm wondering it "struct rte_kni_mbuf" should be updated
> accordingly each time "struct mbuf" is modified.
> 
> Regards,
> Olivier

Yes, it should. There are no changes needed for the part 1 patch set (6 
patches), as it does not change the actual field layout, it just flattens the C 
definitions. I've already started working on an update to the second patch set 
(13 patches) including kni changes as part of it.

/Bruce


[dpdk-dev] [PATCH 03/13] mbuf: add packet_type field

2014-09-09 Thread Richardson, Bruce


> -Original Message-
> From: Olivier MATZ [mailto:olivier.matz at 6wind.com]
> Sent: Tuesday, September 09, 2014 9:03 AM
> To: Zhang, Helin; Yerden Zhumabekov; Richardson, Bruce; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 03/13] mbuf: add packet_type field
> 
> Hello,
> 
> On 09/09/2014 05:59 AM, Zhang, Helin wrote:
> > It is a common field which i40e PMD will use it to store the 'packet type 
> > ID'.
> i40e
> > hardware can recognize more than a hundred of packet types of received
> packets,
> > this is quite useful for upper layer stack or application. So this field is 
> > quite
> useful
> > and will be filled by PMD.
> > In ixgbe/igb, it has less than 10 packet types which are marked in offload 
> > flags.
> From
> > now on, it would be better to have new field here to put the hardware
> offloaded
> > packet type in and it could be used for future NICs.
> >
> >>
> >> I'm not saying this field is useless. But even if it's useful for some 
> >> applications
> >> like yours, it does not mean that it should go in the generic mbuf 
> >> structure.
> >>
> >> Also, for a new field, we should define who is in charge of filling it.
> >> Is is the driver? Does it mean that all drivers have to be modified to 
> >> fill it? Or
> is
> >> it just a placeholder for applications? In this case, shouldn't we use
> >> application-specific metadata? In the other direction (TX), we would also
> need
> >> to define if this field must be filled by the application before 
> >> transmitting a
> mbuf
> >> to a driver.
> > Yes, PMD will fill it. I40e PMD will be the first one, ixgbe/igb can be 
> > kept as it
> is, or
> > modified to be consistent. It is used for RX side only, and for TX side, it 
> > can be
> > investigated to see if it can be used also. I think some new features in
> development
> > can think of that.
> > Anyway, it is a quite useful field for i40e and future generation of NICs.
> 
> To me, having the support in a hardware for that feature is not a
> sufficient reason for adding this field. There are many hardware
> features that will never be integrated in dpdk.
> 
> This first version of the patch:
> 
> - just adds a field that is not used by any code, so it is useless.
>At least testpmd or an application example should show how to
>use it.
> 
> - does not describe what enhancement is provided by adding the
>field (performance? in this case, numbers + use case would help
>to convince people).
> 
> - does not describe what can be the content of the field. Is it
>a protocol number?
> 
> - does not explain if all drivers must fill this field. If yes,
>the patch has to update all drivers. If not, something must be
>done to mark the packet field as unknown by default.
> 
> Regards,
> Olivier

Hi,

Points taken. Really, this patch doesn't belong in this set as I had planned 
things and better belongs in patch set 3 (coming soon, I hope) which should 
propose the new field additions. I simply put it here to avoid having to start 
renumbering and renaming reserved fields in the structure, but that is possibly 
the lesser of the two evils.

However, with regards to adding new fields in, I would like to have some 
agreement that I can add fields in without actually pushing in the patch to use 
them - so long as sufficient rational is provided for using the field and there 
is a soon pending change to actually use it. This patch did not meet the 
criteria for explanation, but if updated to do so, I would like to have this 
patch accepted on the basis of that explanation so as to enable those working 
on the drivers to make us of it. 

Regards,
/Bruce


[dpdk-dev] Defaults for rte_hash

2014-09-09 Thread Matthew Hall
Hello,

I was looking at the code which inits rte_hash objects in examples/l3fwd. It's 
using approx. 1M to 4M hash 'entries' depending on 32-bit vs 64-bit, but it's 
setting the 'bucket_entries' to just 4.

Normally I'm used to using somewhat deeper hash buckets than that... it seems 
like having a zillion little tiny hash buckets would cause more TLB pressure 
and memory overhead... or does 4 get shifted / exponentiated into 2**4 ?

The documentation in http://dpdk.org/doc/api/structrte__hash__parameters.html 
and http://dpdk.org/doc/api/rte__hash_8h.html isn't that clear... is there a 
better place to look for this?

In my case I'm looking to create a table of 4M or 8M entries, containing 
tables of security threat IPs / domains, to be detected in the traffic. So it 
would be good to have some understanding how not to waste a ton of memory on a 
table this huge without making it run super slow either.

Did anybody have some experience with how to get this right?

Another thing... the LPM table uses 16-bit Hop IDs. But I would probably have 
more than 64K CIDR blocks of badness on the Internet available to me for 
analysis. How would I cope with this, besides just letting some attackers 
escape unnoticed? ;)

Have we got some kind of structure which allows a greater number of CIDRs even 
if it's not quite as fast?

Thanks,
Matthew.


[dpdk-dev] Defaults for rte_hash

2014-09-09 Thread Richardson, Bruce
> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Matthew Hall
> Sent: Tuesday, September 09, 2014 11:32 AM
> To: dev at dpdk.org
> Subject: [dpdk-dev] Defaults for rte_hash
> 
> Hello,
> 
> I was looking at the code which inits rte_hash objects in examples/l3fwd. It's
> using approx. 1M to 4M hash 'entries' depending on 32-bit vs 64-bit, but it's
> setting the 'bucket_entries' to just 4.
> 
> Normally I'm used to using somewhat deeper hash buckets than that... it seems
> like having a zillion little tiny hash buckets would cause more TLB pressure
> and memory overhead... or does 4 get shifted / exponentiated into 2**4 ?
> 
> The documentation in
> http://dpdk.org/doc/api/structrte__hash__parameters.html
> and http://dpdk.org/doc/api/rte__hash_8h.html isn't that clear... is there a
> better place to look for this?
> 
> In my case I'm looking to create a table of 4M or 8M entries, containing
> tables of security threat IPs / domains, to be detected in the traffic. So it
> would be good to have some understanding how not to waste a ton of memory
> on a
> table this huge without making it run super slow either.
> 
> Did anybody have some experience with how to get this right?

It might be worth looking too at the hash table structures in the librte_table 
directory for packet framework. These should give better scalability across 
millions of flows than the existing rte_hash implementation. [We're looking 
here to provide in the future a similar, more scalable, hash table 
implementation with an API like that of rte_hash, but that is still under 
development here at the moment.]

> 
> Another thing... the LPM table uses 16-bit Hop IDs. But I would probably have
> more than 64K CIDR blocks of badness on the Internet available to me for
> analysis. How would I cope with this, besides just letting some attackers
> escape unnoticed? ;)

Actually, I think the next hop field in the lpm implementation is only 8-bits, 
not 16 :-). Each lpm entry is only 16-bits in total.

> 
> Have we got some kind of structure which allows a greater number of CIDRs
> even
> if it's not quite as fast?
> 
> Thanks,
> Matthew.


[dpdk-dev] Defaults for rte_hash

2014-09-09 Thread De Lara Guarch, Pablo


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Richardson, Bruce
> Sent: Tuesday, September 09, 2014 11:45 AM
> To: Matthew Hall; dev at dpdk.org
> Subject: Re: [dpdk-dev] Defaults for rte_hash
> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Matthew Hall
> > Sent: Tuesday, September 09, 2014 11:32 AM
> > To: dev at dpdk.org
> > Subject: [dpdk-dev] Defaults for rte_hash
> >
> > Hello,
> >
> > I was looking at the code which inits rte_hash objects in examples/l3fwd.
> It's
> > using approx. 1M to 4M hash 'entries' depending on 32-bit vs 64-bit, but 
> > it's
> > setting the 'bucket_entries' to just 4.
> >
> > Normally I'm used to using somewhat deeper hash buckets than that... it
> seems
> > like having a zillion little tiny hash buckets would cause more TLB pressure
> > and memory overhead... or does 4 get shifted / exponentiated into 2**4 ?
> >

That 4 is not shifted, so it is actually 4 entries/bucket. Actually, the 
maximum number of entries you can use is 16, as bucket will be as big as a 
cache line.
However, regardless the number of entries, memory size will remain the same, 
but using 4 entries/bucket, with 16-byte key, all keys stored for a bucket will 
fit in a cache line, 
so performance looks to be better in this case (although a non-optimal hash 
function could lead not to be able to store all keys, as chances to fill a 
bucket are higher).
Anyway, for this example, 4 entries/bucket looks a good number to me.

> > The documentation in
> > http://dpdk.org/doc/api/structrte__hash__parameters.html
> > and http://dpdk.org/doc/api/rte__hash_8h.html isn't that clear... is there a
> > better place to look for this?
> >
> > In my case I'm looking to create a table of 4M or 8M entries, containing
> > tables of security threat IPs / domains, to be detected in the traffic. So 
> > it
> > would be good to have some understanding how not to waste a ton of
> memory
> > on a
> > table this huge without making it run super slow either.
> >
> > Did anybody have some experience with how to get this right?
> 
> It might be worth looking too at the hash table structures in the librte_table
> directory for packet framework. These should give better scalability across
> millions of flows than the existing rte_hash implementation. [We're looking
> here to provide in the future a similar, more scalable, hash table
> implementation with an API like that of rte_hash, but that is still under
> development here at the moment.]
> 
> >
> > Another thing... the LPM table uses 16-bit Hop IDs. But I would probably
> have
> > more than 64K CIDR blocks of badness on the Internet available to me for
> > analysis. How would I cope with this, besides just letting some attackers
> > escape unnoticed? ;)
> 
> Actually, I think the next hop field in the lpm implementation is only 8-bits,
> not 16 :-). Each lpm entry is only 16-bits in total.
> 
> >
> > Have we got some kind of structure which allows a greater number of
> CIDRs
> > even
> > if it's not quite as fast?
> >
> > Thanks,
> > Matthew.


[dpdk-dev] TCP/IP stack for DPDK

2014-09-09 Thread Alexander Nasonov
Matthew Hall wrote:
> However despite this issue, there are some cases where the Linux stack is 
> greatly superior to the BSD one although normally the opposite is the case... 
> AF_NETLINK for configuring 10,000+ IP addresses, especially for L4-L7 
> performance testing, would be one possible example of this. Another potential 
> example would be the BPF JIT compiler if you want to combine BPF filters with 
> DPDK (something I'm doing right now in my own code actually).

BPF JIT is available in NetBSD too. It should be quite staightforward to
enable it in the rump-dpdk kernel.

Alex


[dpdk-dev] [PATCH v3 6/6] mbuf: flatten struct vlan_macip into mbuf struct

2014-09-09 Thread Bruce Richardson
The vlan_macip structure combined a vlan tag id with l2 and l3 headers
lengths for tracking offloads. However, this structure was only used as
a unit by the e1000 and ixgbe drivers, not generally.

This patch removes the structure from the mbuf header and places the
fields into the mbuf structure directly at the required point, without
any net effect on the structure layout. This allows us to treat the vlan
tags and header length fields as separate for future mbuf changes. The
drivers which were written to use the combined structure still do so,
using a driver-local definition of it.

Changes in V2:
* None

Changes in V3:
* minor comment cleanup following review by Olivier
* reduce perf regression caused by splitting vlan_macip field. This is
  done by providing a single uint16_t value to allow writing/clearing
  the l2 and l3 lengths together. There is still a small perf hit to the
  slow path TX due to the reads from vlan_tci and l2/l3 lengths being
  separated. (<5% in my tests with testpmd with no extra params). 
  Unfortunately, this cannot be eliminated, without restoring the vlan
  tags and l2/l3 lengths as a combined 32-bit field. This would prevent
  us from ever looking to move those fields about and is an artificial tie
  that applies only for performance in igb and ixgbe drivers. Therefore,
  this patch keeps the vlan_tci field separate from the lengths as the
  best solution going forward.

Signed-off-by: Bruce Richardson 
V2 Acked-by: Olivier Matz 

fixup cleanup
---
 app/test-pmd/csumonly.c |  4 +--
 app/test-pmd/flowgen.c  | 14 -
 app/test-pmd/macfwd.c   |  6 ++--
 app/test-pmd/macswap.c  |  6 ++--
 app/test-pmd/rxonly.c   |  3 +-
 app/test-pmd/testpmd.c  |  3 +-
 app/test-pmd/txonly.c   |  6 ++--
 app/test/packet_burst_generator.c   | 10 +++
 examples/ip_fragmentation/main.c|  2 +-
 examples/ip_pipeline/pipeline_rx.c  |  4 +--
 examples/ip_pipeline/pipeline_tx.c  |  2 +-
 examples/ip_reassembly/main.c   |  8 +++---
 examples/ipv4_multicast/main.c  |  3 +-
 examples/vhost/main.c   |  6 ++--
 lib/librte_ip_frag/ip_frag_common.h |  3 +-
 lib/librte_ip_frag/rte_ipv4_fragmentation.c |  2 +-
 lib/librte_ip_frag/rte_ipv4_reassembly.c|  6 ++--
 lib/librte_ip_frag/rte_ipv6_reassembly.c|  5 ++--
 lib/librte_mbuf/rte_mbuf.h  | 36 ---
 lib/librte_pmd_e1000/em_rxtx.c  | 44 +
 lib/librte_pmd_e1000/igb_rxtx.c | 37 ++--
 lib/librte_pmd_i40e/i40e_rxtx.c | 14 -
 lib/librte_pmd_ixgbe/ixgbe_rxtx.c   | 17 ++-
 lib/librte_pmd_ixgbe/ixgbe_rxtx.h   | 22 ++-
 lib/librte_pmd_vmxnet3/vmxnet3_rxtx.c   |  6 ++--
 25 files changed, 158 insertions(+), 111 deletions(-)

diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c
index 655b6d8..28b66f5 100644
--- a/app/test-pmd/csumonly.c
+++ b/app/test-pmd/csumonly.c
@@ -432,8 +432,8 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
}

/* Combine the packet header write. VLAN is not consider here */
-   mb->vlan_macip.f.l2_len = l2_len;
-   mb->vlan_macip.f.l3_len = l3_len;
+   mb->l2_len = l2_len;
+   mb->l3_len = l3_len;
mb->ol_flags = ol_flags;
}
nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue, pkts_burst, nb_rx);
diff --git a/app/test-pmd/flowgen.c b/app/test-pmd/flowgen.c
index 17dbf83..b091b6d 100644
--- a/app/test-pmd/flowgen.c
+++ b/app/test-pmd/flowgen.c
@@ -205,13 +205,13 @@ pkt_burst_flow_gen(struct fwd_stream *fs)
udp_hdr->dgram_len  = RTE_CPU_TO_BE_16(pkt_size -
   sizeof(*eth_hdr) -
   sizeof(*ip_hdr));
-   pkt->nb_segs= 1;
-   pkt->pkt_len= pkt_size;
-   pkt->ol_flags   = ol_flags;
-   pkt->vlan_macip.f.vlan_tci  = vlan_tci;
-   pkt->vlan_macip.f.l2_len= sizeof(struct ether_hdr);
-   pkt->vlan_macip.f.l3_len= sizeof(struct ipv4_hdr);
-   pkts_burst[nb_pkt]  = pkt;
+   pkt->nb_segs= 1;
+   pkt->pkt_len= pkt_size;
+   pkt->ol_flags   = ol_flags;
+   pkt->vlan_tci   = vlan_tci;
+   pkt->l2_len = sizeof(struct ether_hdr);
+   pkt->l3_len = sizeof(struct ipv4_hdr);
+   pkts_burst[nb_pkt]  = pkt;

next_flow = (next_flow + 1) % cfg_n_flows;
}
diff --git a/app/test-p

[dpdk-dev] TCP/IP stack for DPDK

2014-09-09 Thread Stephen Hemminger
Porting Linux stack to DPDK opens up a licensing can of worms.
Linux code is GPLv2, and DPDK code is BSD. Any combination of the two would
end up
being covered by the Linux GPLv2 license.

On Mon, Sep 8, 2014 at 11:30 PM, Vadim Suraev 
wrote:

> I've ported the Linux kernel TCP/IP stack to user space and integrated with
> DPDK,  the source and documentation and the roadmap will be published (and
> announced) within few days.
> Regards,
> Vadim
> On Sep 9, 2014 9:20 AM, "Matthew Hall"  wrote:
>
> > On Tue, Sep 09, 2014 at 08:49:44AM +0800, zimeiw wrote:
> > > I have porting major FreeBSD tcp/ip stack to dpdk. new tcp/ip stack is
> > based
> > > on dpdk rte_mbuf, rte_ring, rte_memory and rte_table. it is faster to
> > > forwarding packets.
> >
> > Hello,
> >
> > This is awesome work to be doing and badly needed to use DPDK for any L4
> > purposes where it is very limited. I'll be following your progress.
> >
> > You didn't mention your name, and compare your work with
> > https://github.com/rumpkernel/dpdk-rumptcpip/ , and talk about behavior
> /
> > performance, and how long you think it'll take. I'm curious if you can
> give
> > some more comments.
> >
> > I'm implementing an RX-side very basic stack myself... but I'm not using
> > BSD
> > standard APIs or doing TX-side like yours will have.
> >
> > Matthew.
> >
>


[dpdk-dev] TCP/IP stack for DPDK

2014-09-09 Thread Jim Thompson

> On Sep 9, 2014, at 5:16 AM, Alexander Nasonov  wrote:
> 
> Matthew Hall wrote:
>> However despite this issue, there are some cases where the Linux stack is 
>> greatly superior to the BSD one although normally the opposite is the 
>> case... 
>> AF_NETLINK for configuring 10,000+ IP addresses, especially for L4-L7 
>> performance testing, would be one possible example of this. Another 
>> potential 
>> example would be the BPF JIT compiler if you want to combine BPF filters 
>> with 
>> DPDK (something I'm doing right now in my own code actually).
> 
> BPF JIT is available in NetBSD too. It should be quite staightforward to
> enable it in the rump-dpdk kernel.


BPF JIT, or even pflua[1] should be straight-forward to put on top of DPDK.  
(It?s straight-forward to do on top of netmap.)

jim

[1] https://github.com/Igalia/pflua-bench


[dpdk-dev] [PATCH 03/13] mbuf: add packet_type field

2014-09-09 Thread Jim Thompson

> On Sep 8, 2014, at 4:17 AM, Olivier MATZ  wrote:
> 
> Hi Yerden,
> 
> On 09/08/2014 12:33 PM, Yerden Zhumabekov wrote:
>> 08.09.2014 16:17, Olivier MATZ ?:
 --- a/lib/librte_mbuf/rte_mbuf.h
 +++ b/lib/librte_mbuf/rte_mbuf.h
 @@ -146,7 +146,7 @@ struct rte_mbuf {
uint32_t reserved1; /**< Unused field. Required for padding */
 
/* remaining bytes are set on RX when pulling packet from descriptor */
 -  uint16_t reserved2; /**< Unused field. Required for padding */
 +  uint16_t packet_type;   /**< Type of packet, e.g. protocols used */
uint16_t data_len;  /**< Amount of data in segment buffer. */
uint32_t pkt_len;   /**< Total pkt len: sum of all segments. */
uint16_t l3_len:9;  /**< L3 (IP) Header Length. */
 
>>> This patch adds a new fields that nobody uses. So why should we add it ?
>> 
>> I would use it :)
>> It's useful to store the IP protocol number (UDP, TCP etc) and version
>> of IP (4, 6) and then relay packet to specific handler.
> 
> I'm not saying this field is useless. But even if it's useful
> for some applications like yours, it does not mean that it should go in
> the generic mbuf structure.
> 
> Also, for a new field, we should define who is in charge of filling it.
> Is is the driver? Does it mean that all drivers have to be modified to
> fill it? Or is it just a placeholder for applications? In this case,
> shouldn't we use application-specific metadata? In the other direction
> (TX), we would also need to define if this field must be filled by the
> application before transmitting a mbuf to a driver.


Funny, but these new fields (and extended mbuf) were prominent during the dpdk 
summit.

I think it?s going to be quite useful.



[dpdk-dev] TCP/IP stack for DPDK

2014-09-09 Thread Vadim Suraev
The licensing worms prevent IMHO only selling the source code, although,
porting may be useful
On Sep 9, 2014 5:54 PM, "Stephen Hemminger" 
wrote:

> Porting Linux stack to DPDK opens up a licensing can of worms.
> Linux code is GPLv2, and DPDK code is BSD. Any combination of the two
> would end up
> being covered by the Linux GPLv2 license.
>
> On Mon, Sep 8, 2014 at 11:30 PM, Vadim Suraev 
> wrote:
>
>> I've ported the Linux kernel TCP/IP stack to user space and integrated
>> with
>> DPDK,  the source and documentation and the roadmap will be published (and
>> announced) within few days.
>> Regards,
>> Vadim
>> On Sep 9, 2014 9:20 AM, "Matthew Hall"  wrote:
>>
>> > On Tue, Sep 09, 2014 at 08:49:44AM +0800, zimeiw wrote:
>> > > I have porting major FreeBSD tcp/ip stack to dpdk. new tcp/ip stack is
>> > based
>> > > on dpdk rte_mbuf, rte_ring, rte_memory and rte_table. it is faster to
>> > > forwarding packets.
>> >
>> > Hello,
>> >
>> > This is awesome work to be doing and badly needed to use DPDK for any L4
>> > purposes where it is very limited. I'll be following your progress.
>> >
>> > You didn't mention your name, and compare your work with
>> > https://github.com/rumpkernel/dpdk-rumptcpip/ , and talk about
>> behavior /
>> > performance, and how long you think it'll take. I'm curious if you can
>> give
>> > some more comments.
>> >
>> > I'm implementing an RX-side very basic stack myself... but I'm not using
>> > BSD
>> > standard APIs or doing TX-side like yours will have.
>> >
>> > Matthew.
>> >
>>
>
>


[dpdk-dev] [PATCH v3 6/6] mbuf: flatten struct vlan_macip into mbuf struct

2014-09-09 Thread De Lara Guarch, Pablo


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Bruce Richardson
> Sent: Tuesday, September 09, 2014 3:41 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v3 6/6] mbuf: flatten struct vlan_macip into
> mbuf struct
> 
> The vlan_macip structure combined a vlan tag id with l2 and l3 headers
> lengths for tracking offloads. However, this structure was only used as
> a unit by the e1000 and ixgbe drivers, not generally.
> 
> This patch removes the structure from the mbuf header and places the
> fields into the mbuf structure directly at the required point, without
> any net effect on the structure layout. This allows us to treat the vlan
> tags and header length fields as separate for future mbuf changes. The
> drivers which were written to use the combined structure still do so,
> using a driver-local definition of it.
> 
> Changes in V2:
> * None
> 
> Changes in V3:
> * minor comment cleanup following review by Olivier
> * reduce perf regression caused by splitting vlan_macip field. This is
>   done by providing a single uint16_t value to allow writing/clearing
>   the l2 and l3 lengths together. There is still a small perf hit to the
>   slow path TX due to the reads from vlan_tci and l2/l3 lengths being
>   separated. (<5% in my tests with testpmd with no extra params).
>   Unfortunately, this cannot be eliminated, without restoring the vlan
>   tags and l2/l3 lengths as a combined 32-bit field. This would prevent
>   us from ever looking to move those fields about and is an artificial tie
>   that applies only for performance in igb and ixgbe drivers. Therefore,
>   this patch keeps the vlan_tci field separate from the lengths as the
>   best solution going forward.
> 
> Signed-off-by: Bruce Richardson 
> V2 Acked-by: Olivier Matz 

V3 Acked-by: Pablo de Lara 



[dpdk-dev] TCP/IP stack for DPDK

2014-09-09 Thread Jim Thompson
Then you don?t understand licensing.

the GPL has  a requirement that you make one of two offers:

The fourth section for version 2 of the license and the seventh section of 
version 3 require that programs distributed as pre-compiled binaries are 
accompanied by a copy of the source code, or a written offer *valid for any 
third party* to obtain the source code via the same mechanism as the 
pre-compiled binary.

You can?t sell the source, you have to make it available, either with the 
binary, or to anyone who asks.

There are other terms and conditions with the GPL (patent licenses, etc.)

Jim

> On Sep 9, 2014, at 8:19 AM, Vadim Suraev  wrote:
> 
> The licensing worms prevent IMHO only selling the source code, although,
> porting may be useful
> On Sep 9, 2014 5:54 PM, "Stephen Hemminger" 
> wrote:
> 
>> Porting Linux stack to DPDK opens up a licensing can of worms.
>> Linux code is GPLv2, and DPDK code is BSD. Any combination of the two
>> would end up
>> being covered by the Linux GPLv2 license.
>> 
>> On Mon, Sep 8, 2014 at 11:30 PM, Vadim Suraev 
>> wrote:
>> 
>>> I've ported the Linux kernel TCP/IP stack to user space and integrated
>>> with
>>> DPDK,  the source and documentation and the roadmap will be published (and
>>> announced) within few days.
>>> Regards,
>>> Vadim
>>> On Sep 9, 2014 9:20 AM, "Matthew Hall"  wrote:
>>> 
 On Tue, Sep 09, 2014 at 08:49:44AM +0800, zimeiw wrote:
> I have porting major FreeBSD tcp/ip stack to dpdk. new tcp/ip stack is
 based
> on dpdk rte_mbuf, rte_ring, rte_memory and rte_table. it is faster to
> forwarding packets.
 
 Hello,
 
 This is awesome work to be doing and badly needed to use DPDK for any L4
 purposes where it is very limited. I'll be following your progress.
 
 You didn't mention your name, and compare your work with
 https://github.com/rumpkernel/dpdk-rumptcpip/ , and talk about
>>> behavior /
 performance, and how long you think it'll take. I'm curious if you can
>>> give
 some more comments.
 
 I'm implementing an RX-side very basic stack myself... but I'm not using
 BSD
 standard APIs or doing TX-side like yours will have.
 
 Matthew.
 
>>> 
>> 
>> 



[dpdk-dev] TCP/IP stack for DPDK

2014-09-09 Thread Vadim Suraev
#You can?t sell the source, you have to make it available, either with the
binary, or to anyone who asks#
But I didn't tell I want to sell it, and I open all the source
On Sep 9, 2014 6:26 PM, "Jim Thompson"  wrote:

> Then you don?t understand licensing.
>
> the GPL has  a requirement that you make one of two offers:
>
> The fourth section for version 2 of the license and the seventh section of
> version 3 require that programs distributed as pre-compiled binaries are
> accompanied by a copy of the source code, or a written offer *valid for any
> third party* to obtain the source code via the same mechanism as the
> pre-compiled binary.
>
> You can?t sell the source, you have to make it available, either with the
> binary, or to anyone who asks.
>
> There are other terms and conditions with the GPL (patent licenses, etc.)
>
> Jim
>
> On Sep 9, 2014, at 8:19 AM, Vadim Suraev  wrote:
>
> The licensing worms prevent IMHO only selling the source code, although,
> porting may be useful
> On Sep 9, 2014 5:54 PM, "Stephen Hemminger" 
> wrote:
>
> Porting Linux stack to DPDK opens up a licensing can of worms.
> Linux code is GPLv2, and DPDK code is BSD. Any combination of the two
> would end up
> being covered by the Linux GPLv2 license.
>
> On Mon, Sep 8, 2014 at 11:30 PM, Vadim Suraev 
> wrote:
>
> I've ported the Linux kernel TCP/IP stack to user space and integrated
> with
> DPDK,  the source and documentation and the roadmap will be published (and
> announced) within few days.
> Regards,
> Vadim
> On Sep 9, 2014 9:20 AM, "Matthew Hall"  wrote:
>
> On Tue, Sep 09, 2014 at 08:49:44AM +0800, zimeiw wrote:
>
> I have porting major FreeBSD tcp/ip stack to dpdk. new tcp/ip stack is
>
> based
>
> on dpdk rte_mbuf, rte_ring, rte_memory and rte_table. it is faster to
> forwarding packets.
>
>
> Hello,
>
> This is awesome work to be doing and badly needed to use DPDK for any L4
> purposes where it is very limited. I'll be following your progress.
>
> You didn't mention your name, and compare your work with
> https://github.com/rumpkernel/dpdk-rumptcpip/ , and talk about
>
> behavior /
>
> performance, and how long you think it'll take. I'm curious if you can
>
> give
>
> some more comments.
>
> I'm implementing an RX-side very basic stack myself... but I'm not using
> BSD
> standard APIs or doing TX-side like yours will have.
>
> Matthew.
>
>
>
>
>
>


[dpdk-dev] TCP/IP stack for DPDK

2014-09-09 Thread Jeff Shaw
On Tue, Sep 09, 2014 at 08:49:44AM +0800, zimeiw wrote:
> hi,
> 
> 
> I  have porting major FreeBSD tcp/ip stack to dpdk. new tcp/ip stack is based 
> on dpdk rte_mbuf, rte_ring, rte_memory and rte_table. it is faster to 
> forwarding packets.
> 
> Below feature are ready:
> 
> Netdp initialize
> Ether layer
> ARP
> IP layer
> Routing
> ICMP
> Commands for adding, deleting, showing IP address
> Commands for adding, deleting, showing static route
> Next planning:
> Porting udp to netdp.
> 
> Porting tcp to netdp.
> Porting socket to netdp.
> 
> 
> You can find the code from the link: https://github.com/dpdk-net/netdp
> 
> 
> 
Hi zimeiw, when will you be posting the source code to github? I can only find 
a static lib and some header files.
Thanks,
Jeff


[dpdk-dev] TCP/IP stack for DPDK

2014-09-09 Thread Alexander Nasonov
Jim Thompson wrote:
> BPF JIT, or even pflua[1] should be straight-forward to put on top of DPDK.  
> (It?s straight-forward to do on top of netmap.)
> 
> jim
> 
> [1] https://github.com/Igalia/pflua-bench

Glad to see LuaJIT here. I hope to DPDK will eventually add support for
LuaJIT.

Alex


[dpdk-dev] Defaults for rte_hash

2014-09-09 Thread Matthew Hall
On Tue, Sep 09, 2014 at 11:42:40AM +, De Lara Guarch, Pablo wrote:
> That 4 is not shifted, so it is actually 4 entries/bucket. Actually, the 
> maximum number of entries you can use is 16, as bucket will be as big as a 
> cache line. However, regardless the number of entries, memory size will 
> remain the same, but using 4 entries/bucket, with 16-byte key, all keys 
> stored for a bucket will fit in a cache line, so performance looks to be 
> better in this case (although a non-optimal hash function could lead not to 
> be able to store all keys, as chances to fill a bucket are higher). Anyway, 
> for this example, 4 entries/bucket looks a good number to me.

So, a general purpose hash usually has some kind of conflict resolution when a 
bucket is full rather than just tossing out entries. It could be open 
addressing, chaining, secondary hashing, etc.

If I'm putting security indicators into a bucket and the buckets just toss 
stuff out without warning that's a security problem. Same thing could be true 
for firewall tables.

Also, if we're assuming a 16-byte key, what happens when I want to do matching 
against www.badness.com or www.this-is-a-really-long-malware-domain.net ?

Did anybody have a performant general purpose hash table for DPDK that doesn't 
have problems with bigger keys or depth issues in a bucket?

Matthew.


[dpdk-dev] TCP/IP stack for DPDK

2014-09-09 Thread Matthew Hall
On Tue, Sep 09, 2014 at 07:54:19AM -0700, Stephen Hemminger wrote:
> Porting Linux stack to DPDK opens up a licensing can of worms.
> Linux code is GPLv2, and DPDK code is BSD. Any combination of the two would
> end up
> being covered by the Linux GPLv2 license.

It would be a can of worms for a closed-source app. Which is why some others 
have used the BSD stack. But it doesn't mean it isn't useful code.

Matthew.


[dpdk-dev] TCP/IP stack for DPDK

2014-09-09 Thread Matthew Hall
On Tue, Sep 09, 2014 at 08:00:32AM -0700, Jim Thompson wrote:
> BPF JIT, or even pflua[1] should be straight-forward to put on top of DPDK.  
> (It?s straight-forward to do on top of netmap.)
> 
> jim

The pflua guys made a user-space copy of Linux BPF JIT. I'm planning to use 
that because it was almost as fast as pflua with a lot fewer usage headaches 
and dependencies.

I'm making an MIT licensed app... so it isn't an issue for me personally if 
there is some GPL2 Linux code present. I don't think anybody made a non-rump 
version of the BSD one yet or I'd use that... I'm trying not to stray too far 
from the app's original purposes until it has some working features present.

Until that time comes, I just started out with libpcap offline mode BPF for 
development purposes because it's standard and already available, and allows 
operations upon raw packet pointers with no issues at all.

Matthew.


[dpdk-dev] TCP/IP stack for DPDK

2014-09-09 Thread Alexander Nasonov
Matthew Hall wrote:
> The pflua guys made a user-space copy of Linux BPF JIT. I'm planning to use 
> that because it was almost as fast as pflua with a lot fewer usage headaches 
> and dependencies.

Ah, I see.

> I'm making an MIT licensed app... so it isn't an issue for me personally if 
> there is some GPL2 Linux code present. I don't think anybody made a non-rump 
> version of the BSD one yet or I'd use that... I'm trying not to stray too far 
> from the app's original purposes until it has some working features present.

sys/net/bpfjit.c in NetBSD should be very easy to adapt to Linux.
I was often testing it on Linux in userspace (without mbuf support).
At the moment, I'm only allowed to work on some NetBSD projects and
I can't adapt bpfjit to anything outside of NetBSD but when I last
compiled bpfjit on Linux, it took me about a minute to fix compilation.

Please try github version, it's not up-to-date but it worked on Linux:

https://github.com/alnsn/bpfjit

Alex


[dpdk-dev] TCP/IP stack for DPDK

2014-09-09 Thread Matthew Hall
On Tue, Sep 09, 2014 at 10:30:01PM +0100, Alexander Nasonov wrote:
> sys/net/bpfjit.c in NetBSD should be very easy to adapt to Linux.
> I was often testing it on Linux in userspace (without mbuf support).
> At the moment, I'm only allowed to work on some NetBSD projects and
> I can't adapt bpfjit to anything outside of NetBSD but when I last
> compiled bpfjit on Linux, it took me about a minute to fix compilation.
> 
> Please try github version, it's not up-to-date but it worked on Linux:
> 
> https://github.com/alnsn/bpfjit
> 
> Alex

Alex,

You rock, thanks for supplying this, I'll be sure to use it along with 
upstream changes from BSD to get a friendlier license for users of my code, 
whoever they might eventually be.

If I forked this from you and updated it to the latest code periodically for 
performance, security, and features, would you accept the pull requests?

Matthew.


[dpdk-dev] TCP/IP stack for DPDK

2014-09-09 Thread Alexander Nasonov
Matthew Hall wrote:
> Alex,
> 
> You rock, thanks for supplying this, I'll be sure to use it along with 
> upstream changes from BSD to get a friendlier license for users of my code, 
> whoever they might eventually be.
> 
> If I forked this from you and updated it to the latest code periodically for 
> performance, security, and features, would you accept the pull requests?

I think it shouldn't be a problem.

PS my stuff depends on sljit which is also BSD-licensed.

Alex