[dpdk-dev] [PATCH 2/7] rte_sched: use reserved field to allow more VLAN's

2015-02-03 Thread Ananyev, Konstantin


> -Original Message-
> From: Stephen Hemminger [mailto:stephen at networkplumber.org]
> Sent: Monday, February 02, 2015 10:32 PM
> To: Ananyev, Konstantin
> Cc: dev at dpdk.org; Stephen Hemminger
> Subject: Re: [dpdk-dev] [PATCH 2/7] rte_sched: use reserved field to allow 
> more VLAN's
> 
> On Mon, 2 Feb 2015 14:21:58 +
> "Ananyev, Konstantin"  wrote:
> 
> > Hi Stephen,
> >
> > > -Original Message-
> > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Stephen Hemminger
> > > Sent: Sunday, February 01, 2015 10:04 AM
> > > To: dev at dpdk.org
> > > Cc: Stephen Hemminger
> > > Subject: [dpdk-dev] [PATCH 2/7] rte_sched: use reserved field to allow 
> > > more VLAN's
> > >
> > > From: Stephen Hemminger 
> > >
> > > The QoS subport is limited to 8 bits in original code.
> > > But customers demanded ability to support full number of VLAN's (4096)
> > > therefore use reserved field of mbuf for this field instead
> > > of packing inside other classify portions.
> > >
> > > Signed-off-by: Stephen Hemminger 
> > > ---
> > >  lib/librte_mbuf/rte_mbuf.h   |  2 +-
> > >  lib/librte_sched/rte_sched.h | 31 ---
> > >  2 files changed, 21 insertions(+), 12 deletions(-)
> > >
> > > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> > > index 16059c6..b6b08f4 100644
> > > --- a/lib/librte_mbuf/rte_mbuf.h
> > > +++ b/lib/librte_mbuf/rte_mbuf.h
> > > @@ -242,7 +242,7 @@ struct rte_mbuf {
> > >   uint16_t data_len;/**< Amount of data in segment buffer. */
> > >   uint32_t pkt_len; /**< Total pkt len: sum of all segments. */
> > >   uint16_t vlan_tci;/**< VLAN Tag Control Identifier (CPU order) 
> > > */
> > > - uint16_t reserved;
> > > + uint16_t subport; /**< SCHED Subport ID */
> >
> > As I remember, we keep these reserved 2 bytes for RX 2 double vlan tag 
> > offload.
> > So probably not a good idea to use it for something that is rte_sched 
> > specific.
> > If you really need extra space fo rte_sched fields inside mbuf, can't you 
> > move it into second cache line?
> > Or might be you can use userdata, to either store sched information 
> > directly, or as a pointer to some external memory  location?
> > Another possibility - union mbuf.hash is 64bit now, while sched uses only 
> > 32bits.
> > So might be you can rearrange it to make sched 64bits too?
> > Something like:
> >
> > union {
> > uint32_t rss; /**< RSS hash result if RSS enabled */
> > struct {
> > union {
> > struct {
> > uint16_t hash;
> > uint16_t id;
> > };
> > uint32_t lo;
> > /**< Second 4 flexible bytes */
> > };
> > uint32_t hi;
> > /**< First 4 flexible bytes or FD ID, dependent on
> >  PKT_RX_FDIR_* flag in ol_flags. */
> > } fdir;   /**< Filter identifier if FDIR enabled */
> > -uint32_t sched;   /**< Hierarchical scheduler */
> > +   uint64_t sched;   /**< Hierarchical scheduler */
> > uint32_t usr; /**< User defined tags. See 
> > @rte_distributor_p
> > rocess */
> > } hash;   /**< hash information */
> 
> Increasing the size of that union totally breaks other alignment and is a not 
> starter.

struct fdir already is 64bit width.
Though yes, we can't use uint64_t directly, as it would break alignment - 
totally forgot about it.
But nothing prevents you from doing:

struct { uint32_t lo, hi;} sched;

 right?

> 
> The reserved field is not use upstream merged code and therefore is fair game.

As you can see that reserved field lies inside first 16B from 
rx_descriptor_fields1;
So hopefully we will be able to load it from RX descriptors in one SSE 
load/store together with 
other RXD fields.
Anyway these 16B are supposed to contain fields that are filled by RXD (as the 
name suggests).

> First to claim it wins.

Wins what?
Sorry, but you can't pollute mbuf structure with whatever you like.
So NACK for now.

Konstantin




[dpdk-dev] mmap failed: Cannot allocate memory when init dpdk eal

2015-02-03 Thread Zhang, Jerry
Hi,

   Please provide the environment info such as kernel version, DPDK version and 
the reproduce steps in detail.

   Thanks!

>-Original Message-
>From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of zhangsha (A)
>Sent: Friday, January 30, 2015 7:40 PM
>To: dev at dpdk.org
>Subject: [dpdk-dev] mmap failed: Cannot allocate memory when init dpdk eal
>
>Hi ?all
>
>I am suffering from the problem mmap failed as followed when init dpdk eal.
>
>Fri Jan 30 09:03:29 2015:EAL: Setting up memory...
>Fri Jan 30 09:03:34 2015:EAL: map_all_hugepages(): mmap failed: Cannot
>allocate memory Fri Jan 30 09:03:34 2015:EAL: Failed to mmap 2 MB hugepages
>Fri Jan 30 09:03:34 2015:EAL: Cannot init memory
>
>Before I run the demo, the free hugepages of my host is :
>
>cat /proc/meminfo
>MemTotal:   132117056 kB
>MemFree:122040292 kB
>Buffers:   10984 kB
>Cached:   123056 kB
>SwapCached:0 kB
>Active:   120812 kB
>Inactive:  85860 kB
>Active(anon):  79488 kB
>Inactive(anon):  364 kB
>Active(file):  41324 kB
>Inactive(file):85496 kB
>Unevictable:   23576 kB
>Mlocked:   23576 kB
>SwapTotal: 0 kB
>SwapFree:  0 kB
>Dirty:  2576 kB
>Writeback: 0 kB
>AnonPages: 96236 kB
>Mapped:19936 kB
>Shmem:   552 kB
>Slab: 101344 kB
>SReclaimable:  24164 kB
>SUnreclaim:77180 kB
>KernelStack:2544 kB
>PageTables: 4180 kB
>NFS_Unstable:  0 kB
>Bounce:0 kB
>WritebackTmp:  0 kB
>CommitLimit:61864224 kB
>Committed_AS: 585844 kB
>VmallocTotal:   34359738367 kB
>VmallocUsed:  518656 kB
>VmallocChunk:   34292133264 kB
>HardwareCorrupted: 0 kB
>AnonHugePages:  4096 kB
>HugePages_Total:4096
>HugePages_Free: 4096
>HugePages_Rsvd:0
>HugePages_Surp:0
>Hugepagesize:   2048 kB
>DirectMap4k:   96256 kB
>DirectMap2M: 6178816 kB
>DirectMap1G:127926272 kB
>
>And after the demo executed, I got the hugepages like this:
>
>cat /proc/meminfo
>MemTotal:   132117056 kB
>MemFree:117325180 kB
>Buffers:   33508 kB
>Cached:   721912 kB
>SwapCached:0 kB
>Active:  4217712 kB
>Inactive: 540956 kB
>Active(anon):4019068 kB
>Inactive(anon):   121136 kB
>Active(file): 198644 kB
>Inactive(file):   419820 kB
>Unevictable:   23908 kB
>Mlocked:   23908 kB
>SwapTotal: 0 kB
>SwapFree:  0 kB
>Dirty:  2856 kB
>Writeback: 0 kB
>AnonPages:   4035184 kB
>Mapped:   160292 kB
>Shmem:122100 kB
>Slab: 177908 kB
>SReclaimable:  64808 kB
>SUnreclaim:   113100 kB
>KernelStack:7560 kB
>PageTables:62128 kB
>NFS_Unstable:  0 kB
>Bounce:0 kB
>WritebackTmp:  0 kB
>CommitLimit:61864224 kB
>Committed_AS:8789664 kB
>VmallocTotal:   34359738367 kB
>VmallocUsed:  527296 kB
>VmallocChunk:   34292122604 kB
>HardwareCorrupted: 0 kB
>AnonHugePages:262144 kB
>HugePages_Total:4096
>HugePages_Free: 2048
>HugePages_Rsvd:0
>HugePages_Surp:0
>Hugepagesize:   2048 kB
>DirectMap4k:  141312 kB
>DirectMap2M: 9279488 kB
>DirectMap1G:124780544 kB
>
>Only the hugepages beyond to node1 was mapped. I was told host(having 64bit
>OS) cannot allocate memory while node0 has 2048 free hugepages,why?
>Dose anyone encountered the similar problem ever?
>Any response will be appreciated!
>Thanks!
>



[dpdk-dev] [PATCH v6 12/13] eal/pci: Add rte_eal_dev_attach/detach() functions

2015-02-03 Thread Tetsuya Mukawa
On 2015/02/02 15:22, Qiu, Michael wrote:
> On 2/2/2015 1:43 PM, Qiu, Michael wrote:
>> On 2/1/2015 12:02 PM, Tetsuya Mukawa wrote:
>>> These functions are used for attaching or detaching a port.
>>> When rte_eal_dev_attach() is called, the function tries to realize the
>>> device name as pci address. If this is done successfully,
>>> rte_eal_dev_attach() will attach physical device port. If not, attaches
>>> virtual devive port.
>>> When rte_eal_dev_detach() is called, the function gets the device type
>>> of this port to know whether the port is came from physical or virtual.
>>> And then specific detaching function will be called.
>>>
>>> v5:
>>> - Change function names like below.
>>>   rte_eal_dev_find_and_invoke() to rte_eal_vdev_find_and_invoke().
>>>   rte_eal_dev_invoke() to rte_eal_vdev_invoke().
>>> - Add code to handle a return value of rte_eal_devargs_remove().
>>> - Fix pci address format in rte_eal_dev_detach().
>>> v4:
>>> - Fix comment.
>>> - Add error checking.
>>> - Fix indent of 'if' statement.
>>> - Change function name.
>>>
> [...]
>
>>> +/* attach the new virtual device, then store port_id of the device */
>>> +static int
>>> +rte_eal_dev_attach_vdev(const char *vdevargs, uint8_t *port_id)
>>> +{
>>> +   char *args;
>>> +   uint8_t new_port_id;
>>> +   struct rte_eth_dev devs[RTE_MAX_ETHPORTS];
>>> +
>>> +   if ((vdevargs == NULL) || (port_id == NULL))
>>> +   goto err0;
>>> +
>>> +   args = strdup(vdevargs);
>>> +   if (args == NULL)
>>> +   goto err0;
>>> +
>>> +   /* save current port status */
>>> +   rte_eth_dev_save(devs);
>>> +   /* add the vdevargs to devargs_list */
>>> +   if (rte_eal_devargs_add(RTE_DEVTYPE_VIRTUAL, args))
>>> +   goto err1;
>>> +   /* parse vdevargs, then retrieve device name */
>>> +   get_vdev_name(args);
>>> +   /* walk around dev_driver_list to find the driver of the device,
>>> +* then invoke probe function o the driver */
>>> +   if (rte_eal_vdev_find_and_invoke(args, RTE_EAL_INVOKE_TYPE_PROBE))
>>> +   goto err2;
>>> +   /* get port_id enabled by above procedures */
>>> +   if (rte_eth_dev_get_changed_port(devs, &new_port_id))
>>> +   goto err2;
>>> +
>>> +   free(args);
>>> +   *port_id = new_port_id;
>>> +   return 0;
>>> +err2:
>>> +   rte_eal_devargs_remove(RTE_DEVTYPE_VIRTUAL, args);
>>> +err1:
>>> +   free(args);
>>> +err0:
>>> +   RTE_LOG(ERR, EAL, "Drver, cannot detach the device\n");
> Here "cannot detach the device\n" should be "cannot attach the device" I
> think.

Hi Michael,

Thanks, I will fix above error message.
Also I will fix my "Drver" typos.

Tetsuya

>> Here also "Drver",
>>
>>
>> Thanks,
>> Michael
>>> +   return -1;
>>> +}
>>> +
>>> +/* detach the new virtual device, then store the name of the device */
>>> +static int
>>> +rte_eal_dev_detach_vdev(uint8_t port_id, char *vdevname)
>>> +{
>>> +   char name[RTE_ETH_NAME_MAX_LEN];
>>> +
>>> +   if (vdevname == NULL)
>>> +   goto err;
>>> +
>>> +   /* check whether the driver supports detach feature, or not */
>>> +   if (rte_eth_dev_check_detachable(port_id))
>>> +   goto err;
>>> +
>>> +   /* get device name by port id */
>>> +   if (rte_eth_dev_get_name_by_port(port_id, name))
>>> +   goto err;
>>> +   /* walk around dev_driver_list to find the driver of the device,
>>> +* then invoke close function o the driver */
>>> +   if (rte_eal_vdev_find_and_invoke(name, RTE_EAL_INVOKE_TYPE_CLOSE))
>>> +   goto err;
>>> +   /* remove the vdevname from devargs_list */
>>> +   if (rte_eal_devargs_remove(RTE_DEVTYPE_VIRTUAL, name))
>>> +   goto err;
>>> +
>>> +   strncpy(vdevname, name, sizeof(name));
>>> +   return 0;
>>> +err:
>>> +   RTE_LOG(ERR, EAL, "Drver, cannot detach the device\n");
>>> +   return -1;
>>> +}
>>> +
>>> +/* attach the new device, then store port_id of the device */
>>> +int
>>> +rte_eal_dev_attach(const char *devargs, uint8_t *port_id)
>>> +{
>>> +   struct rte_pci_addr addr;
>>> +
>>> +   if ((devargs == NULL) || (port_id == NULL))
>>> +   return -EINVAL;
>>> +
>>> +   if (eal_parse_pci_DomBDF(devargs, &addr) == 0)
>>> +   return rte_eal_dev_attach_pdev(&addr, port_id);
>>> +   else
>>> +   return rte_eal_dev_attach_vdev(devargs, port_id);
>>> +}
>>> +
>>> +/* detach the device, then store the name of the device */
>>> +int
>>> +rte_eal_dev_detach(uint8_t port_id, char *name)
>>> +{
>>> +   struct rte_pci_addr addr;
>>> +   int ret;
>>> +
>>> +   if (name == NULL)
>>> +   return -EINVAL;
>>> +
>>> +   if (rte_eth_dev_get_device_type(port_id) == RTE_ETH_DEV_PHYSICAL) {
>>> +   ret = rte_eth_dev_get_addr_by_port(port_id, &addr);
>>> +   if (ret < 0)
>>> +   return ret;
>>> +
>>> +   ret = rte_eal_dev_detach_pdev(port_id, &addr);
>>> +   if (ret == 0)
>>> +   snprintf(name, RTE_ETH_NAME_MAX_LEN,
>>> +   "%04x:%02x:%02x.%d",
>>> +   addr.domai

[dpdk-dev] [PATCH v6] testpmd: Add port hotplug support

2015-02-03 Thread Tetsuya Mukawa
On 2015/02/02 20:33, Iremonger, Bernard wrote
>>  /*
>>   * Work-around of a compilation error with ICC on invocations of the
>>   * rte_be_to_cpu_16() function.
>> diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
>> b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
>> index 218835a..1cacbcf 100644
>> --- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
>> +++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
> Hi Tetsuya,
>
> The doc changes should be in separate commit using the "doc:   explaination"  
>  commit line.
>
>> @@ -808,6 +808,63 @@ The following sections show functions for configuring 
>> ports.
>>
>>  Port configuration changes only become active when forwarding is 
>> started/restarted.
>>
>> +port attach
>> +~~~
>> +
>> +Attach a port specified by pci address or virtual device args.
>> +
>> +To attach a new pci device, the device should be recognized by kernel first.
>> +Then it should be moved under DPDK management.
>> +Finally the port can be attached to testpmd.
>> +On the other hand, to attach a port created by virtual device, above steps 
>> are not needed.
>> +
>> +port attach (identifier)
>> +
>> +For example, to attach a port that pci address is :02:00.0.
> Reword " port that pci address is "  to "port whose pci address is"

Hi Bernard,

Thanks, I will fix below comments like your suggestion.

Tetsuya

>> +
>> +.. code-block:: console
>> +
>> +testpmd> port attach :02:00.0
>> +Attaching a new port...
>> +... snip ...
>> +Port 0 is attached. Now total ports is 1
>> +Done
>> +
>> +For example, to attach a port created by pcap PMD.
>> +
>> +.. code-block:: console
>> +
>> +testpmd> port attach eth_pcap0,iface=eth0
>> +Attaching a new port...
>> +... snip ...
>> +Port 0 is attached. Now total ports is 1
>> +Done
>> +
>> +In this case, identifier is "eth_pcap0,iface=eth0".
>> +This identifier format is the same as "--vdev" format of DPDK applications.
>> +
>> +port detach
>> +~~~
>> +
>> +Detach a specific port.
>> +
>> +Before detaching a port, the port should be closed.
>> +Also to remove a pci device completely from the system, first detach the 
>> port from testpmd.
>> +Then the device should be moved under kernel management.
>> +Finally the device can be remove using kernel pci hotplug functionality.
> Reword "remove" to "removed"
>
>> +On the other hand, to remove a port created by virtual device, above steps 
>> are not needed.
> Reword " created by virtual device" to "created by a virtual device"
>
>> +
>> +port detach (port_id)
>> +
>> +For example, to detach a port 0.
>> +
>> +.. code-block:: console
>> +
>> +testpmd> port detach 0
>> +Detaching a port...
>> +... snip ...
>> +Done
>> +
>>  port start
>>  ~~
>>
>> --
>> 1.9.1
> Regards,
>
> Bernard.
>



[dpdk-dev] [PATCH v6] testpmd: Add port hotplug support

2015-02-03 Thread Tetsuya Mukawa
On 2015/02/02 20:57, Thomas Monjalon wrote:
> 2015-02-02 11:33, Iremonger, Bernard:
>> From: Tetsuya Mukawa [mailto:mukawa at igel.co.jp]
>>> --- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
>>> +++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
>> Hi Tetsuya,
>>
>> The doc changes should be in separate commit using the "doc:   explaination" 
>>   commit line.
> I agree that new docs should be in a separate commit.
> Though when updating some code (like here), it's a good idea to update the doc
> in the same commit. Then the change is atomic.
>
> If you want to figure which patches are updating the doc, it should be 
> possible
> to have a filter on "+++ b/doc/".
>

Hi Thomas,

I appreciate your comment. I will follow this guideline.

Thanks,
Tetsuya


[dpdk-dev] site down?

2015-02-03 Thread Masaru Oki
I got below message:

myhost:~/src/dpdk$ git pull
remote: error: inflate: data stream error (incorrect header check)
remote: error: corrupt loose object 'a09f3e4c50467512970519943d26d9c5753584e0'
remote: fatal: failed to read object
a09f3e4c50467512970519943d26d9c5753584e0: Operation not permitted
remote: aborting due to possible repository corruption on the remote side.
fatal: protocol error: bad pack header

please 'git fsck' on dpdk.org.
http://stackoverflow.com/questions/4170317/git-pull-error-remote-object-is-corrupted

thank you.

2015-02-03 6:55 GMT+09:00 Thomas Monjalon :
> 2015-02-02 20:10, Vipin Agrawal:
>> I?ve been trying to connect to download the 1.6 version.
>
> You should try to download a newer version :)
>
>> Does anybody have a status on dpdk.org?
>
> Yes it was down but now the problem seems to be fixed.
> We are going to investigate why the kernel has crashed.
> It may be due to a recent upgrade of the allocated resources.
>
> Sorry for the inconvenience
> --
> Thomas


[dpdk-dev] [PATCH v9 5/5] app/testpmd: add commands to support hash functions

2015-02-03 Thread Zhang, Helin
Hi Thomas

Yes, I agree with you. Documentation is needed. I will update it soon later 
together with others I need to update. Thanks for your reminder!

Regards,
Helin

> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Monday, February 2, 2015 10:57 PM
> To: Zhang, Helin
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v9 5/5] app/testpmd: add commands to support
> hash functions
> 
> 2015-01-22 15:36, Helin Zhang:
> > To demonstrate the hash filter control, commands are added.
> > They are,
> > - get_sym_hash_ena_per_port
> > - set_sym_hash_ena_per_port
> > - get_hash_global_config
> > - set_hash_global_config
> >
> > Signed-off-by: Helin Zhang 
> > ---
> >  app/test-pmd/cmdline.c | 333
> > +
> >  1 file changed, 333 insertions(+)
> 
> The new testpmd functions should be documented in
> doc/guides/testpmd_app_ug/testpmd_funcs.rst.
> The patch won't be blocked waiting for documentation because it is an old
> patchset and the doc requirement is newer.
> Please, don't forget to submit a doc patch.
> --
> Thomas


[dpdk-dev] [PATCH v6 10/13] eal/pci: Cleanup pci driver initialization code

2015-02-03 Thread Qiu, Michael
On 2/1/2015 12:02 PM, Tetsuya Mukawa wrote:
> - Add rte_eal_pci_close_one_dirver()
>   The function is used for closing the specified driver and device.
> - Add pci_invoke_all_drivers()
>   The function is based on pci_probe_all_drivers. But it can not only
>   probe but also close drivers.
> - Add pci_close_all_drivers()
>   The function tries to find a driver for the specified device, and
>   then close the driver.
> - Add rte_eal_pci_probe_one() and rte_eal_pci_close_one()
>   The functions are used for probe and close a device.
>   First the function tries to find a device that has the specfied
>   PCI address. Then, probe or close the device.
>
> v5:
> - Remove RTE_EAL_INVOKE_TYPE_UNKNOWN, because it's unused.
> v4:
> - Fix paramerter checking.
> - Fix indent of 'if' statement.
>
> Signed-off-by: Tetsuya Mukawa 
> ---
>  lib/librte_eal/common/eal_common_pci.c  | 90 
> +
>  lib/librte_eal/common/eal_private.h | 24 +
>  lib/librte_eal/common/include/rte_pci.h | 33 
>  lib/librte_eal/linuxapp/eal/eal_pci.c   | 69 +
>  4 files changed, 206 insertions(+), 10 deletions(-)
>
> diff --git a/lib/librte_eal/common/eal_common_pci.c 
> b/lib/librte_eal/common/eal_common_pci.c
> index a89f5c3..7c9b8c5 100644
> --- a/lib/librte_eal/common/eal_common_pci.c
> +++ b/lib/librte_eal/common/eal_common_pci.c
> @@ -99,19 +99,27 @@ static struct rte_devargs *pci_devargs_lookup(struct 
> rte_pci_device *dev)
>   return NULL;
>  }
>  
> -/*
> - * If vendor/device ID match, call the devinit() function of all
> - * registered driver for the given device. Return -1 if initialization
> - * failed, return 1 if no driver is found for this device.
> - */
>  static int
> -pci_probe_all_drivers(struct rte_pci_device *dev)
> +pci_invoke_all_drivers(struct rte_pci_device *dev,
> + enum rte_eal_invoke_type type)
>  {
>   struct rte_pci_driver *dr = NULL;
> - int rc;
> + int rc = 0;
> +
> + if ((dev == NULL) || (type >= RTE_EAL_INVOKE_TYPE_MAX))
> + return -1;
>  
>   TAILQ_FOREACH(dr, &pci_driver_list, next) {
> - rc = rte_eal_pci_probe_one_driver(dr, dev);
> + switch (type) {
> + case RTE_EAL_INVOKE_TYPE_PROBE:
> + rc = rte_eal_pci_probe_one_driver(dr, dev);
> + break;
> + case RTE_EAL_INVOKE_TYPE_CLOSE:
> + rc = rte_eal_pci_close_one_driver(dr, dev);
> + break;
> + default:
> + return -1;
> + }
>   if (rc < 0)
>   /* negative value is an error */
>   return -1;
> @@ -123,6 +131,66 @@ pci_probe_all_drivers(struct rte_pci_device *dev)
>   return 1;
>  }
>  
> +#ifdef ENABLE_HOTPLUG
> +static int
> +rte_eal_pci_invoke_one(struct rte_pci_addr *addr,
> + enum rte_eal_invoke_type type)
> +{
> + struct rte_pci_device *dev = NULL;
> + int ret = 0;
> +
> + if ((addr == NULL) || (type >= RTE_EAL_INVOKE_TYPE_MAX))
> + return -1;
> +
> + TAILQ_FOREACH(dev, &pci_device_list, next) {
> + if (eal_compare_pci_addr(&dev->addr, addr))
> + continue;
> +
> + ret = pci_invoke_all_drivers(dev, type);
> + if (ret < 0)
> + goto invoke_err_return;
> +
> + if (type == RTE_EAL_INVOKE_TYPE_CLOSE)
> + goto remove_dev;
> +
> + return 0;
> + }
> +
> + return -1;
> +
> +invoke_err_return:
> + RTE_LOG(WARNING, EAL, "Requested device " PCI_PRI_FMT
> + " cannot be used\n", dev->addr.domain, dev->addr.bus,
> + dev->addr.devid, dev->addr.function);
> + return -1;
> +
> +remove_dev:
> + TAILQ_REMOVE(&pci_device_list, dev, next);
> + return 0;
> +}
> +
> +
> +/*
> + * Find the pci device specified by pci address, then invoke probe function 
> of
> + * the driver of the devive.
> + */
> +int
> +rte_eal_pci_probe_one(struct rte_pci_addr *addr)
> +{
> + return rte_eal_pci_invoke_one(addr, RTE_EAL_INVOKE_TYPE_PROBE);
> +}
> +
> +/*
> + * Find the pci device specified by pci address, then invoke close function 
> of
> + * the driver of the devive.
> + */
> +int
> +rte_eal_pci_close_one(struct rte_pci_addr *addr)
> +{
> + return rte_eal_pci_invoke_one(addr, RTE_EAL_INVOKE_TYPE_CLOSE);
> +}
> +#endif /* ENABLE_HOTPLUG */
> +
>  /*
>   * Scan the content of the PCI bus, and call the devinit() function for
>   * all registered drivers that have a matching entry in its id_table
> @@ -148,10 +216,12 @@ rte_eal_pci_probe(void)
>  
>   /* probe all or only whitelisted devices */
>   if (probe_all)
> - ret = pci_probe_all_drivers(dev);
> + ret = pci_invoke_all_drivers(dev,
> + RTE_EAL_INVOKE_TYPE_PROBE);
>   

[dpdk-dev] [PATCH v6 07/13] ethdev: Add functions that will be used by port hotplug functions

2015-02-03 Thread Qiu, Michael
On 2/1/2015 12:02 PM, Tetsuya Mukawa wrote:
> The patch adds following functions.
>
> - rte_eth_dev_save()
>   The function is used for saving current rte_eth_dev structures.
> - rte_eth_dev_get_changed_port()
>   The function receives the rte_eth_dev structures, then compare
>   these with current values to know which port is actually
>   attached or detached.
> - rte_eth_dev_get_addr_by_port()
>   The function returns a pci address of a ethdev specified by port
>   identifier.
> - rte_eth_dev_get_port_by_addr()
>   The function returns a port identifier of a ethdev specified by
>   pci address.
> - rte_eth_dev_get_name_by_port()
>   The function returns a unique identifier name of a ethdev
>   specified by port identifier.
> - Add rte_eth_dev_check_detachable()
>   The function returns whether a PMD supports detach function.
>
> Also the patch changes scope of rte_eth_dev_allocated() to global.
> This function will be called by virtual PMDs to support port hotplug.
> So change scope of the function to global.
>
> v5:
> - Fix return value of below functions.
>   rte_eth_dev_get_changed_port().
>   rte_eth_dev_get_port_by_addr().
> v4:
> - Add paramerter checking.
> v3:
> - Fix if-condition bug while comparing pci addresses.
> - Add error checking codes.
> Reported-by: Mark Enright 
>
> Signed-off-by: Tetsuya Mukawa 
> ---
>  lib/librte_ether/rte_ethdev.c | 98 
> ++-
>  lib/librte_ether/rte_ethdev.h | 80 +++
>  2 files changed, 177 insertions(+), 1 deletion(-)
>
> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> index 7bed901..5aded10 100644
> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> @@ -206,7 +206,7 @@ rte_eth_dev_data_alloc(void)
>   RTE_MAX_ETHPORTS * sizeof(*rte_eth_dev_data));
>  }
>  
> -static struct rte_eth_dev *
> +struct rte_eth_dev *
>  rte_eth_dev_allocated(const char *name)
>  {
>   unsigned i;
> @@ -426,6 +426,102 @@ rte_eth_dev_count(void)
>   return (nb_ports);
>  }
>  
> +void
> +rte_eth_dev_save(struct rte_eth_dev *devs)
> +{
> + if (devs == NULL)
> + return;
> +
> + /* save current rte_eth_devices */
> + memcpy(devs, rte_eth_devices,
> + sizeof(struct rte_eth_dev) * RTE_MAX_ETHPORTS);
> +}
> +
> +int
> +rte_eth_dev_get_changed_port(struct rte_eth_dev *devs, uint8_t *port_id)
> +{
> + if ((devs == NULL) || (port_id == NULL))
> + return -EINVAL;
> +
> + /* check which port was attached or detached */
> + for (*port_id = 0; *port_id < RTE_MAX_ETHPORTS; (*port_id)++, devs++) {
> + if (rte_eth_devices[*port_id].attached ^ devs->attached)
> + return 0;
> + }
> + return -ENODEV;
> +}
> +
> +int
> +rte_eth_dev_get_addr_by_port(uint8_t port_id, struct rte_pci_addr *addr)
> +{
> + if (rte_eth_dev_validate_port(port_id, TRACE) == DEV_INVALID)
> + return -EINVAL;
> +
> + if (addr == NULL) {
> + PMD_DEBUG_TRACE("Null pointer is specified\n");
> + return -EINVAL;
> + }
> +
> + *addr = rte_eth_devices[port_id].pci_dev->addr;
> + return 0;
> +}
> +
> +int
> +rte_eth_dev_get_port_by_addr(struct rte_pci_addr *addr, uint8_t *port_id)
> +{
> + struct rte_pci_addr *tmp;
> +
> + if ((addr == NULL) || (port_id == NULL)) {
> + PMD_DEBUG_TRACE("Null pointer is specified\n");
> + return -EINVAL;
> + }
> +
> + for (*port_id = 0; *port_id < RTE_MAX_ETHPORTS; (*port_id)++) {
> + if (!rte_eth_devices[*port_id].attached)
> + continue;
> + if (!rte_eth_devices[*port_id].pci_dev)
> + continue;
> + tmp = &rte_eth_devices[*port_id].pci_dev->addr;
> + if (eal_compare_pci_addr(tmp, addr) == 0)
> + return 0;
> + }
> + return -ENODEV;
> +}
> +
> +int
> +rte_eth_dev_get_name_by_port(uint8_t port_id, char *name)
> +{
> + char *tmp;
> +
> + if (rte_eth_dev_validate_port(port_id, TRACE) == DEV_INVALID)
> + return -EINVAL;
> +
> + if (name == NULL) {
> + PMD_DEBUG_TRACE("Null pointer is specified\n");
> + return -EINVAL;
> + }
> +
> + /* shouldn't check 'rte_eth_devices[i].data',
> +  * because it might be overwritten by VDEV PMD */
> + tmp = rte_eth_dev_data[port_id].name;
> + strncpy(name, tmp, strlen(tmp) + 1);
> + return 0;
> +}
> +
> +int
> +rte_eth_dev_check_detachable(uint8_t port_id)
> +{
> + uint32_t drv_flags;
> +
> + if (port_id >= RTE_MAX_ETHPORTS) {
> + PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
> + return -EINVAL;
> + }
> +
> + drv_flags = rte_eth_devices[port_id].driver->pci_drv.drv_flags;

Here should be better to add pt_driver for pci_dev type port.

Thanks,
Michael
> + return !(drv_flags & RTE_PCI_DRV_DETACHABL

[dpdk-dev] [PATCH 01/17] mbuf: add definitions of unified packet types

2015-02-03 Thread Zhang, Helin


> -Original Message-
> From: Olivier MATZ [mailto:olivier.matz at 6wind.com]
> Sent: Monday, February 2, 2015 7:18 PM
> To: Zhang, Helin; dev at dpdk.org
> Cc: Stephen Hemminger
> Subject: Re: [dpdk-dev] [PATCH 01/17] mbuf: add definitions of unified packet
> types
> 
> Hi Helin,
> 
> On 02/02/2015 02:43 AM, Zhang, Helin wrote:
> >>> +/*
> >>> + * Sixteen bits are divided into several fields to mark packet types.
> >>> +Note that
> >>> + * each field is indexical.
> >>> + * - Bit 3:0 is for tunnel types.
> >>> + * - Bit 7:4 is for L3 or outer L3 (for tunneling case) types.
> >>> + * - Bit 10:8 is for L4 types. It can also be used for inner L4 types for
> >>> + *   tunneling packets.
> >>> + * - Bit 13:11 is for inner L3 types.
> >>> + * - Bit 15:14 is reserved.
> >>
> >> Is there a reason why using this specific order?
> > Yes, to support ixgbe Vector PMD, outer L3 types and L4 types need to
> > be contiguous and in this order.
> 
> When you say "need to be", do you mean it's impossible to do in another
> manner or just that it would be slower?
It was designed to be like this, otherwise, performance drop must be expected.

> 
> >> Also, there are 4 bits for outer L3 types and 3 bits for inner L3
> >> types, but both of them have 6 different supported types. Is it 
> >> intentional?
> > Yes, it is to support ixgbe Vector PMD. Contiguous 7 bits are needed, though
> 1 bit wasted.
> 
> To be honnest, I'm always a surprised that in dpdk we prefer having a strange
> API just because it's faster or easier to do on one specific driver (usually 
> i40e or
> ixgbe). Unfortunately, trying to optimize the API for one driver may result in
> making the rest of the code (application and other drivers) slower and more
> complex.
Based on my understanding, 'faster' is most of DPDK customers wanted. Otherwise,
they don't need DPDK. Different hardware must have different capabilities, I am 
trying
to unify at least packet types to get things easier.

> 
> In your proposition, there is no inner l4_type. I consider it's as useful as 
> the
> other fields. From what I see, there are only 2 bits left. What do you think 
> about
> changing the packet type to 64 bits now?
For tunneling cases, L4_type is for inner L4 type, outer L4 type is not needed, 
as it
can be in tunnel type.
I can expect 64 bits are needed in the future. But for now, I don't see any 
strong
demand on that for currently supported hardware.
In addition, there is no free bit in the first cache line of mbuf header, mbuf 
changes
are needed to expand it. I'd prefer to do it later to make things easier.

> 
> From an API point of view, I think it would be good to have the same structure
> for inner and outer types. For instance (this is just an example):
> 
> union layer_pkt_type {
>   struct {
>   uint16_t l2_type:4;
>   uint16_t l3_type:4;
>   uint16_t l4_type:4;
>   uint16_t tun_type:4;
>   };
>   uint16_t u16;
> };
> 
> struct pkt_type {
>   union layer_pkt_type outer;
>   union layer_pkt_type inner;
> };
> 
> When your application decapsulates tunnels, you can just do outer = inner and
> enter into the same code.
Expanding packet_type is not easy, as there is no free bits in the first cache 
line.
Is there any tunnel type in inner packet? Is it a waste?
Is L2 type really needed? I don't know.

> 
> 
> >>> + * RTE_PTYPE_L3_IPV6, RTE_PTYPE_L3_IPV6_EXT, RTE_PTYPE_L4_TCP,
> >>> +RTE_PTYPE_L4_UDP
> >>> + * and RTE_PTYPE_L4_SCTP should be kept as below in a contiguous 7
> bits.
> >>> + *
> >>> + * Note that L3 types values are selected for checking IPV4/IPV6
> >>> +header from
> >>> + * performance point of view. Reading annotations of
> >>> +RTE_ETH_IS_IPV4_HDR and
> >>> + * RTE_ETH_IS_IPV6_HDR is needed for any future changes of L3 type
> >> values.
> >>> + */
> >>> +#define RTE_PTYPE_UNKNOWN   0x /*
> >> 0b */
> >>> +/* bit 3:0 for tunnel types */
> >>> +#define RTE_PTYPE_TUNNEL_IP 0x0001 /*
> >> 0b0001 */
> >>> +#define RTE_PTYPE_TUNNEL_TCP0x0002 /*
> >> 0b0010 */
> >>> +#define RTE_PTYPE_TUNNEL_UDP0x0003 /*
> >> 0b0011 */
> >>> +#define RTE_PTYPE_TUNNEL_GRE0x0004 /*
> >> 0b0100 */
> >>> +#define RTE_PTYPE_TUNNEL_VXLAN  0x0005 /*
> >> 0b0101 */
> >>> +#define RTE_PTYPE_TUNNEL_NVGRE  0x0006 /*
> >> 0b0110 */
> >>> +#define RTE_PTYPE_TUNNEL_GENEVE 0x0007 /*
> >> 0b0111 */
> >>> +#define RTE_PTYPE_TUNNEL_GRENAT 0x0008 /*
> >> 0b1000 */
> >>> +#define RTE_PTYPE_TUNNEL_GRENAT_MAC 0x0009 /*
> >> 0b1001 */
> >>> +#define RTE_PTYPE_TUNNEL_GRENAT_MACVLAN 0x000a /*
> >> 0b1010 */
> >>> +#define RTE_PTYPE_TUNNEL_MASK   0x000f /*
> >> 0b */
> >>> +/* bit 7:4 for L3 types */
> >>> +#define RTE_PT

[dpdk-dev] [PATCH 00/17] unified packet type

2015-02-03 Thread Zhang, Helin


> -Original Message-
> From: Ananyev, Konstantin
> Sent: Tuesday, February 3, 2015 1:20 AM
> To: Olivier MATZ; Zhang, Helin; dev at dpdk.org
> Subject: RE: [dpdk-dev] [PATCH 00/17] unified packet type
> 
> Hi Olivier,
> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Olivier MATZ
> > Sent: Monday, February 02, 2015 11:38 AM
> > To: Zhang, Helin; dev at dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH 00/17] unified packet type
> >
> > Hi Helin,
> >
> > On 02/02/2015 03:44 AM, Zhang, Helin wrote:
> >  Let's take a simple example. Imagine a hardware-1 that is able to
> >  recognize an IP packet by checking the ethertype and that the IP
> >  version is set to 4.
> >  Another hardware-2 recognize an IP packet by checking the
> >  ethertype, the IP version and that the IP length is correct compared to
> m_len(m).
> > 
> >  For the same packet, both hardwares will return
> >  RTE_PTYPE_L3_IPV4, but they don't do the same checks on the
> >  packet. As I want my application behave exactly the same whatever
> >  the hardware, I need to know what checks are done in hardware, so
> >  I can decide what checks must be done in my application.
> > 
> >  Example of definition: RTE_PTYPE_L3_IPV4 means that ethertype is
> >  0x0800 and IP.version is 4.
> > 
> >  It means that I can skip these 2 tests in my application if I
> >  have this packet_type, but all other checks must be done in
> >  software (ip length, flags, checksum, ...)
> > 
> >  For each packet type, we need a definition like above, and we
> >  must check that all drivers setting a packet type behave like 
> >  described.
> > > Hmm, I think the packet_type may need to be renamed to else, like
> offload_packet_type.
> > > It is just for hardware reported packet type information. It is not
> > > for all information of a packet.
> > > As different hardware may have different capability, it cannot
> > > report the same in mbuf among different hardware for the same packet.
> > > With your question, I think the hardware capability flags may be
> > > needed. Applications can query the packet type capabilities on each
> > > port, then it knows what type of packet type information can be reported 
> > > by
> the corresponding hardware.
> > > What do you think? And are they any better ideas from you?
> >
> > I'm not sure renaming the field would change something here.
> >
> > The high-level question is: how a software can take advantage of this
> > information given by the hardware? If the same packet_type does not
> > have the same meaning depending on the hardware, it's not worth having
> > this info.
> >
> > I think the API should describe for each packet type what can be
> > expected by the application. Here is an example. When a driver sets
> > the
> > RTE_PTYPE_L3_IPV4 type, it means that:
> >
> > - the layer 3 is identified as IP by underlying layer (ex: ethertype=IP
> >   if layer 2 is ethernet)
> > - the IP version field is 4
> > - there is no IP options (i.e the size of header is 20)
> 
> Yes, I suppose that's what supported HW can guarantee when
> RTE_PTYPE_L3_IPV4 is set.
> 
> > - the checksum field has been verified by hw, and if wrong, the
> >   flag PKT_RX_IP_CKSUM_BAD is set
> 
> Hmm, why is that?
> As I remember on many devices it is configurable by SW should HW do RX
> checksum offload or not.
> From DPDK point of view there is hw_ip_checksum field in rte_eth_rxmode.
> So it is a possible situation, when at RX HW does packet type determination,
> but doesn't make L3/L4 checksum calculation.
> 
> I suppose for checksum(s) it should be a separate flags (in ol_flags) with 3
> possible values:
> CKSUM_UNKNOWN, CKSUM_BAD, CKSUM_OK.
> 
> Konstantin

I think packet type and checksum are totally different things in DPDK, though
they might have dependencies in hardware.
Checksum good/bad is still indicated in ol_flags. Packet type is nothing about
checksum.

Regards,
Helin

> 
> >
> > If the hardware is not able to give all this information, there are
> > 2 solutions:
> > - do the remaining tests in the driver
> > - or set l3 pkt_type to unknown
> >
> > All other conditions that are not described in the API should be
> > checked by the applition if it needs the information (ex: check that
> > IP dest address is legal, that ip->len is >= 20, ...).
> >
> >
> > If we are able to describe this for all packet types, it would really
> > help application to take advantage of these packet types.
> >
> > Regards,
> > Olivier


[dpdk-dev] [PATCH v6 10/13] eal/pci: Cleanup pci driver initialization code

2015-02-03 Thread Tetsuya Mukawa
On 2015/02/03 11:35, Qiu, Michael wrote:
> On 2/1/2015 12:02 PM, Tetsuya Mukawa wrote:
>> - Add rte_eal_pci_close_one_dirver()
>>   The function is used for closing the specified driver and device.
>> - Add pci_invoke_all_drivers()
>>   The function is based on pci_probe_all_drivers. But it can not only
>>   probe but also close drivers.
>> - Add pci_close_all_drivers()
>>   The function tries to find a driver for the specified device, and
>>   then close the driver.
>> - Add rte_eal_pci_probe_one() and rte_eal_pci_close_one()
>>   The functions are used for probe and close a device.
>>   First the function tries to find a device that has the specfied
>>   PCI address. Then, probe or close the device.
>>
>> v5:
>> - Remove RTE_EAL_INVOKE_TYPE_UNKNOWN, because it's unused.
>> v4:
>> - Fix paramerter checking.
>> - Fix indent of 'if' statement.
>>
>> Signed-off-by: Tetsuya Mukawa 
>> ---
>>  lib/librte_eal/common/eal_common_pci.c  | 90 
>> +
>>  lib/librte_eal/common/eal_private.h | 24 +
>>  lib/librte_eal/common/include/rte_pci.h | 33 
>>  lib/librte_eal/linuxapp/eal/eal_pci.c   | 69 +
>>  4 files changed, 206 insertions(+), 10 deletions(-)
>>
>> diff --git a/lib/librte_eal/common/eal_common_pci.c 
>> b/lib/librte_eal/common/eal_common_pci.c
>> index a89f5c3..7c9b8c5 100644
>> --- a/lib/librte_eal/common/eal_common_pci.c
>> +++ b/lib/librte_eal/common/eal_common_pci.c
>> @@ -99,19 +99,27 @@ static struct rte_devargs *pci_devargs_lookup(struct 
>> rte_pci_device *dev)
>>  return NULL;
>>  }
>>  
>> -/*
>> - * If vendor/device ID match, call the devinit() function of all
>> - * registered driver for the given device. Return -1 if initialization
>> - * failed, return 1 if no driver is found for this device.
>> - */
>>  static int
>> -pci_probe_all_drivers(struct rte_pci_device *dev)
>> +pci_invoke_all_drivers(struct rte_pci_device *dev,
>> +enum rte_eal_invoke_type type)
>>  {
>>  struct rte_pci_driver *dr = NULL;
>> -int rc;
>> +int rc = 0;
>> +
>> +if ((dev == NULL) || (type >= RTE_EAL_INVOKE_TYPE_MAX))
>> +return -1;
>>  
>>  TAILQ_FOREACH(dr, &pci_driver_list, next) {
>> -rc = rte_eal_pci_probe_one_driver(dr, dev);
>> +switch (type) {
>> +case RTE_EAL_INVOKE_TYPE_PROBE:
>> +rc = rte_eal_pci_probe_one_driver(dr, dev);
>> +break;
>> +case RTE_EAL_INVOKE_TYPE_CLOSE:
>> +rc = rte_eal_pci_close_one_driver(dr, dev);
>> +break;
>> +default:
>> +return -1;
>> +}
>>  if (rc < 0)
>>  /* negative value is an error */
>>  return -1;
>> @@ -123,6 +131,66 @@ pci_probe_all_drivers(struct rte_pci_device *dev)
>>  return 1;
>>  }
>>  
>> +#ifdef ENABLE_HOTPLUG
>> +static int
>> +rte_eal_pci_invoke_one(struct rte_pci_addr *addr,
>> +enum rte_eal_invoke_type type)
>> +{
>> +struct rte_pci_device *dev = NULL;
>> +int ret = 0;
>> +
>> +if ((addr == NULL) || (type >= RTE_EAL_INVOKE_TYPE_MAX))
>> +return -1;
>> +
>> +TAILQ_FOREACH(dev, &pci_device_list, next) {
>> +if (eal_compare_pci_addr(&dev->addr, addr))
>> +continue;
>> +
>> +ret = pci_invoke_all_drivers(dev, type);
>> +if (ret < 0)
>> +goto invoke_err_return;
>> +
>> +if (type == RTE_EAL_INVOKE_TYPE_CLOSE)
>> +goto remove_dev;
>> +
>> +return 0;
>> +}
>> +
>> +return -1;
>> +
>> +invoke_err_return:
>> +RTE_LOG(WARNING, EAL, "Requested device " PCI_PRI_FMT
>> +" cannot be used\n", dev->addr.domain, dev->addr.bus,
>> +dev->addr.devid, dev->addr.function);
>> +return -1;
>> +
>> +remove_dev:
>> +TAILQ_REMOVE(&pci_device_list, dev, next);
>> +return 0;
>> +}
>> +
>> +
>> +/*
>> + * Find the pci device specified by pci address, then invoke probe function 
>> of
>> + * the driver of the devive.
>> + */
>> +int
>> +rte_eal_pci_probe_one(struct rte_pci_addr *addr)
>> +{
>> +return rte_eal_pci_invoke_one(addr, RTE_EAL_INVOKE_TYPE_PROBE);
>> +}
>> +
>> +/*
>> + * Find the pci device specified by pci address, then invoke close function 
>> of
>> + * the driver of the devive.
>> + */
>> +int
>> +rte_eal_pci_close_one(struct rte_pci_addr *addr)
>> +{
>> +return rte_eal_pci_invoke_one(addr, RTE_EAL_INVOKE_TYPE_CLOSE);
>> +}
>> +#endif /* ENABLE_HOTPLUG */
>> +
>>  /*
>>   * Scan the content of the PCI bus, and call the devinit() function for
>>   * all registered drivers that have a matching entry in its id_table
>> @@ -148,10 +216,12 @@ rte_eal_pci_probe(void)
>>  
>>  /* probe all or only whitelisted devices */
>>  if (probe_all)
>> -ret = pci_probe_all_drivers(dev

[dpdk-dev] [PATCH v6 07/13] ethdev: Add functions that will be used by port hotplug functions

2015-02-03 Thread Tetsuya Mukawa
On 2015/02/03 11:37, Qiu, Michael wrote:
> On 2/1/2015 12:02 PM, Tetsuya Mukawa wrote:
>> The patch adds following functions.
>>
>> - rte_eth_dev_save()
>>   The function is used for saving current rte_eth_dev structures.
>> - rte_eth_dev_get_changed_port()
>>   The function receives the rte_eth_dev structures, then compare
>>   these with current values to know which port is actually
>>   attached or detached.
>> - rte_eth_dev_get_addr_by_port()
>>   The function returns a pci address of a ethdev specified by port
>>   identifier.
>> - rte_eth_dev_get_port_by_addr()
>>   The function returns a port identifier of a ethdev specified by
>>   pci address.
>> - rte_eth_dev_get_name_by_port()
>>   The function returns a unique identifier name of a ethdev
>>   specified by port identifier.
>> - Add rte_eth_dev_check_detachable()
>>   The function returns whether a PMD supports detach function.
>>
>> Also the patch changes scope of rte_eth_dev_allocated() to global.
>> This function will be called by virtual PMDs to support port hotplug.
>> So change scope of the function to global.
>>
>> v5:
>> - Fix return value of below functions.
>>   rte_eth_dev_get_changed_port().
>>   rte_eth_dev_get_port_by_addr().
>> v4:
>> - Add paramerter checking.
>> v3:
>> - Fix if-condition bug while comparing pci addresses.
>> - Add error checking codes.
>> Reported-by: Mark Enright 
>>
>> Signed-off-by: Tetsuya Mukawa 
>> ---
>>  lib/librte_ether/rte_ethdev.c | 98 
>> ++-
>>  lib/librte_ether/rte_ethdev.h | 80 +++
>>  2 files changed, 177 insertions(+), 1 deletion(-)
>>
>> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
>> index 7bed901..5aded10 100644
>> --- a/lib/librte_ether/rte_ethdev.c
>> +++ b/lib/librte_ether/rte_ethdev.c
>> @@ -206,7 +206,7 @@ rte_eth_dev_data_alloc(void)
>>  RTE_MAX_ETHPORTS * sizeof(*rte_eth_dev_data));
>>  }
>>  
>> -static struct rte_eth_dev *
>> +struct rte_eth_dev *
>>  rte_eth_dev_allocated(const char *name)
>>  {
>>  unsigned i;
>> @@ -426,6 +426,102 @@ rte_eth_dev_count(void)
>>  return (nb_ports);
>>  }
>>  
>> +void
>> +rte_eth_dev_save(struct rte_eth_dev *devs)
>> +{
>> +if (devs == NULL)
>> +return;
>> +
>> +/* save current rte_eth_devices */
>> +memcpy(devs, rte_eth_devices,
>> +sizeof(struct rte_eth_dev) * RTE_MAX_ETHPORTS);
>> +}
>> +
>> +int
>> +rte_eth_dev_get_changed_port(struct rte_eth_dev *devs, uint8_t *port_id)
>> +{
>> +if ((devs == NULL) || (port_id == NULL))
>> +return -EINVAL;
>> +
>> +/* check which port was attached or detached */
>> +for (*port_id = 0; *port_id < RTE_MAX_ETHPORTS; (*port_id)++, devs++) {
>> +if (rte_eth_devices[*port_id].attached ^ devs->attached)
>> +return 0;
>> +}
>> +return -ENODEV;
>> +}
>> +
>> +int
>> +rte_eth_dev_get_addr_by_port(uint8_t port_id, struct rte_pci_addr *addr)
>> +{
>> +if (rte_eth_dev_validate_port(port_id, TRACE) == DEV_INVALID)
>> +return -EINVAL;
>> +
>> +if (addr == NULL) {
>> +PMD_DEBUG_TRACE("Null pointer is specified\n");
>> +return -EINVAL;
>> +}
>> +
>> +*addr = rte_eth_devices[port_id].pci_dev->addr;
>> +return 0;
>> +}
>> +
>> +int
>> +rte_eth_dev_get_port_by_addr(struct rte_pci_addr *addr, uint8_t *port_id)
>> +{
>> +struct rte_pci_addr *tmp;
>> +
>> +if ((addr == NULL) || (port_id == NULL)) {
>> +PMD_DEBUG_TRACE("Null pointer is specified\n");
>> +return -EINVAL;
>> +}
>> +
>> +for (*port_id = 0; *port_id < RTE_MAX_ETHPORTS; (*port_id)++) {
>> +if (!rte_eth_devices[*port_id].attached)
>> +continue;
>> +if (!rte_eth_devices[*port_id].pci_dev)
>> +continue;
>> +tmp = &rte_eth_devices[*port_id].pci_dev->addr;
>> +if (eal_compare_pci_addr(tmp, addr) == 0)
>> +return 0;
>> +}
>> +return -ENODEV;
>> +}
>> +
>> +int
>> +rte_eth_dev_get_name_by_port(uint8_t port_id, char *name)
>> +{
>> +char *tmp;
>> +
>> +if (rte_eth_dev_validate_port(port_id, TRACE) == DEV_INVALID)
>> +return -EINVAL;
>> +
>> +if (name == NULL) {
>> +PMD_DEBUG_TRACE("Null pointer is specified\n");
>> +return -EINVAL;
>> +}
>> +
>> +/* shouldn't check 'rte_eth_devices[i].data',
>> + * because it might be overwritten by VDEV PMD */
>> +tmp = rte_eth_dev_data[port_id].name;
>> +strncpy(name, tmp, strlen(tmp) + 1);
>> +return 0;
>> +}
>> +
>> +int
>> +rte_eth_dev_check_detachable(uint8_t port_id)
>> +{
>> +uint32_t drv_flags;
>> +
>> +if (port_id >= RTE_MAX_ETHPORTS) {
>> +PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
>> +return -EINVAL;
>> +}
>> +
>> +drv_flags = rte_eth_devices[port_id].driver->pci_drv.drv

[dpdk-dev] [PATCH v6 10/13] eal/pci: Cleanup pci driver initialization code

2015-02-03 Thread Qiu, Michael
On 2/3/2015 12:07 PM, Tetsuya Mukawa wrote:
> On 2015/02/03 11:35, Qiu, Michael wrote:
>> On 2/1/2015 12:02 PM, Tetsuya Mukawa wrote:
>>> - Add rte_eal_pci_close_one_dirver()
>>>   The function is used for closing the specified driver and device.
>>> - Add pci_invoke_all_drivers()

[...]
>>>  
>>> +#ifdef ENABLE_HOTPLUG
>>> +/*
>>> + * If vendor/device ID match, call the devuninit() function of the
>>> + * driver.
>>> + */
>>> +int
>>> +rte_eal_pci_close_one_driver(struct rte_pci_driver *dr,
>>> +   struct rte_pci_device *dev)
>>> +{
>>> +   struct rte_pci_id *id_table;
>>> +
>>> +   if ((dr == NULL) || (dev == NULL))
>>> +   return -EINVAL;
>>> +
>>> +   for (id_table = dr->id_table ; id_table->vendor_id != 0; id_table++) {
>>> +
>>> +   /* check if device's identifiers match the driver's ones */
>>> +   if (id_table->vendor_id != dev->id.vendor_id &&
>>> +   id_table->vendor_id != PCI_ANY_ID)
>>> +   continue;
>>> +   if (id_table->device_id != dev->id.device_id &&
>>> +   id_table->device_id != PCI_ANY_ID)
>>> +   continue;
>>> +   if (id_table->subsystem_vendor_id !=
>>> +   dev->id.subsystem_vendor_id &&
>>> +   id_table->subsystem_vendor_id != PCI_ANY_ID)
>>> +   continue;
>>> +   if (id_table->subsystem_device_id !=
>>> +   dev->id.subsystem_device_id &&
>>> +   id_table->subsystem_device_id != PCI_ANY_ID)
>>> +   continue;
>>> +
>>> +   struct rte_pci_addr *loc = &dev->addr;
>>> +
>>> +   RTE_LOG(DEBUG, EAL,
>>> +   "PCI device "PCI_PRI_FMT" on NUMA socket %i\n",
>>> +   loc->domain, loc->bus, loc->devid,
>>> +   loc->function, dev->numa_node);
>>> +
>>> +   RTE_LOG(DEBUG, EAL, "  remove driver: %x:%x %s\n",
>>> +   dev->id.vendor_id, dev->id.device_id,
>>> +   dr->name);
>>> +
>>> +   /* call the driver devuninit() function */
>>> +   if (dr->devuninit && (dr->devuninit(dr, dev) < 0))
>>> +   return -1;  /* negative value is an error */
>>> +
>>> +   /* clear driver structure */
>>> +   dev->driver = NULL;
>>> +
>>> +   if (dr->drv_flags & RTE_PCI_DRV_NEED_MAPPING)
>>> +   /* unmap resources for devices that use igb_uio */
>>> +   pci_unmap_device(dev);
>> Hi, Tetsuya
>>
>> I have one question,  as the code shows, in pci_unmap_device(), will
>> check pt_driver.
>>
>> But assume that, we are now try to detach a vfio device, after print out
>> a error message of unsupported, the does this port workable?
>>
>> I think this port will unworkable, am I right?
>>
>> But actually, we should keep it workable.
>>
>> My suggestion is to add a check in  rte_eth_dev_check_detachable() for
>> pci_device port.
> Hi Michael,
>
> I appreciate your comment.
> In the function called "rte_eal_dev_detach_pdev()",
> "rte_eth_dev_check_detachable()" has been already checked.

What I mean is check the pt_driver for pci_dev in
rte_eth_dev_check_detachable(), so that hotplug framework will not
affect vfio devices, just as I reply in another mail.

Current logic will affect vfio devices if try to detach( Not do the
really test, just the logic shows), am I right?

Thanks,
Michael

> But in the future, someone may want to reuse
> "rte_eal_pci_close_one_driver()".
> So I will add the checking like your suggestion.
>
> Thanks,
> Tetsuya
>
>> Thanks
>> Michael
>>
>>> +
>>> +   return 0;
>>> +   }
>>> +   /* return positive value if driver is not found */
>>> +   return 1;
>>> +}
>>> +#else /* ENABLE_HOTPLUG */
>>> +int
>>> +rte_eal_pci_close_one_driver(struct rte_pci_driver *dr __rte_unused,
>>> +   struct rte_pci_device *dev __rte_unused)
>>> +{
>>> +   RTE_LOG(ERR, EAL, "Hotplug support isn't enabled\n");
>>> +   return -1;
>>> +}
>>> +#endif /* ENABLE_HOTPLUG */
>>> +
>>>  /* Init the PCI EAL subsystem */
>>>  int
>>>  rte_eal_pci_init(void)
>
>



[dpdk-dev] [PATCH 4/7] ethdev: fix of calculating the size of flow type mask array

2015-02-03 Thread Zhang, Helin
Hi Thomas

> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Monday, February 2, 2015 11:31 PM
> To: Zhang, Helin
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 4/7] ethdev: fix of calculating the size of 
> flow
> type mask array
> 
> Hi Helin,
> 
> 2015-01-19 14:56, Helin Zhang:
> > +#define UINT32_BIT (CHAR_BIT * sizeof(uint32_t))
> 
> I don't understand how UINT32_BIT is better than a simple sizeof(uint32_t)?
UINT32_BIT is 32, while sizeof(uint32_t) is 4. They are different.

Regards,
Helin

> 
> --
> Thomas


[dpdk-dev] [PATCH 5/7] ethdev: unification of flow types

2015-02-03 Thread Zhang, Helin


> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Monday, February 2, 2015 11:39 PM
> To: Zhang, Helin
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 5/7] ethdev: unification of flow types
> 
> Hi Helin,
> 
> 2015-01-19 14:56, Helin Zhang:
> > Flow types was defined actually for i40e hardware specifically, and
> > wasn't able to be used for defining RSS offload types of all PMDs. It
> > removed the enum flow types, and uses macros instead with new names.
> > The new macros can be used for defining RSS offload types later. Also
> > modifications are made in i40e and testpmd accordingly.
> >
> > Signed-off-by: Helin Zhang 
> [...]
> > --- a/lib/librte_ether/rte_eth_ctrl.h
> > +++ b/lib/librte_ether/rte_eth_ctrl.h
> > @@ -46,6 +46,35 @@
> >  extern "C" {
> >  #endif
> >
> > +/*
> > + * A packet can be identified by hardware as different flow types.
> > +Different
> > + * NIC hardwares may support different flow types.
> > + * Basically, the NIC hardware identifies the flow type as deep
> > +protocol as
> > + * possible, and exclusively. For example, if a packet is identified
> > +as
> > + * 'ETH_FLOW_TYPE_NONFRAG_IPV4_TCP', it will not be any of other flow
> > +types,
> > + * though it is an actual IPV4 packet.
> > + * Note that the flow types are used to define RSS offload types in
> > + * rte_ethdev.h.
> > + */
> > +#define ETH_FLOW_TYPE_UNKNOWN0
> > +#define ETH_FLOW_TYPE_IPV4   1
> > +#define ETH_FLOW_TYPE_FRAG_IPV4  2
> > +#define ETH_FLOW_TYPE_NONFRAG_IPV4_TCP   3
> > +#define ETH_FLOW_TYPE_NONFRAG_IPV4_UDP   4
> > +#define ETH_FLOW_TYPE_NONFRAG_IPV4_SCTP  5 #define
> > +ETH_FLOW_TYPE_NONFRAG_IPV4_OTHER 6
> > +#define ETH_FLOW_TYPE_IPV6   7
> > +#define ETH_FLOW_TYPE_FRAG_IPV6  8
> > +#define ETH_FLOW_TYPE_NONFRAG_IPV6_TCP   9
> > +#define ETH_FLOW_TYPE_NONFRAG_IPV6_UDP   10
> > +#define ETH_FLOW_TYPE_NONFRAG_IPV6_SCTP  11 #define
> > +ETH_FLOW_TYPE_NONFRAG_IPV6_OTHER 12
> > +#define ETH_FLOW_TYPE_L2_PAYLOAD 13
> > +#define ETH_FLOW_TYPE_IPV6_EX14
> > +#define ETH_FLOW_TYPE_IPV6_TCP_EX15
> > +#define ETH_FLOW_TYPE_IPV6_UDP_EX16
> > +#define ETH_FLOW_TYPE_MAX17
> 
> Why not using an enum?
Enum is 'int' which needs 32 bits, while flow_type is just 16 bits.
The old one which is not enum was in rte_ethdev.h, the enum one was added
recently in rte_eth_ctrl.h.

> Nitpicking: numbers from 0 to 9 should be right aligned.
In my source file, they are right aligned.

> 
> >  /**
> >   * Feature filter types
> >   */
> > @@ -179,24 +208,6 @@ struct rte_eth_tunnel_filter_conf {
> >  #define RTE_ETH_FDIR_MAX_FLEXLEN 16 /** < Max length of
> flexbytes. */
> >
> >  /**
> > - * Flow type
> > - */
> > -enum rte_eth_flow_type {
> > -   RTE_ETH_FLOW_TYPE_NONE = 0,
> > -   RTE_ETH_FLOW_TYPE_UDPV4,
> > -   RTE_ETH_FLOW_TYPE_TCPV4,
> > -   RTE_ETH_FLOW_TYPE_SCTPV4,
> > -   RTE_ETH_FLOW_TYPE_IPV4_OTHER,
> > -   RTE_ETH_FLOW_TYPE_FRAG_IPV4,
> > -   RTE_ETH_FLOW_TYPE_UDPV6,
> > -   RTE_ETH_FLOW_TYPE_TCPV6,
> > -   RTE_ETH_FLOW_TYPE_SCTPV6,
> > -   RTE_ETH_FLOW_TYPE_IPV6_OTHER,
> > -   RTE_ETH_FLOW_TYPE_FRAG_IPV6,
> > -   RTE_ETH_FLOW_TYPE_MAX = 64,
> > -};
> 
> You are renaming the prefix RTE_ETH_FLOW_TYPE_ to ETH_FLOW_TYPE.
> As this is an exported enum (in the API), we should keep RTE_ prefix.
> If you are trying to shorten the names, I suggest RTE_ETH_FLOW_.
OK. Started with RTE_ETH_FLOW_ is good for me. I will modified it in v2.

> 
> [...]
> >  struct rte_eth_fdir_input {
> > -   enum rte_eth_flow_type flow_type;  /**< Type of flow */
> > +   uint16_t flow_type;  /**< Type of flow */
> [...]
> >  struct rte_eth_fdir_flex_mask {
> > -   enum rte_eth_flow_type flow_type;  /**< Flow type */
> > +   uint16_t flow_type;  /**< Flow type */
> 
> I think this comment is useless ;)
OK. I will remove it. Thanks!

Regards,
Helin

> 
> --
> Thomas


[dpdk-dev] [PATCH v6] testpmd: Add port hotplug support

2015-02-03 Thread Qiu, Michael
On 2/1/2015 12:02 PM, Tetsuya Mukawa wrote:
> The patch introduces following commands.
> - port attach [ident]
> - port detach [port_id]
>  - attach: attaching a port
>  - detach: detaching a port
>  - ident: pci address of physical device.
>   Or device name and paramerters of virtual device.
>  (ex. :02:00.0, eth_pcap0,iface=eth0)
>  - port_id: port identifier
>
> v5:
> - Add testpmd documentation.
>   (Thanks to Iremonger, Bernard)
> v4:
>  - Fix strings of command help.
>
> Signed-off-by: Tetsuya Mukawa 
> ---
>  app/test-pmd/cmdline.c  | 133 +++
>  app/test-pmd/config.c   | 116 +---
>  app/test-pmd/parameters.c   |  22 ++-
>  app/test-pmd/testpmd.c  | 199 
> +---
>  app/test-pmd/testpmd.h  |  18 ++-
>  doc/guides/testpmd_app_ug/testpmd_funcs.rst |  57 
>  6 files changed, 415 insertions(+), 130 deletions(-)
>
> diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
> index 4beb404..2f813d8 100644
> --- a/app/test-pmd/cmdline.c
> +++ b/app/test-pmd/cmdline.c
> @@ -572,6 +572,12 @@ static void cmd_help_long_parsed(void *parsed_result,
>   "port close (port_id|all)\n"
>   "Close all ports or port_id.\n\n"
>  
> + "port attach (ident)\n"
> + "Attach physical or virtual dev by pci address or 
> virtual device name\n\n"
> +
> + "port detach (port_id)\n"
> + "Detach physical or virtual dev by port_id\n\n"
> +
>   "port config (port_id|all)"
>   " speed (10|100|1000|1|4|auto)"
>   " duplex (half|full|auto)\n"
> @@ -848,6 +854,89 @@ cmdline_parse_inst_t cmd_operate_specific_port = {
>   },
>  };
>  
> +/* *** attach a specificied port *** */
> +struct cmd_operate_attach_port_result {
> + cmdline_fixed_string_t port;
> + cmdline_fixed_string_t keyword;
> + cmdline_fixed_string_t identifier;
> +};
> +
> +static void cmd_operate_attach_port_parsed(void *parsed_result,
> + __attribute__((unused)) struct cmdline *cl,
> + __attribute__((unused)) void *data)
> +{
> + struct cmd_operate_attach_port_result *res = parsed_result;
> +
> + if (!strcmp(res->keyword, "attach"))
> + attach_port(res->identifier);
> + else
> + printf("Unknown parameter\n");
> +}
> +
> +cmdline_parse_token_string_t cmd_operate_attach_port_port =
> + TOKEN_STRING_INITIALIZER(struct cmd_operate_attach_port_result,
> + port, "port");
> +cmdline_parse_token_string_t cmd_operate_attach_port_keyword =
> + TOKEN_STRING_INITIALIZER(struct cmd_operate_attach_port_result,
> + keyword, "attach");
> +cmdline_parse_token_string_t cmd_operate_attach_port_identifier =
> + TOKEN_STRING_INITIALIZER(struct cmd_operate_attach_port_result,
> + identifier, NULL);
> +
> +cmdline_parse_inst_t cmd_operate_attach_port = {
> + .f = cmd_operate_attach_port_parsed,
> + .data = NULL,
> + .help_str = "port attach identifier, "
> + "identifier: pci address or virtual dev name",
> + .tokens = {
> + (void *)&cmd_operate_attach_port_port,
> + (void *)&cmd_operate_attach_port_keyword,
> + (void *)&cmd_operate_attach_port_identifier,
> + NULL,
> + },
> +};
> +
> +/* *** detach a specificied port *** */
> +struct cmd_operate_detach_port_result {
> + cmdline_fixed_string_t port;
> + cmdline_fixed_string_t keyword;
> + uint8_t port_id;
> +};
> +
> +static void cmd_operate_detach_port_parsed(void *parsed_result,
> + __attribute__((unused)) struct cmdline *cl,
> + __attribute__((unused)) void *data)
> +{
> + struct cmd_operate_detach_port_result *res = parsed_result;
> +
> + if (!strcmp(res->keyword, "detach"))
> + detach_port(res->port_id);
> + else
> + printf("Unknown parameter\n");
> +}
> +
> +cmdline_parse_token_string_t cmd_operate_detach_port_port =
> + TOKEN_STRING_INITIALIZER(struct cmd_operate_detach_port_result,
> + port, "port");
> +cmdline_parse_token_string_t cmd_operate_detach_port_keyword =
> + TOKEN_STRING_INITIALIZER(struct cmd_operate_detach_port_result,
> + keyword, "detach");
> +cmdline_parse_token_num_t cmd_operate_detach_port_port_id =
> + TOKEN_NUM_INITIALIZER(struct cmd_operate_detach_port_result,
> + port_id, UINT8);
> +
> +cmdline_parse_inst_t cmd_operate_detach_port = {
> + .f = cmd_operate_detach_port_parsed,
> + .data = NULL,
> + .help_str = "port detach port_id",
> + .tokens = {
> + (void *)&cmd_operate_detach_port_port,
> +  

[dpdk-dev] [PATCH v2 1/2] eal: sort and align options lists

2015-02-03 Thread David Marchand
Two little comments.

On Mon, Feb 2, 2015 at 6:44 PM, Thomas Monjalon 
wrote:

> @@ -578,37 +579,36 @@ eal_check_common_options(struct internal_config
> *internal_cfg)
>  void
>  eal_common_usage(void)
>  {
> -   printf("-c COREMASK -n NUM [-m NB] [-r NUM] [-b
> ]"
> -  "[--proc-type primary|secondary|auto]\n\n"
> +   printf("-c COREMASK|-l CORELIST -n CHANNELS [options]\n\n"
>"EAL common options:\n"
> -  "  -c COREMASK  : A hexadecimal bitmask of cores to run
> on\n"
> -  "  -l CORELIST  : List of cores to run on\n"
> -  " The argument format is
> [-c2][,c3[-c4],...]\n"
>

[snip]


>
> +  "  -n NUM  Number of memory channels\n"
>

Not really a problem, but for consistency : here, you are talking about
NUM, while at first, you wrote -n CHANNELS.


[snip]


> /* first long only option value must be >= 256, so that we won't
>  * conflict with short options */
> OPT_LONG_MIN_NUM = 256,
> -#define OPT_HUGE_DIR"huge-dir"
> -   OPT_HUGE_DIR_NUM = OPT_LONG_MIN_NUM,
> -#define OPT_MASTER_LCORE "master-lcore"
> +#define OPT_BASE_VIRTADDR "base-virtaddr"
> +   OPT_BASE_VIRTADDR_NUM,
>

Why skip the first entry ?
Afaik, OPT_BASE_VIRTADDR_NUM will be set to 257, is it to avoid having this
= OPT_LONG_MIN_NUM moved anytime we add a new long option at the top of the
enum ?


The rest looks good to me.
Acked-by: David Marchand 

-- 
David Marchand


[dpdk-dev] [PATCH v2 2/2] eal: add help option

2015-02-03 Thread David Marchand
On Mon, Feb 2, 2015 at 6:44 PM, Thomas Monjalon 
wrote:
[snip]

> @@ -340,6 +342,9 @@ eal_parse_args(int argc, char **argv)
> continue;
>
> switch (opt) {
> +   case 'h':
> +   eal_usage(prgname);
> +   exit(EXIT_SUCCESS);
> default:
> if (opt < OPT_LONG_MIN_NUM && isprint(opt)) {
> RTE_LOG(ERR, EAL, "Option %c is not
> supported "
>
[snip]


> @@ -534,6 +536,10 @@ eal_parse_args(int argc, char **argv)
> continue;
>
> switch (opt) {
> +   case 'h':
> +   eal_usage(prgname);
> +   exit(EXIT_SUCCESS);
> +
> /* force loading of external driver */
> case 'd':
> solib = malloc(sizeof(*solib));
>

Why not move those two in common parser ?


-- 
David Marchand


[dpdk-dev] [PATCH 01/17] mbuf: add definitions of unified packet types

2015-02-03 Thread Zhang, Helin


> -Original Message-
> From: Zhang, Helin
> Sent: Tuesday, February 3, 2015 11:19 AM
> To: Olivier MATZ; dev at dpdk.org
> Cc: Stephen Hemminger
> Subject: RE: [dpdk-dev] [PATCH 01/17] mbuf: add definitions of unified packet
> types
> 
> 
> 
> > -Original Message-
> > From: Olivier MATZ [mailto:olivier.matz at 6wind.com]
> > Sent: Monday, February 2, 2015 7:18 PM
> > To: Zhang, Helin; dev at dpdk.org
> > Cc: Stephen Hemminger
> > Subject: Re: [dpdk-dev] [PATCH 01/17] mbuf: add definitions of unified
> > packet types
> >
> > Hi Helin,
> >
> > On 02/02/2015 02:43 AM, Zhang, Helin wrote:
> > >>> +/*
> > >>> + * Sixteen bits are divided into several fields to mark packet types.
> > >>> +Note that
> > >>> + * each field is indexical.
> > >>> + * - Bit 3:0 is for tunnel types.
> > >>> + * - Bit 7:4 is for L3 or outer L3 (for tunneling case) types.
> > >>> + * - Bit 10:8 is for L4 types. It can also be used for inner L4 types 
> > >>> for
> > >>> + *   tunneling packets.
> > >>> + * - Bit 13:11 is for inner L3 types.
> > >>> + * - Bit 15:14 is reserved.
> > >>
> > >> Is there a reason why using this specific order?
> > > Yes, to support ixgbe Vector PMD, outer L3 types and L4 types need
> > > to be contiguous and in this order.
> >
> > When you say "need to be", do you mean it's impossible to do in
> > another manner or just that it would be slower?
> It was designed to be like this, otherwise, performance drop must be expected.
> 
> >
> > >> Also, there are 4 bits for outer L3 types and 3 bits for inner L3
> > >> types, but both of them have 6 different supported types. Is it 
> > >> intentional?
> > > Yes, it is to support ixgbe Vector PMD. Contiguous 7 bits are
> > > needed, though
> > 1 bit wasted.
> >
> > To be honnest, I'm always a surprised that in dpdk we prefer having a
> > strange API just because it's faster or easier to do on one specific
> > driver (usually i40e or ixgbe). Unfortunately, trying to optimize the
> > API for one driver may result in making the rest of the code
> > (application and other drivers) slower and more complex.
> Based on my understanding, 'faster' is most of DPDK customers wanted.
> Otherwise, they don't need DPDK. Different hardware must have different
> capabilities, I am trying to unify at least packet types to get things easier.
> 
> >
> > In your proposition, there is no inner l4_type. I consider it's as
> > useful as the other fields. From what I see, there are only 2 bits
> > left. What do you think about changing the packet type to 64 bits now?
> For tunneling cases, L4_type is for inner L4 type, outer L4 type is not 
> needed, as
> it can be in tunnel type.
> I can expect 64 bits are needed in the future. But for now, I don't see any
> strong demand on that for currently supported hardware.
> In addition, there is no free bit in the first cache line of mbuf header, mbuf
> changes are needed to expand it. I'd prefer to do it later to make things 
> easier.
Sorry, I misremember the usage of the first cache line of mbuf. It still has 
some
free space. Based on this, enlarging (to 32 or 64 bits) the packet type might 
be good.

> 
> >
> > From an API point of view, I think it would be good to have the same
> > structure for inner and outer types. For instance (this is just an example):
> >
> > union layer_pkt_type {
> > struct {
> > uint16_t l2_type:4;
> > uint16_t l3_type:4;
> > uint16_t l4_type:4;
> > uint16_t tun_type:4;
> > };
> > uint16_t u16;
> > };
> >
> > struct pkt_type {
> > union layer_pkt_type outer;
> > union layer_pkt_type inner;
> > };
> >
> > When your application decapsulates tunnels, you can just do outer =
> > inner and enter into the same code.
> Expanding packet_type is not easy, as there is no free bits in the first cache
> line.
> Is there any tunnel type in inner packet? Is it a waste?
> Is L2 type really needed? I don't know.
If it is now not short of space in mbuf, the definition as yours might be good.
But tun_type is not required for inner packet, I'd prefer to define it as needed
with taking into account the Vector PMD support. It seems 32 bits might be 
enough,
like below,
struct pkt_type {
uint32_t l2_type:4;
uint32_t l3_type:4;
uint32_t l4_type:4;
uint32_t tun_type:4;
uint32_t inner_l2_type:4;
uint32_t inner_l3_type:4;
uint32_t inner_l4_type:4;
}

Regards,
Helin

> 
> >
> >
> > >>> + * RTE_PTYPE_L3_IPV6, RTE_PTYPE_L3_IPV6_EXT, RTE_PTYPE_L4_TCP,
> > >>> +RTE_PTYPE_L4_UDP
> > >>> + * and RTE_PTYPE_L4_SCTP should be kept as below in a contiguous
> > >>> +7
> > bits.
> > >>> + *
> > >>> + * Note that L3 types values are selected for checking IPV4/IPV6
> > >>> +header from
> > >>> + * performance point of view. Reading annotations of
> > >>> +RTE_ETH_IS_IPV4_HDR and
> > >>> + * RTE_ETH_IS_IPV6_HDR is needed for any future changes of L3
> > >>> +type
> > >> values.
> > >>> + */
> > >>> +#define RTE_PTYPE_UN

[dpdk-dev] [PATCH 03/18] fm10k: Add empty fm10k files

2015-02-03 Thread Chen, Jing D
Hi Neil,

> -Original Message-
> From: Neil Horman [mailto:nhorman at tuxdriver.com]
> Sent: Monday, February 02, 2015 9:39 PM
> To: Chen, Jing D
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 03/18] fm10k: Add empty fm10k files
> 
> On Mon, Feb 02, 2015 at 05:34:43AM +, Chen, Jing D wrote:
> > Hi Neil,
> >
> > > -Original Message-
> > > From: Neil Horman [mailto:nhorman at tuxdriver.com]
> > > Sent: Saturday, January 31, 2015 10:02 PM
> > > To: Chen, Jing D
> > > Cc: dev at dpdk.org
> > > Subject: Re: [dpdk-dev] [PATCH 03/18] fm10k: Add empty fm10k files
> > >
> > > On Fri, Jan 30, 2015 at 01:07:19PM +0800, Chen Jing D(Mark) wrote:
> > > > From: Jeff Shaw 
> > > >
> > > > Define macros and basic data structure.
> > > > Define rte_log wrapper functions.
> > > >
> > > > Signed-off-by: Jeff Shaw 
> > > > Signed-off-by: Chen Jing D(Mark) 
> > > > ---
> > > >  lib/librte_pmd_fm10k/Makefile |   96 
> > > >  lib/librte_pmd_fm10k/fm10k.h  |  224
> > > +
> > > >  lib/librte_pmd_fm10k/fm10k_logs.h |   66 +++
> > > >  3 files changed, 386 insertions(+), 0 deletions(-)
> > > >  create mode 100644 lib/librte_pmd_fm10k/Makefile
> > > >  create mode 100644 lib/librte_pmd_fm10k/fm10k.h
> > > >  create mode 100644 lib/librte_pmd_fm10k/fm10k_ethdev.c
> > > >  create mode 100644 lib/librte_pmd_fm10k/fm10k_logs.h
> > > >  create mode 100644 lib/librte_pmd_fm10k/fm10k_rxtx.c
> > > >
> > > Why are you adding empty files?
> >
> > The 2 ".c" files are empty while the 2 ".h" files include code. "Makefile"
> includes rules to
> > compile the ".c" files, I don't like to break the compile for every single 
> > patch,
> that's why
> > the 2 ".c" files are added in this patch.
> >
> That doesn't really answer the question.  Theres no need to add empty files
> here.  Just add the headers alone and add the empy files on the first commit
> where you have code to put in them.  Adjust the makefile so that you add
> them
> into the compilation in the same commit that you populate the file to avoid a
> FTBFS error.
> Neil

Got you. I'll add the content with new files. Thanks!

> 
> > >
> > > Neil
> >
> > Thanks for your comments.
> > Mark
> >


[dpdk-dev] [PATCH v6] testpmd: Add port hotplug support

2015-02-03 Thread Qiu, Michael
On 2/1/2015 12:02 PM, Tetsuya Mukawa wrote:
> The patch introduces following commands.
> - port attach [ident]
> - port detach [port_id]
>  - attach: attaching a port
>  - detach: detaching a port
>  - ident: pci address of physical device.
>   Or device name and paramerters of virtual device.
>  (ex. :02:00.0, eth_pcap0,iface=eth0)
>  - port_id: port identifier
>
> v5:
> - Add testpmd documentation.
>   (Thanks to Iremonger, Bernard)
> v4:
>  - Fix strings of command help.
>
> Signed-off-by: Tetsuya Mukawa 
> ---
>  app/test-pmd/cmdline.c  | 133 +++
>  app/test-pmd/config.c   | 116 +---
>  app/test-pmd/parameters.c   |  22 ++-
>  app/test-pmd/testpmd.c  | 199 
> +---
>  app/test-pmd/testpmd.h  |  18 ++-
>  doc/guides/testpmd_app_ug/testpmd_funcs.rst |  57 
>  6 files changed, 415 insertions(+), 130 deletions(-)
>
> diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
> index 4beb404..2f813d8 100644
> --- a/app/test-pmd/cmdline.c
> +++ b/app/test-pmd/cmdline.c
> @@ -572,6 +572,12 @@ static void cmd_help_long_parsed(void *parsed_result,
>   "port close (port_id|all)\n"
>   "Close all ports or port_id.\n\n"
>  
> + "port attach (ident)\n"
> + "Attach physical or virtual dev by pci address or 
> virtual device name\n\n"
> +
> + "port detach (port_id)\n"
> + "Detach physical or virtual dev by port_id\n\n"
> +
>   "port config (port_id|all)"
>   " speed (10|100|1000|1|4|auto)"
>   " duplex (half|full|auto)\n"
> @@ -848,6 +854,89 @@ cmdline_parse_inst_t cmd_operate_specific_port = {
>   },
>  };
>  
> +/* *** attach a specificied port *** */
> +struct cmd_operate_attach_port_result {
> + cmdline_fixed_string_t port;
> + cmdline_fixed_string_t keyword;
> + cmdline_fixed_string_t identifier;
> +};
> +
> +static void cmd_operate_attach_port_parsed(void *parsed_result,
> + __attribute__((unused)) struct cmdline *cl,
> + __attribute__((unused)) void *data)
> +{
> + struct cmd_operate_attach_port_result *res = parsed_result;
> +
> + if (!strcmp(res->keyword, "attach"))
> + attach_port(res->identifier);
> + else
> + printf("Unknown parameter\n");
> +}
> +
> +cmdline_parse_token_string_t cmd_operate_attach_port_port =
> + TOKEN_STRING_INITIALIZER(struct cmd_operate_attach_port_result,
> + port, "port");
> +cmdline_parse_token_string_t cmd_operate_attach_port_keyword =
> + TOKEN_STRING_INITIALIZER(struct cmd_operate_attach_port_result,
> + keyword, "attach");
> +cmdline_parse_token_string_t cmd_operate_attach_port_identifier =
> + TOKEN_STRING_INITIALIZER(struct cmd_operate_attach_port_result,
> + identifier, NULL);
> +
> +cmdline_parse_inst_t cmd_operate_attach_port = {
> + .f = cmd_operate_attach_port_parsed,
> + .data = NULL,
> + .help_str = "port attach identifier, "
> + "identifier: pci address or virtual dev name",
> + .tokens = {
> + (void *)&cmd_operate_attach_port_port,
> + (void *)&cmd_operate_attach_port_keyword,
> + (void *)&cmd_operate_attach_port_identifier,
> + NULL,
> + },
> +};
> +
> +/* *** detach a specificied port *** */
> +struct cmd_operate_detach_port_result {
> + cmdline_fixed_string_t port;
> + cmdline_fixed_string_t keyword;
> + uint8_t port_id;
> +};
> +
> +static void cmd_operate_detach_port_parsed(void *parsed_result,
> + __attribute__((unused)) struct cmdline *cl,
> + __attribute__((unused)) void *data)
> +{
> + struct cmd_operate_detach_port_result *res = parsed_result;
> +
> + if (!strcmp(res->keyword, "detach"))
> + detach_port(res->port_id);
> + else
> + printf("Unknown parameter\n");
> +}
> +
> +cmdline_parse_token_string_t cmd_operate_detach_port_port =
> + TOKEN_STRING_INITIALIZER(struct cmd_operate_detach_port_result,
> + port, "port");
> +cmdline_parse_token_string_t cmd_operate_detach_port_keyword =
> + TOKEN_STRING_INITIALIZER(struct cmd_operate_detach_port_result,
> + keyword, "detach");
> +cmdline_parse_token_num_t cmd_operate_detach_port_port_id =
> + TOKEN_NUM_INITIALIZER(struct cmd_operate_detach_port_result,
> + port_id, UINT8);
> +
> +cmdline_parse_inst_t cmd_operate_detach_port = {
> + .f = cmd_operate_detach_port_parsed,
> + .data = NULL,
> + .help_str = "port detach port_id",
> + .tokens = {
> + (void *)&cmd_operate_detach_port_port,
> +  

[dpdk-dev] vhost: virtio-net rx-ring stop work after work many hours, bug?

2015-02-03 Thread Linhaifeng
I found that the new code had try to notify guest after send each packet after 
2bbb811.
So this bug not exist now.

static inline uint32_t __attribute__((always_inline)) 
virtio_dev_merge_rx(struct virtio_net *dev, uint16_t queue_id,
struct rte_mbuf **pkts, uint32_t count) {
... ...

for (pkt_idx = 0; pkt_idx < count; pkt_idx++) {

... ...

/* Kick the guest if necessary. */
if (!(vq->avail->flags & VRING_AVAIL_F_NO_INTERRUPT))
eventfd_write((int)vq->kickfd, 1);
}

return count;
}

thank you very much!

On 2015/1/27 15:57, Linhaifeng wrote:
> Hi,all
> 
> I use vhost-user to send data to VM at first it cant work well but after many 
> hours VM can not receive data but can send data.
> 
> (gdb)p avail_idx
> $4 = 2668
> (gdb)p free_entries
> $5 = 0
> (gdb)l
> /* check that we have enough buffers */
> if (unlikely(count > free_entries))
> count = free_entries;
> 
> if (count == 0){
> int b=0;
> if(b) { // when set b=1 to notify guest rx_ring will restart to 
> work
> if (!(vq->avail->flags & VRING_AVAIL_F_NO_INTERRUPT)) {
> 
> eventfd_write(vq->callfd, 1);
> }
> }
> return 0;
> }
> 
> some info i print in guest:
> 
> net eth3:vi->num=199
> net eth3:rvq info: num_free=57, used->idx=2668, avail->idx=2668
> net eth3:svq info: num_free=254, used->idx=1644, avail->idx=1644
> 
> net eth3:vi->num=199
> net eth3:rvq info: num_free=57, used->idx=2668, avail->idx=2668
> net eth3:svq info: num_free=254, used->idx=1645, avail->idx=1645
> 
> net eth3:vi->num=199
> net eth3:rvq info: num_free=57, used->idx=2668, avail->idx=2668
> net eth3:svq info: num_free=254, used->idx=1646, avail->idx=1646
> 
> # free
>  total   used   free sharedbuffers cached
> Mem:  3924100  3372523586848  0  95984 138060
> -/+ buffers/cache: 1032083820892
> Swap:   970748  0 970748
> 
> I have two questions:
> 1.Should we need to notify guest when there is no buffer in vq->avail?
> 2.Why virtio_net stop to fill avail?
> 
> 
> 
> 
> 
> 

-- 
Regards,
Haifeng



[dpdk-dev] [PATCH 1/2] rte_ethdev: update link status (speed, duplex, link_up) after rte_eth_dev_start

2015-02-03 Thread Jia Yu
My answer to Helin?s comments:

This patch is needed for bond slave devices or other devices, when LSC
interrupt is enabled.
 1. slave_configure()  -> slave_eth_dev->?.lsc = 1

 2. rte_eth_link_get() reads dev_link from eth_dev, when lsc interrupt is
enabled. However, the dev_link on eth_dev has not be initialized and
showed link down state. This patch initializes the device?s dev_link at
rte_eth_dev_start time.

Please let me know if you have further questions/comments.

Thanks,
Jia

On 1/30/15, 2:28 AM, "Thomas Monjalon"  wrote:

>Jia, any news on this patchset?
>
>2014-11-12 03:57, Zhang, Helin:
>> Hi Jia
>> 
>> > -Original Message-
>> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jia Yu
>> > Sent: Saturday, November 8, 2014 1:32 AM
>> > To: dev at dpdk.org
>> > Subject: [dpdk-dev] [PATCH 1/2] rte_ethdev: update link status
>>(speed, duplex,
>> > link_up) after rte_eth_dev_start
>> > 
>> > Since LSR interrupt is disabled by pmd drivers, link status in
>>rte_eth_device is
>> > always down.
>> If LSC interrupt is disabled by default, it will poll the link status
>>during the initialization
>> or in dev_start, and then the link status should he correct. If I am
>>not wrong.
>> 
>> > Bond slave_configure() enables LSR interrupt on devices to get
>>notification if link
>> > status changes. However, the LSC interrupt at device start time is
>>still lost.
>> Before enabling interrupt for LSC, the link status should be polled. So
>>after the port
>> startup, the link status should be there.
>> 
>> > 
>> > In this fix, call link_update to read link status from hardware
>>register at device
>> > start time.
>> Could you help to explain this code changes a bit more? Why we need it?
>> 
>> > 
>> > Issue:
>> > Change-Id: Ib57a1c9114f922485c7b0f4338bfe7b3d3f87d65
>> > Signed-off-by: Jia Yu 
>> > ---
>> >  lib/librte_ether/rte_ethdev.c | 4 
>> >  1 file changed, 4 insertions(+)
>> > 
>> > diff --git a/lib/librte_ether/rte_ethdev.c
>>b/lib/librte_ether/rte_ethdev.c index
>> > ff1c769..6c01b02 100644
>> > --- a/lib/librte_ether/rte_ethdev.c
>> > +++ b/lib/librte_ether/rte_ethdev.c
>> > @@ -869,6 +869,10 @@ rte_eth_dev_start(uint8_t port_id)
>> > 
>> >rte_eth_dev_config_restore(port_id);
>> > 
>> > +  if (dev->data->dev_conf.intr_conf.lsc != 0) {
>> > +  FUNC_PTR_OR_ERR_RET(*dev->dev_ops->link_update, -ENOTSUP);
>> > +  (*dev->dev_ops->link_update)(dev, 0);
>> > +  }
>> >return 0;
>> >  }
>> > 
>> > --
>> > 1.9.1
>> 
>> Regards,
>> Helin
>



[dpdk-dev] [PATCH v6 10/13] eal/pci: Cleanup pci driver initialization code

2015-02-03 Thread Tetsuya Mukawa
On 2015/02/03 14:05, Qiu, Michael wrote:
> On 2/3/2015 12:07 PM, Tetsuya Mukawa wrote:
>> On 2015/02/03 11:35, Qiu, Michael wrote:
>>> On 2/1/2015 12:02 PM, Tetsuya Mukawa wrote:
 - Add rte_eal_pci_close_one_dirver()
   The function is used for closing the specified driver and device.
 - Add pci_invoke_all_drivers()
> [...]
  
 +#ifdef ENABLE_HOTPLUG
 +/*
 + * If vendor/device ID match, call the devuninit() function of the
 + * driver.
 + */
 +int
 +rte_eal_pci_close_one_driver(struct rte_pci_driver *dr,
 +  struct rte_pci_device *dev)
 +{
 +  struct rte_pci_id *id_table;
 +
 +  if ((dr == NULL) || (dev == NULL))
 +  return -EINVAL;
 +
 +  for (id_table = dr->id_table ; id_table->vendor_id != 0; id_table++) {
 +
 +  /* check if device's identifiers match the driver's ones */
 +  if (id_table->vendor_id != dev->id.vendor_id &&
 +  id_table->vendor_id != PCI_ANY_ID)
 +  continue;
 +  if (id_table->device_id != dev->id.device_id &&
 +  id_table->device_id != PCI_ANY_ID)
 +  continue;
 +  if (id_table->subsystem_vendor_id !=
 +  dev->id.subsystem_vendor_id &&
 +  id_table->subsystem_vendor_id != PCI_ANY_ID)
 +  continue;
 +  if (id_table->subsystem_device_id !=
 +  dev->id.subsystem_device_id &&
 +  id_table->subsystem_device_id != PCI_ANY_ID)
 +  continue;
 +
 +  struct rte_pci_addr *loc = &dev->addr;
 +
 +  RTE_LOG(DEBUG, EAL,
 +  "PCI device "PCI_PRI_FMT" on NUMA socket %i\n",
 +  loc->domain, loc->bus, loc->devid,
 +  loc->function, dev->numa_node);
 +
 +  RTE_LOG(DEBUG, EAL, "  remove driver: %x:%x %s\n",
 +  dev->id.vendor_id, dev->id.device_id,
 +  dr->name);
 +
 +  /* call the driver devuninit() function */
 +  if (dr->devuninit && (dr->devuninit(dr, dev) < 0))
 +  return -1;  /* negative value is an error */
 +
 +  /* clear driver structure */
 +  dev->driver = NULL;
 +
 +  if (dr->drv_flags & RTE_PCI_DRV_NEED_MAPPING)
 +  /* unmap resources for devices that use igb_uio */
 +  pci_unmap_device(dev);
>>> Hi, Tetsuya
>>>
>>> I have one question,  as the code shows, in pci_unmap_device(), will
>>> check pt_driver.
>>>
>>> But assume that, we are now try to detach a vfio device, after print out
>>> a error message of unsupported, the does this port workable?
>>>
>>> I think this port will unworkable, am I right?
>>>
>>> But actually, we should keep it workable.
>>>
>>> My suggestion is to add a check in  rte_eth_dev_check_detachable() for
>>> pci_device port.
>> Hi Michael,
>>
>> I appreciate your comment.
>> In the function called "rte_eal_dev_detach_pdev()",
>> "rte_eth_dev_check_detachable()" has been already checked.
> What I mean is check the pt_driver for pci_dev in
> rte_eth_dev_check_detachable(), so that hotplug framework will not
> affect vfio devices, just as I reply in another mail.
>
> Current logic will affect vfio devices if try to detach( Not do the
> really test, just the logic shows), am I right?

Thanks, I've got your point.
Yes, you are right. I will fix it.

Tetsuya

> Thanks,
> Michael
>  
>> But in the future, someone may want to reuse
>> "rte_eal_pci_close_one_driver()".
>> So I will add the checking like your suggestion.
>>
>> Thanks,
>> Tetsuya
>>
>>> Thanks
>>> Michael
>>>
 +
 +  return 0;
 +  }
 +  /* return positive value if driver is not found */
 +  return 1;
 +}
 +#else /* ENABLE_HOTPLUG */
 +int
 +rte_eal_pci_close_one_driver(struct rte_pci_driver *dr __rte_unused,
 +  struct rte_pci_device *dev __rte_unused)
 +{
 +  RTE_LOG(ERR, EAL, "Hotplug support isn't enabled\n");
 +  return -1;
 +}
 +#endif /* ENABLE_HOTPLUG */
 +
  /* Init the PCI EAL subsystem */
  int
  rte_eal_pci_init(void)
>>



[dpdk-dev] [PATCH v2 0/5] Interrupt mode for PMD

2015-02-03 Thread Zhou Danny
v2 changes
- Fix compilation issue in Makefile for missed header file.
- Consolidate internal and community review comments of v1 patch set.

The patch series introduce low-latency one-shot rx interrupt into DPDK with
polling and interrupt mode switch control example.

DPDK userspace interrupt notification and handling mechanism is based on UIO
with below limitation:
1) It is designed to handle LSC interrupt only with inefficient suspended
pthread wakeup procedure (e.g. UIO wakes up LSC interrupt handling thread
which then wakes up DPDK polling thread). In this way, it introduces
non-deterministic wakeup latency for DPDK polling thread as well as packet
latency if it is used to handle Rx interrupt.
2) UIO only supports a single interrupt vector which has to been shared by
LSC interrupt and interrupts assigned to dedicated rx queues.

This patchset includes below features:
1) Enable one-shot rx queue interrupt in ixgbe PMD(PF & VF) and igb PMD(PF 
only).
2) Build on top of the VFIO mechanism instead of UIO, so it could support
up to 64 interrupt vectors for rx queue interrupts.
3) Have 1 DPDK polling thread handle per Rx queue interrupt with a dedicated
VFIO eventfd, which eliminates non-deterministic pthread wakeup latency in
user space.
4) Demonstrate interrupts control APIs and userspace NAIP-like polling/interrupt
switch algorithms in L3fwd-power example.

Known limitations:
1) It does not work for UIO due to a single interrupt eventfd shared by LSC
and rx queue interrupt handlers causes a mess.
2) LSC interrupt is not supported by VF driver, so it is by default disabled
in L3fwd-power now. Feel free to turn in on if you want to support both LSC
and rx queue interrupts on a PF.

Danny Zhou (5):
  ethdev: add rx interrupt enable/disable functions
  ixgbe: enable rx queue interrupts for both PF and VF
  igb: enable rx queue interrupts for PF
  eal: add per rx queue interrupt handling based on VFIO
  l3fwd-power: enable one-shot rx interrupt and polling/interrupt mode
switch

 examples/l3fwd-power/main.c| 141 +---
 lib/librte_eal/common/include/rte_eal.h|  12 +
 lib/librte_eal/linuxapp/eal/Makefile   |   1 +
 lib/librte_eal/linuxapp/eal/eal_interrupts.c   | 181 +++---
 lib/librte_eal/linuxapp/eal/eal_pci_vfio.c |  11 +-
 .../linuxapp/eal/include/exec-env/rte_interrupts.h |   4 +
 lib/librte_ether/rte_ethdev.c  |  45 +++
 lib/librte_ether/rte_ethdev.h  |  57 
 lib/librte_pmd_e1000/e1000/e1000_hw.h  |   3 +
 lib/librte_pmd_e1000/e1000_ethdev.h|   6 +
 lib/librte_pmd_e1000/igb_ethdev.c  | 230 +++--
 lib/librte_pmd_ixgbe/ixgbe_ethdev.c| 377 -
 lib/librte_pmd_ixgbe/ixgbe_ethdev.h|   9 +
 13 files changed, 965 insertions(+), 112 deletions(-)

-- 
1.8.1.4



[dpdk-dev] [PATCH v2 1/5] ethdev: add rx interrupt enable/disable functions

2015-02-03 Thread Zhou Danny
Add two dev_ops functions to enable and disable rx queue interrupts

Signed-off-by: Danny Zhou 
Tested-by: Yong Liu 
---
 lib/librte_ether/rte_ethdev.c | 45 ++
 lib/librte_ether/rte_ethdev.h | 57 +++
 2 files changed, 102 insertions(+)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index ea3a1fb..dd66cd9 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -2825,6 +2825,51 @@ _rte_eth_dev_callback_process(struct rte_eth_dev *dev,
}
rte_spinlock_unlock(&rte_eth_dev_cb_lock);
 }
+
+int
+rte_eth_dev_rx_queue_intr_enable(uint8_t port_id,
+   uint16_t queue_id)
+{
+   struct rte_eth_dev *dev;
+
+   if (port_id >= nb_ports) {
+   PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
+   return (-ENODEV);
+   }
+
+   dev = &rte_eth_devices[port_id];
+   if (dev == NULL) {
+   PMD_DEBUG_TRACE("Invalid port device\n");
+   return (-ENODEV);
+   }
+
+   FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_intr_enable, -ENOTSUP);
+   (*dev->dev_ops->rx_queue_intr_enable)(dev, queue_id);
+   return 0;
+}
+
+int
+rte_eth_dev_rx_queue_intr_disable(uint8_t port_id,
+   uint16_t queue_id)
+{
+   struct rte_eth_dev *dev;
+
+   if (port_id >= nb_ports) {
+   PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
+   return (-ENODEV);
+   }
+
+   dev = &rte_eth_devices[port_id];
+   if (dev == NULL) {
+   PMD_DEBUG_TRACE("Invalid port device\n");
+   return (-ENODEV);
+   }
+
+   FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_intr_disable, -ENOTSUP);
+   (*dev->dev_ops->rx_queue_intr_disable)(dev, queue_id);
+   return 0;
+}
+
 #ifdef RTE_NIC_BYPASS
 int rte_eth_dev_bypass_init(uint8_t port_id)
 {
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 1200c1c..c080039 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -848,6 +848,8 @@ struct rte_eth_fdir {
 struct rte_intr_conf {
/** enable/disable lsc interrupt. 0 (default) - disable, 1 enable */
uint16_t lsc;
+   /** enable/disable rxq interrupt. 0 (default) - disable, 1 enable */
+   uint16_t rxq;
 };

 /**
@@ -1108,6 +1110,14 @@ typedef int (*eth_tx_queue_setup_t)(struct rte_eth_dev 
*dev,
const struct rte_eth_txconf *tx_conf);
 /**< @internal Setup a transmit queue of an Ethernet device. */

+typedef int (*eth_rx_enable_intr_t)(struct rte_eth_dev *dev,
+   uint16_t rx_queue_id);
+/**< @internal Enable interrupt of a receive queue of an Ethernet device. */
+
+typedef int (*eth_rx_disable_intr_t)(struct rte_eth_dev *dev,
+   uint16_t rx_queue_id);
+/**< @internal Disable interrupt of a receive queue of an Ethernet device. */
+
 typedef void (*eth_queue_release_t)(void *queue);
 /**< @internal Release memory resources allocated by given RX/TX queue. */

@@ -1444,6 +1454,8 @@ struct eth_dev_ops {
eth_queue_start_t  tx_queue_start;/**< Start TX for a queue.*/
eth_queue_stop_t   tx_queue_stop;/**< Stop TX for a queue.*/
eth_rx_queue_setup_t   rx_queue_setup;/**< Set up device RX queue.*/
+   eth_rx_enable_intr_t   rx_queue_intr_enable; /**< Enable Rx queue 
interrupt. */
+   eth_rx_disable_intr_t  rx_queue_intr_disable; /**< Disable Rx queue 
interrupt.*/
eth_queue_release_trx_queue_release;/**< Release RX queue.*/
eth_rx_queue_count_t   rx_queue_count; /**< Get Rx queue count. */
eth_rx_descriptor_done_t   rx_descriptor_done;  /**< Check rxd DD bit */
@@ -2810,6 +2822,51 @@ void _rte_eth_dev_callback_process(struct rte_eth_dev 
*dev,
enum rte_eth_event_type event);

 /**
+ * When there is no rx packet coming in Rx Queue for a long time, we can
+ * sleep lcore related to RX Queue for power saving, and enable rx interrupt
+ * to be triggered when rx packect arrives.
+ *
+ * The rte_eth_dev_rx_queue_intr_enable() function enables rx queue
+ * interrupt on specific rx queue of a port.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param queue_id
+ *   The index of the receive queue from which to retrieve input packets.
+ *   The value must be in the range [0, nb_rx_queue - 1] previously supplied
+ *   to rte_eth_dev_configure().
+ * @return
+ *   - (0) if successful.
+ *   - (-ENOTSUP) if underlying hardware OR driver doesn't support
+ * that operation.
+ *   - (-ENODEV) if *port_id* invalid.
+ */
+int rte_eth_dev_rx_queue_intr_enable(uint8_t port_id,
+   uint16_t queue_id);
+
+/**
+ * When lcore wakes up from rx interrupt indicating packet coming, disable rx
+ * interrup

[dpdk-dev] [PATCH v2 3/5] igb: enable rx queue interrupts for PF

2015-02-03 Thread Zhou Danny
v2 changes
- Consolidate review comments related to coding style

The patch does below for igb PF:
- Setup NIC to generate MSI-X interrupts
- Set the IVAR register to map interrupt causes to vectors
- Implement interrupt enable/disable functions

Signed-off-by: Danny Zhou 
Tested-by: Yong Liu 
---
 lib/librte_pmd_e1000/e1000/e1000_hw.h |   3 +
 lib/librte_pmd_e1000/e1000_ethdev.h   |   6 +
 lib/librte_pmd_e1000/igb_ethdev.c | 230 ++
 3 files changed, 214 insertions(+), 25 deletions(-)

diff --git a/lib/librte_pmd_e1000/e1000/e1000_hw.h 
b/lib/librte_pmd_e1000/e1000/e1000_hw.h
index 4dd92a3..9b999ec 100644
--- a/lib/librte_pmd_e1000/e1000/e1000_hw.h
+++ b/lib/librte_pmd_e1000/e1000/e1000_hw.h
@@ -780,6 +780,9 @@ struct e1000_mac_info {
u16 mta_reg_count;
u16 uta_reg_count;

+   u32 max_rx_queues;
+   u32 max_tx_queues;
+
/* Maximum size of the MTA register table in all supported adapters */
#define MAX_MTA_REG 128
u32 mta_shadow[MAX_MTA_REG];
diff --git a/lib/librte_pmd_e1000/e1000_ethdev.h 
b/lib/librte_pmd_e1000/e1000_ethdev.h
index d155e77..713ca11 100644
--- a/lib/librte_pmd_e1000/e1000_ethdev.h
+++ b/lib/librte_pmd_e1000/e1000_ethdev.h
@@ -34,6 +34,8 @@
 #ifndef _E1000_ETHDEV_H_
 #define _E1000_ETHDEV_H_

+#include 
+
 /* need update link, bit flag */
 #define E1000_FLAG_NEED_LINK_UPDATE (uint32_t)(1 << 0)
 #define E1000_FLAG_MAILBOX  (uint32_t)(1 << 1)
@@ -105,10 +107,14 @@
 #define E1000_FTQF_QUEUE_SHIFT   16
 #define E1000_FTQF_QUEUE_ENABLE  0x0100

+/* maximum number of other interrupts besides Rx & Tx interrupts */
+#define E1000_MAX_OTHER_INTR   1
+
 /* structure for interrupt relative data */
 struct e1000_interrupt {
uint32_t flags;
uint32_t mask;
+   rte_spinlock_t lock;
 };

 /* local vfta copy */
diff --git a/lib/librte_pmd_e1000/igb_ethdev.c 
b/lib/librte_pmd_e1000/igb_ethdev.c
index 2a268b8..7d9b103 100644
--- a/lib/librte_pmd_e1000/igb_ethdev.c
+++ b/lib/librte_pmd_e1000/igb_ethdev.c
@@ -97,6 +97,7 @@ static int  eth_igb_flow_ctrl_get(struct rte_eth_dev *dev,
 static int  eth_igb_flow_ctrl_set(struct rte_eth_dev *dev,
struct rte_eth_fc_conf *fc_conf);
 static int eth_igb_lsc_interrupt_setup(struct rte_eth_dev *dev);
+static int eth_igb_rxq_interrupt_setup(struct rte_eth_dev *dev);
 static int eth_igb_interrupt_get_status(struct rte_eth_dev *dev);
 static int eth_igb_interrupt_action(struct rte_eth_dev *dev);
 static void eth_igb_interrupt_handler(struct rte_intr_handle *handle,
@@ -191,6 +192,14 @@ static int eth_igb_filter_ctrl(struct rte_eth_dev *dev,
 enum rte_filter_op filter_op,
 void *arg);

+static int eth_igb_rx_queue_intr_enable(struct rte_eth_dev *dev, uint16_t 
queue_id);
+static int eth_igb_rx_queue_intr_disable(struct rte_eth_dev *dev, uint16_t 
queue_id);
+static void eth_igb_assign_msix_vector(struct e1000_hw *hw, int8_t direction,
+   uint8_t queue, uint8_t msix_vector);
+static void eth_igb_configure_msix_intr(struct  e1000_hw *hw);
+static void eth_igb_write_ivar(struct e1000_hw *hw, uint8_t msix_vector,
+   uint8_t index, uint8_t offset);
+
 /*
  * Define VF Stats MACRO for Non "cleared on read" register
  */
@@ -250,6 +259,8 @@ static struct eth_dev_ops eth_igb_ops = {
.vlan_tpid_set= eth_igb_vlan_tpid_set,
.vlan_offload_set = eth_igb_vlan_offload_set,
.rx_queue_setup   = eth_igb_rx_queue_setup,
+   .rx_queue_intr_enable = eth_igb_rx_queue_intr_enable,
+   .rx_queue_intr_disable = eth_igb_rx_queue_intr_disable,
.rx_queue_release = eth_igb_rx_queue_release,
.rx_queue_count   = eth_igb_rx_queue_count,
.rx_descriptor_done   = eth_igb_rx_descriptor_done,
@@ -592,6 +603,16 @@ eth_igb_dev_init(__attribute__((unused)) struct eth_driver 
*eth_drv,
 eth_dev->data->port_id, pci_dev->id.vendor_id,
 pci_dev->id.device_id);

+   /* set max interrupt vfio request */
+   struct rte_eth_dev_info dev_info;
+
+   memset(&dev_info, 0, sizeof(dev_info));
+   eth_igb_infos_get(eth_dev, &dev_info);
+
+   hw->mac.max_rx_queues = dev_info.max_rx_queues;
+
+   pci_dev->intr_handle.max_intr = hw->mac.max_rx_queues + 
E1000_MAX_OTHER_INTR;
+
rte_intr_callback_register(&(pci_dev->intr_handle),
eth_igb_interrupt_handler, (void *)eth_dev);

@@ -754,7 +775,7 @@ eth_igb_start(struct rte_eth_dev *dev)
 {
struct e1000_hw *hw =
E1000_DEV_PRIVATE_TO_HW(dev->data->dev_private);
-   int ret, i, mask;
+   int ret, mask;
uint32_t ctrl_ext;

PMD_INIT_FUNC_TRACE();
@@ -794,6 +815,9 @@ eth_igb_start(struct rte_eth_dev *dev)
/* configure PF module if SRIOV enabled */
igb_pf_host_configure(dev);

+   /* confiugre msix 

[dpdk-dev] [PATCH v2 4/5] eal: add per rx queue interrupt handling based on VFIO

2015-02-03 Thread Zhou Danny
v2 change:
- Fix compilation issue for a missed header file
- Bug fix: free unreleased resources on the exception path before return
- Consolidate coding style related review comments

This patch does below:
- Create multiple VFIO eventfd for rx queues.
- Handle per rx queue interrupt.
- Eliminate unnecessary suspended DPDK polling thread wakeup mechanism
for rx interrupt by allowing polling thread epoll_wait rx queue
interrupt notification.

Signed-off-by: Danny Zhou 
Tested-by: Yong Liu 
---
 lib/librte_eal/common/include/rte_eal.h|  12 ++
 lib/librte_eal/linuxapp/eal/Makefile   |   1 +
 lib/librte_eal/linuxapp/eal/eal_interrupts.c   | 181 -
 lib/librte_eal/linuxapp/eal/eal_pci_vfio.c |  11 +-
 .../linuxapp/eal/include/exec-env/rte_interrupts.h |   4 +
 5 files changed, 167 insertions(+), 42 deletions(-)

diff --git a/lib/librte_eal/common/include/rte_eal.h 
b/lib/librte_eal/common/include/rte_eal.h
index f4ecd2e..d81331f 100644
--- a/lib/librte_eal/common/include/rte_eal.h
+++ b/lib/librte_eal/common/include/rte_eal.h
@@ -150,6 +150,18 @@ int rte_eal_iopl_init(void);
  *   - On failure, a negative error value.
  */
 int rte_eal_init(int argc, char **argv);
+
+/**
+ * @param port_id
+ *   the port id
+ * @param queue_id
+ *   the queue id
+ * @return
+ *   - On success, return 0
+ *   - On failure, returns -1.
+ */
+int rte_eal_wait_rx_intr(uint8_t port_id, uint8_t queue_id);
+
 /**
  * Usage function typedef used by the application usage function.
  *
diff --git a/lib/librte_eal/linuxapp/eal/Makefile 
b/lib/librte_eal/linuxapp/eal/Makefile
index 72ecf3a..325957f 100644
--- a/lib/librte_eal/linuxapp/eal/Makefile
+++ b/lib/librte_eal/linuxapp/eal/Makefile
@@ -39,6 +39,7 @@ CFLAGS += -I$(SRCDIR)/include
 CFLAGS += -I$(RTE_SDK)/lib/librte_eal/common
 CFLAGS += -I$(RTE_SDK)/lib/librte_eal/common/include
 CFLAGS += -I$(RTE_SDK)/lib/librte_ring
+CFLAGS += -I$(RTE_SDK)/lib/librte_mbuf
 CFLAGS += -I$(RTE_SDK)/lib/librte_mempool
 CFLAGS += -I$(RTE_SDK)/lib/librte_malloc
 CFLAGS += -I$(RTE_SDK)/lib/librte_ether
diff --git a/lib/librte_eal/linuxapp/eal/eal_interrupts.c 
b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
index dc2668a..74da06c 100644
--- a/lib/librte_eal/linuxapp/eal/eal_interrupts.c
+++ b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
@@ -64,6 +64,7 @@
 #include 
 #include 
 #include 
+#include 

 #include "eal_private.h"
 #include "eal_vfio.h"
@@ -127,6 +128,7 @@ static pthread_t intr_thread;
 #ifdef VFIO_PRESENT

 #define IRQ_SET_BUF_LEN  (sizeof(struct vfio_irq_set) + sizeof(int))
+#define MSIX_IRQ_SET_BUF_LEN (sizeof(struct vfio_irq_set) + sizeof(int) * 
(VFIO_MAX_QUEUE_ID + 1))

 /* enable legacy (INTx) interrupts */
 static int
@@ -221,7 +223,7 @@ vfio_disable_intx(struct rte_intr_handle *intr_handle) {
 /* enable MSI-X interrupts */
 static int
 vfio_enable_msi(struct rte_intr_handle *intr_handle) {
-   int len, ret;
+   int len, ret, max_intr;
char irq_set_buf[IRQ_SET_BUF_LEN];
struct vfio_irq_set *irq_set;
int *fd_ptr;
@@ -230,12 +232,19 @@ vfio_enable_msi(struct rte_intr_handle *intr_handle) {

irq_set = (struct vfio_irq_set *) irq_set_buf;
irq_set->argsz = len;
-   irq_set->count = 1;
+   if ((!intr_handle->max_intr) ||
+   (intr_handle->max_intr > VFIO_MAX_QUEUE_ID))
+   max_intr = VFIO_MAX_QUEUE_ID + 1;
+   else
+   max_intr = intr_handle->max_intr;
+
+   irq_set->count = max_intr;
irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD | 
VFIO_IRQ_SET_ACTION_TRIGGER;
irq_set->index = VFIO_PCI_MSI_IRQ_INDEX;
irq_set->start = 0;
fd_ptr = (int *) &irq_set->data;
-   *fd_ptr = intr_handle->fd;
+   memcpy(fd_ptr, intr_handle->queue_fd, sizeof(intr_handle->queue_fd));
+   fd_ptr[max_intr - 1] = intr_handle->fd;

ret = ioctl(intr_handle->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set);

@@ -244,23 +253,6 @@ vfio_enable_msi(struct rte_intr_handle *intr_handle) {
intr_handle->fd);
return -1;
}
-
-   /* manually trigger interrupt to enable it */
-   memset(irq_set, 0, len);
-   len = sizeof(struct vfio_irq_set);
-   irq_set->argsz = len;
-   irq_set->count = 1;
-   irq_set->flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_TRIGGER;
-   irq_set->index = VFIO_PCI_MSI_IRQ_INDEX;
-   irq_set->start = 0;
-
-   ret = ioctl(intr_handle->vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set);
-
-   if (ret) {
-   RTE_LOG(ERR, EAL, "Error triggering MSI interrupts for fd %d\n",
-   intr_handle->fd);
-   return -1;
-   }
return 0;
 }

@@ -292,8 +284,8 @@ vfio_disable_msi(struct rte_intr_handle *intr_handle) {
 /* enable MSI-X interrupts */
 static int
 vfio_enable_msix(struct rte_intr_handle *intr_handle) {
-   int len, ret;
-   char 

[dpdk-dev] [PATCH v2 2/5] ixgbe: enable rx queue interrupts for both PF and VF

2015-02-03 Thread Zhou Danny
v2 changes
- Consolidate review comments related to coding style

The patch does below things for ixgbe PF and VF:
- Setup NIC to generate MSI-X interrupts
- Set the IVAR register to map interrupt causes to vectors
- Implement interrupt enable/disable functions

Signed-off-by: Danny Zhou 
Signed-off-by: Yong Liu 
Tested-by: Yong Liu 
---
 lib/librte_pmd_ixgbe/ixgbe_ethdev.c | 377 +++-
 lib/librte_pmd_ixgbe/ixgbe_ethdev.h |   9 +
 2 files changed, 382 insertions(+), 4 deletions(-)

diff --git a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c 
b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
index b341dd0..551847d 100644
--- a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
+++ b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c
@@ -60,6 +60,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 

 #include "ixgbe_logs.h"
@@ -83,6 +84,9 @@
  */
 #define IXGBE_FC_LO0x40

+/* Default minimum inter-interrupt interval for EITR configuration */
+#define IXGBE_MIN_INTER_INTERRUPT_INTERVAL_DEFAULT0x79E
+
 /* Timer value included in XOFF frames. */
 #define IXGBE_FC_PAUSE 0x680

@@ -173,6 +177,7 @@ static int ixgbe_dev_rss_reta_query(struct rte_eth_dev *dev,
uint16_t reta_size);
 static void ixgbe_dev_link_status_print(struct rte_eth_dev *dev);
 static int ixgbe_dev_lsc_interrupt_setup(struct rte_eth_dev *dev);
+static int ixgbe_dev_rxq_interrupt_setup(struct rte_eth_dev *dev);
 static int ixgbe_dev_interrupt_get_status(struct rte_eth_dev *dev);
 static int ixgbe_dev_interrupt_action(struct rte_eth_dev *dev);
 static void ixgbe_dev_interrupt_handler(struct rte_intr_handle *handle,
@@ -186,11 +191,14 @@ static void ixgbe_dcb_init(struct ixgbe_hw *hw,struct 
ixgbe_dcb_config *dcb_conf
 /* For Virtual Function support */
 static int eth_ixgbevf_dev_init(struct eth_driver *eth_drv,
struct rte_eth_dev *eth_dev);
+static int ixgbevf_dev_interrupt_get_status(struct rte_eth_dev *dev);
+static int ixgbevf_dev_interrupt_action(struct rte_eth_dev *dev);
 static int  ixgbevf_dev_configure(struct rte_eth_dev *dev);
 static int  ixgbevf_dev_start(struct rte_eth_dev *dev);
 static void ixgbevf_dev_stop(struct rte_eth_dev *dev);
 static void ixgbevf_dev_close(struct rte_eth_dev *dev);
 static void ixgbevf_intr_disable(struct ixgbe_hw *hw);
+static void ixgbevf_intr_enable(struct ixgbe_hw *hw);
 static void ixgbevf_dev_stats_get(struct rte_eth_dev *dev,
struct rte_eth_stats *stats);
 static void ixgbevf_dev_stats_reset(struct rte_eth_dev *dev);
@@ -200,6 +208,15 @@ static void ixgbevf_vlan_strip_queue_set(struct 
rte_eth_dev *dev,
uint16_t queue, int on);
 static void ixgbevf_vlan_offload_set(struct rte_eth_dev *dev, int mask);
 static void ixgbevf_set_vfta_all(struct rte_eth_dev *dev, bool on);
+static void ixgbevf_dev_interrupt_handler(struct rte_intr_handle *handle,
+   void *param);
+static int ixgbevf_dev_rx_queue_intr_enable(struct rte_eth_dev *dev,
+   uint16_t queue_id);
+static int ixgbevf_dev_rx_queue_intr_disable(struct rte_eth_dev *dev,
+uint16_t queue_id);
+static void ixgbevf_set_ivar_map(struct ixgbe_hw *hw, int8_t direction,
+uint8_t queue, uint8_t msix_vector);
+static void ixgbevf_configure_msix(struct  ixgbe_hw *hw);

 /* For Eth VMDQ APIs support */
 static int ixgbe_uc_hash_table_set(struct rte_eth_dev *dev, struct
@@ -217,6 +234,14 @@ static int ixgbe_mirror_rule_set(struct rte_eth_dev *dev,
 static int ixgbe_mirror_rule_reset(struct rte_eth_dev *dev,
uint8_t rule_id);

+static int ixgbe_dev_rx_queue_intr_enable(struct rte_eth_dev *dev,
+   uint16_t queue_id);
+static int ixgbe_dev_rx_queue_intr_disable(struct rte_eth_dev *dev,
+   uint16_t queue_id);
+static void ixgbe_set_ivar_map(struct ixgbe_hw *hw, int8_t direction,
+   uint8_t queue, uint8_t msix_vector);
+static void ixgbe_configure_msix(struct  ixgbe_hw *hw);
+
 static int ixgbe_set_queue_rate_limit(struct rte_eth_dev *dev,
uint16_t queue_idx, uint16_t tx_rate);
 static int ixgbe_set_vf_rate_limit(struct rte_eth_dev *dev, uint16_t vf,
@@ -257,7 +282,7 @@ static int ixgbe_dev_filter_ctrl(struct rte_eth_dev *dev,
  */
 #define UPDATE_VF_STAT(reg, last, cur) \
 {   \
-   u32 latest = IXGBE_READ_REG(hw, reg);   \
+   uint32_t latest = IXGBE_READ_REG(hw, reg);   \
cur += latest - last;   \
last = latest;  \
 }
@@ -338,6 +363,8 @@ static struct eth_dev_ops ixgbe_eth_dev_ops = {
.tx_queue_start   = ixgbe_dev_tx_queue_start,
.tx_queue_stop= ixgbe_dev_tx_queue_stop,
.rx_queue_setup   = ixgbe_dev_rx_queue_setup,
+   .rx_queue_intr_enable = ixgbe_dev_rx_queue_int

[dpdk-dev] [PATCH v2 5/5] l3fwd-power: enable one-shot rx interrupt and polling/interrupt mode switch

2015-02-03 Thread Zhou Danny
v2 change
- Remove unused function which is for debug purpose

Demonstrate how to handle per rx queue interrupt in a NAPI-like
implementation in usersapce. PDK polling thread mainly works in 
polling mode and switch to interrupt mode only if there is no 
any packet received in recent polls.
Usersapce interrupt notification generally takes a lot more cycles than
kernel, so one-shot interrupt is used here to guarantee minimum overhead
and DPDK polling thread returns to polling mode immediately once it
receives an interrupt notificaiton for incoming packet.

Signed-off-by: Danny Zhou 
Tested-by: Yong Liu 
---
 examples/l3fwd-power/main.c | 141 +++-
 1 file changed, 100 insertions(+), 41 deletions(-)

diff --git a/examples/l3fwd-power/main.c b/examples/l3fwd-power/main.c
index f6b55b9..15f0a5a 100644
--- a/examples/l3fwd-power/main.c
+++ b/examples/l3fwd-power/main.c
@@ -75,12 +75,13 @@
 #include 
 #include 
 #include 
+#include 

 #define RTE_LOGTYPE_L3FWD_POWER RTE_LOGTYPE_USER1

 #define MAX_PKT_BURST 32

-#define MIN_ZERO_POLL_COUNT 5
+#define MIN_ZERO_POLL_COUNT 10

 /* around 100ms at 2 Ghz */
 #define TIMER_RESOLUTION_CYCLES   2ULL
@@ -188,6 +189,9 @@ struct lcore_rx_queue {
 #define MAX_TX_QUEUE_PER_PORT RTE_MAX_ETHPORTS
 #define MAX_RX_QUEUE_PER_PORT 128

+#define MAX_RX_QUEUE_INTERRUPT_PER_PORT 16
+
+
 #define MAX_LCORE_PARAMS 1024
 struct lcore_params {
uint8_t port_id;
@@ -214,7 +218,7 @@ static uint16_t nb_lcore_params = 
sizeof(lcore_params_array_default) /

 static struct rte_eth_conf port_conf = {
.rxmode = {
-   .mq_mode= ETH_MQ_RX_RSS,
+   .mq_mode = ETH_MQ_RX_RSS,
.max_rx_pkt_len = ETHER_MAX_LEN,
.split_hdr_size = 0,
.header_split   = 0, /**< Header Split disabled */
@@ -226,11 +230,14 @@ static struct rte_eth_conf port_conf = {
.rx_adv_conf = {
.rss_conf = {
.rss_key = NULL,
-   .rss_hf = ETH_RSS_IP,
+   .rss_hf = ETH_RSS_UDP,
},
},
.txmode = {
-   .mq_mode = ETH_DCB_NONE,
+   .mq_mode = ETH_MQ_TX_NONE,
+   },
+   .intr_conf = {
+   .rxq = 1, /**< rxq interrupt feature enabled */
},
 };

@@ -402,19 +409,22 @@ power_timer_cb(__attribute__((unused)) struct rte_timer 
*tim,
/* accumulate total execution time in us when callback is invoked */
sleep_time_ratio = (float)(stats[lcore_id].sleep_time) /
(float)SCALING_PERIOD;
-
/**
 * check whether need to scale down frequency a step if it sleep a lot.
 */
-   if (sleep_time_ratio >= SCALING_DOWN_TIME_RATIO_THRESHOLD)
-   rte_power_freq_down(lcore_id);
+   if (sleep_time_ratio >= SCALING_DOWN_TIME_RATIO_THRESHOLD) {
+   if (rte_power_freq_down)
+   rte_power_freq_down(lcore_id);
+   }
else if ( (unsigned)(stats[lcore_id].nb_rx_processed /
-   stats[lcore_id].nb_iteration_looped) < MAX_PKT_BURST)
+   stats[lcore_id].nb_iteration_looped) < MAX_PKT_BURST) {
/**
 * scale down a step if average packet per iteration less
 * than expectation.
 */
-   rte_power_freq_down(lcore_id);
+   if (rte_power_freq_down)
+   rte_power_freq_down(lcore_id);
+   }

/**
 * initialize another timer according to current frequency to ensure
@@ -707,22 +717,20 @@ l3fwd_simple_forward(struct rte_mbuf *m, uint8_t portid,

 }

-#define SLEEP_GEAR1_THRESHOLD100
-#define SLEEP_GEAR2_THRESHOLD1000
+#define MINIMUM_SLEEP_TIME 1
+#define SUSPEND_THRESHOLD  300

 static inline uint32_t
 power_idle_heuristic(uint32_t zero_rx_packet_count)
 {
-   /* If zero count is less than 100, use it as the sleep time in us */
-   if (zero_rx_packet_count < SLEEP_GEAR1_THRESHOLD)
-   return zero_rx_packet_count;
-   /* If zero count is less than 1000, sleep time should be 100 us */
-   else if ((zero_rx_packet_count >= SLEEP_GEAR1_THRESHOLD) &&
-   (zero_rx_packet_count < SLEEP_GEAR2_THRESHOLD))
-   return SLEEP_GEAR1_THRESHOLD;
-   /* If zero count is greater than 1000, sleep time should be 1000 us */
-   else if (zero_rx_packet_count >= SLEEP_GEAR2_THRESHOLD)
-   return SLEEP_GEAR2_THRESHOLD;
+   /* If zero count is less than 100,  sleep 1us */
+   if (zero_rx_packet_count < SUSPEND_THRESHOLD)
+   return MINIMUM_SLEEP_TIME;
+   /* If zero count is less than 1000, sleep 100 us which is the minimum 
latency
+   switching from C3/C6 to C0
+   */
+   else
+   return SUSPEND_THRESHOLD;

return 0;
 }
@@ -762,6 +770,35 

[dpdk-dev] [PATCH v2 2/2] eal: add help option

2015-02-03 Thread Thomas Monjalon
2015-02-03 07:33, David Marchand:
> On Mon, Feb 2, 2015 at 6:44 PM, Thomas Monjalon 
> wrote:
> [snip]
> > @@ -340,6 +342,9 @@ eal_parse_args(int argc, char **argv)
> > continue;
> >
> > switch (opt) {
> > +   case 'h':
> > +   eal_usage(prgname);
> > +   exit(EXIT_SUCCESS);
> > default:
> > if (opt < OPT_LONG_MIN_NUM && isprint(opt)) {
> > RTE_LOG(ERR, EAL, "Option %c is not 
> > supported "
> [snip]
> > @@ -534,6 +536,10 @@ eal_parse_args(int argc, char **argv)
> > continue;
> >
> > switch (opt) {
> > +   case 'h':
> > +   eal_usage(prgname);
> > +   exit(EXIT_SUCCESS);
> > +
> > /* force loading of external driver */
> > case 'd':
> > solib = malloc(sizeof(*solib));
> 
> Why not move those two in common parser ?

Because it's calling eal_usage() which is not callable from common parser.
eal_usage() print usage for common and environment-specific options.

-- 
Thomas


[dpdk-dev] [PATCH v2 1/2] eal: sort and align options lists

2015-02-03 Thread Thomas Monjalon
2015-02-03 07:26, David Marchand:
> Two little comments.
> 
> On Mon, Feb 2, 2015 at 6:44 PM, Thomas Monjalon 
> wrote:
> > @@ -578,37 +579,36 @@ eal_check_common_options(struct internal_config
> > *internal_cfg)
> >  void
> >  eal_common_usage(void)
> >  {
> > -   printf("-c COREMASK -n NUM [-m NB] [-r NUM] [-b 
> > ]"
> > -  "[--proc-type primary|secondary|auto]\n\n"
> > +   printf("-c COREMASK|-l CORELIST -n CHANNELS [options]\n\n"
> >"EAL common options:\n"
> > -  "  -c COREMASK  : A hexadecimal bitmask of cores to run on\n"
> > -  "  -l CORELIST  : List of cores to run on\n"
> > -  " The argument format is 
> > [-c2][,c3[-c4],...]\n"
> [snip]
> > +  "  -n NUM  Number of memory channels\n"
> 
> Not really a problem, but for consistency : here, you are talking about
> NUM, while at first, you wrote -n CHANNELS.

Yes you're right. I changed headline but not the description of this option.
Will do.

> [snip]
> > /* first long only option value must be >= 256, so that we won't
> >  * conflict with short options */
> > OPT_LONG_MIN_NUM = 256,
> > -#define OPT_HUGE_DIR"huge-dir"
> > -   OPT_HUGE_DIR_NUM = OPT_LONG_MIN_NUM,
> > -#define OPT_MASTER_LCORE "master-lcore"
> > +#define OPT_BASE_VIRTADDR "base-virtaddr"
> > +   OPT_BASE_VIRTADDR_NUM,
> 
> Why skip the first entry ?
> Afaik, OPT_BASE_VIRTADDR_NUM will be set to 257, is it to avoid having this
> = OPT_LONG_MIN_NUM moved anytime we add a new long option at the top of the
> enum ?

Exactly, yes. I think we don't care what is the first number.
It doesn't deserve a painful assignment.

> The rest looks good to me.
> Acked-by: David Marchand 

Thanks
-- 
Thomas


[dpdk-dev] [PATCH 1/2] rte_ethdev: update link status (speed, duplex, link_up) after rte_eth_dev_start

2015-02-03 Thread Zhang, Helin


> -Original Message-
> From: Jia Yu [mailto:jyu at vmware.com]
> Sent: Tuesday, February 3, 2015 4:00 PM
> To: Zhang, Helin
> Cc: dev at dpdk.org; Thomas Monjalon
> Subject: Re: [dpdk-dev] [PATCH 1/2] rte_ethdev: update link status (speed,
> duplex, link_up) after rte_eth_dev_start
> 
> My answer to Helin?s comments:
> 
> This patch is needed for bond slave devices or other devices, when LSC
> interrupt is enabled.
>  1. slave_configure()  -> slave_eth_dev->?.lsc = 1
> 
>  2. rte_eth_link_get() reads dev_link from eth_dev, when lsc interrupt is
> enabled. However, the dev_link on eth_dev has not be initialized and showed
> link down state. This patch initializes the device?s dev_link at
> rte_eth_dev_start time.
So the link update is for bond only? But your code changes is in rte_ethdev, it 
will
be used for all PMDs. For hardware NIC (e.g. 82599), this link update is not 
needed
at all. It just need to wait the link status change event. So can those code 
changes
be put in librte_pmd_bond, but not in librte_ether?

Regards,
Helin

> 
> Please let me know if you have further questions/comments.
> 
> Thanks,
> Jia
> 
> On 1/30/15, 2:28 AM, "Thomas Monjalon" 
> wrote:
> 
> >Jia, any news on this patchset?
> >
> >2014-11-12 03:57, Zhang, Helin:
> >> Hi Jia
> >>
> >> > -Original Message-
> >> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jia Yu
> >> > Sent: Saturday, November 8, 2014 1:32 AM
> >> > To: dev at dpdk.org
> >> > Subject: [dpdk-dev] [PATCH 1/2] rte_ethdev: update link status
> >>(speed, duplex,
> >> > link_up) after rte_eth_dev_start
> >> >
> >> > Since LSR interrupt is disabled by pmd drivers, link status in
> >>rte_eth_device is
> >> > always down.
> >> If LSC interrupt is disabled by default, it will poll the link status
> >>during the initialization  or in dev_start, and then the link status
> >>should he correct. If I am not wrong.
> >>
> >> > Bond slave_configure() enables LSR interrupt on devices to get
> >>notification if link
> >> > status changes. However, the LSC interrupt at device start time is
> >>still lost.
> >> Before enabling interrupt for LSC, the link status should be polled.
> >>So after the port  startup, the link status should be there.
> >>
> >> >
> >> > In this fix, call link_update to read link status from hardware
> >>register at device
> >> > start time.
> >> Could you help to explain this code changes a bit more? Why we need it?
> >>
> >> >
> >> > Issue:
> >> > Change-Id: Ib57a1c9114f922485c7b0f4338bfe7b3d3f87d65
> >> > Signed-off-by: Jia Yu 
> >> > ---
> >> >  lib/librte_ether/rte_ethdev.c | 4 
> >> >  1 file changed, 4 insertions(+)
> >> >
> >> > diff --git a/lib/librte_ether/rte_ethdev.c
> >>b/lib/librte_ether/rte_ethdev.c index
> >> > ff1c769..6c01b02 100644
> >> > --- a/lib/librte_ether/rte_ethdev.c
> >> > +++ b/lib/librte_ether/rte_ethdev.c
> >> > @@ -869,6 +869,10 @@ rte_eth_dev_start(uint8_t port_id)
> >> >
> >> >  rte_eth_dev_config_restore(port_id);
> >> >
> >> > +if (dev->data->dev_conf.intr_conf.lsc != 0) {
> >> > +FUNC_PTR_OR_ERR_RET(*dev->dev_ops->link_update,
> -ENOTSUP);
> >> > +(*dev->dev_ops->link_update)(dev, 0);
> >> > +}
> >> >  return 0;
> >> >  }
> >> >
> >> > --
> >> > 1.9.1
> >>
> >> Regards,
> >> Helin
> >



[dpdk-dev] [PATCH] testpmd: Fix wrong message when no port started

2015-02-03 Thread Michael Qiu
The log message is wrong when no port started.

Signed-off-by: Michael Qiu 
---
 app/test-pmd/testpmd.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 773b8af..ebf9448 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -1423,7 +1423,7 @@ start_port(portid_t pid)
if (need_check_link_status && !no_link_check)
check_all_ports_link_status(nb_ports, RTE_PORT_ALL);
else
-   printf("Please stop the ports first\n");
+   printf("Please start at least one port first\n");

printf("Done\n");
return 0;
-- 
1.9.3



[dpdk-dev] [PATCH 00/17] unified packet type

2015-02-03 Thread Olivier MATZ
Hi Konstantin,

On 02/02/2015 06:20 PM, Ananyev, Konstantin wrote:
>> I think the API should describe for each packet type what can be
>> expected by the application. Here is an example. When a driver sets the
>> RTE_PTYPE_L3_IPV4 type, it means that:
>>
>> - the layer 3 is identified as IP by underlying layer (ex: ethertype=IP
>>if layer 2 is ethernet)
>> - the IP version field is 4
>> - there is no IP options (i.e the size of header is 20)
>
> Yes, I suppose that's what supported HW can guarantee when RTE_PTYPE_L3_IPV4 
> is set.
>
>> - the checksum field has been verified by hw, and if wrong, the
>>flag PKT_RX_IP_CKSUM_BAD is set
>
> Hmm, why is that?
> As I remember on many devices it is configurable by SW should HW do RX 
> checksum offload or not.
>  From DPDK point of view there is hw_ip_checksum field in rte_eth_rxmode.
> So it is a possible situation, when at RX HW does packet type determination, 
> but doesn't make L3/L4
> checksum calculation.
>
> I suppose for checksum(s) it should be a separate flags (in ol_flags) with 3 
> possible values:
> CKSUM_UNKNOWN, CKSUM_BAD, CKSUM_OK.

Indeed you are right, it's probably better to have specific flags
for checksum.

Regards,
Olivier



[dpdk-dev] [PATCH v2 2/2] eal: add help option

2015-02-03 Thread David Marchand
On Tue, Feb 3, 2015 at 9:20 AM, Thomas Monjalon 
wrote:

> 2015-02-03 07:33, David Marchand:
> > On Mon, Feb 2, 2015 at 6:44 PM, Thomas Monjalon <
> thomas.monjalon at 6wind.com>
> > wrote:
> > [snip]
> > > @@ -340,6 +342,9 @@ eal_parse_args(int argc, char **argv)
> > > continue;
> > >
> > > switch (opt) {
> > > +   case 'h':
> > > +   eal_usage(prgname);
> > > +   exit(EXIT_SUCCESS);
> > > default:
> > > if (opt < OPT_LONG_MIN_NUM && isprint(opt)) {
> > > RTE_LOG(ERR, EAL, "Option %c is not
> supported "
> > [snip]
> > > @@ -534,6 +536,10 @@ eal_parse_args(int argc, char **argv)
> > > continue;
> > >
> > > switch (opt) {
> > > +   case 'h':
> > > +   eal_usage(prgname);
> > > +   exit(EXIT_SUCCESS);
> > > +
> > > /* force loading of external driver */
> > > case 'd':
> > > solib = malloc(sizeof(*solib));
> >
> > Why not move those two in common parser ?
>
> Because it's calling eal_usage() which is not callable from common parser.
> eal_usage() print usage for common and environment-specific options.
>

Oh right ...
Then, I suppose it is fine like this.

Acked-by: David Marchand 


-- 
David Marchand


[dpdk-dev] [PATCH 01/17] mbuf: add definitions of unified packet types

2015-02-03 Thread Olivier MATZ
Hi Helin,

On 02/03/2015 07:37 AM, Zhang, Helin wrote:
>>> When your application decapsulates tunnels, you can just do outer =
>>> inner and enter into the same code.
>> Expanding packet_type is not easy, as there is no free bits in the first 
>> cache
>> line.
>> Is there any tunnel type in inner packet? Is it a waste?
>> Is L2 type really needed? I don't know.
> If it is now not short of space in mbuf, the definition as yours might be 
> good.
> But tun_type is not required for inner packet, I'd prefer to define it as 
> needed
> with taking into account the Vector PMD support. It seems 32 bits might be 
> enough,
> like below,
> struct pkt_type {
>   uint32_t l2_type:4;
>   uint32_t l3_type:4;
>   uint32_t l4_type:4;
>   uint32_t tun_type:4;
>   uint32_t inner_l2_type:4;
>   uint32_t inner_l3_type:4;
>   uint32_t inner_l4_type:4;
> }

Yes, I think a structure like this would be much better!
Maybe a union with a u32 could also help to assign the value
in one operation.

Thanks,
Olivier



[dpdk-dev] [PATCH v6] testpmd: Add port hotplug support

2015-02-03 Thread Qiu, Michael
On 2/3/2015 2:16 PM, Qiu, Michael wrote:
> On 2/1/2015 12:02 PM, Tetsuya Mukawa wrote:
>> The patch introduces following commands.
>> - port attach [ident]
>> - port detach [port_id]
>>  - attach: attaching a port
>>  - detach: detaching a port
>>  - ident: pci address of physical device.
>>   Or device name and paramerters of virtual device.
>>  (ex. :02:00.0, eth_pcap0,iface=eth0)
>>  - port_id: port identifier
>>
>> v5:
>> - Add testpmd documentation.
>>   (Thanks to Iremonger, Bernard)
>> v4:
>>  - Fix strings of command help.
>>
>> Signed-off-by: Tetsuya Mukawa 

[...]

>> +static int
>> +port_is_closed(portid_t port_id)
>> +{
>> +if (port_id_is_invalid(port_id, ENABLED_WARN))
>> +return 0;
>> +
>> +if (ports[port_id].port_status != RTE_PORT_CLOSED)
>> +return 0;
>> +
>> +return 1;
>> +}
>> +
>> +int
>>  start_port(portid_t pid)
>>  {
>>  int diag, need_check_link_status = 0;
>> @@ -1296,8 +1347,8 @@ start_port(portid_t pid)
>>  
>>  if(dcb_config)
>>  dcb_test = 1;
>> -for (pi = 0; pi < nb_ports; pi++) {
>> -if (pid < nb_ports && pid != pi)
>> +FOREACH_PORT(pi, ports) {
>> +if (!port_id_is_invalid(pid, DISABLED_WARN) && pid != pi)
> Here may it be:
>
> if (!port_id_is_invalid(pid, DISABLED_WARN) && (pid != pi || pid == 
> RET_PORT_ALL))

Sorry, should be:

if (!port_id_is_invalid(pid, DISABLED_WARN) && pid != pi && pid != 
(portid_t)RET_PORT_ALL)

Otherwise, should check for "RET_PORT_ALL" in function port_id_is_invalid()

Thanks,
Michael

> Otherwise no port will be start by default.
>
>
> Thanks,
> Michael
>
>>  continue;
>>  
>>  port = &ports[pi];
>> @@ -1421,7 +1472,7 @@ start_port(portid_t pid)
>>  }
>>  
>>  if (need_check_link_status && !no_link_check)
>> -check_all_ports_link_status(nb_ports, RTE_PORT_ALL);
>> +check_all_ports_link_status(RTE_PORT_ALL);
>>  else
>>  printf("Please stop the ports first\n");
>>  
>> @@ -1446,8 +1497,8 @@ stop_port(portid_t pid)
>>  }
>>  printf("Stopping ports...\n");
>>  
>> -for (pi = 0; pi < nb_ports; pi++) {
>> -if (pid < nb_ports && pid != pi)
>> +FOREACH_PORT(pi, ports) {
>> +if (!port_id_is_invalid(pid, DISABLED_WARN) && pid != pi)
>>  continue;
>>  
>>  port = &ports[pi];
>> @@ -1463,7 +1514,7 @@ stop_port(portid_t pid)
>>  need_check_link_status = 1;
>>  }
>>  if (need_check_link_status && !no_link_check)
>> -check_all_ports_link_status(nb_ports, RTE_PORT_ALL);
>> +check_all_ports_link_status(RTE_PORT_ALL);
>>  
>>  printf("Done\n");
>>  }
>> @@ -1481,8 +1532,8 @@ close_port(portid_t pid)
>>  
>>  printf("Closing ports...\n");
>>  
>> -for (pi = 0; pi < nb_ports; pi++) {
>> -if (pid < nb_ports && pid != pi)
>> +FOREACH_PORT(pi, ports) {
>> +if (!port_id_is_invalid(pid, DISABLED_WARN) && pid != pi)
>>  continue;
>>  
>>  port = &ports[pi];
>> @@ -1502,31 +1553,83 @@ close_port(portid_t pid)
>>  printf("Done\n");
>>  }
>>  
>> -int
>> -all_ports_stopped(void)
>> +void
>> +attach_port(char *identifier)
>>  {
>> -portid_t pi;
>> -struct rte_port *port;
>> +portid_t i, j, pi = 0;
>>  
>> -for (pi = 0; pi < nb_ports; pi++) {
>> -port = &ports[pi];
>> -if (port->port_status != RTE_PORT_STOPPED)
>> -return 0;
>> +printf("Attaching a new port...\n");
>> +
>> +if (identifier == NULL) {
>> +printf("Invalid parameters are speficied\n");
>> +return;
>>  }
>>  
>> -return 1;
>> +if (test_done == 0) {
>> +printf("Please stop forwarding first\n");
>> +return;
>> +}
>> +
>> +if (rte_eal_dev_attach(identifier, &pi))
>> +return;
>> +
>> +ports[pi].enabled = 1;
>> +reconfig(pi, rte_eth_dev_socket_id(pi));
>> +rte_eth_promiscuous_enable(pi);
>> +
>> +nb_ports = rte_eth_dev_count();
>> +
>> +/* set_default_fwd_ports_config(); */
>> +bzero(fwd_ports_ids, sizeof(fwd_ports_ids));
>> +i = 0;
>> +FOREACH_PORT(j, ports) {
>> +fwd_ports_ids[i] = j;
>> +i++;
>> +}
>> +nb_cfg_ports = nb_ports;
>> +nb_fwd_ports++;
>> +
>> +ports[pi].port_status = RTE_PORT_STOPPED;
>> +
>> +printf("Port %d is attached. Now total ports is %d\n", pi, nb_ports);
>> +printf("Done\n");
>>  }
>>  
>> -int
>> -port_is_started(portid_t port_id)
>> +void
>> +detach_port(uint8_t port_id)
>>  {
>> -if (port_id_is_invalid(port_id))
>> -return -1;
>> +portid_t i, pi = 0;
>> +char name[RTE_ETH_NAME_MAX_LEN];
>>  
>> -if (ports[port_id].port_status != RTE_PORT_STARTED)
>> -return 0;
>> +printf("Detaching a port...\n");
>>  
>> -return 1;
>> +if (!port_is_closed(port_i

[dpdk-dev] site down?

2015-02-03 Thread Thomas Monjalon
2015-02-03 11:18, Masaru Oki:
> I got below message:
> 
> myhost:~/src/dpdk$ git pull
> remote: error: inflate: data stream error (incorrect header check)
> remote: error: corrupt loose object 'a09f3e4c50467512970519943d26d9c5753584e0'
> remote: fatal: failed to read object
> a09f3e4c50467512970519943d26d9c5753584e0: Operation not permitted
> remote: aborting due to possible repository corruption on the remote side.
> fatal: protocol error: bad pack header
> 
> please 'git fsck' on dpdk.org.
> http://stackoverflow.com/questions/4170317/git-pull-error-remote-object-is-corrupted

I restored the repository from a backup.
I hope all is fixed now.

> thank you.

Thank you for reporting.


> 2015-02-03 6:55 GMT+09:00 Thomas Monjalon :
> > 2015-02-02 20:10, Vipin Agrawal:
> >> I?ve been trying to connect to download the 1.6 version.
> >
> > You should try to download a newer version :)
> >
> >> Does anybody have a status on dpdk.org?
> >
> > Yes it was down but now the problem seems to be fixed.
> > We are going to investigate why the kernel has crashed.
> > It may be due to a recent upgrade of the allocated resources.
> >
> > Sorry for the inconvenience
> > --
> > Thomas



[dpdk-dev] [PATCH v6] testpmd: Add port hotplug support

2015-02-03 Thread Iremonger, Bernard

> >> +.. code-block:: console
> >> +
> >> +testpmd> port attach :02:00.0
> >> +Attaching a new port...
> >> +... snip ...
> >> +Port 0 is attached. Now total ports is 1
> >> +Done
> >> +port detach
> >> +~~~
> >> +
> >> +Detach a specific port.
> >> +
> >> +Before detaching a port, the port should be closed.
> >> +Also to remove a pci device completely from the system, first detach the 
> >> port from testpmd.
> >> +Then the device should be moved under kernel management.
> >> +Finally the device can be remove using kernel pci hotplug functionality.

Hi Tetsuya,
Reword "remove" to "removed"

> >> +On the other hand, to remove a port created by virtual device, above 
> >> steps are not needed.
 Reword " created by virtual device" to "created by a virtual device"

> >
> >> +
> >> +port detach (port_id)
> >> +
> >> +For example, to detach a port 0.
> >> +
> >> +.. code-block:: console
> >> +
> >> +testpmd> port detach 0
> >> +Detaching a port...
> >> +... snip ...
> >> +Done
> >> +
> >>  port start
> >>  ~~
> >>
> >> --
> >> 1.9.1
> > Regards,
> >
> > Bernard.
> >



[dpdk-dev] [PATCH v6] testpmd: Add port hotplug support

2015-02-03 Thread Tetsuya Mukawa
On 2015/02/03 15:59, Qiu, Michael wrote:
> On 2/1/2015 12:02 PM, Tetsuya Mukawa wrote:
>> The patch introduces following commands.
>> - port attach [ident]
>> - port detach [port_id]
>>  - attach: attaching a port
>>  - detach: detaching a port
>>  - ident: pci address of physical device.
>>   Or device name and paramerters of virtual device.
>>  (ex. :02:00.0, eth_pcap0,iface=eth0)
>>  - port_id: port identifier
>>
>> v5:
>> - Add testpmd documentation.
>>   (Thanks to Iremonger, Bernard)
>> v4:
>>  - Fix strings of command help.
>>
>> Signed-off-by: Tetsuya Mukawa 
>> ---
>>  app/test-pmd/cmdline.c  | 133 +++
>>  app/test-pmd/config.c   | 116 +---
>>  app/test-pmd/parameters.c   |  22 ++-
>>  app/test-pmd/testpmd.c  | 199 
>> +---
>>  app/test-pmd/testpmd.h  |  18 ++-
>>  doc/guides/testpmd_app_ug/testpmd_funcs.rst |  57 
>>  6 files changed, 415 insertions(+), 130 deletions(-)
>>
>> diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
>> index 4beb404..2f813d8 100644
>> --- a/app/test-pmd/cmdline.c
>> +++ b/app/test-pmd/cmdline.c
>> @@ -572,6 +572,12 @@ static void cmd_help_long_parsed(void *parsed_result,
>>  "port close (port_id|all)\n"
>>  "Close all ports or port_id.\n\n"
>>  
>> +"port attach (ident)\n"
>> +"Attach physical or virtual dev by pci address or 
>> virtual device name\n\n"
>> +
>> +"port detach (port_id)\n"
>> +"Detach physical or virtual dev by port_id\n\n"
>> +
>>  "port config (port_id|all)"
>>  " speed (10|100|1000|1|4|auto)"
>>  " duplex (half|full|auto)\n"
>> @@ -848,6 +854,89 @@ cmdline_parse_inst_t cmd_operate_specific_port = {
>>  },
>>  };
>>  
>> +/* *** attach a specificied port *** */
>> +struct cmd_operate_attach_port_result {
>> +cmdline_fixed_string_t port;
>> +cmdline_fixed_string_t keyword;
>> +cmdline_fixed_string_t identifier;
>> +};
>> +
>> +static void cmd_operate_attach_port_parsed(void *parsed_result,
>> +__attribute__((unused)) struct cmdline *cl,
>> +__attribute__((unused)) void *data)
>> +{
>> +struct cmd_operate_attach_port_result *res = parsed_result;
>> +
>> +if (!strcmp(res->keyword, "attach"))
>> +attach_port(res->identifier);
>> +else
>> +printf("Unknown parameter\n");
>> +}
>> +
>> +cmdline_parse_token_string_t cmd_operate_attach_port_port =
>> +TOKEN_STRING_INITIALIZER(struct cmd_operate_attach_port_result,
>> +port, "port");
>> +cmdline_parse_token_string_t cmd_operate_attach_port_keyword =
>> +TOKEN_STRING_INITIALIZER(struct cmd_operate_attach_port_result,
>> +keyword, "attach");
>> +cmdline_parse_token_string_t cmd_operate_attach_port_identifier =
>> +TOKEN_STRING_INITIALIZER(struct cmd_operate_attach_port_result,
>> +identifier, NULL);
>> +
>> +cmdline_parse_inst_t cmd_operate_attach_port = {
>> +.f = cmd_operate_attach_port_parsed,
>> +.data = NULL,
>> +.help_str = "port attach identifier, "
>> +"identifier: pci address or virtual dev name",
>> +.tokens = {
>> +(void *)&cmd_operate_attach_port_port,
>> +(void *)&cmd_operate_attach_port_keyword,
>> +(void *)&cmd_operate_attach_port_identifier,
>> +NULL,
>> +},
>> +};
>> +
>> +/* *** detach a specificied port *** */
>> +struct cmd_operate_detach_port_result {
>> +cmdline_fixed_string_t port;
>> +cmdline_fixed_string_t keyword;
>> +uint8_t port_id;
>> +};
>> +
>> +static void cmd_operate_detach_port_parsed(void *parsed_result,
>> +__attribute__((unused)) struct cmdline *cl,
>> +__attribute__((unused)) void *data)
>> +{
>> +struct cmd_operate_detach_port_result *res = parsed_result;
>> +
>> +if (!strcmp(res->keyword, "detach"))
>> +detach_port(res->port_id);
>> +else
>> +printf("Unknown parameter\n");
>> +}
>> +
>> +cmdline_parse_token_string_t cmd_operate_detach_port_port =
>> +TOKEN_STRING_INITIALIZER(struct cmd_operate_detach_port_result,
>> +port, "port");
>> +cmdline_parse_token_string_t cmd_operate_detach_port_keyword =
>> +TOKEN_STRING_INITIALIZER(struct cmd_operate_detach_port_result,
>> +keyword, "detach");
>> +cmdline_parse_token_num_t cmd_operate_detach_port_port_id =
>> +TOKEN_NUM_INITIALIZER(struct cmd_operate_detach_port_result,
>> +port_id, UINT8);
>> +
>> +cmdline_parse_inst_t cmd_operate_detach_port = {
>> +.f = cmd_operate_detach_port_parsed,
>> +.data = NULL,
>> +.he

[dpdk-dev] [PATCH v6] testpmd: Add port hotplug support

2015-02-03 Thread Tetsuya Mukawa
On 2015/02/03 18:14, Qiu, Michael wrote:
> On 2/3/2015 2:16 PM, Qiu, Michael wrote:
>> On 2/1/2015 12:02 PM, Tetsuya Mukawa wrote:
>>> The patch introduces following commands.
>>> - port attach [ident]
>>> - port detach [port_id]
>>>  - attach: attaching a port
>>>  - detach: detaching a port
>>>  - ident: pci address of physical device.
>>>   Or device name and paramerters of virtual device.
>>>  (ex. :02:00.0, eth_pcap0,iface=eth0)
>>>  - port_id: port identifier
>>>
>>> v5:
>>> - Add testpmd documentation.
>>>   (Thanks to Iremonger, Bernard)
>>> v4:
>>>  - Fix strings of command help.
>>>
>>> Signed-off-by: Tetsuya Mukawa 
> [...]
>
>>> +static int
>>> +port_is_closed(portid_t port_id)
>>> +{
>>> +   if (port_id_is_invalid(port_id, ENABLED_WARN))
>>> +   return 0;
>>> +
>>> +   if (ports[port_id].port_status != RTE_PORT_CLOSED)
>>> +   return 0;
>>> +
>>> +   return 1;
>>> +}
>>> +
>>> +int
>>>  start_port(portid_t pid)
>>>  {
>>> int diag, need_check_link_status = 0;
>>> @@ -1296,8 +1347,8 @@ start_port(portid_t pid)
>>>  
>>> if(dcb_config)
>>> dcb_test = 1;
>>> -   for (pi = 0; pi < nb_ports; pi++) {
>>> -   if (pid < nb_ports && pid != pi)
>>> +   FOREACH_PORT(pi, ports) {
>>> +   if (!port_id_is_invalid(pid, DISABLED_WARN) && pid != pi)
>> Here may it be:
>>
>> if (!port_id_is_invalid(pid, DISABLED_WARN) && (pid != pi || pid == 
>> RET_PORT_ALL))
> Sorry, should be:
>
> if (!port_id_is_invalid(pid, DISABLED_WARN) && pid != pi && pid != 
> (portid_t)RET_PORT_ALL)
>
> Otherwise, should check for "RET_PORT_ALL" in function port_id_is_invalid()

Thanks for comment. I've found 2 issues.
(I guess the original code has same issue.)
One is that "port_id_is_invalid" should receives "pi" instead of "pid".
The other is if statement is wrong as you said.

I guess following statement will be good.

if (port_id_is_invalid(pi, DISABLED_WARN) || (pid != pi && pid !=
(portid_t)RTE_PORT_ALL))

How about it?

Thanks,
Tetsuya


> Thanks,
> Michael
>
>> Otherwise no port will be start by default.
>>
>>
>> Thanks,
>> Michael
>>
>>> continue;
>>>  
>>> port = &ports[pi];
>>> @@ -1421,7 +1472,7 @@ start_port(portid_t pid)
>>> }
>>>  
>>> if (need_check_link_status && !no_link_check)
>>> -   check_all_ports_link_status(nb_ports, RTE_PORT_ALL);
>>> +   check_all_ports_link_status(RTE_PORT_ALL);
>>> else
>>> printf("Please stop the ports first\n");
>>>  
>>> @@ -1446,8 +1497,8 @@ stop_port(portid_t pid)
>>> }
>>> printf("Stopping ports...\n");
>>>  
>>> -   for (pi = 0; pi < nb_ports; pi++) {
>>> -   if (pid < nb_ports && pid != pi)
>>> +   FOREACH_PORT(pi, ports) {
>>> +   if (!port_id_is_invalid(pid, DISABLED_WARN) && pid != pi)
>>> continue;
>>>  
>>> port = &ports[pi];
>>> @@ -1463,7 +1514,7 @@ stop_port(portid_t pid)
>>> need_check_link_status = 1;
>>> }
>>> if (need_check_link_status && !no_link_check)
>>> -   check_all_ports_link_status(nb_ports, RTE_PORT_ALL);
>>> +   check_all_ports_link_status(RTE_PORT_ALL);
>>>  
>>> printf("Done\n");
>>>  }
>>> @@ -1481,8 +1532,8 @@ close_port(portid_t pid)
>>>  
>>> printf("Closing ports...\n");
>>>  
>>> -   for (pi = 0; pi < nb_ports; pi++) {
>>> -   if (pid < nb_ports && pid != pi)
>>> +   FOREACH_PORT(pi, ports) {
>>> +   if (!port_id_is_invalid(pid, DISABLED_WARN) && pid != pi)
>>> continue;
>>>  
>>> port = &ports[pi];
>>> @@ -1502,31 +1553,83 @@ close_port(portid_t pid)
>>> printf("Done\n");
>>>  }
>>>  
>>> -int
>>> -all_ports_stopped(void)
>>> +void
>>> +attach_port(char *identifier)
>>>  {
>>> -   portid_t pi;
>>> -   struct rte_port *port;
>>> +   portid_t i, j, pi = 0;
>>>  
>>> -   for (pi = 0; pi < nb_ports; pi++) {
>>> -   port = &ports[pi];
>>> -   if (port->port_status != RTE_PORT_STOPPED)
>>> -   return 0;
>>> +   printf("Attaching a new port...\n");
>>> +
>>> +   if (identifier == NULL) {
>>> +   printf("Invalid parameters are speficied\n");
>>> +   return;
>>> }
>>>  
>>> -   return 1;
>>> +   if (test_done == 0) {
>>> +   printf("Please stop forwarding first\n");
>>> +   return;
>>> +   }
>>> +
>>> +   if (rte_eal_dev_attach(identifier, &pi))
>>> +   return;
>>> +
>>> +   ports[pi].enabled = 1;
>>> +   reconfig(pi, rte_eth_dev_socket_id(pi));
>>> +   rte_eth_promiscuous_enable(pi);
>>> +
>>> +   nb_ports = rte_eth_dev_count();
>>> +
>>> +   /* set_default_fwd_ports_config(); */
>>> +   bzero(fwd_ports_ids, sizeof(fwd_ports_ids));
>>> +   i = 0;
>>> +   FOREACH_PORT(j, ports) {
>>> +   fwd_ports_ids[i] = j;
>>> +   i++;
>>> +   }
>>> +   nb_cfg_ports = nb_ports;
>>> +   nb_fwd_ports++;
>>> +
>>> +   ports[pi].port_status = RTE_PORT_STOPPED;
>>> +
>>> +   printf("Port %d is atta

[dpdk-dev] [PATCH v3 2/4] doc: Add Sphinx config to build pdf version of guides

2015-02-03 Thread Iremonger, Bernard
> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of John McNamara
> Sent: Monday, February 2, 2015 1:16 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v3 2/4] doc: Add Sphinx config to build pdf 
> version of guides
> 
> Add Python Sphinx config to allow conversion of guides to Latex and then PDF 
> format.
> 
> This mainly adds metadata but also includes an override to the Latex 
> formatter to control the font size
> in code blocks.
> 
> Signed-off-by: John McNamara 
> ---
>  doc/guides/conf.py |   44 +++-
>  1 files changed, 43 insertions(+), 1 deletions(-)
> 
> diff --git a/doc/guides/conf.py b/doc/guides/conf.py index 385af03..9f546bd 
> 100644
> --- a/doc/guides/conf.py
> +++ b/doc/guides/conf.py
> @@ -29,11 +29,53 @@
>  #   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> 
>  import subprocess
> +from sphinx.highlighting import PygmentsBridge from
> +pygments.formatters.latex import LatexFormatter
> 
>  project = 'DPDK'
> 
>  copyright = '2014, Intel'

Hi John,

2014 should be 2015.

Regards,

Bernard.

> 
> -version = subprocess.check_output(["make","-sRrC","../../", "showversion"])
> +version = subprocess.check_output(['make', '-sRrC', '../../',
> +'showversion']) release = version
> 
>  master_doc = 'index'
> +
> +# Latex directives to be included directly in the latex/pdf docs.
> +latex_preamble = r"""
> +\usepackage[utf8]{inputenc}
> +\usepackage{DejaVuSansMono}
> +\usepackage[T1]{fontenc}
> +\usepackage{helvet}
> +\renewcommand{\familydefault}{\sfdefault}
> +
> +\RecustomVerbatimEnvironment{Verbatim}{Verbatim}{xleftmargin=5mm}
> +"""
> +
> +# Configuration for the latex/pdf docs.
> +latex_elements = {
> +'papersize': 'a4paper',
> +'pointsize': '11pt',
> +'preamble': latex_preamble}
> +
> +latex_documents = [
> +('index',
> + 'dpdk_doc.tex',
> + '',
> + '',
> + 'manual')]
> +
> +
> +# Temp class to override the default Latex formatter in order to modify
> +the # font size in the code/verbatim blocks.
> +class CustomLatexFormatter(LatexFormatter):
> +
> +def __init__(self, **options):
> +
> +super(CustomLatexFormatter, self).__init__(**options)
> +
> +# Use the second smallest font size for code/verbatim blocks.
> +self.verboptions = r'formatcom=\footnotesize'
> +
> +# Replace the default latex formatter.
> +PygmentsBridge.latex_formatter = CustomLatexFormatter
> --
> 1.7.4.1



[dpdk-dev] [PATCH v6] testpmd: Add port hotplug support

2015-02-03 Thread Tetsuya Mukawa
Hi Bernard,

I appreciate your checking.
I will fix like below.

Tetsuya

On 2015/02/03 19:03, Iremonger, Bernard wrote:
 +.. code-block:: console
 +
 +testpmd> port attach :02:00.0
 +Attaching a new port...
 +... snip ...
 +Port 0 is attached. Now total ports is 1
 +Done
 +port detach
 +~~~
 +
 +Detach a specific port.
 +
 +Before detaching a port, the port should be closed.
 +Also to remove a pci device completely from the system, first detach the 
 port from testpmd.
 +Then the device should be moved under kernel management.
 +Finally the device can be remove using kernel pci hotplug functionality.
> Hi Tetsuya,
> Reword "remove" to "removed"
>
 +On the other hand, to remove a port created by virtual device, above 
 steps are not needed.
>  Reword " created by virtual device" to "created by a virtual device"
>
 +
 +port detach (port_id)
 +
 +For example, to detach a port 0.
 +
 +.. code-block:: console
 +
 +testpmd> port detach 0
 +Detaching a port...
 +... snip ...
 +Done
 +
  port start
  ~~

 --
 1.9.1
>>> Regards,
>>>
>>> Bernard.
>>>



[dpdk-dev] deadline for 2.0 features proposal

2015-02-03 Thread Thomas Monjalon
Hi,

These features were planned in the 2.0 roadmap but not submitted:
- cuckoo hash
- packet distributor (phase 2)
- bifurcated driver, assuming availability in Linux kernel
- Broadcom driver
- Hyper-V driver
- i40e QoS
- i40e IEEE1588
- i40e DCB
- i40e SR-IOV switching
- i40e port mirroring

They are now in the 2.1 roadmap.
If you are working on one of these features or plan to do,
please confirm their status and the estimated integration window.

Thanks
-- 
Thomas


[dpdk-dev] vhost: virtio-net rx-ring stop work after work many hours, bug?

2015-02-03 Thread Linhaifeng


On 2015/1/27 17:37, Michael S. Tsirkin wrote:
> On Tue, Jan 27, 2015 at 03:57:13PM +0800, Linhaifeng wrote:
>> Hi,all
>>
>> I use vhost-user to send data to VM at first it cant work well but after 
>> many hours VM can not receive data but can send data.
>>
>> (gdb)p avail_idx
>> $4 = 2668
>> (gdb)p free_entries
>> $5 = 0
>> (gdb)l
>> /* check that we have enough buffers */
>> if (unlikely(count > free_entries))
>> count = free_entries;
>>
>> if (count == 0){
>> int b=0;
>> if(b) { // when set b=1 to notify guest rx_ring will restart to 
>> work
>> if (!(vq->avail->flags & VRING_AVAIL_F_NO_INTERRUPT)) {
>>
>> eventfd_write(vq->callfd, 1);
>> }
>> }
>> return 0;
>> }
>>
>> some info i print in guest:
>>
>> net eth3:vi->num=199
>> net eth3:rvq info: num_free=57, used->idx=2668, avail->idx=2668
>> net eth3:svq info: num_free=254, used->idx=1644, avail->idx=1644
>>
>> net eth3:vi->num=199
>> net eth3:rvq info: num_free=57, used->idx=2668, avail->idx=2668
>> net eth3:svq info: num_free=254, used->idx=1645, avail->idx=1645
>>
>> net eth3:vi->num=199
>> net eth3:rvq info: num_free=57, used->idx=2668, avail->idx=2668
>> net eth3:svq info: num_free=254, used->idx=1646, avail->idx=1646
>>
>> # free
>>  total   used   free sharedbuffers cached
>> Mem:  3924100  3372523586848  0  95984 138060
>> -/+ buffers/cache: 1032083820892
>> Swap:   970748  0 970748
>>
>> I have two questions:
>> 1.Should we need to notify guest when there is no buffer in vq->avail?
> 
> No unless NOTIFY_ON_EMPTY is set (most guests don't set it).

Thank you for your new knowledge:)

> 
>> 2.Why virtio_net stop to fill avail?
> 
> Most likely, it didn't get an interrupt.
> 
> If so, it would be a dpdk vhost user bug.
> Which code are you using in dpdk?
> 

Hi,mst

Thank you for your reply.
Sorry, maybe my mail filter have a bug,so i saw this mail until now.

I use the dpdk code before 2bbb811.I paste the code here for you to review.
(Note that the vhost_enqueue_burstand vhost_dequeue_burst function runs as poll 
mode.)

I guess if vhost_enqueue_burst used all the buffers in rx_ring then try to 
notify guest
to receive but at this time vcpu may be exiting so guest cann't receive the 
notify.


/*
 * Enqueues packets to the guest virtio RX virtqueue for vhost devices.
 */
static inline uint32_t __attribute__((always_inline))
vhost_enqueue_burst(struct virtio_net *dev, struct rte_mbuf **pkts, unsigned 
count)
{
struct vhost_virtqueue *vq;
struct vring_desc *desc;
struct rte_mbuf *buff;
/* The virtio_hdr is initialised to 0. */
struct virtio_net_hdr_mrg_rxbuf virtio_hdr = {{0,0,0,0,0,0},0};
uint64_t buff_addr = 0;
uint64_t buff_hdr_addr = 0;
uint32_t head[PKT_BURST_SIZE], packet_len = 0;
uint32_t head_idx, packet_success = 0;
uint32_t mergeable, mrg_count = 0;
uint32_t retry = 0;
uint16_t avail_idx, res_cur_idx;
uint16_t res_base_idx, res_end_idx;
uint16_t free_entries;
uint8_t success = 0;

LOG_DEBUG(APP, "(%"PRIu64") virtio_dev_rx()\n", dev->device_fh);
vq = dev->virtqueue[VIRTIO_RXQ];
count = (count > PKT_BURST_SIZE) ? PKT_BURST_SIZE : count;

/* As many data cores may want access to available buffers, they need 
to be reserved. */
do {
res_base_idx = vq->last_used_idx_res;
avail_idx = *((volatile uint16_t *)&vq->avail->idx);

free_entries = (avail_idx - res_base_idx);
/* If retry is enabled and the queue is full then we wait and 
retry to avoid packet loss. */
if (unlikely(count > free_entries)) {
for (retry = 0; retry < burst_tx_retry_num; retry++) {
rte_delay_us(burst_tx_delay_time);
avail_idx =
*((volatile uint16_t *)&vq->avail->idx);
free_entries = (avail_idx - res_base_idx);
if (count <= free_entries)
break;
}
}

/*check that we have enough buffers*/
if (unlikely(count > free_entries))
count = free_entries;

if (count == 0) // !!! 
when VM cann't receive always return here
return 0;

res_end_idx = res_base_idx + count;
/* vq->last_used_idx_res is atomically updated. */
success = rte_atomic16_cmpset(&vq->last_used_idx_res, 
res_base_idx,

res_end_idx);
} 

[dpdk-dev] vhost: virtio-net rx-ring stop work after work many hours, bug?

2015-02-03 Thread Linhaifeng


On 2015/1/28 17:51, Xie, Huawei wrote:
> 
> 
>> -Original Message-
>> From: Linhaifeng [mailto:haifeng.lin at huawei.com]
>> Sent: Tuesday, January 27, 2015 3:57 PM
>> To: dpd >> dev at dpdk.org; ms >> Michael S. Tsirkin
>> Cc: lilijun; liuyongan at huawei.com; Xie, Huawei
>> Subject: vhost: virtio-net rx-ring stop work after work many hours,bug?
>>
>> Hi,all
>>
>> I use vhost-user to send data to VM at first it cant work well but after many
>> hours VM can not receive data but can send data.
>>
>> (gdb)p avail_idx
>> $4 = 2668
>> (gdb)p free_entries
>> $5 = 0
>> (gdb)l
>> /* check that we have enough buffers */
>> if (unlikely(count > free_entries))
>> count = free_entries;
>>
>> if (count == 0){
>> int b=0;
>> if(b) { // when set b=1 to notify guest rx_ring will restart to 
>> work
>> if (!(vq->avail->flags & VRING_AVAIL_F_NO_INTERRUPT)) {
>>
>> eventfd_write(vq->callfd, 1);
>> }
>> }
>> return 0;
>> }
>>
>> some info i print in guest:
>>
>> net eth3:vi->num=199
>> net eth3:rvq info: num_free=57, used->idx=2668, avail->idx=2668
>> net eth3:svq info: num_free=254, used->idx=1644, avail->idx=1644
>>
>> net eth3:vi->num=199
>> net eth3:rvq info: num_free=57, used->idx=2668, avail->idx=2668
>> net eth3:svq info: num_free=254, used->idx=1645, avail->idx=1645
>>
>> net eth3:vi->num=199
>> net eth3:rvq info: num_free=57, used->idx=2668, avail->idx=2668
>> net eth3:svq info: num_free=254, used->idx=1646, avail->idx=1646
>>
>> # free
>>  total   used   free sharedbuffers cached
>> Mem:  3924100  3372523586848  0  95984 138060
>> -/+ buffers/cache: 1032083820892
>> Swap:   970748  0 970748
>>
>> I have two questions:
>> 1.Should we need to notify guest when there is no buffer in vq->avail?
>> 2.Why virtio_net stop to fill avail?
>>
>>
> 
> Haifeng:
> Thanks for reporting this issue.
> It might not be vhost-user specific, because as long vhost-user has received 
> all the vring information correctly, it shares the same code 
> receiving/transmitting packets with vhost-cuse.
> Are you using latest patch or the old patch?

Xie:
Sorry, I saw this mail until now.

I use the old code not latest patch.The lastest patch is ok because it will 
notify guest after copy each pkt when merge-able.(May be is not OK when you 
close the merge-able feature)

> 1  Do you disable merge-able feature support in vhost example? There is an 
> bug in vhost-user feature negotiation which is fixed in latest patch.  It 
> could cause guest not receive packets at all. So if you are testing only 
> using linux net device, this isn't the cause.
Yes, i disabled it.

> 2.Do you still have the spot? Could you check if there are available 
> descriptors from checking the desc ring or even dump the vring status? Check 
> the notify_on_empty flag Michael mentioned?  I find a bug in vhost library 
> when processing three or more chained descriptors. But if you never 
> re-configure eth0 with different features,  this isn't the cause.
> 3. Is this reproduce-able? Next time if you run long hours stability test, 
> could you try to disable guest virtio feature?
> -device 
> virtio-net-pci,netdev=mynet0,mac=54:00:00:54:00:01,csum=off,gso=off,guest_tso4=off,guest_tso6=off,guest_ecn=off
> 
> I have run more than ten hours' nightly test many times before, and haven't 
> met this issue. 
> We will check * if there is issue in the vhost code delivering interrupts to 
> guest which cause potential deadlock *if there are places we should but miss 
> delivering interrupts to guest.
> 
>>
>>
>>
>>
>> --
>> Regards,
>> Haifeng
> 

-- 
Regards,
Haifeng



[dpdk-dev] [PATCH v6 00/13] Port Hotplug Framework

2015-02-03 Thread Iremonger, Bernard
> -Original Message-
> From: Tetsuya Mukawa [mailto:mukawa at igel.co.jp]
> Sent: Sunday, February 1, 2015 4:02 AM
> To: dev at dpdk.org
> Cc: Qiu, Michael; Iremonger, Bernard; Tetsuya Mukawa
> Subject: [PATCH v6 00/13] Port Hotplug Framework
> 
> This patch series adds a dynamic port hotplug framework to DPDK.
> With the patches, DPDK apps can attach or detach ports at runtime.
> 
> The basic concept of the port hotplug is like followings.
> - DPDK apps must have responsibility to manage ports.
>   DPDK apps only know which ports are attached or detached at the moment.
>   The port hotplug framework is implemented to allow DPDK apps to manage 
> ports.
>   For example, when DPDK apps call port attach function, attached port number
>   will be returned. Also DPDK apps can detach port by port number.
> - Kernel support is needed for attaching or detaching physical device ports.
>   To attach a new physical device port, the device will be recognized by
>   userspace directly I/O framework in kernel at first. Then DPDK apps can
>   call the port hotplug functions to attach ports.
>   For detaching, steps are vice versa.
> - Before detach ports, ports must be stopped and closed.
>   DPDK application must call rte_eth_dev_stop() and rte_eth_dev_close() before
>   detaching ports. These function will call finalization codes of PMDs.
>   But so far, no PMD frees all resources allocated by initialization.
>   It means PMDs are needed to be fixed to support the port hotplug.
>   'RTE_PCI_DRV_DETACHABLE' is a new flag indicating a PMD supports detaching.
>   Without this flag, detaching will be failed.
> - Mustn't affect legacy DPDK apps.
>   No DPDK EAL behavior is changed, if the port hotplug functions are't called.
>   So all legacy DPDK apps can still work without modifications.
> 
> And a few limitations.
> - The port hotplug functions are not thread safe.
>   DPDK apps should handle it.
> - Only support Linux and igb_uio so far.
>   BSD and VFIO is not supported. I will send VFIO patches at least, but I 
> don't
>   have a plan to submit BSD patch so far.
> 
> 
> Here is port hotplug APIs.
> ---
> /**
>  * Attach a new device.
>  *
>  * @param devargs
>  *   A pointer to a strings array describing the new device
>  *   to be attached. The strings should be a pci address like
>  *   ':01:00.0' or virtual device name like 'eth_pcap0'.
>  * @param port_id
>  *  A pointer to a port identifier actually attached.
>  * @return
>  *  0 on success and port_id is filled, negative on error  */ int 
> rte_eal_dev_attach(const char *devargs,
> uint8_t *port_id);
> 
> /**
>  * Detach a device.
>  *
>  * @param port_id
>  *   The port identifier of the device to detach.
>  * @param addr
>  *  A pointer to a device name actually detached.
>  * @return
>  *  0 on success and devname is filled, negative on error  */ int 
> rte_eal_dev_detach(uint8_t port_id,
> char *devname);
> ---
> 
> This patch series are for DPDK EAL. To use port hotplug function by DPDK 
> apps, each PMD should be
> fixed to support 'RTE_PCI_DRV_DETACHABLE' flag. Please check a patch for pcap 
> PMD.
> 
> Also please check testpmd patch. It will show you how to fix your legacy 
> applications to support port
> hotplug feature.
> 
> PATCH v6 changes
>  - Fix rte_eth_dev_uninit() to handle a return value of uninit
>function of PMD. To do this, below changes also be applied.
>- Fix a paramter of rte_eth_dev_free().
>- Use rte_eth_dev structure as the paramter of rte_eth_dev_free().
> 
> PATCH v5 changes
>  - Add runtime check passthrough driver type, like vfio-pci, igb_uio
>and uio_pci_generic.
>This was done by Qiu, Michael. Thanks a lot.
>  - Change function names like below.
>- rte_eal_dev_find_and_invoke() to rte_eal_vdev_find_and_invoke().
>- rte_eal_dev_invoke() to rte_eal_vdev_invoke().
>  - Add code to handle a return value of rte_eal_devargs_remove().
>  - Fix pci address format in rte_eal_dev_detach().
>  - Remove RTE_EAL_INVOKE_TYPE_UNKNOWN, because it's unused.
>  - Change function definition of rte_eal_devargs_remove().
>  - Fix pci_unmap_device() to check pt_driver.
>  - Fix return value of below functions.
>- rte_eth_dev_get_changed_port().
>- rte_eth_dev_get_port_by_addr().
>  - Change paramters of rte_eth_dev_validate_port() to cleanup code.
>  - Fix pci_scan_one to handle pt_driver correctly.
>(Thanks to Qiu, Michael for above sugesstions)
> 
> PATCH v4 changes
>  - Merge patches to review easier.
>  - Fix indent of 'if' statement.
>  - Fix calculation method of eal_compare_pci_addr().
>  - Fix header file declaration.
>  - Add header file to determine if hotplug can be enabled.
>(Thanks to Qiu, Michael)
>  - Use braces with 'for' loop.
>  - Add paramerter checking.
>  - Fix sanity check code
>  - Fix comments of rte_eth_dev_type.
>  - C

[dpdk-dev] [PATCH v2 1/4] mk: Add 'make doc-pdf' target to convert guide docs to pdf

2015-02-03 Thread Mcnamara, John
> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Monday, February 2, 2015 1:35 PM
> To: Mcnamara, John
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2 1/4] mk: Add 'make doc-pdf' target to
> convert guide docs to pdf
> 
> I think it's possible. Look at this link:
> http://stackoverflow.com/questions/6473660/using-sphinx-docs-how-can-i-
> specify-png-image-formats-for-html-builds-and-pdf-im
> In this example, SVG files are converted into PDF files, not PNG.

Hi Thomas,

Using image.* will work for latex/pdf but only if the pngs are converted in the 
source directory (as opposed to the build directory) since Sphinx requires them 
to be available at the latex compile time.

I'll modify the patch to work that way.

Regards,

John
-- 



[dpdk-dev] [PATCH v4 1/5] mk: Add 'make doc-pdf' target to convert guide docs to pdf

2015-02-03 Thread John McNamara
Added make system support for building PDF versions of
the guides. Requires Python Sphinx and TexLive Full.

Signed-off-by: John McNamara 
---
 mk/rte.sdkdoc.mk |   43 ++-
 1 files changed, 42 insertions(+), 1 deletions(-)

diff --git a/mk/rte.sdkdoc.mk b/mk/rte.sdkdoc.mk
index dabc0d6..e09628f 100644
--- a/mk/rte.sdkdoc.mk
+++ b/mk/rte.sdkdoc.mk
@@ -37,13 +37,24 @@ endif
 endif

 RTE_SPHINX_BUILD = sphinx-build
+RTE_PDFLATEX_VERBOSE := --interaction=nonstopmode
+
 ifndef V
 RTE_SPHINX_VERBOSE := -q
+RTE_PDFLATEX_VERBOSE := --interaction=batchmode
+RTE_INKSCAPE_VERBOSE := > /dev/null 2>&1
 endif
 ifeq '$V' '0'
 RTE_SPHINX_VERBOSE := -q
+RTE_PDFLATEX_VERBOSE := --interaction=batchmode
+RTE_INKSCAPE_VERBOSE := > /dev/null 2>&1
 endif

+RTE_GUIDE_PDFS := $(filter %/, $(wildcard $(RTE_SDK)/doc/guides/*/))
+RTE_GUIDE_PDFS := 
$(RTE_GUIDE_PDFS:$(RTE_SDK)/doc/guides%=$(RTE_OUTPUT)/doc/latex/guides%)
+RTE_GUIDE_PDFS := $(RTE_GUIDE_PDFS:%/=%.pdf)
+RTE_DEFAULT_DPI ?= 300
+
 .PHONY: help
 help:
@cat $(RTE_SDK)/doc/build-sdk-quick.txt
@@ -53,7 +64,7 @@ help:
 all: api-html guides-html

 .PHONY: clean
-clean: api-html-clean guides-html-clean
+clean: api-html-clean guides-html-clean guides-latex-clean

 .PHONY: api-html
 api-html: api-html-clean
@@ -83,3 +94,33 @@ guides-%:
@echo 'sphinx for guides...'
$(Q)$(RTE_SPHINX_BUILD) -b $* $(RTE_SPHINX_VERBOSE) \
-c $(RTE_SDK)/doc/guides $(RTE_SDK)/doc/guides 
$(RTE_OUTPUT)/doc/$*/guides
+
+
+pdf: $(RTE_GUIDE_PDFS)
+
+.SECONDEXPANSION:
+# Use wildcard expansion to avoid * expansion issue with make 3.82.
+$(RTE_OUTPUT)/doc/latex/guides/%.pdf: $$(wildcard 
$(RTE_SDK)/doc/guides/%/*.rst)
+   @echo 'creating' $* 'pdf ...'
+
+   @# Convert the svg files to png for pdflatex.
+   $(eval tmp_images = $(wildcard $(RTE_SDK)/doc/guides/$*/img/*.svg))
+   $(Q)for image in $(tmp_images:.svg=); do \
+   inkscape -d $(RTE_DEFAULT_DPI) -D -b ff \
+   -f $$image.svg -e $$image.png $(RTE_INKSCAPE_VERBOSE); \
+   done
+
+   @# Generate the latex files.
+   $(Q)$(RTE_SPHINX_BUILD) -b latex $(RTE_SPHINX_VERBOSE) \
+   -c $(RTE_SDK)/doc/guides  $(RTE_SDK)/doc/guides/$* \
+   $(RTE_OUTPUT)/doc/latex/guides/$*
+
+   @# Remove the generated png files.
+   $(Q)rm -f $(tmp_images:.svg=.png)
+
+   @# Generate the pdf files.
+   $(Q)sed -i 's/LATEXOPTS =/LATEXOPTS = $(RTE_PDFLATEX_VERBOSE)/' \
+   $(RTE_OUTPUT)/doc/latex/guides/$*/Makefile
+   $(Q)make all-pdf -s -C $(RTE_OUTPUT)/doc/latex/guides/$*
+
+   $(Q)mv $(RTE_OUTPUT)/doc/latex/guides/$*/dpdk_doc.pdf $@
-- 
1.7.4.1



[dpdk-dev] [PATCH v4 0/5] doc: Add 'make pdf' target to convert guide docs to pdf

2015-02-03 Thread John McNamara

This patch adds support for creating PDF versions of the user guides.

Specifically:

* The Programmer's Guide
* The Linux Getting Started Guide
* The FreeBSD Getting Started Guide
* The Sample Applications User Guide
* The TestPMD User Guide
* The Release Notes

The local and online Html documentation is very useful but we have had
internal and external requests from people who also liked the PDF
documentation in older releases.

The PDF generation is fully automated and uses the same Sphinx build system
and RST files used for the Html docs but uses the 'latexpdf' target. In
addition to the standard Sphinx Python modules it requires the Tex/LaTeX
toolchain. For best results it requires a TexLive 'Full' installation.

The PDF documents are generated as follows:

make pdf
# or
make doc-pdf

The PDFs aren't generated as part of the 'make doc' rule since they can take
some 1-3 minutes to build and since they have a large toolchain dependency.

V4 Changes:
* Changed RST image types to wildcard to allow Sphinx to decide
  the appropriate type.
* Changed back to calling Sphinx generated makefile to ensure that
  the pdf files are created by Sphinx make rules.

V3 Changes:
* Remove sub-directory config.py files and replaced them with metadata
  in the main doc/guides/conf.py file and a more generic make rule.
* Added *.pdf targets with *.rst dependencies.
* Call pdflatex directly (instead of from Sphinx) to control the
  verbosity of the output.

V2 Changes:
* Removes config file duplication
* Converts SVG files to PNG on the fly
* Splits the patch into distinct mk/doc parts
* Fixes issues in the RST docs that prevent PDF generation


John McNamara (5):
  mk: Add 'make doc-pdf' target to convert guide docs to pdf
  doc: Add Sphinx config to build pdf version of guides
  doc: Fix encoding of (r) character
  doc: Refactored split cell formatting in one table
  doc: Convert image extensions to wildcard

 doc/guides/conf.py |   48 ++-
 doc/guides/prog_guide/env_abstraction_layer.rst|2 +-
 .../prog_guide/i40e_ixgbe_igb_virt_func_drv.rst|8 ++--
 .../intel_dpdk_xen_based_packet_switch_sol.rst |6 +-
 doc/guides/prog_guide/ivshmem_lib.rst  |2 +-
 doc/guides/prog_guide/kernel_nic_interface.rst |8 ++--
 .../libpcap_ring_based_poll_mode_drv.rst   |2 +-
 .../prog_guide/link_bonding_poll_mode_drv_lib.rst  |   14 +++---
 doc/guides/prog_guide/lpm6_lib.rst |2 +-
 doc/guides/prog_guide/lpm_lib.rst  |2 +-
 doc/guides/prog_guide/malloc_lib.rst   |2 +-
 doc/guides/prog_guide/mbuf_lib.rst |4 +-
 doc/guides/prog_guide/mempool_lib.rst  |6 +-
 doc/guides/prog_guide/multi_proc_support.rst   |2 +-
 doc/guides/prog_guide/overview.rst |2 +-
 doc/guides/prog_guide/packet_distrib_lib.rst   |4 +-
 doc/guides/prog_guide/packet_framework.rst |   14 +++---
 .../poll_mode_drv_emulated_virtio_nic.rst  |6 +-
 .../poll_mode_drv_paravirtual_vmxnets_nic.rst  |6 +-
 doc/guides/prog_guide/qos_framework.rst|   36 +++---
 doc/guides/prog_guide/ring_lib.rst |   28 ++--
 doc/guides/rel_notes/supported_features.rst|2 +-
 doc/guides/sample_app_ug/dist_app.rst  |4 +-
 doc/guides/sample_app_ug/exception_path.rst|2 +-
 doc/guides/sample_app_ug/intel_quickassist.rst |2 +-
 doc/guides/sample_app_ug/kernel_nic_interface.rst  |4 +-
 .../sample_app_ug/l2_forward_real_virtual.rst  |4 +-
 .../sample_app_ug/l3_forward_access_ctrl.rst   |4 +-
 doc/guides/sample_app_ug/load_balancer.rst |2 +-
 doc/guides/sample_app_ug/multi_process.rst |8 ++--
 doc/guides/sample_app_ug/qos_scheduler.rst |2 +-
 doc/guides/sample_app_ug/quota_watermark.rst   |6 +-
 doc/guides/sample_app_ug/test_pipeline.rst |   34 +-
 doc/guides/sample_app_ug/vhost.rst |   10 ++--
 doc/guides/sample_app_ug/vm_power_management.rst   |4 +-
 doc/guides/sample_app_ug/vmdq_dcb_forwarding.rst   |2 +-
 mk/rte.sdkdoc.mk   |   43 +-
 37 files changed, 216 insertions(+), 121 deletions(-)

-- 
1.7.4.1



[dpdk-dev] [PATCH v4 3/5] doc: Fix encoding of (r) character

2015-02-03 Thread John McNamara
Change encoding of (r) from Latin-1 to UTF8 to match the other
symbols in the doc and to allow it to convert cleanly to PDF.

Signed-off-by: John McNamara 
---
 doc/guides/rel_notes/supported_features.rst |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/doc/guides/rel_notes/supported_features.rst 
b/doc/guides/rel_notes/supported_features.rst
index 7936e93..d87fcaa 100644
--- a/doc/guides/rel_notes/supported_features.rst
+++ b/doc/guides/rel_notes/supported_features.rst
@@ -51,7 +51,7 @@ Supported Features

 *   Intel? X710 40 Gigabit Ethernet Controller

-*   Support NIC filters in addition to flow director for Intel? 1GbE and 10GbE 
Controllers
+*   Support NIC filters in addition to flow director for Intel? 1GbE and 10GbE 
Controllers

 *   Virtualization (KVM)

-- 
1.7.4.1



[dpdk-dev] [PATCH v4 4/5] doc: Refactored split cell formatting in one table

2015-02-03 Thread John McNamara
Refactored split cell in test_pipeline table to allow it to
convert cleanly to PDF.

The Sphinx/Latex converter doesn't handle split cells like the
following:

  +-+--+
  | Header 1| Header 2 |
  +=+==+
  | |  |
  | |  |
  +-+  |
  | |  |
  | |  |
  +-+--+

Instead the table was refactored to a simpler format:

  +-+--+
  | Header 1| Header 2 |
  +=+==+
  | |  |
  | |  |
  +-+--+
  | |  |
  | |  |
  +-+--+

The same information was retained in the table.

Signed-off-by: John McNamara 
---
 doc/guides/sample_app_ug/test_pipeline.rst |   32 +++
 1 files changed, 22 insertions(+), 10 deletions(-)

diff --git a/doc/guides/sample_app_ug/test_pipeline.rst 
b/doc/guides/sample_app_ug/test_pipeline.rst
index 867a7a7..a5fed8a 100644
--- a/doc/guides/sample_app_ug/test_pipeline.rst
+++ b/doc/guides/sample_app_ug/test_pipeline.rst
@@ -137,9 +137,9 @@ For hash tables, the following parameters can be selected:
 |   || entries.
 | hash table with the following key format: |
 |   || 
 |   |
 |   || 
 | [4-byte index, 4 bytes of 0]  |
-+---++--+
   |
-| 4 | hash-[spec]-8-ext  |  Extendible bucket hash table with 8-byte 
key size   | The action configured for all table entries is|
-|   ||  and 16 million entries.
 | "Sendto output port", with the output port index  |
+|   || 
 |   |
+|   || 
 | The action configured for all table entries is|
+|   || 
 | "Sendto output port", with the output port index  |
 |   || 
 | uniformly distributed for the range of output ports.  |
 |   || 
 |   |
 |   || 
 | The default table rule (used in the case of a lookup  |
@@ -152,13 +152,17 @@ For hash tables, the following parameters can be selected:
 |   || 
 | [destination IPv4 address, 4 bytes of 0]  |
 |   || 
 |   |
 
+---++--+---+
+| 4 | hash-[spec]-8-ext  | Extendible bucket hash table with 8-byte 
key size| Same as hash-[spec]-8-lru table entries, above.   |
+|   || and 16 million entries. 
 |   |
+|   || 
 |   |
++---++--+---+
 | 5 | hash-[spec]-16-lru | LRU hash table with 16-byte key size and 16 
million  | 16 million entries are successfully added to the hash |
 |   || entries.
 | table with the following key format:  |
 |   || 
 |   |
 |   || 
 | [4-byte index, 12 bytes of 0] |
-+---++-

[dpdk-dev] [PATCH v4 5/5] doc: Convert image extensions to wildcard

2015-02-03 Thread John McNamara
Changed all image.svg and image.png extensions to image.*
This allows Sphinx to decide the appropriate image type
from the available image options.

Signed-off-by: John McNamara 
---
 doc/guides/prog_guide/env_abstraction_layer.rst|2 +-
 .../prog_guide/i40e_ixgbe_igb_virt_func_drv.rst|8 ++--
 .../intel_dpdk_xen_based_packet_switch_sol.rst |6 ++--
 doc/guides/prog_guide/ivshmem_lib.rst  |2 +-
 doc/guides/prog_guide/kernel_nic_interface.rst |8 ++--
 .../libpcap_ring_based_poll_mode_drv.rst   |2 +-
 .../prog_guide/link_bonding_poll_mode_drv_lib.rst  |   14 
 doc/guides/prog_guide/lpm6_lib.rst |2 +-
 doc/guides/prog_guide/lpm_lib.rst  |2 +-
 doc/guides/prog_guide/malloc_lib.rst   |2 +-
 doc/guides/prog_guide/mbuf_lib.rst |4 +-
 doc/guides/prog_guide/mempool_lib.rst  |6 ++--
 doc/guides/prog_guide/multi_proc_support.rst   |2 +-
 doc/guides/prog_guide/overview.rst |2 +-
 doc/guides/prog_guide/packet_distrib_lib.rst   |4 +-
 doc/guides/prog_guide/packet_framework.rst |   14 
 .../poll_mode_drv_emulated_virtio_nic.rst  |6 ++--
 .../poll_mode_drv_paravirtual_vmxnets_nic.rst  |6 ++--
 doc/guides/prog_guide/qos_framework.rst|   36 ++--
 doc/guides/prog_guide/ring_lib.rst |   28 
 doc/guides/sample_app_ug/dist_app.rst  |4 +-
 doc/guides/sample_app_ug/exception_path.rst|2 +-
 doc/guides/sample_app_ug/intel_quickassist.rst |2 +-
 doc/guides/sample_app_ug/kernel_nic_interface.rst  |4 +-
 .../sample_app_ug/l2_forward_real_virtual.rst  |4 +-
 .../sample_app_ug/l3_forward_access_ctrl.rst   |4 +-
 doc/guides/sample_app_ug/load_balancer.rst |2 +-
 doc/guides/sample_app_ug/multi_process.rst |8 ++--
 doc/guides/sample_app_ug/qos_scheduler.rst |2 +-
 doc/guides/sample_app_ug/quota_watermark.rst   |6 ++--
 doc/guides/sample_app_ug/test_pipeline.rst |2 +-
 doc/guides/sample_app_ug/vhost.rst |   10 +++---
 doc/guides/sample_app_ug/vm_power_management.rst   |4 +-
 doc/guides/sample_app_ug/vmdq_dcb_forwarding.rst   |2 +-
 34 files changed, 106 insertions(+), 106 deletions(-)

diff --git a/doc/guides/prog_guide/env_abstraction_layer.rst 
b/doc/guides/prog_guide/env_abstraction_layer.rst
index 231e266..45791b6 100644
--- a/doc/guides/prog_guide/env_abstraction_layer.rst
+++ b/doc/guides/prog_guide/env_abstraction_layer.rst
@@ -212,4 +212,4 @@ Memory zones can be reserved with specific start address 
alignment by supplying
 The alignment value should be a power of two and not less than the cache line 
size (64 bytes).
 Memory zones can also be reserved from either 2 MB or 1 GB hugepages, provided 
that both are available on the system.

-.. |linuxapp_launch| image:: img/linuxapp_launch.svg
+.. |linuxapp_launch| image:: img/linuxapp_launch.*
diff --git a/doc/guides/prog_guide/i40e_ixgbe_igb_virt_func_drv.rst 
b/doc/guides/prog_guide/i40e_ixgbe_igb_virt_func_drv.rst
index 41e316e..a984379 100755
--- a/doc/guides/prog_guide/i40e_ixgbe_igb_virt_func_drv.rst
+++ b/doc/guides/prog_guide/i40e_ixgbe_igb_virt_func_drv.rst
@@ -542,10 +542,10 @@ which belongs to the destination VF on the VM.

 |inter_vm_comms|

-.. |perf_benchmark| image:: img/perf_benchmark.png
+.. |perf_benchmark| image:: img/perf_benchmark.*

-.. |single_port_nic| image:: img/single_port_nic.png
+.. |single_port_nic| image:: img/single_port_nic.*

-.. |inter_vm_comms| image:: img/inter_vm_comms.png
+.. |inter_vm_comms| image:: img/inter_vm_comms.*

-.. |fast_pkt_proc| image:: img/fast_pkt_proc.png
+.. |fast_pkt_proc| image:: img/fast_pkt_proc.*
diff --git a/doc/guides/prog_guide/intel_dpdk_xen_based_packet_switch_sol.rst 
b/doc/guides/prog_guide/intel_dpdk_xen_based_packet_switch_sol.rst
index 1f1e04f..47841cd 100644
--- a/doc/guides/prog_guide/intel_dpdk_xen_based_packet_switch_sol.rst
+++ b/doc/guides/prog_guide/intel_dpdk_xen_based_packet_switch_sol.rst
@@ -457,8 +457,8 @@ The packet flow is:

 packet generator->Virtio in guest VM1->switching backend->Virtio in guest 
VM2->switching backend->wire

-.. |grant_table| image:: img/grant_table.png
+.. |grant_table| image:: img/grant_table.*

-.. |grant_refs| image:: img/grant_refs.png
+.. |grant_refs| image:: img/grant_refs.*

-.. |dpdk_xen_pkt_switch| image:: img/dpdk_xen_pkt_switch.png
+.. |dpdk_xen_pkt_switch| image:: img/dpdk_xen_pkt_switch.*
diff --git a/doc/guides/prog_guide/ivshmem_lib.rst 
b/doc/guides/prog_guide/ivshmem_lib.rst
index cd2f595..c76d2b3 100644
--- a/doc/guides/prog_guide/ivshmem_lib.rst
+++ b/doc/guides/prog_guide/ivshmem_lib.rst
@@ -155,4 +155,4 @@ As a result, if the user wishes to shut down or restart the 
IVSHMEM host applica
 it is not enough to simply shut the application down.
 Th

[dpdk-dev] [PATCH v4 2/5] doc: Add Sphinx config to build pdf version of guides

2015-02-03 Thread John McNamara
Add Python Sphinx config to allow conversion of guides
to Latex and then PDF format.

This mainly adds metadata but also includes an override to the
Latex formatter to control the font size in code blocks.

Signed-off-by: John McNamara 
---
 doc/guides/conf.py |   48 +---
 1 files changed, 45 insertions(+), 3 deletions(-)

diff --git a/doc/guides/conf.py b/doc/guides/conf.py
index 385af03..1c03b50 100644
--- a/doc/guides/conf.py
+++ b/doc/guides/conf.py
@@ -1,5 +1,5 @@
 #   BSD LICENSE
-#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
 #   All rights reserved.
 #
 #   Redistribution and use in source and binary forms, with or without
@@ -29,11 +29,53 @@
 #   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

 import subprocess
+from sphinx.highlighting import PygmentsBridge
+from pygments.formatters.latex import LatexFormatter

 project = 'DPDK'

-copyright = '2014, Intel'
+copyright = '2015, Intel'

-version = subprocess.check_output(["make","-sRrC","../../", "showversion"])
+version = subprocess.check_output(['make', '-sRrC', '../../', 'showversion'])
+release = version

 master_doc = 'index'
+
+# Latex directives to be included directly in the latex/pdf docs.
+latex_preamble = r"""
+\usepackage[utf8]{inputenc}
+\usepackage{DejaVuSansMono}
+\usepackage[T1]{fontenc}
+\usepackage{helvet}
+\renewcommand{\familydefault}{\sfdefault}
+
+\RecustomVerbatimEnvironment{Verbatim}{Verbatim}{xleftmargin=5mm}
+"""
+
+# Configuration for the latex/pdf docs.
+latex_elements = {
+'papersize': 'a4paper',
+'pointsize': '11pt',
+'preamble': latex_preamble}
+
+latex_documents = [
+('index',
+ 'dpdk_doc.tex',
+ '',
+ '',
+ 'manual')]
+
+
+# Temp class to override the default Latex formatter in order to modify the
+# code/verbatim blocks.
+class CustomLatexFormatter(LatexFormatter):
+
+def __init__(self, **options):
+
+super(CustomLatexFormatter, self).__init__(**options)
+
+# Use the second smallest font size for code/verbatim blocks.
+self.verboptions = r'formatcom=\footnotesize'
+
+# Replace the default latex formatter.
+PygmentsBridge.latex_formatter = CustomLatexFormatter
-- 
1.7.4.1



[dpdk-dev] [PATCH v9 4/4] docs: Add ABI documentation

2015-02-03 Thread Thomas Monjalon
> Adding a document describing rudimentary ABI policy and adding notice space 
> for
> any deprecation announcements
> 
> Signed-off-by: Neil Horman 

Acked-by: Thomas Monjalon 

Thanks Neil for writing the policy.

The version 2.0 will be the first to have a versioned ABI with LIBABIVER := 1.
Starting from there, we'll have to consider this ABI policy when preparing
next versions.

-- 
Thomas


[dpdk-dev] Add DSO symbol versioning to supportbackwards compatibility

2015-02-03 Thread Thomas Monjalon
2014-12-20 16:01, Neil Horman:
> GI: [PATCH 1/4] compat: Add infrastructure to support symbol versioninBI
> develops and changes quickly, which makes it difficult for
> applications to keep up with the latest version of the library, especially 
> when
> it (the DPDK) is built as a set of shared objects, as applications may be 
> built
> against an older version of the library.
> 
> To mitigate this, this patch series introduces support for library and symbol
> versioning when the DPDK is built as a DSO.  Specifically, it does 4 things:
> 
> 1) Adds initial support for library versioning.  Each library now has a 
> version
> map that explicitly calls out what symbols are exported to using applications,
> and assigns version(s) to them
> 
> 2) Adds support macros so that when libraries create incompatible ABI's,
> multiple versions may be supported so that applications linked against older
> DPDK releases can continue to function
> 
> 3) Adds library soname versioning suffixes so that when ABI's must be broken 
> in
> a fashion that requires a rebuild of older applications, they will break at 
> load
> time, rather than cause unexpected issues at run time.
> 
> 4) Adds documentation for ABI policy, and provides space to document 
> deprecated
> ABI versions, so that applications might be warned of impending changes.
> 
> With these elements in place the DPDK has some support to allow for the 
> extended
> maintenence of older API's while still allowing the freedom to develop new and
> improved API's.
> 
> Implementing this feature will require some additional effort on the part of
> developers and reviewers.  When reviewing patches, must be checked against
> existing exports to ensure that the function prototypes are not changing.  If
> they are, the versioning macros must be used, and the library export map 
> should
> be updated to reflect the new version of the function.
> 
> When data structures change, if those structures are application accessible,
> apis that accept or return instances of those data structures should have new
> versions created so that users of the old data structure version might 
> co-exist
> at the same time.
> 
> Note it was requested that this series be delayed until DPDK 2.0, so this is a
> repost, now that DPDK 1.8 has been tagged.
> 
> Signed-off-by: Neil Horman 
> CC: Thomas Monjalon 
> CC: "Richardson, Bruce" 
> CC: "Robert Love" 

After updating to version 2.0, and sorting symbol lists,
Applied

It's probably an important change which makes 2.0 number meaningful.
Thanks
-- 
Thomas


[dpdk-dev] Packet drops during non-exhaustive flood with OVS and 1.8.0

2015-02-03 Thread Traynor, Kevin

> -Original Message-
> From: Andrey Korolyov [mailto:andrey at xdel.ru]
> Sent: Monday, February 2, 2015 10:53 AM
> To: dev at dpdk.org
> Cc: discuss at openvswitch.org; Traynor, Kevin
> Subject: Re: Packet drops during non-exhaustive flood with OVS and 1.8.0
> 
> On Thu, Jan 22, 2015 at 8:11 PM, Andrey Korolyov  wrote:
> > On Wed, Jan 21, 2015 at 8:02 PM, Andrey Korolyov  wrote:
> >> Hello,
> >>
> >> I observed that the latest OVS with dpdk-1.8.0 and igb_uio starts to
> >> drop packets earlier than a regular Linux ixgbe 10G interface, setup
> >> follows:
> >>
> >> receiver/forwarder:
> >> - 8 core/2 head system with E5-2603v2, cores 1-3 are given to OVS 
> >> exclusively
> >> - n-dpdk-rxqs=6, rx scattering is not enabled
> >> - x520 da
> >> - 3.10/3.18 host kernel
> >> - during 'legacy mode' testing, queue interrupts are scattered through all 
> >> cores
> >>
> >> sender:
> >> - 16-core E52630, netmap framework for packet generation
> >> - pkt-gen -f tx -i eth2 -s 10.6.9.0-10.6.9.255 -d
> >> 10.6.10.0-10.6.10.255 -S 90:e2:ba:84:19:a0 -D 90:e2:ba:85:06:07 -R
> >> 1100, results in 11Mpps 60-byte packet flood, there are constant
> >> values during test.
> >>
> >> OVS contains only single drop rule at the moment:
> >> ovs-ofctl add-flow br0 in_port=1,actions=DROP
> >>
> >> Packet generator was launched for tens of seconds for both Linux stack
> >> and OVS+DPDK cases, resulting in zero drop/error count on the
> >> interface in first, along with same counter values on pktgen and host
> >> interface stat (means that the none of generated packets are
> >> unaccounted).
> >>
> >> I selected rate for about 11M because OVS starts to do packet drop
> >> around this value, after same short test interface stat shows
> >> following:
> >>
> >> statistics  : {collisions=0, rx_bytes=22003928768,
> >> rx_crc_err=0, rx_dropped=0, rx_errors=10694693, rx_frame_err=0,
> >> rx_over_err=0, rx_packets=343811387, tx_bytes=0, tx_dropped=0,
> >> tx_errors=0, tx_packets=0}
> >>
> >> pktgen side:
> >> Sent 354506080 packets, 60 bytes each, in 32.23 seconds.
> >> Speed: 11.00 Mpps Bandwidth: 5.28 Gbps (raw 7.39 Gbps)
> >>
> >> If rate will be increased up to 13-14Mpps, the relative error/overall
> >> ratio will rise up to a one third. So far OVS on dpdk shows perfect
> >> results and I do not want to reject this solution due to exhaustive
> >> behavior like described one, so I`m open for any suggestions to
> >> improve the situation (except using 1.7 branch :) ).
> >
> > At a glance it looks like there is a problem with pmd threads, as they
> > starting to consume about five thousandth of sys% on a dedicated cores
> > during flood but in theory they should not. Any ideas for
> > debugging/improving this situation are very welcomed!
> 
> Over the time from a last message I tried a couple of different
> configurations, but packet loss starting to happen as early as at
> 7-8Mpps. Looks like that the bulk processing which has been in
> OVS-DPDK distro is missing from series of patches
> (http://openvswitch.org/pipermail/dev/2014-December/049722.html,
> http://openvswitch.org/pipermail/dev/2014-December/049723.html).
> Before implementing this, I would like to know if there can be any
> obvious (not for me unfortunately) clues on this performance issue.

These patches are to enable DPDK 1.8 only. What 'bulk processing' are you 
referring to? 
By default there is a batch size of 192 in netdev-dpdk for rx from the NIC - 
the linked 
patch doesn't change this, just the DPDK version.

Main things to consider are to isocpu's, pin the pmd thread and keep everything 
on 1 NUMA socket. At 11 mpps without packet loss on that processor I suspect 
you are 
doing those things already.

> 
> Thanks!


[dpdk-dev] Packet drops during non-exhaustive flood with OVS and 1.8.0

2015-02-03 Thread Andrey Korolyov
> These patches are to enable DPDK 1.8 only. What 'bulk processing' are you 
> referring to?
> By default there is a batch size of 192 in netdev-dpdk for rx from the NIC - 
> the linked
> patch doesn't change this, just the DPDK version.

Sorry, I referred the wrong part there: bulk transmission, which is
clearly not involved in my case. The idea was that the conditionally
enabling prefetch for rx queues (BULK_ALLOC) may help somehow, but
it`s probably will mask issue instead of solving it directly. By my
understanding, strict drop rule should have a zero impact on a main
ovs thread (and this is true) and work just fine with a line rate
(this is not).

>
> Main things to consider are to isocpu's, pin the pmd thread and keep 
> everything
> on 1 NUMA socket. At 11 mpps without packet loss on that processor I suspect 
> you are
> doing those things already.

Yes, with all tuning improvements I was able to do this, but bare
Linux stack on same machine is able to handle 12Mpps and there are
absolutely no hints of what exactly is being congested.


[dpdk-dev] [PATCH v6 00/13] Port Hotplug Framework

2015-02-03 Thread Iremonger, Bernard


> -Original Message-
> From: Tetsuya Mukawa [mailto:mukawa at igel.co.jp]
> Sent: Sunday, February 1, 2015 4:02 AM
> To: dev at dpdk.org
> Cc: Qiu, Michael; Iremonger, Bernard; Tetsuya Mukawa
> Subject: [PATCH v6 00/13] Port Hotplug Framework
> 
> This patch series adds a dynamic port hotplug framework to DPDK.
> With the patches, DPDK apps can attach or detach ports at runtime.
> 
> The basic concept of the port hotplug is like followings.
> - DPDK apps must have responsibility to manage ports.
>   DPDK apps only know which ports are attached or detached at the moment.
>   The port hotplug framework is implemented to allow DPDK apps to manage 
> ports.
>   For example, when DPDK apps call port attach function, attached port number
>   will be returned. Also DPDK apps can detach port by port number.
> - Kernel support is needed for attaching or detaching physical device ports.
>   To attach a new physical device port, the device will be recognized by
>   userspace directly I/O framework in kernel at first. Then DPDK apps can
>   call the port hotplug functions to attach ports.
>   For detaching, steps are vice versa.
> - Before detach ports, ports must be stopped and closed.
>   DPDK application must call rte_eth_dev_stop() and rte_eth_dev_close() before
>   detaching ports. These function will call finalization codes of PMDs.
>   But so far, no PMD frees all resources allocated by initialization.
>   It means PMDs are needed to be fixed to support the port hotplug.
>   'RTE_PCI_DRV_DETACHABLE' is a new flag indicating a PMD supports detaching.
>   Without this flag, detaching will be failed.
> - Mustn't affect legacy DPDK apps.
>   No DPDK EAL behavior is changed, if the port hotplug functions are't called.
>   So all legacy DPDK apps can still work without modifications.
> 
> And a few limitations.
> - The port hotplug functions are not thread safe.
>   DPDK apps should handle it.
> - Only support Linux and igb_uio so far.
>   BSD and VFIO is not supported. I will send VFIO patches at least, but I 
> don't
>   have a plan to submit BSD patch so far.
> 
> 
> Here is port hotplug APIs.
> ---
> /**
>  * Attach a new device.
>  *
>  * @param devargs
>  *   A pointer to a strings array describing the new device
>  *   to be attached. The strings should be a pci address like
>  *   ':01:00.0' or virtual device name like 'eth_pcap0'.
>  * @param port_id
>  *  A pointer to a port identifier actually attached.
>  * @return
>  *  0 on success and port_id is filled, negative on error  */ int 
> rte_eal_dev_attach(const char *devargs,
> uint8_t *port_id);
> 
> /**
>  * Detach a device.
>  *
>  * @param port_id
>  *   The port identifier of the device to detach.
>  * @param addr
>  *  A pointer to a device name actually detached.
>  * @return
>  *  0 on success and devname is filled, negative on error  */ int 
> rte_eal_dev_detach(uint8_t port_id,
> char *devname);
> ---
> 
> This patch series are for DPDK EAL. To use port hotplug function by DPDK 
> apps, each PMD should be
> fixed to support 'RTE_PCI_DRV_DETACHABLE' flag. Please check a patch for pcap 
> PMD.
> 
> Also please check testpmd patch. It will show you how to fix your legacy 
> applications to support port
> hotplug feature.
> 
> PATCH v6 changes
>  - Fix rte_eth_dev_uninit() to handle a return value of uninit
>function of PMD. To do this, below changes also be applied.
>- Fix a paramter of rte_eth_dev_free().
>- Use rte_eth_dev structure as the paramter of rte_eth_dev_free().
> 
> PATCH v5 changes
>  - Add runtime check passthrough driver type, like vfio-pci, igb_uio
>and uio_pci_generic.
>This was done by Qiu, Michael. Thanks a lot.
>  - Change function names like below.
>- rte_eal_dev_find_and_invoke() to rte_eal_vdev_find_and_invoke().
>- rte_eal_dev_invoke() to rte_eal_vdev_invoke().
>  - Add code to handle a return value of rte_eal_devargs_remove().
>  - Fix pci address format in rte_eal_dev_detach().
>  - Remove RTE_EAL_INVOKE_TYPE_UNKNOWN, because it's unused.
>  - Change function definition of rte_eal_devargs_remove().
>  - Fix pci_unmap_device() to check pt_driver.
>  - Fix return value of below functions.
>- rte_eth_dev_get_changed_port().
>- rte_eth_dev_get_port_by_addr().
>  - Change paramters of rte_eth_dev_validate_port() to cleanup code.
>  - Fix pci_scan_one to handle pt_driver correctly.
>(Thanks to Qiu, Michael for above sugesstions)
> 
> PATCH v4 changes
>  - Merge patches to review easier.
>  - Fix indent of 'if' statement.
>  - Fix calculation method of eal_compare_pci_addr().
>  - Fix header file declaration.
>  - Add header file to determine if hotplug can be enabled.
>(Thanks to Qiu, Michael)
>  - Use braces with 'for' loop.
>  - Add paramerter checking.
>  - Fix sanity check code
>  - Fix comments of rte_eth_dev_type.
>  -

[dpdk-dev] [PATCH 1/2] rte_ethdev: update link status (speed, duplex, link_up) after rte_eth_dev_start

2015-02-03 Thread Jia Yu
Helin,

Thanks for comment. Any device that enabled LSC needs this fix, otherwise
they all need to call link_update separately. We cannot assume that only
Bond enables LSC interrupt.

The fix will not be used for all PMDs, as it explicitly checks if
(dev->data->dev_conf.intr_conf.lsc != 0). Therefore, for hardware NIC
(e.g. 82599) that disabled lsc by default, the link_update callback will
not be executed. Please let me know if you have other concerns.

Thanks,
Jia

On 2/3/15, 12:35 AM, "Zhang, Helin"  wrote:

>
>
>> -Original Message-
>> From: Jia Yu [mailto:jyu at vmware.com]
>> Sent: Tuesday, February 3, 2015 4:00 PM
>> To: Zhang, Helin
>> Cc: dev at dpdk.org; Thomas Monjalon
>> Subject: Re: [dpdk-dev] [PATCH 1/2] rte_ethdev: update link status
>>(speed,
>> duplex, link_up) after rte_eth_dev_start
>> 
>> My answer to Helin?s comments:
>> 
>> This patch is needed for bond slave devices or other devices, when LSC
>> interrupt is enabled.
>>  1. slave_configure()  -> slave_eth_dev->?.lsc = 1
>> 
>>  2. rte_eth_link_get() reads dev_link from eth_dev, when lsc interrupt
>>is
>> enabled. However, the dev_link on eth_dev has not be initialized and
>>showed
>> link down state. This patch initializes the device?s dev_link at
>> rte_eth_dev_start time.
>So the link update is for bond only? But your code changes is in
>rte_ethdev, it will
>be used for all PMDs. For hardware NIC (e.g. 82599), this link update is
>not needed
>at all. It just need to wait the link status change event. So can those
>code changes
>be put in librte_pmd_bond, but not in librte_ether?
>
>Regards,
>Helin
>
>> 
>> Please let me know if you have further questions/comments.
>> 
>> Thanks,
>> Jia
>> 
>> On 1/30/15, 2:28 AM, "Thomas Monjalon" 
>> wrote:
>> 
>> >Jia, any news on this patchset?
>> >
>> >2014-11-12 03:57, Zhang, Helin:
>> >> Hi Jia
>> >>
>> >> > -Original Message-
>> >> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jia Yu
>> >> > Sent: Saturday, November 8, 2014 1:32 AM
>> >> > To: dev at dpdk.org
>> >> > Subject: [dpdk-dev] [PATCH 1/2] rte_ethdev: update link status
>> >>(speed, duplex,
>> >> > link_up) after rte_eth_dev_start
>> >> >
>> >> > Since LSR interrupt is disabled by pmd drivers, link status in
>> >>rte_eth_device is
>> >> > always down.
>> >> If LSC interrupt is disabled by default, it will poll the link status
>> >>during the initialization  or in dev_start, and then the link status
>> >>should he correct. If I am not wrong.
>> >>
>> >> > Bond slave_configure() enables LSR interrupt on devices to get
>> >>notification if link
>> >> > status changes. However, the LSC interrupt at device start time is
>> >>still lost.
>> >> Before enabling interrupt for LSC, the link status should be polled.
>> >>So after the port  startup, the link status should be there.
>> >>
>> >> >
>> >> > In this fix, call link_update to read link status from hardware
>> >>register at device
>> >> > start time.
>> >> Could you help to explain this code changes a bit more? Why we need
>>it?
>> >>
>> >> >
>> >> > Issue:
>> >> > Change-Id: Ib57a1c9114f922485c7b0f4338bfe7b3d3f87d65
>> >> > Signed-off-by: Jia Yu 
>> >> > ---
>> >> >  lib/librte_ether/rte_ethdev.c | 4 
>> >> >  1 file changed, 4 insertions(+)
>> >> >
>> >> > diff --git a/lib/librte_ether/rte_ethdev.c
>> >>b/lib/librte_ether/rte_ethdev.c index
>> >> > ff1c769..6c01b02 100644
>> >> > --- a/lib/librte_ether/rte_ethdev.c
>> >> > +++ b/lib/librte_ether/rte_ethdev.c
>> >> > @@ -869,6 +869,10 @@ rte_eth_dev_start(uint8_t port_id)
>> >> >
>> >> > rte_eth_dev_config_restore(port_id);
>> >> >
>> >> > +   if (dev->data->dev_conf.intr_conf.lsc != 0) {
>> >> > +   FUNC_PTR_OR_ERR_RET(*dev->dev_ops->link_update,
>> -ENOTSUP);
>> >> > +   (*dev->dev_ops->link_update)(dev, 0);
>> >> > +   }
>> >> > return 0;
>> >> >  }
>> >> >
>> >> > --
>> >> > 1.9.1
>> >>
>> >> Regards,
>> >> Helin
>> >
>



[dpdk-dev] deadline for 2.0 features proposal

2015-02-03 Thread Bruce Richardson
On Tue, Feb 03, 2015 at 11:40:41AM +0100, Thomas Monjalon wrote:
> Hi,
> 
> These features were planned in the 2.0 roadmap but not submitted:
>   - cuckoo hash
>   - packet distributor (phase 2)
>   - bifurcated driver, assuming availability in Linux kernel
>   - Broadcom driver
>   - Hyper-V driver
>   - i40e QoS
>   - i40e IEEE1588
>   - i40e DCB
>   - i40e SR-IOV switching
>   - i40e port mirroring
> 
> They are now in the 2.1 roadmap.
> If you are working on one of these features or plan to do,
> please confirm their status and the estimated integration window.
> 
> Thanks
> -- 
> Thomas

There are no enhancements to the packet distributor lib which will be ready for
inclusion inside the 2.0 timeframe.

Regards,
/Bruce


[dpdk-dev] Add DSO symbol versioning to supportbackwards compatibility

2015-02-03 Thread Neil Horman
On Tue, Feb 03, 2015 at 05:01:51PM +0100, Thomas Monjalon wrote:
> 2014-12-20 16:01, Neil Horman:
> > GI: [PATCH 1/4] compat: Add infrastructure to support symbol versioninBI
> > develops and changes quickly, which makes it difficult for
> > applications to keep up with the latest version of the library, especially 
> > when
> > it (the DPDK) is built as a set of shared objects, as applications may be 
> > built
> > against an older version of the library.
> > 
> > To mitigate this, this patch series introduces support for library and 
> > symbol
> > versioning when the DPDK is built as a DSO.  Specifically, it does 4 things:
> > 
> > 1) Adds initial support for library versioning.  Each library now has a 
> > version
> > map that explicitly calls out what symbols are exported to using 
> > applications,
> > and assigns version(s) to them
> > 
> > 2) Adds support macros so that when libraries create incompatible ABI's,
> > multiple versions may be supported so that applications linked against older
> > DPDK releases can continue to function
> > 
> > 3) Adds library soname versioning suffixes so that when ABI's must be 
> > broken in
> > a fashion that requires a rebuild of older applications, they will break at 
> > load
> > time, rather than cause unexpected issues at run time.
> > 
> > 4) Adds documentation for ABI policy, and provides space to document 
> > deprecated
> > ABI versions, so that applications might be warned of impending changes.
> > 
> > With these elements in place the DPDK has some support to allow for the 
> > extended
> > maintenence of older API's while still allowing the freedom to develop new 
> > and
> > improved API's.
> > 
> > Implementing this feature will require some additional effort on the part of
> > developers and reviewers.  When reviewing patches, must be checked against
> > existing exports to ensure that the function prototypes are not changing.  
> > If
> > they are, the versioning macros must be used, and the library export map 
> > should
> > be updated to reflect the new version of the function.
> > 
> > When data structures change, if those structures are application accessible,
> > apis that accept or return instances of those data structures should have 
> > new
> > versions created so that users of the old data structure version might 
> > co-exist
> > at the same time.
> > 
> > Note it was requested that this series be delayed until DPDK 2.0, so this 
> > is a
> > repost, now that DPDK 1.8 has been tagged.
> > 
> > Signed-off-by: Neil Horman 
> > CC: Thomas Monjalon 
> > CC: "Richardson, Bruce" 
> > CC: "Robert Love" 
> 
> After updating to version 2.0, and sorting symbol lists,
> Applied
> 
> It's probably an important change which makes 2.0 number meaningful.
> Thanks
> -- 
> Thomas
> 
Thank you Thomas!  Just as a heads up, there may be some inconsistencies
resulting from patches that you merged between the time of my last repost and
the time of this merge.  They'll be easy to fix as they will just amount to
symbol additions/removals from the various version map file.  So please cc me on
any sucpicious build breaks in the next few days when building shared objects,
and I'll fix them up ASAP.

Neil



[dpdk-dev] deadline for 2.0 features proposal

2015-02-03 Thread Thomas Monjalon
2015-02-03 12:04, Bruce Richardson:
> On Tue, Feb 03, 2015 at 11:40:41AM +0100, Thomas Monjalon wrote:
> > Hi,
> > 
> > These features were planned in the 2.0 roadmap but not submitted:
> > - cuckoo hash
> > - packet distributor (phase 2)
> > - bifurcated driver, assuming availability in Linux kernel
> > - Broadcom driver
> > - Hyper-V driver
> > - i40e QoS
> > - i40e IEEE1588
> > - i40e DCB
> > - i40e SR-IOV switching
> > - i40e port mirroring
> > 
> > They are now in the 2.1 roadmap.
> > If you are working on one of these features or plan to do,
> > please confirm their status and the estimated integration window.
> > 
> > Thanks
> 
> There are no enhancements to the packet distributor lib which will be ready 
> for
> inclusion inside the 2.0 timeframe.

But there will be something new for 2.1?
What means "phase 2"?

Thanks
-- 
Thomas


[dpdk-dev] [RFC PATCH] rte_timer: Fix rte_timer_reset return value

2015-02-03 Thread rsanfo...@gmail.com
From: Robert Sanford 

- API rte_timer_reset() should return -1 when the timer is in the
RUNNING or CONFIG state. Instead, it ignores the return value of
internal function __rte_timer_reset() and always returns 0.
We change rte_timer_reset() to return the value returned by
__rte_timer_reset().

- Change API rte_timer_reset_sync() to invoke rte_pause() while
spin-waiting for rte_timer_reset() to succeed.

- Enhance timer stress test 2 to report how many timer reset
collisions occur, i.e., how many times rte_timer_reset() fails
due to a timer being in the CONFIG state.

Signed-off-by: Robert Sanford 

---
 app/test/test_timer.c|   25 ++---
 lib/librte_timer/rte_timer.c |7 +++
 2 files changed, 25 insertions(+), 7 deletions(-)

diff --git a/app/test/test_timer.c b/app/test/test_timer.c
index 4b4800b..2f27f84 100644
--- a/app/test/test_timer.c
+++ b/app/test/test_timer.c
@@ -247,12 +247,15 @@ static int
 timer_stress2_main_loop(__attribute__((unused)) void *arg)
 {
static struct rte_timer *timers;
-   int i;
+   int i, ret;
static volatile int ready = 0;
uint64_t delay = rte_get_timer_hz() / 4;
unsigned lcore_id = rte_lcore_id();
+   int32_t my_collisions = 0;
+   static rte_atomic32_t collisions = RTE_ATOMIC32_INIT(0);

if (lcore_id == rte_get_master_lcore()) {
+   cb_count = 0;
timers = rte_malloc(NULL, sizeof(*timers) * NB_STRESS2_TIMERS, 
0);
if (timers == NULL) {
printf("Test Failed\n");
@@ -268,15 +271,24 @@ timer_stress2_main_loop(__attribute__((unused)) void *arg)
}

/* have all cores schedule all timers on master lcore */
-   for (i = 0; i < NB_STRESS2_TIMERS; i++)
-   rte_timer_reset(&timers[i], delay, SINGLE, 
rte_get_master_lcore(),
+   for (i = 0; i < NB_STRESS2_TIMERS; i++) {
+   ret = rte_timer_reset(&timers[i], delay, SINGLE, 
rte_get_master_lcore(),
timer_stress2_cb, NULL);
+   /* there will be collisions when multiple cores simultaneously
+* configure the same timers */
+   if (ret != 0)
+   my_collisions++;
+   }
+   if (my_collisions != 0)
+   rte_atomic32_add(&collisions, my_collisions);

ready = 0;
rte_delay_ms(500);

/* now check that we get the right number of callbacks */
if (lcore_id == rte_get_master_lcore()) {
+   if ((my_collisions = rte_atomic32_read(&collisions)) != 0)
+   printf("- %d timer reset collisions (OK)\n", 
my_collisions);
rte_timer_manage();
if (cb_count != NB_STRESS2_TIMERS) {
printf("Test Failed\n");
@@ -311,6 +323,13 @@ timer_stress2_main_loop(__attribute__((unused)) void *arg)
/* now check that we get the right number of callbacks */
if (lcore_id == rte_get_master_lcore()) {
rte_timer_manage();
+
+   /* clean up statics, in case we run again */
+   rte_free(timers);
+   timers = 0;
+   ready = 0;
+   rte_atomic32_set(&collisions, 0);
+
if (cb_count != NB_STRESS2_TIMERS) {
printf("Test Failed\n");
printf("- Stress test 2, part 2 failed\n");
diff --git a/lib/librte_timer/rte_timer.c b/lib/librte_timer/rte_timer.c
index 269a992..d18abf5 100644
--- a/lib/librte_timer/rte_timer.c
+++ b/lib/librte_timer/rte_timer.c
@@ -424,10 +424,8 @@ rte_timer_reset(struct rte_timer *tim, uint64_t ticks,
else
period = 0;

-   __rte_timer_reset(tim,  cur_time + ticks, period, tim_lcore,
+   return __rte_timer_reset(tim,  cur_time + ticks, period, tim_lcore,
  fct, arg, 0);
-
-   return 0;
 }

 /* loop until rte_timer_reset() succeed */
@@ -437,7 +435,8 @@ rte_timer_reset_sync(struct rte_timer *tim, uint64_t ticks,
 rte_timer_cb_t fct, void *arg)
 {
while (rte_timer_reset(tim, ticks, type, tim_lcore,
-  fct, arg) != 0);
+  fct, arg) != 0)
+   rte_pause();
 }

 /* Stop the timer associated with the timer handle tim */
-- 
1.7.1



[dpdk-dev] deadline for 2.0 features proposal

2015-02-03 Thread Bruce Richardson
On Tue, Feb 03, 2015 at 09:24:28PM +0100, Thomas Monjalon wrote:
> 2015-02-03 12:04, Bruce Richardson:
> > On Tue, Feb 03, 2015 at 11:40:41AM +0100, Thomas Monjalon wrote:
> > > Hi,
> > > 
> > > These features were planned in the 2.0 roadmap but not submitted:
> > >   - cuckoo hash
> > >   - packet distributor (phase 2)
> > >   - bifurcated driver, assuming availability in Linux kernel
> > >   - Broadcom driver
> > >   - Hyper-V driver
> > >   - i40e QoS
> > >   - i40e IEEE1588
> > >   - i40e DCB
> > >   - i40e SR-IOV switching
> > >   - i40e port mirroring
> > > 
> > > They are now in the 2.1 roadmap.
> > > If you are working on one of these features or plan to do,
> > > please confirm their status and the estimated integration window.
> > > 
> > > Thanks
> > 
> > There are no enhancements to the packet distributor lib which will be ready 
> > for
> > inclusion inside the 2.0 timeframe.
> 
> But there will be something new for 2.1?
> What means "phase 2"?
> 
> Thanks
> -- 
> Thomas

I'm currently doing some investigation work in this area, to see what ways we
can improve performance or use other distribution schemes. Depending on what
comes out of the investigations, there may or may not be improvements to push
in the 2.1 timeframe. It's too early to know at this point.

Regards,
/Bruce


[dpdk-dev] [PATCH v2 0/5] Interrupt mode for PMD

2015-02-03 Thread Stephen Hemminger
On Tue,  3 Feb 2015 16:18:26 +0800
Zhou Danny  wrote:

> 2) UIO only supports a single interrupt vector which has to been shared by
> LSC interrupt and interrupts assigned to dedicated rx queues.

UIO uses msi-x and there is no fundamental reason it could not use one IRQ for
LSC and one IRQ per queue. Might require some more work in base kernel
but not that hard.


[dpdk-dev] [PATCH v2 1/5] ethdev: add rx interrupt enable/disable functions

2015-02-03 Thread Stephen Hemminger
On Tue,  3 Feb 2015 16:18:27 +0800
Zhou Danny  wrote:

> +
> +int
> +rte_eth_dev_rx_queue_intr_enable(uint8_t port_id,
> + uint16_t queue_id)
> +{
> + struct rte_eth_dev *dev;
> +
> + if (port_id >= nb_ports) {
> + PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
> + return (-ENODEV);
> + }
> +
> + dev = &rte_eth_devices[port_id];
> + if (dev == NULL) {
> + PMD_DEBUG_TRACE("Invalid port device\n");
> + return (-ENODEV);
> + }
> +
> + FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_intr_enable, -ENOTSUP);
> + (*dev->dev_ops->rx_queue_intr_enable)(dev, queue_id);
> + return 0;

The interrupt setup might fail for device specific reasons.
You should give the device specific function a chance to
return error as well.


[dpdk-dev] [PATCH] maintainers: claim responsibility for VMXNET3 PMD

2015-02-03 Thread Yong Wang
Signed-off-by: Yong Wang 
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 9a63714..377aa8a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -206,6 +206,7 @@ F: examples/vhost/
 F: doc/guides/sample_app_ug/vhost.rst

 VMware vmxnet3
+M: Yong Wang 
 F: lib/librte_pmd_vmxnet3/
 F: doc/guides/prog_guide/poll_mode_drv_paravirtual_vmxnets_nic.rst

-- 
1.9.1