[dpdk-dev] Fortville Firmware

2015-02-23 Thread Igor Ryzhov
Hello.

We are testing new X710/XL710 cards and having some issues - our cards drop
significant amount of packets without indicating the reason on any of it's
counters. We found that our cards' firmware version is 4.22 but 4.2.4 is a
suggested version in some guides. We want to try our cards with other
firmware version, but the problem is that we can't find it. Can anybody
help with getting firmware version 4.2.4? Thank you.

Regards,
Igor Ryzhov


[dpdk-dev] Testpmd returns error.

2015-02-23 Thread Tetsuya Mukawa
On 2015/02/23 5:46, Bruce Richardson wrote:
> On Sun, Feb 22, 2015 at 02:30:02PM +0900, Tetsuya Mukawa wrote:
>> Hi,
>>
>> In my environment, testpmd in latest master branch returns error like below.
>>
>> $ sudo ./tools/dpdk_nic_bind.py -b igb_uio :02:00.0
>> $ sudo ./x86_64-native-linuxapp-gcc/app/testpmd -c f -n 1 -- -i
>> EAL: Detected lcore 0 as core 0 on socket 0
>> EAL: Detected lcore 1 as core 1 on socket 0
>> EAL: Detected lcore 2 as core 2 on socket 0
>> EAL: Detected lcore 3 as core 3 on socket 0
>> EAL: Detected lcore 4 as core 4 on socket 0
>> EAL: Detected lcore 5 as core 5 on socket 0
>> EAL: Detected lcore 6 as core 6 on socket 0
>> EAL: Detected lcore 7 as core 7 on socket 0
>> EAL: Support maximum 128 logical core(s) by configuration.
>> EAL: Detected 8 lcore(s)
>> EAL: VFIO modules not all loaded, skip VFIO support...
>> EAL: Setting up memory...
>> EAL: Ask a virtual area of 0x28000 bytes
>> EAL: Virtual area found at 0x7ffd4000 (size = 0x28000)
>> EAL: Requesting 10 pages of size 1024MB from socket 0
>> EAL: TSC frequency is ~3991450 KHz
>> EAL: Master core 0 is ready (tid=f7fd6840)
>> PMD: ENICPMD trace: rte_enic_pmd_init
>> EAL: Core 3 is ready (tid=f58e0700)
>> EAL: Core 2 is ready (tid=f60e1700)
>> EAL: Core 1 is ready (tid=f68e2700)
>> EAL: PCI device :02:00.0 on NUMA socket -1
>> EAL:   probe driver: 8086:10b9 rte_em_pmd
>> EAL:   PCI memory mapped at 0x7fffc000
>> EAL: pci_map_resource(): cannot mmap(23, 0x7fffc002, 0x2,
>> 0x1000): Invalid argument (0x)
>> EAL: Error - exiting with code: 1
>>   Cause: Requested device :02:00.0 cannot be used
>>
>>
>> I've run git-bisect, and it seems following commit cause this error.
>>
>> commit 4a499c64959074ba6fa6a5a2b3a2a6aa10627fa1
>> Author: Danny Zhou 
>> Date:   Fri Feb 20 16:59:15 2015 +
>>
>> eal/linux: enable uio_pci_generic support
>>
>> Someone, could you please check it?
>>
>> Thanks,
>> Tetsuya
>>
> Hi Tetsuya,
>
> trying to reproduce the problem here, with no success so far with a mix of 1G
> and 10G ports. Is there anything special about your environment that might 
> especially trigger this issue? Is it a VM or running on the host machine etc.?

Hi Bruce,

I appreciate for your testing.

I've tried it on an another system, and I couldn't reproduce it.
Could you please see below?

- The system I can reproduce the issue
OS: Ubuntu14.04
Kernel: Linux eris 3.13.0-30-generic
CPU: AMD FX(tm)-8350 Eight-Core Processor
NIC: Ethernet controller: Intel Corporation 82572EI Gigabit Ethernet
Controller (Copper) (rev 06)

- The system I cannot reproduce the issue
OS: ubuntu14.04
Kernel: Linux ubuntu-igel 3.13.0-30-generic
Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz
NIC: Ethernet controller: Intel Corporation I350 Gigabit Network
Connection (rev 01)

I will check it more in next testing phase.
Until then, I will use the system I cannot reproduce the issue.

Regards,
Tetsuya

> Regards,
> /Bruce
>




[dpdk-dev] [PATCH v2 00/11] qemu vhost-user support

2015-02-23 Thread Tetsuya Mukawa
On 2015/02/12 14:07, Huawei Xie wrote:
> vhost-user supports passing vring information to a seperate vhost enabled
> user space process, normally a user space vSwitch, through unix domain socket.
>
> In previous DPDK version, we implement a user space character device driver
> vhost-cuse in user space DPDK process. vring information is passed to the
> cuse driver through ioctl call, including eventfds for interrupt injection and
> host notification. A kernel module is developed to copy these fds from
> qemu process into our process. We also need some trick to map guest memory.
> (TODO: kickfd/callfd is reversed which causes confusion)
>
> known issue in vhost-user implementation in QEMU, reported by haifeng.lin at 
> huawei.com
> * QEMU doesn't send correct memory region information with multiple numa node 
> configuration
> http://lists.gnu.org/archive/html/qemu-devel/2014-12/msg01454.html
>
> Thanks Tetsuya for reporting the issue that "FD_ISSET would crash when 
> receive -1
> as fd on Ubuntu 14.04".
>
> Huawei Xie (11):
>  enable VIRTIO_NET_F_CTRL_RX
>  create vhost_cuse directory and move vhost-net-cdev.c into vhost_cuse
>  rename vhost-net-cdev.h to vhost-net.h
>  move fd copying(from qemu process into vhost process) to eventfd_copy.c
>  copy host_memory_map from virtio-net.c to a new file virtio-net-cdev.c
>  make host_memory_map a more generic function.
>  implement cuse_set_memory_table in virtio-net-cdev.c
>  add select based event driven processing
>  vhost user support
>  support dev->ifname
>  support calling rte_vhost_driver_register after 
> rte_vhost_driver_session_start
>
>  lib/librte_vhost/Makefile |   8 +-
>  lib/librte_vhost/rte_virtio_net.h |   5 +-
>  lib/librte_vhost/vhost-net-cdev.c | 389 
>  lib/librte_vhost/vhost-net-cdev.h | 113 --
>  lib/librte_vhost/vhost-net.h  | 118 +++
>  lib/librte_vhost/vhost_cuse/eventfd_copy.c|  88 +
>  lib/librte_vhost/vhost_cuse/eventfd_copy.h|  39 ++
>  lib/librte_vhost/vhost_cuse/vhost-net-cdev.c  | 417 ++
>  lib/librte_vhost/vhost_cuse/virtio-net-cdev.c | 423 ++
>  lib/librte_vhost/vhost_cuse/virtio-net-cdev.h |  48 +++
>  lib/librte_vhost/vhost_rxtx.c |   2 +-
>  lib/librte_vhost/vhost_user/fd_man.c  | 258 ++
>  lib/librte_vhost/vhost_user/fd_man.h  |  67 
>  lib/librte_vhost/vhost_user/vhost-net-user.c  | 472 +
>  lib/librte_vhost/vhost_user/vhost-net-user.h  | 106 ++
>  lib/librte_vhost/vhost_user/virtio-net-user.c | 314 
>  lib/librte_vhost/vhost_user/virtio-net-user.h |  49 +++
>  lib/librte_vhost/virtio-net.c | 491 
> ++
>  lib/librte_vhost/virtio-net.h |  43 +++
>  19 files changed, 2491 insertions(+), 959 deletions(-)
>  delete mode 100644 lib/librte_vhost/vhost-net-cdev.c
>  delete mode 100644 lib/librte_vhost/vhost-net-cdev.h
>  create mode 100644 lib/librte_vhost/vhost-net.h
>  create mode 100644 lib/librte_vhost/vhost_cuse/eventfd_copy.c
>  create mode 100644 lib/librte_vhost/vhost_cuse/eventfd_copy.h
>  create mode 100644 lib/librte_vhost/vhost_cuse/vhost-net-cdev.c
>  create mode 100644 lib/librte_vhost/vhost_cuse/virtio-net-cdev.c
>  create mode 100644 lib/librte_vhost/vhost_cuse/virtio-net-cdev.h
>  create mode 100644 lib/librte_vhost/vhost_user/fd_man.c
>  create mode 100644 lib/librte_vhost/vhost_user/fd_man.h
>  create mode 100644 lib/librte_vhost/vhost_user/vhost-net-user.c
>  create mode 100644 lib/librte_vhost/vhost_user/vhost-net-user.h
>  create mode 100644 lib/librte_vhost/vhost_user/virtio-net-user.c
>  create mode 100644 lib/librte_vhost/vhost_user/virtio-net-user.h
>  create mode 100644 lib/librte_vhost/virtio-net.h
>
Acked-by: Tetsuya Mukawa 


[dpdk-dev] [PATCH v11 00/13] Port Hotplug Framework

2015-02-23 Thread Tetsuya Mukawa
This patch series adds a dynamic port hotplug framework to DPDK.
With the patches, DPDK apps can attach or detach ports at runtime.

The basic concept of the port hotplug is like followings.
- DPDK apps must have responsibility to manage ports.
  DPDK apps only know which ports are attached or detached at the moment.
  The port hotplug framework is implemented to allow DPDK apps to manage ports.
  For example, when DPDK apps call port attach function, attached port number
  will be returned. Also, DPDK apps can detach port by port number.
- Kernel support is needed for attaching or detaching physical device ports.
  To attach a new physical device port, the device will be recognized by
  userspace directly I/O framework in kernel at first. Then DPDK apps can
  call the port hotplug functions to attach ports.
  For detaching, steps are vice versa.
- Before detach ports, ports must be stopped and closed.
  DPDK application must call rte_eth_dev_stop() and rte_eth_dev_close() before
  detaching ports. These function will call finalization codes of PMDs.
  But so far, no PMD frees all resources allocated by initialization.
  It means PMDs are needed to be fixed to support the port hotplug.
  'RTE_PCI_DRV_DETACHABLE' is a new flag indicating a PMD supports detaching.
  Without this flag, detaching will be failed.
- Mustn't affect legacy DPDK apps.
  No DPDK EAL behavior is changed, if the port hotplug functions are't called.
  So all legacy DPDK apps can still work without modifications.

And a few limitations.
- The port hotplug functions are not thread safe.
  DPDK apps should handle it.
- Only support Linux and igb_uio so far.
  BSD and VFIO is not supported. I will send VFIO patches at least, but I don't
  have a plan to submit BSD patch so far.


Here is port hotplug APIs.
---
/**
 * Attach a new device.
 *
 * @param devargs
 *   A pointer to a strings array describing the new device
 *   to be attached. The strings should be a pci address like
 *   ':01:00.0' or virtual device name like 'eth_pcap0'.
 * @param port_id
 *  A pointer to a port identifier actually attached.
 * @return
 *  0 on success and port_id is filled, negative on error
 */
int rte_eal_dev_attach(const char *devargs, uint8_t *port_id);

/**
 * Detach a device.
 *
 * @param port_id
 *   The port identifier of the device to detach.
 * @param addr
 *  A pointer to a device name actually detached.
 * @return
 *  0 on success and devname is filled, negative on error
 */
int rte_eal_dev_detach(uint8_t port_id, char *devname);
---

This patch series are for DPDK EAL. To use port hotplug function by DPDK apps,
each PMD should be fixed to support 'RTE_PCI_DRV_DETACHABLE' flag. Please check
a patch for pcap PMD.

Also, please check testpmd patch. It will show you how to fix your legacy
applications to support port hotplug feature.

PATCH v11 changes
 - Remove needless devargs handling codes.
 - Replace get_vdev_name() by rte_eal_parse_devargs_str().
 - Replace rte_eal_vdev_find_and_init by rte_eal_vdev_init()
 - Replace rte_eal_vdev_find_and_uninit by rte_eal_vdev_uninit()
 - Fix rte_eal_dev_init() to use rte_eal_vdev_init().
 - Remove needless patch.
   (Thanks to Maxime Leroy)

PATCH v10 changes
 - Add comments.
 - Chagne order of version.map.
 - Fix comment of "rte_ethdev.h".
   (Thanks to Thomas Monjalon)
 - Add size parameter to rte_eth_dev_create_unique_device_name().
   (Thanks to Iremonger, Bernard)

PATCH v9 changes
 - Fix commit title.
 - Fix commit log.
 - Fix comments.
 - Define CONFIG_RTE_LIBRTE_EAL_HOTPLUG at the top of this patch series.
 - DEV_INVALID/VALID are removed.
 - DEV_DISCONNECTED is replaced by DEV_DETACHED.
 - DEV_CONNECTED is replaced by DEV_ATTACHED.
 - rte_eth_dev_allocate_new_port() is renamed to
   rte_eth_dev_find_free_port().
 - rte_eth_dev_validate_port() is renamed to rte_eth_dev_is_valid_port().
 - rte_eth_dev_is_valid_port() is changed not to handle log toggle.
 - eal_compare_pci_addr() is replaced by rte_eal_compare_pci_addr().
 - rte_eth_dev_free() is replaced by rte_eth_dev_release_port().
 - Add a function to create a unique device name.
 - Change parameter of pci_devuninit_t and rte_eth_dev_uninit.
 - Remove code that initiaize callback of ethdev from
   rte_eth_dev_uninit().
 - Remove pci_unmap_device(). It will be implemented in later patch.
 - rte_eth_dev_check_detachable() is replaced by
   rte_eth_dev_is_detachable().
 - strncpy() is replaced by strcpy().
 - Implement pci_unmap_device() in this patch.
 - Remove "rte_dev_hotplug.h".
 - Remove needless "#ifdef".
 - Remove RTE_EAL_INVOKE_TYPE_PROBE/CLOSE.
 - RTE_ETH_DEV_PHYSICAL is replaced by RTE_ETH_DEV_PCI.
 - Use strcmp() instead of strncmp().
 - Remove RTE_EAL_INVOKE_TYPE_PROBE/CLOSE.
   (Thanks to Thomas Monjalon)
 - Change definition of rte_dev_uninit_t.
   (Thanks to Thomas Monjalon and Maxim

[dpdk-dev] [PATCH v11 01/13] eal: Enable port Hotplug framework in Linux

2015-02-23 Thread Tetsuya Mukawa
The patch adds CONFIG_RTE_LIBRTE_EAL_HOTPLUG in Linux and BSD
configuration. So far, Hotplug functions only support linux.

v9:
- Move this patch at the top of this patch series.
  (Thanks to Thomas Monjalon)

Signed-off-by: Tetsuya Mukawa 
---
 config/common_bsdapp   | 6 ++
 config/common_linuxapp | 5 +
 2 files changed, 11 insertions(+)

diff --git a/config/common_bsdapp b/config/common_bsdapp
index 4c0cfc0..c24f687 100644
--- a/config/common_bsdapp
+++ b/config/common_bsdapp
@@ -116,6 +116,12 @@ CONFIG_RTE_LIBRTE_EAL_BSDAPP=y
 CONFIG_RTE_LIBRTE_EAL_LINUXAPP=n

 #
+# Compile Environment Abstraction Layer to support hotplug
+# So far, Hotplug functions only support linux
+#
+CONFIG_RTE_LIBRTE_EAL_HOTPLUG=n
+
+#
 # Compile Environment Abstraction Layer to support Vmware TSC map
 #
 CONFIG_RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT=y
diff --git a/config/common_linuxapp b/config/common_linuxapp
index 0234236..d66b008 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -114,6 +114,11 @@ CONFIG_RTE_PCI_MAX_READ_REQUEST_SIZE=0
 CONFIG_RTE_LIBRTE_EAL_LINUXAPP=y

 #
+# Compile Environment Abstraction Layer to support hotplug
+#
+CONFIG_RTE_LIBRTE_EAL_HOTPLUG=y
+
+#
 # Compile Environment Abstraction Layer to support Vmware TSC map
 #
 CONFIG_RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT=y
-- 
1.9.1



[dpdk-dev] [PATCH v11 02/13] eal_pci: Add flag to hold kernel driver type

2015-02-23 Thread Tetsuya Mukawa
From: Michael Qiu 

Currently, dpdk has no ability to know which type of driver(
vfio-pci/igb_uio/uio_pci_generic) the device used. It only can
check whether vfio is enabled or not staticly.

It really useful to have the flag, becasue different type need to
handle differently in runtime. For example, pci memory map,
pot hotplug, and so on.

This patch add a flag field for pci device to solve above issue.

Signed-off-by: Michael Qiu 
Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_eal/common/include/rte_pci.h |  8 +
 lib/librte_eal/linuxapp/eal/eal_pci.c   | 53 +++--
 2 files changed, 59 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/common/include/rte_pci.h 
b/lib/librte_eal/common/include/rte_pci.h
index 4301c16..5e0ba00 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -142,6 +142,13 @@ struct rte_pci_addr {

 struct rte_devargs;

+enum rte_pt_driver {
+   RTE_PT_UNKNOWN  = 0,
+   RTE_PT_IGB_UIO  = 1,
+   RTE_PT_VFIO = 2,
+   RTE_PT_UIO_GENERIC  = 3,
+};
+
 /**
  * A structure describing a PCI device.
  */
@@ -155,6 +162,7 @@ struct rte_pci_device {
uint16_t max_vfs;   /**< sriov enable if not zero */
int numa_node;  /**< NUMA node connection */
struct rte_devargs *devargs;/**< Device user arguments */
+   enum rte_pt_driver pt_driver;   /**< Driver of passthrough */
 };

 /** Any PCI device identifier (vendor, device, ...) */
diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c 
b/lib/librte_eal/linuxapp/eal/eal_pci.c
index 63bcbce..9fe2851 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
@@ -97,6 +97,35 @@ error:
return -1;
 }

+static int
+pci_get_kernel_driver_by_path(const char *filename, char *dri_name)
+{
+   int count;
+   char path[PATH_MAX];
+   char *name;
+
+   if (!filename || !dri_name)
+   return -1;
+
+   count = readlink(filename, path, PATH_MAX);
+   if (count >= PATH_MAX)
+   return -1;
+
+   /* For device does not have a driver */
+   if (count < 0)
+   return 1;
+
+   path[count] = '\0';
+
+   name = strrchr(path, '/');
+   if (name) {
+   strncpy(dri_name, name + 1, strlen(name + 1) + 1);
+   return 0;
+   }
+
+   return -1;
+}
+
 void *
 pci_find_max_end_va(void)
 {
@@ -220,11 +249,12 @@ pci_scan_one(const char *dirname, uint16_t domain, 
uint8_t bus,
char filename[PATH_MAX];
unsigned long tmp;
struct rte_pci_device *dev;
+   char driver[PATH_MAX];
+   int ret;

dev = malloc(sizeof(*dev));
-   if (dev == NULL) {
+   if (dev == NULL)
return -1;
-   }

memset(dev, 0, sizeof(*dev));
dev->addr.domain = domain;
@@ -303,6 +333,25 @@ pci_scan_one(const char *dirname, uint16_t domain, uint8_t 
bus,
return -1;
}

+   /* parse driver */
+   snprintf(filename, sizeof(filename), "%s/driver", dirname);
+   ret = pci_get_kernel_driver_by_path(filename, driver);
+   if (!ret) {
+   if (!strcmp(driver, "vfio-pci"))
+   dev->pt_driver = RTE_PT_VFIO;
+   else if (!strcmp(driver, "igb_uio"))
+   dev->pt_driver = RTE_PT_IGB_UIO;
+   else if (!strcmp(driver, "uio_pci_generic"))
+   dev->pt_driver = RTE_PT_UIO_GENERIC;
+   else
+   dev->pt_driver = RTE_PT_UNKNOWN;
+   } else if (ret < 0) {
+   RTE_LOG(ERR, EAL, "Fail to get kernel driver\n");
+   free(dev);
+   return -1;
+   } else
+   dev->pt_driver = RTE_PT_UNKNOWN;
+
/* device is valid, add in list (sorted) */
if (TAILQ_EMPTY(&pci_device_list)) {
TAILQ_INSERT_TAIL(&pci_device_list, dev, next);
-- 
1.9.1



[dpdk-dev] [PATCH v11 03/13] eal_pci: pci memory map work with driver type

2015-02-23 Thread Tetsuya Mukawa
From: Michael Qiu 

With the driver type flag in struct rte_pci_dev, we do not need
to always  map uio devices with vfio related function when
vfio enabled.

Signed-off-by: Michael Qiu 
Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_eal/linuxapp/eal/eal_pci.c | 30 +-
 1 file changed, 17 insertions(+), 13 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c 
b/lib/librte_eal/linuxapp/eal/eal_pci.c
index 9fe2851..c04f897 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
@@ -554,25 +554,29 @@ pci_config_space_set(struct rte_pci_device *dev)
 static int
 pci_map_device(struct rte_pci_device *dev)
 {
-   int ret, mapped = 0;
+   int ret = -1;

/* try mapping the NIC resources using VFIO if it exists */
+   switch (dev->pt_driver) {
+   case RTE_PT_VFIO:
 #ifdef VFIO_PRESENT
-   if (pci_vfio_is_enabled()) {
-   ret = pci_vfio_map_resource(dev);
-   if (ret == 0)
-   mapped = 1;
-   else if (ret < 0)
-   return ret;
-   }
+   if (pci_vfio_is_enabled())
+   ret = pci_vfio_map_resource(dev);
 #endif
-   /* map resources for devices that use uio_pci_generic or igb_uio */
-   if (!mapped) {
+   break;
+   case RTE_PT_IGB_UIO:
+   case RTE_PT_UIO_GENERIC:
+   /* map resources for devices that use uio */
ret = pci_uio_map_resource(dev);
-   if (ret != 0)
-   return ret;
+   break;
+   default:
+   RTE_LOG(DEBUG, EAL, "  Not managed by known pt driver,"
+   " skipped\n");
+   ret = 1;
+   break;
}
-   return 0;
+
+   return ret;
 }

 /*
-- 
1.9.1



[dpdk-dev] [PATCH v11 04/13] eal/pci, ethdev: Remove assumption that port will not be detached

2015-02-23 Thread Tetsuya Mukawa
To remove assumption, do like followings.

This patch adds "RTE_PCI_DRV_DETACHABLE" to drv_flags of rte_pci_driver
structure. The flags indicate the driver can detach devices at runtime.
Also, remove assumption that port will not be detached.

To remove the assumption.
- Add 'attached' member to rte_eth_dev structure.
  This member is used for indicating the port is attached, or not.
  DEV_ATTACHED indicates a port is attached.
  DEV_DETACHED indicates a port is detached.
- Add rte_eth_dev_allocate_new_port().
  This function is used for allocating new port.

v9:
- DEV_INVALID/VALID are removed.
- DEV_DISCONNECTED is replaced by DEV_DETACHED.
- DEV_CONNECTED is replaced by DEV_ATTACHED.
- rte_eth_dev_allocate_new_port() is renamed to
  rte_eth_dev_find_free_port().
- rte_eth_dev_validate_port() is renamed to rte_eth_dev_is_valid_port().
- rte_eth_dev_is_valid_port() is changed not to handle log toggle.
- Fix commit log to describe DEV_ATACHED and DEV_DETACHED.
  (Thanks to Thomas Monjalon)
v8:
- NONE_TRACE is changed to NO_TRACE.
  (Thanks to Iremonger, Bernard)
v5:
- Change parameters of rte_eth_dev_validate_port() to cleanup code.
v4:
- Use braces with 'for' loop.
- Fix indent of 'if' statement.

Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_eal/common/include/rte_pci.h |   2 +
 lib/librte_ether/rte_ethdev.c   | 248 
 lib/librte_ether/rte_ethdev.h   |   5 +
 3 files changed, 164 insertions(+), 91 deletions(-)

diff --git a/lib/librte_eal/common/include/rte_pci.h 
b/lib/librte_eal/common/include/rte_pci.h
index 5e0ba00..ffd13d9 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -210,6 +210,8 @@ struct rte_pci_driver {
 #define RTE_PCI_DRV_FORCE_UNBIND 0x0004
 /** Device driver supports link state interrupt */
 #define RTE_PCI_DRV_INTR_LSC   0x0008
+/** Device driver supports detaching capability */
+#define RTE_PCI_DRV_DETACHABLE 0x0010

 /**< Internal use only - Macro used by pci addr parsing functions **/
 #define GET_PCIADDR_FIELD(in, fd, lim, dlm)   \
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 27bbb0b..0e1e5c9 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -175,6 +175,11 @@ enum {
STAT_QMAP_RX
 };

+enum {
+   DEV_DETACHED = 0,
+   DEV_ATTACHED
+};
+
 static inline void
 rte_eth_dev_data_alloc(void)
 {
@@ -201,19 +206,34 @@ rte_eth_dev_allocated(const char *name)
 {
unsigned i;

-   for (i = 0; i < nb_ports; i++) {
-   if (strcmp(rte_eth_devices[i].data->name, name) == 0)
+   for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
+   if ((rte_eth_devices[i].attached == DEV_ATTACHED) &&
+   strcmp(rte_eth_devices[i].data->name, name) == 0)
return &rte_eth_devices[i];
}
return NULL;
 }

+static uint8_t
+rte_eth_dev_find_free_port(void)
+{
+   unsigned i;
+
+   for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
+   if (rte_eth_devices[i].attached == DEV_DETACHED)
+   return i;
+   }
+   return RTE_MAX_ETHPORTS;
+}
+
 struct rte_eth_dev *
 rte_eth_dev_allocate(const char *name)
 {
+   uint8_t port_id;
struct rte_eth_dev *eth_dev;

-   if (nb_ports == RTE_MAX_ETHPORTS) {
+   port_id = rte_eth_dev_find_free_port();
+   if (port_id == RTE_MAX_ETHPORTS) {
PMD_DEBUG_TRACE("Reached maximum number of Ethernet ports\n");
return NULL;
}
@@ -226,10 +246,12 @@ rte_eth_dev_allocate(const char *name)
return NULL;
}

-   eth_dev = &rte_eth_devices[nb_ports];
-   eth_dev->data = &rte_eth_dev_data[nb_ports];
+   eth_dev = &rte_eth_devices[port_id];
+   eth_dev->data = &rte_eth_dev_data[port_id];
snprintf(eth_dev->data->name, sizeof(eth_dev->data->name), "%s", name);
-   eth_dev->data->port_id = nb_ports++;
+   eth_dev->data->port_id = port_id;
+   eth_dev->attached = DEV_ATTACHED;
+   nb_ports++;
return eth_dev;
 }

@@ -283,6 +305,7 @@ rte_eth_dev_init(struct rte_pci_driver *pci_drv,
(unsigned) pci_dev->id.device_id);
if (rte_eal_process_type() == RTE_PROC_PRIMARY)
rte_free(eth_dev->data->dev_private);
+   eth_dev->attached = DEV_DETACHED;
nb_ports--;
return diag;
 }
@@ -308,10 +331,20 @@ rte_eth_driver_register(struct eth_driver *eth_drv)
rte_eal_pci_register(ð_drv->pci_drv);
 }

+static int
+rte_eth_dev_is_valid_port(uint8_t port_id)
+{
+   if (port_id >= RTE_MAX_ETHPORTS ||
+   rte_eth_devices[port_id].attached != DEV_ATTACHED)
+   return 0;
+   else
+   return 1;
+}
+
 int
 rte_eth_dev_socket_id(uint8_t port_id)
 {
-   if (port_id >= nb_ports)
+   if (!rte_eth_dev_is_valid_port(port_id))
return -1;
return rte_eth_device

[dpdk-dev] [PATCH v11 05/13] eal/pci: Consolidate pci address comparison APIs

2015-02-23 Thread Tetsuya Mukawa
This patch replaces pci_addr_comparison() and memcmp() of pci addresses by
rte_eal_compare_pci_addr().

To compare PCI addresses, rte_eal_compare_pci_addr() doesn't use memcmp().
This is because sizeof(struct rte_pci_addr) returns 6, but actually
this structure is like below.

struct rte_pci_addr {
uint16_t domain;/**< Device domain */
uint8_t bus;/**< Device bus */
uint8_t devid;  /**< Device ID */
uint8_t function;   /**< Device function. */
};

If the structure is dynamically allocated in a function without bzero,
last 1 byte may have value. As a result, memcmp may not work.
To avoid such a case, rte_eal_compare_pci_addr() compare following values.

dev_addr = (addr->domain << 24) | (addr->bus << 16) |
(addr->devid << 8) | addr->function;

v9:
- eal_compare_pci_addr() is replaced by rte_eal_compare_pci_addr().
- Fix commit log.
  (Thanks to Thomas Monjalon)
v8:
- Fix pci_scan_one() to update sysfs values.
  (Thanks to Qiu, Michael and Iremonger, Bernard)
v5:
- Fix pci_scan_one to handle pt_driver correctly.
v4:
- Fix calculation method of eal_compare_pci_addr().
- Add parameter checking.

Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_eal/bsdapp/eal/eal_pci.c   | 29 --
 lib/librte_eal/common/eal_common_pci.c|  2 +-
 lib/librte_eal/common/include/rte_pci.h   | 34 +++
 lib/librte_eal/linuxapp/eal/eal_pci.c | 30 +--
 lib/librte_eal/linuxapp/eal/eal_pci_uio.c |  2 +-
 5 files changed, 63 insertions(+), 34 deletions(-)

diff --git a/lib/librte_eal/bsdapp/eal/eal_pci.c 
b/lib/librte_eal/bsdapp/eal/eal_pci.c
index 74ecce7..9193f80 100644
--- a/lib/librte_eal/bsdapp/eal/eal_pci.c
+++ b/lib/librte_eal/bsdapp/eal/eal_pci.c
@@ -270,20 +270,6 @@ pci_uio_map_resource(struct rte_pci_device *dev)
return (0);
 }

-/* Compare two PCI device addresses. */
-static int
-pci_addr_comparison(struct rte_pci_addr *addr, struct rte_pci_addr *addr2)
-{
-   uint64_t dev_addr = (addr->domain << 24) + (addr->bus << 16) + 
(addr->devid << 8) + addr->function;
-   uint64_t dev_addr2 = (addr2->domain << 24) + (addr2->bus << 16) + 
(addr2->devid << 8) + addr2->function;
-
-   if (dev_addr > dev_addr2)
-   return 1;
-   else
-   return 0;
-}
-
-
 /* Scan one pci sysfs entry, and fill the devices list from it. */
 static int
 pci_scan_one(int dev_pci_fd, struct pci_conf *conf)
@@ -356,13 +342,24 @@ pci_scan_one(int dev_pci_fd, struct pci_conf *conf)
}
else {
struct rte_pci_device *dev2 = NULL;
+   int ret;

TAILQ_FOREACH(dev2, &pci_device_list, next) {
-   if (pci_addr_comparison(&dev->addr, &dev2->addr))
+   ret = rte_eal_compare_pci_addr(&dev->addr, &dev2->addr);
+   if (ret > 0)
continue;
-   else {
+   else if (ret < 0) {
TAILQ_INSERT_BEFORE(dev2, dev, next);
return 0;
+   } else { /* already registered */
+   /* update pt_driver */
+   dev2->pt_driver = dev->pt_driver;
+   dev2->max_vfs = dev->max_vfs;
+   memmove(dev2->mem_resource,
+   dev->mem_resource,
+   sizeof(dev->mem_resource));
+   free(dev);
+   return 0;
}
}
TAILQ_INSERT_TAIL(&pci_device_list, dev, next);
diff --git a/lib/librte_eal/common/eal_common_pci.c 
b/lib/librte_eal/common/eal_common_pci.c
index f3c7f71..bf2793f 100644
--- a/lib/librte_eal/common/eal_common_pci.c
+++ b/lib/librte_eal/common/eal_common_pci.c
@@ -93,7 +93,7 @@ static struct rte_devargs *pci_devargs_lookup(struct 
rte_pci_device *dev)
if (devargs->type != RTE_DEVTYPE_BLACKLISTED_PCI &&
devargs->type != RTE_DEVTYPE_WHITELISTED_PCI)
continue;
-   if (!memcmp(&dev->addr, &devargs->pci.addr, sizeof(dev->addr)))
+   if (!rte_eal_compare_pci_addr(&dev->addr, &devargs->pci.addr))
return devargs;
}
return NULL;
diff --git a/lib/librte_eal/common/include/rte_pci.h 
b/lib/librte_eal/common/include/rte_pci.h
index ffd13d9..6814e91 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -272,6 +272,40 @@ eal_parse_pci_DomBDF(const char *input, struct 
rte_pci_addr *dev_addr)
 }
 #undef GET_PCIADDR_FIELD

+/* Compare two PCI device addresses. */
+/**
+ * Utility function to compare two PCI device addresses.
+ *
+ 

[dpdk-dev] [PATCH v11 06/13] ethdev: Add rte_eth_dev_release_port to release specified port

2015-02-23 Thread Tetsuya Mukawa
This patch adds rte_eth_dev_release_port(). The function is used for
changing an attached status of the device that has specified name.

v9:
- rte_eth_dev_free() is replaced by rte_eth_dev_release_port().
  (Thanks to Thomas Monjalon)
v6:
- Use rte_eth_dev structure as the paramter of rte_eth_dev_free().
v4:
- Add parameter checking.

Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_ether/rte_ethdev.c | 11 +++
 lib/librte_ether/rte_ethdev.h | 12 
 2 files changed, 23 insertions(+)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 0e1e5c9..8d271ae 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -255,6 +255,17 @@ rte_eth_dev_allocate(const char *name)
return eth_dev;
 }

+int
+rte_eth_dev_release_port(struct rte_eth_dev *eth_dev)
+{
+   if (eth_dev == NULL)
+   return -EINVAL;
+
+   eth_dev->attached = 0;
+   nb_ports--;
+   return 0;
+}
+
 static int
 rte_eth_dev_init(struct rte_pci_driver *pci_drv,
 struct rte_pci_device *pci_dev)
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index ef31bda..8016a51 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1487,6 +1487,18 @@ extern uint8_t rte_eth_dev_count(void);
  */
 struct rte_eth_dev *rte_eth_dev_allocate(const char *name);

+/**
+ * Function for internal use by dummy drivers primarily, e.g. ring-based
+ * driver.
+ * Release the specified ethdev port.
+ *
+ * @param eth_dev
+ * The *eth_dev* pointer is the address of the *rte_eth_dev* structure.
+ * @return
+ *   - 0 on success, negative on error
+ */
+int rte_eth_dev_release_port(struct rte_eth_dev *eth_dev);
+
 struct eth_driver;
 /**
  * @internal
-- 
1.9.1



[dpdk-dev] [PATCH v11 07/13] eal, ethdev: Add a function and function pointers to close ether device

2015-02-23 Thread Tetsuya Mukawa
The patch adds function pointer to rte_pci_driver and eth_driver
structure. These function pointers are used when ports are detached.
Also, the patch adds rte_eth_dev_uninit(). So far, it's not called
by anywhere, but it will be called when port hotplug function is
implemented.

v10:
- Add size parameter to rte_eth_dev_create_unique_device_name().
  (Thanks to Iremonger, Bernard)
v9:
- Change parameter of pci_devuninit_t and rte_eth_dev_uninit.
- Remove code that initiaize callback of ethdev from
  rte_eth_dev_uninit().
- Add a function to create a unique device name.
  (Thanks to Thomas Monjalon)
v6:
- Fix rte_eth_dev_uninit() to handle a return value of uninit
  function of PMD.
v4:
- Add parameter checking.
- Change function names.

Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_eal/common/include/rte_pci.h |  6 
 lib/librte_ether/rte_ethdev.c   | 64 +++--
 lib/librte_ether/rte_ethdev.h   | 24 +
 3 files changed, 92 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/common/include/rte_pci.h 
b/lib/librte_eal/common/include/rte_pci.h
index 6814e91..4ea57cb 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -192,12 +192,18 @@ struct rte_pci_driver;
 typedef int (pci_devinit_t)(struct rte_pci_driver *, struct rte_pci_device *);

 /**
+ * Uninitialisation function for the driver called during hotplugging.
+ */
+typedef int (pci_devuninit_t)(struct rte_pci_device *);
+
+/**
  * A structure describing a PCI driver.
  */
 struct rte_pci_driver {
TAILQ_ENTRY(rte_pci_driver) next;   /**< Next in list. */
const char *name;   /**< Driver name. */
pci_devinit_t *devinit; /**< Device init. function. */
+   pci_devuninit_t *devuninit; /**< Device uninit function. */
struct rte_pci_id *id_table;/**< ID table, NULL terminated. 
*/
uint32_t drv_flags; /**< Flags contolling handling 
of device. */
 };
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 8d271ae..3d148e2 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -266,6 +266,24 @@ rte_eth_dev_release_port(struct rte_eth_dev *eth_dev)
return 0;
 }

+static inline int
+rte_eth_dev_create_unique_device_name(char *name, size_t size,
+   struct rte_pci_device *pci_dev)
+{
+   int ret;
+
+   if ((name == NULL) || (pci_dev == NULL))
+   return -EINVAL;
+
+   ret = snprintf(name, size, "%d:%d.%d",
+   pci_dev->addr.bus, pci_dev->addr.devid,
+   pci_dev->addr.function);
+   if (ret < 0)
+   return ret;
+
+   return 0;
+}
+
 static int
 rte_eth_dev_init(struct rte_pci_driver *pci_drv,
 struct rte_pci_device *pci_dev)
@@ -279,8 +297,8 @@ rte_eth_dev_init(struct rte_pci_driver *pci_drv,
eth_drv = (struct eth_driver *)pci_drv;

/* Create unique Ethernet device name using PCI address */
-   snprintf(ethdev_name, RTE_ETH_NAME_MAX_LEN, "%d:%d.%d",
-   pci_dev->addr.bus, pci_dev->addr.devid, 
pci_dev->addr.function);
+   rte_eth_dev_create_unique_device_name(ethdev_name,
+   sizeof(ethdev_name), pci_dev);

eth_dev = rte_eth_dev_allocate(ethdev_name);
if (eth_dev == NULL)
@@ -321,6 +339,47 @@ rte_eth_dev_init(struct rte_pci_driver *pci_drv,
return diag;
 }

+static int
+rte_eth_dev_uninit(struct rte_pci_device *pci_dev)
+{
+   const struct eth_driver *eth_drv;
+   struct rte_eth_dev *eth_dev;
+   char ethdev_name[RTE_ETH_NAME_MAX_LEN];
+   int ret;
+
+   if (pci_dev == NULL)
+   return -EINVAL;
+
+   /* Create unique Ethernet device name using PCI address */
+   rte_eth_dev_create_unique_device_name(ethdev_name,
+   sizeof(ethdev_name), pci_dev);
+
+   eth_dev = rte_eth_dev_allocated(ethdev_name);
+   if (eth_dev == NULL)
+   return -ENODEV;
+
+   eth_drv = (const struct eth_driver *)pci_dev->driver;
+
+   /* Invoke PMD device uninit function */
+   if (*eth_drv->eth_dev_uninit) {
+   ret = (*eth_drv->eth_dev_uninit)(eth_drv, eth_dev);
+   if (ret)
+   return ret;
+   }
+
+   /* free ether device */
+   rte_eth_dev_release_port(eth_dev);
+
+   if (rte_eal_process_type() == RTE_PROC_PRIMARY)
+   rte_free(eth_dev->data->dev_private);
+
+   eth_dev->pci_dev = NULL;
+   eth_dev->driver = NULL;
+   eth_dev->data = NULL;
+
+   return 0;
+}
+
 /**
  * Register an Ethernet [Poll Mode] driver.
  *
@@ -339,6 +398,7 @@ void
 rte_eth_driver_register(struct eth_driver *eth_drv)
 {
eth_drv->pci_drv.devinit = rte_eth_dev_init;
+   eth_drv->pci_drv.devuninit = rte_eth_dev_uninit;
rte_eal_pci_regis

[dpdk-dev] [PATCH v11 08/13] ethdev: Add functions that will be used by port hotplug functions

2015-02-23 Thread Tetsuya Mukawa
The patch adds following functions.

- rte_eth_dev_save()
  The function is used for saving current rte_eth_dev structures.
- rte_eth_dev_get_changed_port()
  The function receives the rte_eth_dev structures, then compare
  these with current values to know which port is actually
  attached or detached.
- rte_eth_dev_get_addr_by_port()
  The function returns a pci address of an ethdev specified by port
  identifier.
- rte_eth_dev_get_port_by_addr()
  The function returns a port identifier of an ethdev specified by
  pci address.
- rte_eth_dev_get_name_by_port()
  The function returns a unique identifier name of an ethdev
  specified by port identifier.
- Add rte_eth_dev_is_detachable()
  The function returns whether a PMD supports detach function.

Also, the patch changes scope of rte_eth_dev_allocated() to global.
This function will be called by virtual PMDs to support port hotplug.
So change scope of the function to global.

v10:
- Change order of version.map.
  (Thanks to Thomas Monjalon)
v9:
- rte_eth_dev_check_detachable() is replaced by
  rte_eth_dev_is_detachable().
- strncpy() is replaced by strcpy().
  (Thanks to Thomas Monjalon)
- Add missing symbol in version map.
  (Thanks to Nail Horman)
v8:
- Add size parameter to rte_eth_dev_save().
- Add missing symbol in version map.
  (Thanks to Qiu, Michael and Iremonger, Bernard)
v7:
- Add pt_driver checking to rte_eth_dev_check_detachable().
  (Thanks to Qiu, Michael)
v5:
- Fix return value of below functions.
  rte_eth_dev_get_changed_port().
  rte_eth_dev_get_port_by_addr().
v4:
- Add parameter checking.
v3:
- Fix if-condition bug while comparing pci addresses.
- Add error checking codes.
Reported-by: Mark Enright 

Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_ether/rte_ethdev.c  | 103 -
 lib/librte_ether/rte_ethdev.h  |  83 ++
 lib/librte_ether/rte_ether_version.map |   7 +++
 3 files changed, 192 insertions(+), 1 deletion(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 3d148e2..7067620 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -201,7 +201,7 @@ rte_eth_dev_data_alloc(void)
RTE_MAX_ETHPORTS * sizeof(*rte_eth_dev_data));
 }

-static struct rte_eth_dev *
+struct rte_eth_dev *
 rte_eth_dev_allocated(const char *name)
 {
unsigned i;
@@ -426,6 +426,107 @@ rte_eth_dev_count(void)
return (nb_ports);
 }

+int
+rte_eth_dev_save(struct rte_eth_dev *devs, size_t size)
+{
+   if ((devs == NULL) ||
+   (size != sizeof(struct rte_eth_dev) * RTE_MAX_ETHPORTS))
+   return -EINVAL;
+
+   /* save current rte_eth_devices */
+   memcpy(devs, rte_eth_devices, size);
+   return 0;
+}
+
+int
+rte_eth_dev_get_changed_port(struct rte_eth_dev *devs, uint8_t *port_id)
+{
+   if ((devs == NULL) || (port_id == NULL))
+   return -EINVAL;
+
+   /* check which port was attached or detached */
+   for (*port_id = 0; *port_id < RTE_MAX_ETHPORTS; (*port_id)++, devs++) {
+   if (rte_eth_devices[*port_id].attached ^ devs->attached)
+   return 0;
+   }
+   return -ENODEV;
+}
+
+int
+rte_eth_dev_get_addr_by_port(uint8_t port_id, struct rte_pci_addr *addr)
+{
+   if (!rte_eth_dev_is_valid_port(port_id)) {
+   PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
+   return -EINVAL;
+   }
+
+   if (addr == NULL) {
+   PMD_DEBUG_TRACE("Null pointer is specified\n");
+   return -EINVAL;
+   }
+
+   *addr = rte_eth_devices[port_id].pci_dev->addr;
+   return 0;
+}
+
+int
+rte_eth_dev_get_port_by_addr(struct rte_pci_addr *addr, uint8_t *port_id)
+{
+   struct rte_pci_addr *tmp;
+
+   if ((addr == NULL) || (port_id == NULL)) {
+   PMD_DEBUG_TRACE("Null pointer is specified\n");
+   return -EINVAL;
+   }
+
+   for (*port_id = 0; *port_id < RTE_MAX_ETHPORTS; (*port_id)++) {
+   if (!rte_eth_devices[*port_id].attached)
+   continue;
+   if (!rte_eth_devices[*port_id].pci_dev)
+   continue;
+   tmp = &rte_eth_devices[*port_id].pci_dev->addr;
+   if (rte_eal_compare_pci_addr(tmp, addr) == 0)
+   return 0;
+   }
+   return -ENODEV;
+}
+
+int
+rte_eth_dev_get_name_by_port(uint8_t port_id, char *name)
+{
+   char *tmp;
+
+   if (!rte_eth_dev_is_valid_port(port_id)) {
+   PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
+   return -EINVAL;
+   }
+
+   if (name == NULL) {
+   PMD_DEBUG_TRACE("Null pointer is specified\n");
+   return -EINVAL;
+   }
+
+   /* shouldn't check 'rte_eth_devices[i].data',
+* because it might be overwritten by VDEV PMD */
+   tmp = rte_eth_dev_data[port_id].name;
+   strcpy(name,

[dpdk-dev] [PATCH v11 09/13] eal/linux/pci: Add functions for unmapping igb_uio resources

2015-02-23 Thread Tetsuya Mukawa
The patch adds functions for unmapping igb_uio resources. The patch is only
for Linux and igb_uio environment. VFIO and BSD are not supported.

v9:
- Remove "rte_dev_hotplug.h".
- Remove needless "#ifdef".
  (Thanks to Thomas Monjalon and Neil Horman)
- Remove pci_unmap_device(). It will be implemented in later patch.
v8:
- Fix typo.
  (Thanks to Iremonger, Bernard)
v5:
- Fix pci_unmap_device() to check pt_driver.
v4:
- Add parameter checking.
- Add header file to determine if hotplug can be enabled.

Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_eal/linuxapp/eal/eal_pci.c  | 17 
 lib/librte_eal/linuxapp/eal/eal_pci_init.h |  7 
 lib/librte_eal/linuxapp/eal/eal_pci_uio.c  | 65 ++
 3 files changed, 89 insertions(+)

diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c 
b/lib/librte_eal/linuxapp/eal/eal_pci.c
index e6cead1..17f32c0 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
@@ -167,6 +167,23 @@ pci_map_resource(void *requested_addr, int fd, off_t 
offset, size_t size)
return mapaddr;
 }

+/* unmap a particular resource */
+void
+pci_unmap_resource(void *requested_addr, size_t size)
+{
+   if (requested_addr == NULL)
+   return;
+
+   /* Unmap the PCI memory resource of device */
+   if (munmap(requested_addr, size)) {
+   RTE_LOG(ERR, EAL, "%s(): cannot munmap(%p, 0x%lx): %s\n",
+   __func__, requested_addr, (unsigned long)size,
+   strerror(errno));
+   } else
+   RTE_LOG(DEBUG, EAL, "  PCI memory unmapped at %p\n",
+   requested_addr);
+}
+
 /* parse the "resource" sysfs file */
 static int
 pci_parse_sysfs_resource(const char *filename, struct rte_pci_device *dev)
diff --git a/lib/librte_eal/linuxapp/eal/eal_pci_init.h 
b/lib/librte_eal/linuxapp/eal/eal_pci_init.h
index 1070eb8..e2dd8a5 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci_init.h
+++ b/lib/librte_eal/linuxapp/eal/eal_pci_init.h
@@ -71,6 +71,13 @@ void *pci_map_resource(void *requested_addr, int fd, off_t 
offset,
 /* map IGB_UIO resource prototype */
 int pci_uio_map_resource(struct rte_pci_device *dev);

+void pci_unmap_resource(void *requested_addr, size_t size);
+
+#ifdef RTE_LIBRTE_EAL_HOTPLUG
+/* unmap IGB_UIO resource prototype */
+void pci_uio_unmap_resource(struct rte_pci_device *dev);
+#endif /* RTE_LIBRTE_EAL_HOTPLUG */
+
 #ifdef VFIO_PRESENT

 #define VFIO_MAX_GROUPS 64
diff --git a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c 
b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
index f7acc55..ff4d0e8 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
@@ -433,3 +433,68 @@ pci_uio_map_resource(struct rte_pci_device *dev)

return 0;
 }
+
+#ifdef RTE_LIBRTE_EAL_HOTPLUG
+static void
+pci_uio_unmap(struct mapped_pci_resource *uio_res)
+{
+   int i;
+
+   if (uio_res == NULL)
+   return;
+
+   for (i = 0; i != uio_res->nb_maps; i++)
+   pci_unmap_resource(uio_res->maps[i].addr,
+   (size_t)uio_res->maps[i].size);
+}
+
+static struct mapped_pci_resource *
+pci_uio_find_resource(struct rte_pci_device *dev)
+{
+   struct mapped_pci_resource *uio_res;
+
+   if (dev == NULL)
+   return NULL;
+
+   TAILQ_FOREACH(uio_res, pci_res_list, next) {
+
+   /* skip this element if it doesn't match our PCI address */
+   if (!rte_eal_compare_pci_addr(&uio_res->pci_addr, &dev->addr))
+   return uio_res;
+   }
+   return NULL;
+}
+
+/* unmap the PCI resource of a PCI device in virtual memory */
+void
+pci_uio_unmap_resource(struct rte_pci_device *dev)
+{
+   struct mapped_pci_resource *uio_res;
+
+   if (dev == NULL)
+   return;
+
+   /* find an entry for the device */
+   uio_res = pci_uio_find_resource(dev);
+   if (uio_res == NULL)
+   return;
+
+   /* secondary processes - just free maps */
+   if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+   return pci_uio_unmap(uio_res);
+
+   TAILQ_REMOVE(pci_res_list, uio_res, next);
+
+   /* unmap all resources */
+   pci_uio_unmap(uio_res);
+
+   /* free uio resource */
+   rte_free(uio_res);
+
+   /* close fd if in primary process */
+   close(dev->intr_handle.fd);
+
+   dev->intr_handle.fd = -1;
+   dev->intr_handle.type = RTE_INTR_HANDLE_UNKNOWN;
+}
+#endif /* RTE_LIBRTE_EAL_HOTPLUG */
-- 
1.9.1



[dpdk-dev] [PATCH v11 10/13] eal/pci: Add probe and close functions of pci driver

2015-02-23 Thread Tetsuya Mukawa
- Add pci_close_all_drivers()
  The function tries to find a driver for the specified device, and
  then close the driver.
- Add rte_eal_pci_probe_one() and rte_eal_pci_close_one()
  The functions are used for probe and close a device.
  First the function tries to find a device that has the specified
  PCI address. Then, probe or close the device.

v9:
- Fix commit title.
- Remove RTE_EAL_INVOKE_TYPE_PROBE/CLOSE.
  (Thanks to Thomas Monjalon)
- Implement pci_unmap_device() in this patch.
v5:
- Remove RTE_EAL_INVOKE_TYPE_UNKNOWN, because it's unused.
v4:
- Fix parameter checking.
- Fix indent of 'if' statement.

Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_eal/common/eal_common_pci.c  | 98 -
 lib/librte_eal/common/eal_private.h | 15 +
 lib/librte_eal/common/include/rte_pci.h | 32 +++
 lib/librte_eal/linuxapp/eal/eal_pci.c   | 94 +++
 4 files changed, 238 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/common/eal_common_pci.c 
b/lib/librte_eal/common/eal_common_pci.c
index bf2793f..5b6b55d 100644
--- a/lib/librte_eal/common/eal_common_pci.c
+++ b/lib/librte_eal/common/eal_common_pci.c
@@ -108,7 +108,10 @@ static int
 pci_probe_all_drivers(struct rte_pci_device *dev)
 {
struct rte_pci_driver *dr = NULL;
-   int rc;
+   int rc = 0;
+
+   if (dev == NULL)
+   return -1;

TAILQ_FOREACH(dr, &pci_driver_list, next) {
rc = rte_eal_pci_probe_one_driver(dr, dev);
@@ -123,6 +126,99 @@ pci_probe_all_drivers(struct rte_pci_device *dev)
return 1;
 }

+#ifdef RTE_LIBRTE_EAL_HOTPLUG
+/*
+ * If vendor/device ID match, call the devuninit() function of all
+ * registered driver for the given device. Return -1 if initialization
+ * failed, return 1 if no driver is found for this device.
+ */
+static int
+pci_close_all_drivers(struct rte_pci_device *dev)
+{
+   struct rte_pci_driver *dr = NULL;
+   int rc = 0;
+
+   if (dev == NULL)
+   return -1;
+
+   TAILQ_FOREACH(dr, &pci_driver_list, next) {
+   rc = rte_eal_pci_close_one_driver(dr, dev);
+   if (rc < 0)
+   /* negative value is an error */
+   return -1;
+   if (rc > 0)
+   /* positive value means driver not found */
+   continue;
+   return 0;
+   }
+   return 1;
+}
+
+/*
+ * Find the pci device specified by pci address, then invoke probe function of
+ * the driver of the devive.
+ */
+int
+rte_eal_pci_probe_one(struct rte_pci_addr *addr)
+{
+   struct rte_pci_device *dev = NULL;
+   int ret = 0;
+
+   if (addr == NULL)
+   return -1;
+
+   TAILQ_FOREACH(dev, &pci_device_list, next) {
+   if (rte_eal_compare_pci_addr(&dev->addr, addr))
+   continue;
+
+   ret = pci_probe_all_drivers(dev);
+   if (ret < 0)
+   goto err_return;
+   return 0;
+   }
+   return -1;
+
+err_return:
+   RTE_LOG(WARNING, EAL, "Requested device " PCI_PRI_FMT
+   " cannot be used\n", dev->addr.domain, dev->addr.bus,
+   dev->addr.devid, dev->addr.function);
+   return -1;
+}
+
+/*
+ * Find the pci device specified by pci address, then invoke close function of
+ * the driver of the devive.
+ */
+int
+rte_eal_pci_close_one(struct rte_pci_addr *addr)
+{
+   struct rte_pci_device *dev = NULL;
+   int ret = 0;
+
+   if (addr == NULL)
+   return -1;
+
+   TAILQ_FOREACH(dev, &pci_device_list, next) {
+   if (rte_eal_compare_pci_addr(&dev->addr, addr))
+   continue;
+
+   ret = pci_close_all_drivers(dev);
+   if (ret < 0)
+   goto err_return;
+
+   TAILQ_REMOVE(&pci_device_list, dev, next);
+   return 0;
+   }
+   return -1;
+
+err_return:
+   RTE_LOG(WARNING, EAL, "Requested device " PCI_PRI_FMT
+   " cannot be used\n", dev->addr.domain, dev->addr.bus,
+   dev->addr.devid, dev->addr.function);
+   return -1;
+}
+#endif /* RTE_LIBRTE_EAL_HOTPLUG */
+
 /*
  * Scan the content of the PCI bus, and call the devinit() function for
  * all registered drivers that have a matching entry in its id_table
diff --git a/lib/librte_eal/common/eal_private.h 
b/lib/librte_eal/common/eal_private.h
index 159cd66..4acf5a0 100644
--- a/lib/librte_eal/common/eal_private.h
+++ b/lib/librte_eal/common/eal_private.h
@@ -165,6 +165,21 @@ int rte_eal_pci_probe_one_driver(struct rte_pci_driver *dr,
struct rte_pci_device *dev);

 /**
+ * Munmap memory for single PCI device
+ *
+ * This function is private to EAL.
+ *
+ * @param  dr
+ *  The pointer to the pci driver structure
+ * @param  dev
+ *  The pointer to the pci device structure
+ * @return
+ 

[dpdk-dev] [PATCH v11 11/13] ethdev: Add one dev_type parameter to rte_eth_dev_allocate

2015-02-23 Thread Tetsuya Mukawa
This new parameter is needed to keep device type like PCI or virtual.
Port detaching processes are different between PCI device and virtual
device.
RTE_ETH_DEV_PCI indicates device type is PCI. RTE_ETH_DEV_VIRTUAL
indicates device is virtual.

v10:
- Change order of version.map.
  (Thanks to Thomas Monjalon)
- Fix comment of "rte_ethdev.h".
  (Thanks to Thomas Monjalon)
v9:
- Fix commit log.
- RTE_ETH_DEV_PHYSICAL is replaced by RTE_ETH_DEV_PCI.
  (Thanks to Thomas Monjalon)
v8:
- NONE_TRACE is replaced by NO_TRACE.
- Add missing symbol in version map.
  (Thanks to Qiu, Michael and Iremonger, Bernard)
v4:
- Fix comments of rte_eth_dev_type.

Signed-off-by: Tetsuya Mukawa 
---
 app/test/virtual_pmd.c   |  2 +-
 lib/librte_ether/rte_ethdev.c| 25 +++--
 lib/librte_ether/rte_ethdev.h| 25 -
 lib/librte_pmd_af_packet/rte_eth_af_packet.c |  2 +-
 lib/librte_pmd_bond/rte_eth_bond_api.c   |  2 +-
 lib/librte_pmd_pcap/rte_eth_pcap.c   |  2 +-
 lib/librte_pmd_ring/rte_eth_ring.c   |  2 +-
 lib/librte_pmd_xenvirt/rte_eth_xenvirt.c |  2 +-
 8 files changed, 53 insertions(+), 9 deletions(-)

diff --git a/app/test/virtual_pmd.c b/app/test/virtual_pmd.c
index cd9faf3..01a3913 100644
--- a/app/test/virtual_pmd.c
+++ b/app/test/virtual_pmd.c
@@ -580,7 +580,7 @@ virtual_ethdev_create(const char *name, struct ether_addr 
*mac_addr,
goto err;

/* reserve an ethdev entry */
-   eth_dev = rte_eth_dev_allocate(name);
+   eth_dev = rte_eth_dev_allocate(name, RTE_ETH_DEV_PCI);
if (eth_dev == NULL)
goto err;

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 7067620..f176f1e 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -227,7 +227,7 @@ rte_eth_dev_find_free_port(void)
 }

 struct rte_eth_dev *
-rte_eth_dev_allocate(const char *name)
+rte_eth_dev_allocate(const char *name, enum rte_eth_dev_type type)
 {
uint8_t port_id;
struct rte_eth_dev *eth_dev;
@@ -251,6 +251,7 @@ rte_eth_dev_allocate(const char *name)
snprintf(eth_dev->data->name, sizeof(eth_dev->data->name), "%s", name);
eth_dev->data->port_id = port_id;
eth_dev->attached = DEV_ATTACHED;
+   eth_dev->dev_type = type;
nb_ports++;
return eth_dev;
 }
@@ -262,6 +263,7 @@ rte_eth_dev_release_port(struct rte_eth_dev *eth_dev)
return -EINVAL;

eth_dev->attached = 0;
+   eth_dev->dev_type = RTE_ETH_DEV_UNKNOWN;
nb_ports--;
return 0;
 }
@@ -300,7 +302,7 @@ rte_eth_dev_init(struct rte_pci_driver *pci_drv,
rte_eth_dev_create_unique_device_name(ethdev_name,
sizeof(ethdev_name), pci_dev);

-   eth_dev = rte_eth_dev_allocate(ethdev_name);
+   eth_dev = rte_eth_dev_allocate(ethdev_name, RTE_ETH_DEV_PCI);
if (eth_dev == NULL)
return -ENOMEM;

@@ -426,6 +428,14 @@ rte_eth_dev_count(void)
return (nb_ports);
 }

+enum rte_eth_dev_type
+rte_eth_dev_get_device_type(uint8_t port_id)
+{
+   if (!rte_eth_dev_is_valid_port(port_id))
+   return -1;
+   return rte_eth_devices[port_id].dev_type;
+}
+
 int
 rte_eth_dev_save(struct rte_eth_dev *devs, size_t size)
 {
@@ -523,6 +533,17 @@ rte_eth_dev_is_detachable(uint8_t port_id)
return -EINVAL;
}

+   if (rte_eth_devices[port_id].dev_type == RTE_ETH_DEV_PCI) {
+   switch (rte_eth_devices[port_id].pci_dev->pt_driver) {
+   case RTE_PT_IGB_UIO:
+   case RTE_PT_UIO_GENERIC:
+   break;
+   case RTE_PT_VFIO:
+   default:
+   return -ENOTSUP;
+   }
+   }
+
drv_flags = rte_eth_devices[port_id].driver->pci_drv.drv_flags;
return !(drv_flags & RTE_PCI_DRV_DETACHABLE);
 }
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index d4cfafb..1a978ed 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1382,6 +1382,17 @@ struct eth_dev_ops {
 };

 /**
+ * The eth device type
+ */
+enum rte_eth_dev_type {
+   RTE_ETH_DEV_UNKNOWN,/**< unknown device type */
+   RTE_ETH_DEV_PCI,
+   /**< Physical function and Virtual function of PCI devices */
+   RTE_ETH_DEV_VIRTUAL,/**< non hardware device */
+   RTE_ETH_DEV_MAX /**< max value of this enum */
+};
+
+/**
  * @internal
  * The generic data structure associated with each ethernet device.
  *
@@ -1400,6 +1411,7 @@ struct rte_eth_dev {
struct rte_pci_device *pci_dev; /**< PCI info. supplied by probing */
struct rte_eth_dev_cb_list callbacks; /**< User application callbacks */
uint8_t attached; /**< Flag indicating the port is attached */
+   enum rte_eth_dev_type dev_type; /**< Flag indicating the device type */
 };


[dpdk-dev] [PATCH v11 12/13] eal/pci: Add rte_eal_dev_attach/detach() functions

2015-02-23 Thread Tetsuya Mukawa
These functions are used for attaching or detaching a port.
When rte_eal_dev_attach() is called, the function tries to realize the
device name as pci address. If this is done successfully,
rte_eal_dev_attach() will attach physical device port. If not, attaches
virtual devive port.
When rte_eal_dev_detach() is called, the function gets the device type
of this port to know whether the port is come from physical or virtual.
And then specific detaching function will be called.

v11:
- Remove needless devargs handling codes.
- Replace get_vdev_name() by rte_eal_parse_devargs_str().
- Replace rte_eal_vdev_find_and_init by rte_eal_vdev_init()
- Replace rte_eal_vdev_find_and_uninit by rte_eal_vdev_uninit()
- Fix rte_eal_dev_init() to use rte_eal_vdev_init().
  (Thanks to Maxime Leroy)
v10:
- Add comments.
- Change order of version.map.
  (Thanks to Thomas Monjalon)
v9:
- Fix comments.
- Use strcmp() instead of strncmp().
- Remove RTE_EAL_INVOKE_TYPE_PROBE/CLOSE.
- Change definition of rte_dev_uninit_t.
  (Thanks to Thomas Monjalon and Maxime Leroy)
v8:
- Add missing symbol in version map.
  (Thanks to Qiu, Michael and Iremonger, Bernard)
v7:
- Fix typo of warning messages.
  (Thanks to Qiu, Michael)
v5:
- Change function names like below.
  rte_eal_dev_find_and_invoke() to rte_eal_vdev_find_and_invoke().
  rte_eal_dev_invoke() to rte_eal_vdev_invoke().
- Add code to handle a return value of rte_eal_devargs_remove().
- Fix pci address format in rte_eal_dev_detach().
v4:
- Fix comment.
- Add error checking.
- Fix indent of 'if' statement.
- Change function name.

Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_eal/common/eal_common_dev.c  | 285 ++--
 lib/librte_eal/common/eal_common_devargs.c  |  46 ++--
 lib/librte_eal/common/eal_private.h |  11 +
 lib/librte_eal/common/include/rte_dev.h |  33 +++
 lib/librte_eal/common/include/rte_devargs.h |  28 +++
 lib/librte_eal/linuxapp/eal/Makefile|   1 +
 lib/librte_eal/linuxapp/eal/eal_pci.c   |   6 +-
 lib/librte_eal/linuxapp/eal/rte_eal_version.map |   2 +
 8 files changed, 378 insertions(+), 34 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_dev.c 
b/lib/librte_eal/common/eal_common_dev.c
index eae5656..7d4dce6 100644
--- a/lib/librte_eal/common/eal_common_dev.c
+++ b/lib/librte_eal/common/eal_common_dev.c
@@ -32,10 +32,13 @@
  *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */

+#include 
+#include 
 #include 
 #include 
 #include 

+#include 
 #include 
 #include 
 #include 
@@ -61,6 +64,37 @@ rte_eal_driver_unregister(struct rte_driver *driver)
TAILQ_REMOVE(&dev_driver_list, driver, next);
 }

+static int
+rte_eal_vdev_init(const char *name, const char *args)
+{
+   struct rte_driver *driver;
+
+   if (name == NULL)
+   return -EINVAL;
+
+   TAILQ_FOREACH(driver, &dev_driver_list, next) {
+   if (driver->type != PMD_VDEV)
+   continue;
+
+   /*
+* search a driver prefix in virtual device name.
+* For example, if the driver is pcap PMD, driver->name
+* will be "eth_pcap", but "name" will be "eth_pcapN".
+* So use strncmp to compare.
+*/
+   if (!strncmp(driver->name, name, strlen(driver->name))) {
+   driver->init(name, args);
+   break;
+   }
+   }
+
+   if (driver == NULL) {
+   RTE_LOG(WARNING, EAL, "no driver found for %s\n", name);
+   return -EINVAL;
+   }
+   return 0;
+}
+
 int
 rte_eal_dev_init(void)
 {
@@ -79,23 +113,10 @@ rte_eal_dev_init(void)
if (devargs->type != RTE_DEVTYPE_VIRTUAL)
continue;

-   TAILQ_FOREACH(driver, &dev_driver_list, next) {
-   if (driver->type != PMD_VDEV)
-   continue;
-
-   /* search a driver prefix in virtual device name */
-   if (!strncmp(driver->name, devargs->virtual.drv_name,
-   strlen(driver->name))) {
-   driver->init(devargs->virtual.drv_name,
-   devargs->args);
-   break;
-   }
-   }
-
-   if (driver == NULL) {
+   if (rte_eal_vdev_init(devargs->virtual.drv_name,
+   devargs->args))
rte_panic("no driver found for %s\n",
  devargs->virtual.drv_name);
-   }
}

/* Once the vdevs are initalized, start calling all the pdev drivers */
@@ -107,3 +128,237 @@ rte_eal_dev_init(void)
}
return 0;
 }
+
+/* So far, DPDK hotplug function only supports linux */
+#ifdef RTE_LIBRTE_EAL_HOTPLUG
+static int
+rte_eal_vdev

[dpdk-dev] [PATCH v11 13/13] doc: Add port hotplug framework section to programmers guide

2015-02-23 Thread Tetsuya Mukawa
This patch adds a new section for describing port hotplug framework.

Signed-off-by: Tetsuya Mukawa 
---
 doc/guides/prog_guide/index.rst  |   1 +
 doc/guides/prog_guide/port_hotplug_framework.rst | 110 +++
 2 files changed, 111 insertions(+)
 create mode 100644 doc/guides/prog_guide/port_hotplug_framework.rst

diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst
index de69682..60a6ac5 100644
--- a/doc/guides/prog_guide/index.rst
+++ b/doc/guides/prog_guide/index.rst
@@ -71,6 +71,7 @@ Programmer's Guide
 packet_classif_access_ctrl
 packet_framework
 vhost_lib
+port_hotplug_framework
 source_org
 dev_kit_build_system
 dev_kit_root_make_help
diff --git a/doc/guides/prog_guide/port_hotplug_framework.rst 
b/doc/guides/prog_guide/port_hotplug_framework.rst
new file mode 100644
index 000..355ae28
--- /dev/null
+++ b/doc/guides/prog_guide/port_hotplug_framework.rst
@@ -0,0 +1,110 @@
+..  BSD LICENSE
+Copyright(c) 2015 IGEL Co.,Ltd. All rights reserved.
+All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions
+are met:
+
+* Redistributions of source code must retain the above copyright
+notice, this list of conditions and the following disclaimer.
+* Redistributions in binary form must reproduce the above copyright
+notice, this list of conditions and the following disclaimer in
+the documentation and/or other materials provided with the
+distribution.
+* Neither the name of IGEL Co.,Ltd. nor the names of its
+contributors may be used to endorse or promote products derived
+from this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+Port Hotplug Framework
+==
+
+The Port Hotplug Framework provides DPDK applications with the ability to
+attach and detach ports at runtime. Because the framework depends on PMD
+implementation, the ports that PMDs cannot handle are out of scope of this
+framework. Furthermore, after detaching a port from a DPDK application, the
+framework doesn't provide a way for removing the devices from the system.
+For the ports backed by a physical NIC, the kernel will need to support PCI
+Hotplug feature.
+
+Overview
+
+
+The basic requirements of the Port Hotplug Framework are:
+
+*   DPDK applications that use the Port Hotplug Framework must manage their
+own ports.
+
+The Port Hotplug Framework is implemented to allow DPDK applications to
+manage ports. For example, when DPDK applications call the port attach
+function, the attached port number is returned. DPDK applications can
+also detach the port by port number.
+
+*   Kernel support is needed for attaching or detaching physical device
+ports.
+
+To attach new physical device ports, the device will be recognized by
+userspace driver I/O framework in kernel at first. Then DPDK
+applications can call the Port Hotplug functions to attach the ports.
+For detaching, steps are vice versa.
+
+*   Before detaching, they must be stopped and closed.
+
+DPDK applications must call "rte_eth_dev_stop()" and
+"rte_eth_dev_close()" APIs before detaching ports. These functions will
+start finalization sequence of the PMDs.
+
+*   The framework doesn't affect legacy DPDK applications behavior.
+
+If the Port Hotplug functions aren't called, all legacy DPDK apps can
+still work without modifications.
+
+Port Hotplug API overview
+-
+
+*   Attaching a port
+
+"rte_eal_dev_attach()" API attaches a port to DPDK application, and
+returns the attached port number. Before calling the API, the device
+should be recognized by an userspace driver I/O framework. The API
+receives a pci address like ":01:00.0" or a virtual device name
+like "eth_pcap0,iface=eth0". In the case of virtual device name, the
+format is the same as the general "--vdev" option of DPDK.
+
+*   Detac

[dpdk-dev] [PATCH v11] librte_pmd_pcap: Add port hotplug support

2015-02-23 Thread Tetsuya Mukawa
This patch adds finalization code to free resources allocated by the
PMD.

v6:
 - Fix a paramter of rte_eth_dev_free().
v4:
 - Change function name.

Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_pmd_pcap/rte_eth_pcap.c | 40 ++
 1 file changed, 40 insertions(+)

diff --git a/lib/librte_pmd_pcap/rte_eth_pcap.c 
b/lib/librte_pmd_pcap/rte_eth_pcap.c
index af7fae8..5e94930 100644
--- a/lib/librte_pmd_pcap/rte_eth_pcap.c
+++ b/lib/librte_pmd_pcap/rte_eth_pcap.c
@@ -498,6 +498,13 @@ static struct eth_dev_ops ops = {
.stats_reset = eth_stats_reset,
 };

+static struct eth_driver rte_pcap_pmd = {
+   .pci_drv = {
+   .name = "rte_pcap_pmd",
+   .drv_flags = RTE_PCI_DRV_DETACHABLE,
+   },
+};
+
 /*
  * Function handler that opens the pcap file for reading a stores a
  * reference of it for use it later on.
@@ -713,6 +720,10 @@ rte_pmd_init_internals(const char *name, const unsigned 
nb_rx_queues,
if (*eth_dev == NULL)
goto error;

+   /* check length of device name */
+   if ((strlen((*eth_dev)->data->name) + 1) > sizeof(data->name))
+   goto error;
+
/* now put it all together
 * - store queue data in internals,
 * - store numa_node info in pci_driver
@@ -739,10 +750,13 @@ rte_pmd_init_internals(const char *name, const unsigned 
nb_rx_queues,
data->nb_tx_queues = (uint16_t)nb_tx_queues;
data->dev_link = pmd_link;
data->mac_addrs = ð_addr;
+   strncpy(data->name,
+   (*eth_dev)->data->name, strlen((*eth_dev)->data->name));

(*eth_dev)->data = data;
(*eth_dev)->dev_ops = &ops;
(*eth_dev)->pci_dev = pci_dev;
+   (*eth_dev)->driver = &rte_pcap_pmd;

return 0;

@@ -927,10 +941,36 @@ rte_pmd_pcap_devinit(const char *name, const char *params)

 }

+static int
+rte_pmd_pcap_devuninit(const char *name)
+{
+   struct rte_eth_dev *eth_dev = NULL;
+
+   RTE_LOG(INFO, PMD, "Closing pcap ethdev on numa socket %u\n",
+   rte_socket_id());
+
+   if (name == NULL)
+   return -1;
+
+   /* reserve an ethdev entry */
+   eth_dev = rte_eth_dev_allocated(name);
+   if (eth_dev == NULL)
+   return -1;
+
+   rte_free(eth_dev->data->dev_private);
+   rte_free(eth_dev->data);
+   rte_free(eth_dev->pci_dev);
+
+   rte_eth_dev_release_port(eth_dev);
+
+   return 0;
+}
+
 static struct rte_driver pmd_pcap_drv = {
.name = "eth_pcap",
.type = PMD_VDEV,
.init = rte_pmd_pcap_devinit,
+   .uninit = rte_pmd_pcap_devuninit,
 };

 PMD_REGISTER_DRIVER(pmd_pcap_drv);
-- 
1.9.1



[dpdk-dev] [PATCH v11] testpmd: Add port hotplug support

2015-02-23 Thread Tetsuya Mukawa
The patch introduces following commands.
- port attach [ident]
- port detach [port_id]
 - attach: attaching a port
 - detach: detaching a port
 - ident: pci address of physical device.
  Or device name and parameters of virtual device.
 (ex. :02:00.0, eth_pcap0,iface=eth0)
 - port_id: port identifier

v7:
- Fix doc.
  (Thanks to Iremonger, Bernard)
- Fix port checking implementation of star_port();
  (Thanks to Qiu, Michael)
v5:
- Add testpmd documentation.
  (Thanks to Iremonger, Bernard)
v4:
 - Fix strings of command help.

Signed-off-by: Tetsuya Mukawa 
---
 app/test-pmd/cmdline.c  | 137 +++
 app/test-pmd/config.c   | 102 --
 app/test-pmd/parameters.c   |  22 ++-
 app/test-pmd/testpmd.c  | 199 +---
 app/test-pmd/testpmd.h  |  18 ++-
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  57 
 6 files changed, 409 insertions(+), 126 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index c6a1627..b78c659 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -513,6 +513,12 @@ static void cmd_help_long_parsed(void *parsed_result,
"port close (port_id|all)\n"
"Close all ports or port_id.\n\n"

+   "port attach (ident)\n"
+   "Attach physical or virtual dev by pci address or 
virtual device name\n\n"
+
+   "port detach (port_id)\n"
+   "Detach physical or virtual dev by port_id\n\n"
+
"port config (port_id|all)"
" speed (10|100|1000|1|4|auto)"
" duplex (half|full|auto)\n"
@@ -793,6 +799,89 @@ cmdline_parse_inst_t cmd_operate_specific_port = {
},
 };

+/* *** attach a specified port *** */
+struct cmd_operate_attach_port_result {
+   cmdline_fixed_string_t port;
+   cmdline_fixed_string_t keyword;
+   cmdline_fixed_string_t identifier;
+};
+
+static void cmd_operate_attach_port_parsed(void *parsed_result,
+   __attribute__((unused)) struct cmdline *cl,
+   __attribute__((unused)) void *data)
+{
+   struct cmd_operate_attach_port_result *res = parsed_result;
+
+   if (!strcmp(res->keyword, "attach"))
+   attach_port(res->identifier);
+   else
+   printf("Unknown parameter\n");
+}
+
+cmdline_parse_token_string_t cmd_operate_attach_port_port =
+   TOKEN_STRING_INITIALIZER(struct cmd_operate_attach_port_result,
+   port, "port");
+cmdline_parse_token_string_t cmd_operate_attach_port_keyword =
+   TOKEN_STRING_INITIALIZER(struct cmd_operate_attach_port_result,
+   keyword, "attach");
+cmdline_parse_token_string_t cmd_operate_attach_port_identifier =
+   TOKEN_STRING_INITIALIZER(struct cmd_operate_attach_port_result,
+   identifier, NULL);
+
+cmdline_parse_inst_t cmd_operate_attach_port = {
+   .f = cmd_operate_attach_port_parsed,
+   .data = NULL,
+   .help_str = "port attach identifier, "
+   "identifier: pci address or virtual dev name",
+   .tokens = {
+   (void *)&cmd_operate_attach_port_port,
+   (void *)&cmd_operate_attach_port_keyword,
+   (void *)&cmd_operate_attach_port_identifier,
+   NULL,
+   },
+};
+
+/* *** detach a specified port *** */
+struct cmd_operate_detach_port_result {
+   cmdline_fixed_string_t port;
+   cmdline_fixed_string_t keyword;
+   uint8_t port_id;
+};
+
+static void cmd_operate_detach_port_parsed(void *parsed_result,
+   __attribute__((unused)) struct cmdline *cl,
+   __attribute__((unused)) void *data)
+{
+   struct cmd_operate_detach_port_result *res = parsed_result;
+
+   if (!strcmp(res->keyword, "detach"))
+   detach_port(res->port_id);
+   else
+   printf("Unknown parameter\n");
+}
+
+cmdline_parse_token_string_t cmd_operate_detach_port_port =
+   TOKEN_STRING_INITIALIZER(struct cmd_operate_detach_port_result,
+   port, "port");
+cmdline_parse_token_string_t cmd_operate_detach_port_keyword =
+   TOKEN_STRING_INITIALIZER(struct cmd_operate_detach_port_result,
+   keyword, "detach");
+cmdline_parse_token_num_t cmd_operate_detach_port_port_id =
+   TOKEN_NUM_INITIALIZER(struct cmd_operate_detach_port_result,
+   port_id, UINT8);
+
+cmdline_parse_inst_t cmd_operate_detach_port = {
+   .f = cmd_operate_detach_port_parsed,
+   .data = NULL,
+   .help_str = "port detach port_id",
+   .tokens = {
+   (void *)&cmd_operate_detach_port_port,
+   (void *)&cmd_operate_detach_port_keyword,
+ 

[dpdk-dev] [PATCH v11 1/2] librte_pmd_null: Add Null PMD

2015-02-23 Thread Tetsuya Mukawa
Null PMD is a driver of the virtual device particularly designed to measure
performance of DPDK PMDs. When an application call rx, Null PMD just allocates
mbufs and returns those. Also tx, the PMD just frees mbufs.

The PMD has following options.
- size: specify packe size allocated by RX. Default packet size is 64.
- copy: specify 1 or 0 to enable or disable copy while RX and TX.
Default value is 0(disabled).
This option is used for emulating more realistic data transfer.
Copy size is equal to packet size.

To use the PMD, enable CONFIG_RTE_BUILD_SHARED_LIB in config file. Then
compile the PMD as shared library. The library can be linked using '-d'
option when an application invokes.

Here is an example.
$ sudo ./testpmd -c f -n 4 -d librte_pmd_null.so \
--vdev 'eth_null0' --vdev 'eth_null1' -- -i --no-flush-rx

If testpmd is compiled with CONFIG_RTE_BUILD_SHARED_LIB, it may need to
specify more libraries using '-d' option.

v11:
- Fix warning of checkpatch.pl.
v10:
- Fix driver initilization code not to return error when param is NULL.
v8:
 - Fix Makefile and add version map file.
   (Thanks to Qiu, Michael and Iremonger, Bernard)
v7:
 - Add parameter checkings.
   (Thanks to Iremonger, Bernard)
 - Remove needless "__rte_unused".
v4:
 - Fix memory leak.
   (Thanks to Iremonger, Bernard)

Signed-off-by: Tetsuya Mukawa 
---
 config/common_bsdapp |   5 +
 config/common_linuxapp   |   5 +
 lib/Makefile |   1 +
 lib/librte_pmd_null/Makefile |  62 +++
 lib/librte_pmd_null/rte_eth_null.c   | 545 +++
 lib/librte_pmd_null/rte_pmd_null_version.map |   4 +
 6 files changed, 622 insertions(+)
 create mode 100644 lib/librte_pmd_null/Makefile
 create mode 100644 lib/librte_pmd_null/rte_eth_null.c
 create mode 100644 lib/librte_pmd_null/rte_pmd_null_version.map

diff --git a/config/common_bsdapp b/config/common_bsdapp
index c24f687..50189d6 100644
--- a/config/common_bsdapp
+++ b/config/common_bsdapp
@@ -243,6 +243,11 @@ CONFIG_RTE_LIBRTE_BOND_DEBUG_ALB=n
 CONFIG_RTE_LIBRTE_BOND_DEBUG_ALB_L1=n

 #
+# Compile null PMD
+#
+CONFIG_RTE_LIBRTE_PMD_NULL=y
+
+#
 # Do prefetch of packet data within PMD driver receive function
 #
 CONFIG_RTE_PMD_PACKET_PREFETCH=y
diff --git a/config/common_linuxapp b/config/common_linuxapp
index d66b008..9ddd056 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -250,6 +250,11 @@ CONFIG_RTE_LIBRTE_PMD_AF_PACKET=y
 CONFIG_RTE_LIBRTE_PMD_XENVIRT=n

 #
+# Compile null PMD
+#
+CONFIG_RTE_LIBRTE_PMD_NULL=y
+
+#
 # Do prefetch of packet data within PMD driver receive function
 #
 CONFIG_RTE_PMD_PACKET_PREFETCH=y
diff --git a/lib/Makefile b/lib/Makefile
index 6575a4e..5fcbb3c 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -54,6 +54,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += librte_pmd_virtio
 DIRS-$(CONFIG_RTE_LIBRTE_VMXNET3_PMD) += librte_pmd_vmxnet3
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_XENVIRT) += librte_pmd_xenvirt
 DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += librte_vhost
+DIRS-$(CONFIG_RTE_LIBRTE_PMD_NULL) += librte_pmd_null
 DIRS-$(CONFIG_RTE_LIBRTE_HASH) += librte_hash
 DIRS-$(CONFIG_RTE_LIBRTE_LPM) += librte_lpm
 DIRS-$(CONFIG_RTE_LIBRTE_ACL) += librte_acl
diff --git a/lib/librte_pmd_null/Makefile b/lib/librte_pmd_null/Makefile
new file mode 100644
index 000..6472015
--- /dev/null
+++ b/lib/librte_pmd_null/Makefile
@@ -0,0 +1,62 @@
+#   BSD LICENSE
+#
+#   Copyright (C) IGEL Co.,Ltd.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of IGEL Co.,Ltd. nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHE

[dpdk-dev] [PATCH v11 2/2] librte_pmd_null: Support port hotplug function

2015-02-23 Thread Tetsuya Mukawa
This patch adds port hotplug support to Null PMD.

v9:
 - Use rte_eth_dev_release_port() instead of rte_eth_dev_free().
v7:
 - Add parameter checkings.
   (Thanks to Iremonger, Bernard)
v6:
 - Fix a parameter of rte_eth_dev_free().
v4:
 - Fix commit title.

Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_pmd_null/rte_eth_null.c | 35 +++
 1 file changed, 35 insertions(+)

diff --git a/lib/librte_pmd_null/rte_eth_null.c 
b/lib/librte_pmd_null/rte_eth_null.c
index aaa0839..bb10276 100644
--- a/lib/librte_pmd_null/rte_eth_null.c
+++ b/lib/librte_pmd_null/rte_eth_null.c
@@ -336,6 +336,13 @@ eth_stats_reset(struct rte_eth_dev *dev)
}
 }

+static struct eth_driver rte_null_pmd = {
+   .pci_drv = {
+   .name = "rte_null_pmd",
+   .drv_flags = RTE_PCI_DRV_DETACHABLE,
+   },
+};
+
 static void
 eth_queue_release(void *q)
 {
@@ -429,10 +436,12 @@ eth_dev_null_create(const char *name,
data->nb_tx_queues = (uint16_t)nb_tx_queues;
data->dev_link = pmd_link;
data->mac_addrs = ð_addr;
+   strncpy(data->name, eth_dev->data->name, strlen(eth_dev->data->name));

eth_dev->data = data;
eth_dev->dev_ops = &ops;
eth_dev->pci_dev = pci_dev;
+   eth_dev->driver = &rte_null_pmd;

/* finally assign rx and tx ops */
if (packet_copy) {
@@ -536,10 +545,36 @@ rte_pmd_null_devinit(const char *name, const char *params)
return eth_dev_null_create(name, numa_node, packet_size, packet_copy);
 }

+static int
+rte_pmd_null_devuninit(const char *name)
+{
+   struct rte_eth_dev *eth_dev = NULL;
+
+   if (name == NULL)
+   return -EINVAL;
+
+   RTE_LOG(INFO, PMD, "Closing null ethdev on numa socket %u\n",
+   rte_socket_id());
+
+   /* reserve an ethdev entry */
+   eth_dev = rte_eth_dev_allocated(name);
+   if (eth_dev == NULL)
+   return -1;
+
+   rte_free(eth_dev->data->dev_private);
+   rte_free(eth_dev->data);
+   rte_free(eth_dev->pci_dev);
+
+   rte_eth_dev_release_port(eth_dev);
+
+   return 0;
+}
+
 static struct rte_driver pmd_null_drv = {
.name = "eth_null",
.type = PMD_VDEV,
.init = rte_pmd_null_devinit,
+   .uninit = rte_pmd_null_devuninit,
 };

 PMD_REGISTER_DRIVER(pmd_null_drv);
-- 
1.9.1



[dpdk-dev] [PATCH] virtio: Add default_txconf

2015-02-23 Thread Takuya ASADA
When I tried to launch test-pmd on KVM guest of Fedora21, I got following error:

Configuring Port 0 (socket 0)
Fail to configure port 0 tx queues
EAL: Error - exiting with code: 1
  Cause: Start ports failed

I found that the error caused here, and actual error message was "TX checksum 
offload not supported":
http://dpdk.org/browse/dpdk/tree/lib/librte_pmd_virtio/virtio_rxtx.c#n425

This patch adds default_txconf on virtio pmd, to avoid the error.

Signed-off-by: Takuya ASADA 
---
 lib/librte_pmd_virtio/virtio_ethdev.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c 
b/lib/librte_pmd_virtio/virtio_ethdev.c
index b3b5bb6..9c183bb 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -1188,6 +1188,9 @@ virtio_dev_info_get(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *dev_info)
dev_info->min_rx_bufsize = VIRTIO_MIN_RX_BUFSIZE;
dev_info->max_rx_pktlen = VIRTIO_MAX_RX_PKTLEN;
dev_info->max_mac_addrs = VIRTIO_MAX_MAC_ADDRS;
+   dev_info->default_txconf = (struct rte_eth_txconf) {
+   .txq_flags = ETH_TXQ_FLAGS_NOOFFLOADS
+   };
 }

 /*
-- 
2.1.0



[dpdk-dev] [PATCH] Make -Werror optional

2015-02-23 Thread Panu Matilainen
On 02/21/2015 09:33 PM, Neil Horman wrote:
> On Fri, Feb 20, 2015 at 05:55:21PM -0800, Stephen Hemminger wrote:
>> On Thu, 12 Feb 2015 16:54:44 +0200
>> Panu Matilainen  wrote:
>>
>>> On 02/12/2015 04:38 PM, Stephen Hemminger wrote:
 On Thu, 12 Feb 2015 13:13:22 +0200
 Panu Matilainen  wrote:

> This adds new CONFIG_RTE_ERROR_ON_WARNING config option to enable
> fail-on-warning compile behavior, defaulting to off.
>
> Failing build on warnings is a useful developer tool but its bad
> for release tarballs which can and do get built with newer
> compilers than what was used/available during development. Compilers
> routinely add new warnings so code which built silently with cc X
> might no longer do so with X+1. This doesn't make the existing code
> any more buggier and failing the build in this case does not help
> not help improve code quality of an already released version either.
>>
>> Hopefully distro's like RHEL will build with -Werror enabled
>> and not allow build to go through with errors.
>>
> Thats usually what we do, yes.

Um, nope. All Fedora and RHEL builds are done using a common base set of 
flags set centrally from rpm configuration, and that includes among 
other things -Wall but not -Werror, although since F21 
-Werror=format-security is included since that there are relatively few 
false positives for that.

The thing is, compiler warnings from compilers are just that: warnings, 
and often including hefty dose of false positives. A good package 
maintainer will look at the build logs of his/her packages, investigate 
warnings and send patches upstream to address them in oncoming versions 
where actually relevant, but generally a package maintainer in a distro 
is not responsible for achieving zero-warning build, nor should they.

Take for example set-but-not-used warning introduced in gcc 4.6. Many of 
the warnings unearthed by that do indicate real potential issues in the 
software (a typical example being ignoring an allegedly unlikely error 
code from a syscall, sometimes dozens or even hundreds of them for 
larger projects) but its also typically code that's been in use for 
years and years without anybody noticing. That code does not suddenly 
become unusable just because its now being compiled by a compiler that 
detects this condition, and -Werror on distro level misplaces the burden 
of addressing years of upstream laziness on a packager who often has 
little to do with upstream.

- Panu -


[dpdk-dev] [PATCH 0/8] Improve build process

2015-02-23 Thread Gonzalez Monroy, Sergio
On 22/02/2015 23:37, Neil Horman wrote:
> On Fri, Feb 20, 2015 at 02:31:36PM +, Gonzalez Monroy, Sergio wrote:
>> On 13/02/2015 12:51, Neil Horman wrote:
>>> On Fri, Feb 13, 2015 at 11:08:02AM +, Gonzalez Monroy, Sergio wrote:
 On 13/02/2015 10:14, Panu Matilainen wrote:
> On 02/12/2015 05:52 PM, Neil Horman wrote:
>> On Thu, Feb 12, 2015 at 04:07:50PM +0200, Panu Matilainen wrote:
>>> On 02/12/2015 02:23 PM, Neil Horman wrote:
> [...snip...]
>>> So I just realized that I was not having into account a possible
>>> scenario, where
>>> we have an app built with static dpdk libs then loading a dso
>>> with -d
>>> option.
>>>
>>> In such case, because the pmd would have DT_NEEDED entries,
>>> dlopen will
>>> fail.
>>> So to enable such scenario we would need to build PMDs without
>>> DT_NEEDED
>>> entries.
>> Hmm, for that to be a problem you'd need to have the PMD built
>> against
>> shared dpdk libs and while the application is built against
>> static dpdk
>> libs. I dont think that's a supportable scenario in any case.
>>
>> Or is there some other scenario that I'm not seeing?
>>
>> - Panu -
>>
> I agree with you. I suppose it comes down to, do we want to
> support such
> scenario?
>
>  From what I can see, it seems that we do currently support such
> scenario by
> building dpdk apps against all static dpdk libs using
> --whole-archive (all
> libs and not only PMDs).
> http://dpdk.org/browse/dpdk/commit/?id=20afd76a504155e947c770783ef5023e87136ad8
>
>
> Am I misunderstanding this?
>
 Shoot, you're right, I missed the static build aspect to this.  Yes,
 if we do the following:

 1) Build the DPDK as a static library
 2) Link an application against (1)
 3) Use the dlopen mechanism to load a PMD built as a DSO

 Then the DT_NEEDED entries in the DSO will go unsatisfied, because
 the shared
 objects on which it (the PMD) depends will not exist in the file
 system.
>>> I think its even more twisty:
>>>
>>> 1) Build the DPDK as a static library
>>> 2) Link an application against (1)
>>> 3) Do another build of DPDK as a shared library
>>> 4) In app 2), use the dlopen mechanism to load a PMD built as a part
>>> of or
>>> against 3)
>>>
>>> Somehow I doubt this would work very well.
>>>
>> Ideally it should, presuming the ABI is preserved between (1) and (3),
>> though I
>> agree, up until recently, that was an assumption that was unreliable.
> Versioning is a big and important step towards reliability but there are
> more issues to solve. This of course getting pretty far from the original
> topic, but at least one such issue is that there are some cases where a
> config value affects what are apparently public structs (rte_mbuf wrt
> RTE_MBUF_REFCNT for example), which really is a no-go.
>
 Agree, the RTE_MBUF_REFCNT is something that needs to be dealt with asap.
 I'll look into it.

 I think the problem is a little bit orthogonal to the libdpdk_core
 problem you
 were initially addressing.  That is to say, this problem of
 dlopen-ed PMD's
 exists regardless of weather you build the DPDK as part of a static
 or dynamic
 library.  The problems just happen to intersect in their
 manipulation of the
 DT_NEEDED entries.

 Ok, so, given the above, I would say your approach is likely
 correct, just
 prevent DT_NEEDED entries from getting added to PMD's. Doing so will
 sidestep
 loading issue for libraries that may not exist in the filesystem,
 but thats ok,
 because by all rights, the symbols codified in those needed
 libraries should
 already be present in the running application (either made available
 by the
 application having statically linked them, or having the linker load
 them from
 the proper libraries at run time).
>>> My 5c is that I'd much rather see the common case (all static or all
>>> shared)
>>> be simple and reliable, which in case of DSOs includes no lying
>>> (whether by
>>> omission or otherwise) about DT_NEEDED, ever. That way the issue is
>>> dealt
>>> once where it belongs. If somebody wants to go down the rabbit hole of
>>> mixed
>>> shared + static linkage, let them dig the hole by themselves :)
>>>
>> This is a fair point.  Can DT_NEEDED sections be stripped via tools like
>> objcopy
>> after the build is complete?  If so, end users can hack this corner case
>> to wo

[dpdk-dev] [PATCH v11 08/13] ethdev: Add functions that will be used by port hotplug functions

2015-02-23 Thread Iremonger, Bernard
> -Original Message-
> From: Tetsuya Mukawa [mailto:mukawa at igel.co.jp]
> Sent: Monday, February 23, 2015 5:09 AM
> To: dev at dpdk.org
> Cc: Qiu, Michael; Iremonger, Bernard; maxime.leroy at 6wind.com; Tetsuya 
> Mukawa
> Subject: [PATCH v11 08/13] ethdev: Add functions that will be used by port 
> hotplug functions
> 
> The patch adds following functions.
> 
> - rte_eth_dev_save()
>   The function is used for saving current rte_eth_dev structures.
> - rte_eth_dev_get_changed_port()
>   The function receives the rte_eth_dev structures, then compare
>   these with current values to know which port is actually
>   attached or detached.
> - rte_eth_dev_get_addr_by_port()
>   The function returns a pci address of an ethdev specified by port
>   identifier.
> - rte_eth_dev_get_port_by_addr()
>   The function returns a port identifier of an ethdev specified by
>   pci address.
> - rte_eth_dev_get_name_by_port()
>   The function returns a unique identifier name of an ethdev
>   specified by port identifier.
> - Add rte_eth_dev_is_detachable()
>   The function returns whether a PMD supports detach function.
> 
> Also, the patch changes scope of rte_eth_dev_allocated() to global.
> This function will be called by virtual PMDs to support port hotplug.
> So change scope of the function to global.
> 
> v10:
> - Change order of version.map.
>   (Thanks to Thomas Monjalon)
> v9:
> - rte_eth_dev_check_detachable() is replaced by
>   rte_eth_dev_is_detachable().
> - strncpy() is replaced by strcpy().
>   (Thanks to Thomas Monjalon)
> - Add missing symbol in version map.
>   (Thanks to Nail Horman)
> v8:
> - Add size parameter to rte_eth_dev_save().
> - Add missing symbol in version map.
>   (Thanks to Qiu, Michael and Iremonger, Bernard)
> v7:
> - Add pt_driver checking to rte_eth_dev_check_detachable().
>   (Thanks to Qiu, Michael)
> v5:
> - Fix return value of below functions.
>   rte_eth_dev_get_changed_port().
>   rte_eth_dev_get_port_by_addr().
> v4:
> - Add parameter checking.
> v3:
> - Fix if-condition bug while comparing pci addresses.
> - Add error checking codes.
> Reported-by: Mark Enright 
> 
> Signed-off-by: Tetsuya Mukawa 
> ---
>  lib/librte_ether/rte_ethdev.c  | 103 
> -
>  lib/librte_ether/rte_ethdev.h  |  83 ++
>  lib/librte_ether/rte_ether_version.map |   7 +++
>  3 files changed, 192 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c 
> index 3d148e2..7067620
> 100644
> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> @@ -201,7 +201,7 @@ rte_eth_dev_data_alloc(void)
>   RTE_MAX_ETHPORTS * sizeof(*rte_eth_dev_data));  
> }
> 
> -static struct rte_eth_dev *
> +struct rte_eth_dev *
>  rte_eth_dev_allocated(const char *name)  {
>   unsigned i;
> @@ -426,6 +426,107 @@ rte_eth_dev_count(void)
>   return (nb_ports);
>  }
> 
> +int
> +rte_eth_dev_save(struct rte_eth_dev *devs, size_t size) {
> + if ((devs == NULL) ||
> + (size != sizeof(struct rte_eth_dev) * RTE_MAX_ETHPORTS))
> + return -EINVAL;
> +
> + /* save current rte_eth_devices */
> + memcpy(devs, rte_eth_devices, size);
> + return 0;
> +}
> +
> +int
> +rte_eth_dev_get_changed_port(struct rte_eth_dev *devs, uint8_t
> +*port_id) {
> + if ((devs == NULL) || (port_id == NULL))
> + return -EINVAL;
> +
> + /* check which port was attached or detached */
> + for (*port_id = 0; *port_id < RTE_MAX_ETHPORTS; (*port_id)++, devs++) {
> + if (rte_eth_devices[*port_id].attached ^ devs->attached)
> + return 0;
> + }
> + return -ENODEV;
> +}
> +
> +int
> +rte_eth_dev_get_addr_by_port(uint8_t port_id, struct rte_pci_addr
> +*addr) {
> + if (!rte_eth_dev_is_valid_port(port_id)) {
> + PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
> + return -EINVAL;
> + }
> +
> + if (addr == NULL) {
> + PMD_DEBUG_TRACE("Null pointer is specified\n");
> + return -EINVAL;
> + }
> +
> + *addr = rte_eth_devices[port_id].pci_dev->addr;
> + return 0;
> +}
> +
> +int
> +rte_eth_dev_get_port_by_addr(struct rte_pci_addr *addr, uint8_t
> +*port_id) {
> + struct rte_pci_addr *tmp;
> +
> + if ((addr == NULL) || (port_id == NULL)) {
> + PMD_DEBUG_TRACE("Null pointer is specified\n");
> + return -EINVAL;
> + }
> +
> + for (*port_id = 0; *port_id < RTE_MAX_ETHPORTS; (*port_id)++) {
> + if (!rte_eth_devices[*port_id].attached)
> + continue;
> + if (!rte_eth_devices[*port_id].pci_dev)
> + continue;
> + tmp = &rte_eth_devices[*port_id].pci_dev->addr;
> + if (rte_eal_compare_pci_addr(tmp, addr) == 0)
> + return 0;
> + }
> + return -ENODEV;
> +}
> +
> +int
> +rte_eth_dev_get_name_

[dpdk-dev] Testpmd returns error.

2015-02-23 Thread Bruce Richardson
On Mon, Feb 23, 2015 at 11:33:45AM +0900, Tetsuya Mukawa wrote:
> On 2015/02/23 5:46, Bruce Richardson wrote:
> > On Sun, Feb 22, 2015 at 02:30:02PM +0900, Tetsuya Mukawa wrote:
> >> Hi,
> >>
> >> In my environment, testpmd in latest master branch returns error like 
> >> below.
> >>
> >> $ sudo ./tools/dpdk_nic_bind.py -b igb_uio :02:00.0
> >> $ sudo ./x86_64-native-linuxapp-gcc/app/testpmd -c f -n 1 -- -i
> >> EAL: Detected lcore 0 as core 0 on socket 0
> >> EAL: Detected lcore 1 as core 1 on socket 0
> >> EAL: Detected lcore 2 as core 2 on socket 0
> >> EAL: Detected lcore 3 as core 3 on socket 0
> >> EAL: Detected lcore 4 as core 4 on socket 0
> >> EAL: Detected lcore 5 as core 5 on socket 0
> >> EAL: Detected lcore 6 as core 6 on socket 0
> >> EAL: Detected lcore 7 as core 7 on socket 0
> >> EAL: Support maximum 128 logical core(s) by configuration.
> >> EAL: Detected 8 lcore(s)
> >> EAL: VFIO modules not all loaded, skip VFIO support...
> >> EAL: Setting up memory...
> >> EAL: Ask a virtual area of 0x28000 bytes
> >> EAL: Virtual area found at 0x7ffd4000 (size = 0x28000)
> >> EAL: Requesting 10 pages of size 1024MB from socket 0
> >> EAL: TSC frequency is ~3991450 KHz
> >> EAL: Master core 0 is ready (tid=f7fd6840)
> >> PMD: ENICPMD trace: rte_enic_pmd_init
> >> EAL: Core 3 is ready (tid=f58e0700)
> >> EAL: Core 2 is ready (tid=f60e1700)
> >> EAL: Core 1 is ready (tid=f68e2700)
> >> EAL: PCI device :02:00.0 on NUMA socket -1
> >> EAL:   probe driver: 8086:10b9 rte_em_pmd
> >> EAL:   PCI memory mapped at 0x7fffc000
> >> EAL: pci_map_resource(): cannot mmap(23, 0x7fffc002, 0x2,
> >> 0x1000): Invalid argument (0x)
> >> EAL: Error - exiting with code: 1
> >>   Cause: Requested device :02:00.0 cannot be used
> >>
> >>
> >> I've run git-bisect, and it seems following commit cause this error.
> >>
> >> commit 4a499c64959074ba6fa6a5a2b3a2a6aa10627fa1
> >> Author: Danny Zhou 
> >> Date:   Fri Feb 20 16:59:15 2015 +
> >>
> >> eal/linux: enable uio_pci_generic support
> >>
> >> Someone, could you please check it?
> >>
> >> Thanks,
> >> Tetsuya
> >>
> > Hi Tetsuya,
> >
> > trying to reproduce the problem here, with no success so far with a mix of 
> > 1G
> > and 10G ports. Is there anything special about your environment that might 
> > especially trigger this issue? Is it a VM or running on the host machine 
> > etc.?
> 
> Hi Bruce,
> 
> I appreciate for your testing.
> 
> I've tried it on an another system, and I couldn't reproduce it.
> Could you please see below?
> 
> - The system I can reproduce the issue
> OS: Ubuntu14.04
> Kernel: Linux eris 3.13.0-30-generic
> CPU: AMD FX(tm)-8350 Eight-Core Processor
> NIC: Ethernet controller: Intel Corporation 82572EI Gigabit Ethernet
> Controller (Copper) (rev 06)
> 
> - The system I cannot reproduce the issue
> OS: ubuntu14.04
> Kernel: Linux ubuntu-igel 3.13.0-30-generic
> Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz
> NIC: Ethernet controller: Intel Corporation I350 Gigabit Network
> Connection (rev 01)
> 
> I will check it more in next testing phase.
> Until then, I will use the system I cannot reproduce the issue.
> 
> Regards,
> Tetsuya
> 
> > Regards,
> > /Bruce
> >
>
Thanks Tetsuya,
Declan has managed to find a board here that can reproduce the issue so we 
are now investigating possible solutions.
/Bruce


[dpdk-dev] [PATCH v4 2/5] ixgbe: enable rx queue interrupts for both PF and VF

2015-02-23 Thread Thomas Monjalon
2015-02-19 21:48, Zhou Danny:
> + /* set max interrupt vfio request */
> + pci_dev->intr_handle.max_intr = hw->mac.max_rx_queues +
> + IXGBE_MAX_OTHER_INTR;
> +

Compilation is broken here.


[dpdk-dev] [PATCH v4 2/5] ixgbe: enable rx queue interrupts for both PF and VF

2015-02-23 Thread Zhou, Danny
I noticed the V4 patch conflicts with the latest code on the main branch due to 
lots of code merged, and it 
is mentioned by Jun Xu in previous email, and I will have to rebase the patch 
and send out V5 version.

> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Monday, February 23, 2015 7:20 PM
> To: Zhou, Danny
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v4 2/5] ixgbe: enable rx queue interrupts for 
> both PF and VF
> 
> 2015-02-19 21:48, Zhou Danny:
> > +   /* set max interrupt vfio request */
> > +   pci_dev->intr_handle.max_intr = hw->mac.max_rx_queues +
> > +   IXGBE_MAX_OTHER_INTR;
> > +
> 
> Compilation is broken here.


[dpdk-dev] [PATCH v4 4/5] eal: add per rx queue interrupt handling based on VFIO

2015-02-23 Thread Zhou, Danny

> -Original Message-
> From: Jun Xiao [mailto:jun.xiao at cloudnetengine.com]
> Sent: Saturday, February 21, 2015 10:57 AM
> To: Zhou, Danny
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v4 4/5] eal: add per rx queue interrupt 
> handling based on VFIO
> 
> On 02/19, Zhou Danny wrote:
> > v4 changes:
> > - Adjust position of new-added structure fields
> >
> > v3 changes:
> > - Fix review comments
> >
> > v2 changes:
> > - Fix compilation issue for a missed header file
> > - Bug fix: free unreleased resources on the exception path before return
> > - Consolidate coding style related review comments
> >
> > This patch does below:
> > - Create multiple VFIO eventfd for rx queues.
> > - Handle per rx queue interrupt.
> > - Eliminate unnecessary suspended DPDK polling thread wakeup mechanism
> > for rx interrupt by allowing polling thread epoll_wait rx queue
> > interrupt notification.
> >
> > Signed-off-by: Danny Zhou 
> > Tested-by: Yong Liu 
> > ---
> >  lib/librte_eal/common/include/rte_eal.h|  12 ++
> >  lib/librte_eal/linuxapp/eal/Makefile   |   1 +
> >  lib/librte_eal/linuxapp/eal/eal_interrupts.c   | 190 
> > -
> >  lib/librte_eal/linuxapp/eal/eal_pci_vfio.c |  12 +-
> >  .../linuxapp/eal/include/exec-env/rte_interrupts.h |   4 +
> 
> Hi Danny,
> 
> Could you rebase the patch to your commit 4a499c64959, otherwise
> rte_interrupts.h cannot be applied.
> 

Ok, will do it.

> Thanks,
> Jun


[dpdk-dev] [PATCH v11 08/13] ethdev: Add functions that will be used by port hotplug functions

2015-02-23 Thread Tetsuya Mukawa
On 2015/02/23 20:01, Iremonger, Bernard wrote:
>> -Original Message-
>> From: Tetsuya Mukawa [mailto:mukawa at igel.co.jp]
>> Sent: Monday, February 23, 2015 5:09 AM
>> To: dev at dpdk.org
>> Cc: Qiu, Michael; Iremonger, Bernard; maxime.leroy at 6wind.com; Tetsuya 
>> Mukawa
>> Subject: [PATCH v11 08/13] ethdev: Add functions that will be used by port 
>> hotplug functions
>>
>> The patch adds following functions.
>>
>> - rte_eth_dev_save()
>>   The function is used for saving current rte_eth_dev structures.
>> - rte_eth_dev_get_changed_port()
>>   The function receives the rte_eth_dev structures, then compare
>>   these with current values to know which port is actually
>>   attached or detached.
>> - rte_eth_dev_get_addr_by_port()
>>   The function returns a pci address of an ethdev specified by port
>>   identifier.
>> - rte_eth_dev_get_port_by_addr()
>>   The function returns a port identifier of an ethdev specified by
>>   pci address.
>> - rte_eth_dev_get_name_by_port()
>>   The function returns a unique identifier name of an ethdev
>>   specified by port identifier.
>> - Add rte_eth_dev_is_detachable()
>>   The function returns whether a PMD supports detach function.
>>
>> Also, the patch changes scope of rte_eth_dev_allocated() to global.
>> This function will be called by virtual PMDs to support port hotplug.
>> So change scope of the function to global.
>>
>> v10:
>> - Change order of version.map.
>>   (Thanks to Thomas Monjalon)
>> v9:
>> - rte_eth_dev_check_detachable() is replaced by
>>   rte_eth_dev_is_detachable().
>> - strncpy() is replaced by strcpy().
>>   (Thanks to Thomas Monjalon)
>> - Add missing symbol in version map.
>>   (Thanks to Nail Horman)
>> v8:
>> - Add size parameter to rte_eth_dev_save().
>> - Add missing symbol in version map.
>>   (Thanks to Qiu, Michael and Iremonger, Bernard)
>> v7:
>> - Add pt_driver checking to rte_eth_dev_check_detachable().
>>   (Thanks to Qiu, Michael)
>> v5:
>> - Fix return value of below functions.
>>   rte_eth_dev_get_changed_port().
>>   rte_eth_dev_get_port_by_addr().
>> v4:
>> - Add parameter checking.
>> v3:
>> - Fix if-condition bug while comparing pci addresses.
>> - Add error checking codes.
>> Reported-by: Mark Enright 
>>
>> Signed-off-by: Tetsuya Mukawa 
>> ---
>>  lib/librte_ether/rte_ethdev.c  | 103 
>> -
>>  lib/librte_ether/rte_ethdev.h  |  83 ++
>>  lib/librte_ether/rte_ether_version.map |   7 +++
>>  3 files changed, 192 insertions(+), 1 deletion(-)
>>
>> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c 
>> index 3d148e2..7067620
>> 100644
>> --- a/lib/librte_ether/rte_ethdev.c
>> +++ b/lib/librte_ether/rte_ethdev.c
>> @@ -201,7 +201,7 @@ rte_eth_dev_data_alloc(void)
>>  RTE_MAX_ETHPORTS * sizeof(*rte_eth_dev_data));  
>> }
>>
>> -static struct rte_eth_dev *
>> +struct rte_eth_dev *
>>  rte_eth_dev_allocated(const char *name)  {
>>  unsigned i;
>> @@ -426,6 +426,107 @@ rte_eth_dev_count(void)
>>  return (nb_ports);
>>  }
>>
>> +int
>> +rte_eth_dev_save(struct rte_eth_dev *devs, size_t size) {
>> +if ((devs == NULL) ||
>> +(size != sizeof(struct rte_eth_dev) * RTE_MAX_ETHPORTS))
>> +return -EINVAL;
>> +
>> +/* save current rte_eth_devices */
>> +memcpy(devs, rte_eth_devices, size);
>> +return 0;
>> +}
>> +
>> +int
>> +rte_eth_dev_get_changed_port(struct rte_eth_dev *devs, uint8_t
>> +*port_id) {
>> +if ((devs == NULL) || (port_id == NULL))
>> +return -EINVAL;
>> +
>> +/* check which port was attached or detached */
>> +for (*port_id = 0; *port_id < RTE_MAX_ETHPORTS; (*port_id)++, devs++) {
>> +if (rte_eth_devices[*port_id].attached ^ devs->attached)
>> +return 0;
>> +}
>> +return -ENODEV;
>> +}
>> +
>> +int
>> +rte_eth_dev_get_addr_by_port(uint8_t port_id, struct rte_pci_addr
>> +*addr) {
>> +if (!rte_eth_dev_is_valid_port(port_id)) {
>> +PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
>> +return -EINVAL;
>> +}
>> +
>> +if (addr == NULL) {
>> +PMD_DEBUG_TRACE("Null pointer is specified\n");
>> +return -EINVAL;
>> +}
>> +
>> +*addr = rte_eth_devices[port_id].pci_dev->addr;
>> +return 0;
>> +}
>> +
>> +int
>> +rte_eth_dev_get_port_by_addr(struct rte_pci_addr *addr, uint8_t
>> +*port_id) {
>> +struct rte_pci_addr *tmp;
>> +
>> +if ((addr == NULL) || (port_id == NULL)) {
>> +PMD_DEBUG_TRACE("Null pointer is specified\n");
>> +return -EINVAL;
>> +}
>> +
>> +for (*port_id = 0; *port_id < RTE_MAX_ETHPORTS; (*port_id)++) {
>> +if (!rte_eth_devices[*port_id].attached)
>> +continue;
>> +if (!rte_eth_devices[*port_id].pci_dev)
>> +continue;
>> +tmp = &rte_eth_devices[*port_id].pci_dev->addr;
>> +if (rte_e

[dpdk-dev] Testpmd returns error.

2015-02-23 Thread Tetsuya Mukawa
On 2015/02/23 20:12, Bruce Richardson wrote:
> On Mon, Feb 23, 2015 at 11:33:45AM +0900, Tetsuya Mukawa wrote:
>> On 2015/02/23 5:46, Bruce Richardson wrote:
>>> On Sun, Feb 22, 2015 at 02:30:02PM +0900, Tetsuya Mukawa wrote:
 Hi,

 In my environment, testpmd in latest master branch returns error like 
 below.

 $ sudo ./tools/dpdk_nic_bind.py -b igb_uio :02:00.0
 $ sudo ./x86_64-native-linuxapp-gcc/app/testpmd -c f -n 1 -- -i
 EAL: Detected lcore 0 as core 0 on socket 0
 EAL: Detected lcore 1 as core 1 on socket 0
 EAL: Detected lcore 2 as core 2 on socket 0
 EAL: Detected lcore 3 as core 3 on socket 0
 EAL: Detected lcore 4 as core 4 on socket 0
 EAL: Detected lcore 5 as core 5 on socket 0
 EAL: Detected lcore 6 as core 6 on socket 0
 EAL: Detected lcore 7 as core 7 on socket 0
 EAL: Support maximum 128 logical core(s) by configuration.
 EAL: Detected 8 lcore(s)
 EAL: VFIO modules not all loaded, skip VFIO support...
 EAL: Setting up memory...
 EAL: Ask a virtual area of 0x28000 bytes
 EAL: Virtual area found at 0x7ffd4000 (size = 0x28000)
 EAL: Requesting 10 pages of size 1024MB from socket 0
 EAL: TSC frequency is ~3991450 KHz
 EAL: Master core 0 is ready (tid=f7fd6840)
 PMD: ENICPMD trace: rte_enic_pmd_init
 EAL: Core 3 is ready (tid=f58e0700)
 EAL: Core 2 is ready (tid=f60e1700)
 EAL: Core 1 is ready (tid=f68e2700)
 EAL: PCI device :02:00.0 on NUMA socket -1
 EAL:   probe driver: 8086:10b9 rte_em_pmd
 EAL:   PCI memory mapped at 0x7fffc000
 EAL: pci_map_resource(): cannot mmap(23, 0x7fffc002, 0x2,
 0x1000): Invalid argument (0x)
 EAL: Error - exiting with code: 1
   Cause: Requested device :02:00.0 cannot be used


 I've run git-bisect, and it seems following commit cause this error.

 commit 4a499c64959074ba6fa6a5a2b3a2a6aa10627fa1
 Author: Danny Zhou 
 Date:   Fri Feb 20 16:59:15 2015 +

 eal/linux: enable uio_pci_generic support

 Someone, could you please check it?

 Thanks,
 Tetsuya

>>> Hi Tetsuya,
>>>
>>> trying to reproduce the problem here, with no success so far with a mix of 
>>> 1G
>>> and 10G ports. Is there anything special about your environment that might 
>>> especially trigger this issue? Is it a VM or running on the host machine 
>>> etc.?
>> Hi Bruce,
>>
>> I appreciate for your testing.
>>
>> I've tried it on an another system, and I couldn't reproduce it.
>> Could you please see below?
>>
>> - The system I can reproduce the issue
>> OS: Ubuntu14.04
>> Kernel: Linux eris 3.13.0-30-generic
>> CPU: AMD FX(tm)-8350 Eight-Core Processor
>> NIC: Ethernet controller: Intel Corporation 82572EI Gigabit Ethernet
>> Controller (Copper) (rev 06)
>>
>> - The system I cannot reproduce the issue
>> OS: ubuntu14.04
>> Kernel: Linux ubuntu-igel 3.13.0-30-generic
>> Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz
>> NIC: Ethernet controller: Intel Corporation I350 Gigabit Network
>> Connection (rev 01)
>>
>> I will check it more in next testing phase.
>> Until then, I will use the system I cannot reproduce the issue.
>>
>> Regards,
>> Tetsuya
>>
>>> Regards,
>>> /Bruce
>>>
> Thanks Tetsuya,
> Declan has managed to find a board here that can reproduce the issue so we 
> are now investigating possible solutions.
> /Bruce

Hi Bruce,

Thanks you so much!

Tetsuya




[dpdk-dev] [PATCH v11 08/13] ethdev: Add functions that will be used by port hotplug functions

2015-02-23 Thread Iremonger, Bernard
> >> the pointer diff --git a/lib/librte_ether/rte_ether_version.map
> >> b/lib/librte_ether/rte_ether_version.map
> >> index f66fd2d..099c769 100644
> >> --- a/lib/librte_ether/rte_ether_version.map
> >> +++ b/lib/librte_ether/rte_ether_version.map
> >> @@ -6,6 +6,7 @@ DPDK_2.0 {
> >>rte_eth_allmulticast_enable;
> >>rte_eth_allmulticast_get;
> >>rte_eth_dev_allocate;
> >> +  rte_eth_dev_allocated;
> >>rte_eth_dev_bypass_event_show;
> >>rte_eth_dev_bypass_event_store;
> >>rte_eth_dev_bypass_init;
> >> @@ -32,9 +33,14 @@ DPDK_2.0 {
> >>rte_eth_dev_filter_supported;
> >>rte_eth_dev_flow_ctrl_get;
> >>rte_eth_dev_flow_ctrl_set;
> >> +  rte_eth_dev_get_addr_by_port;
> >> +  rte_eth_dev_get_changed_port;
> > Hi Tetsuya,
> >
> > rte_eth_dev_get_device_type;
> >
> > needs to be added to rte_ether_version map to solve linking issue.
> 
> Hi Bernard,
> 
> Thanks.
> Could you please let me know how can I check this linking issue on my 
> environment?
> I will add it on my test.
> 
> Thanks,
> Tetsuya
> 

Hi Tetsuya,

In config/common_linuxapp
With "CONFIG_RTE_BUILD_SHARED_LIB=y"  the follow linking error is occurring

  LD test
/root/dpdk_sforge_2/x86_64-native-linuxapp-gcc/lib/librte_eal.so: undefined 
reference to `rte_eth_dev_get_device_type'
collect2: error: ld returned 1 exit status
make[5]: *** [test] Error 1
make[4]: *** [test] Error 2
make[3]: *** [app] Error 2
make[2]: *** [all] Error 2
make[1]: *** [x86_64-native-linuxapp-gcc_install] Error 2
make: *** [install] Error 2

Regards,

Bernard.
> >
> > Regards,
> >
> > Bernard.
> >
> >
> >>rte_eth_dev_get_mtu;
> >> +  rte_eth_dev_get_name_by_port;
> >> +  rte_eth_dev_get_port_by_addr;
> >>rte_eth_dev_get_vlan_offload;
> >>rte_eth_dev_info_get;
> >> +  rte_eth_dev_is_detachable;
> >>rte_eth_dev_mac_addr_add;
> >>rte_eth_dev_mac_addr_remove;
> >>rte_eth_dev_priority_flow_ctrl_set;
> >> @@ -44,6 +50,7 @@ DPDK_2.0 {
> >>rte_eth_dev_rss_reta_update;
> >>rte_eth_dev_rx_queue_start;
> >>rte_eth_dev_rx_queue_stop;
> >> +  rte_eth_dev_save;
> >>rte_eth_dev_set_link_down;
> >>rte_eth_dev_set_link_up;
> >>rte_eth_dev_set_mtu;
> >> --
> >> 1.9.1



[dpdk-dev] [PATCH v11 08/13] ethdev: Add functions that will be used by port hotplug functions

2015-02-23 Thread Tetsuya Mukawa
On 2015/02/23 20:39, Iremonger, Bernard wrote:
 the pointer diff --git a/lib/librte_ether/rte_ether_version.map
 b/lib/librte_ether/rte_ether_version.map
 index f66fd2d..099c769 100644
 --- a/lib/librte_ether/rte_ether_version.map
 +++ b/lib/librte_ether/rte_ether_version.map
 @@ -6,6 +6,7 @@ DPDK_2.0 {
rte_eth_allmulticast_enable;
rte_eth_allmulticast_get;
rte_eth_dev_allocate;
 +  rte_eth_dev_allocated;
rte_eth_dev_bypass_event_show;
rte_eth_dev_bypass_event_store;
rte_eth_dev_bypass_init;
 @@ -32,9 +33,14 @@ DPDK_2.0 {
rte_eth_dev_filter_supported;
rte_eth_dev_flow_ctrl_get;
rte_eth_dev_flow_ctrl_set;
 +  rte_eth_dev_get_addr_by_port;
 +  rte_eth_dev_get_changed_port;
>>> Hi Tetsuya,
>>>
>>> rte_eth_dev_get_device_type;
>>>
>>> needs to be added to rte_ether_version map to solve linking issue.
>> Hi Bernard,
>>
>> Thanks.
>> Could you please let me know how can I check this linking issue on my 
>> environment?
>> I will add it on my test.
>>
>> Thanks,
>> Tetsuya
>>
> Hi Tetsuya,
>
> In config/common_linuxapp
> With "CONFIG_RTE_BUILD_SHARED_LIB=y"  the follow linking error is occurring
>
>   LD test
> /root/dpdk_sforge_2/x86_64-native-linuxapp-gcc/lib/librte_eal.so: undefined 
> reference to `rte_eth_dev_get_device_type'
> collect2: error: ld returned 1 exit status
> make[5]: *** [test] Error 1
> make[4]: *** [test] Error 2
> make[3]: *** [app] Error 2
> make[2]: *** [all] Error 2
> make[1]: *** [x86_64-native-linuxapp-gcc_install] Error 2
> make: *** [install] Error 2

Okay, I could reproduce it.

Thanks,
Tetsuya

> Regards,
>
> Bernard.
>>> Regards,
>>>
>>> Bernard.
>>>
>>>
rte_eth_dev_get_mtu;
 +  rte_eth_dev_get_name_by_port;
 +  rte_eth_dev_get_port_by_addr;
rte_eth_dev_get_vlan_offload;
rte_eth_dev_info_get;
 +  rte_eth_dev_is_detachable;
rte_eth_dev_mac_addr_add;
rte_eth_dev_mac_addr_remove;
rte_eth_dev_priority_flow_ctrl_set;
 @@ -44,6 +50,7 @@ DPDK_2.0 {
rte_eth_dev_rss_reta_update;
rte_eth_dev_rx_queue_start;
rte_eth_dev_rx_queue_stop;
 +  rte_eth_dev_save;
rte_eth_dev_set_link_down;
rte_eth_dev_set_link_up;
rte_eth_dev_set_mtu;
 --
 1.9.1



[dpdk-dev] [PATCH v5 0/3] new headroom stats library and example application

2015-02-23 Thread Thomas Monjalon
2015-02-20 15:46, Jastrzebski, MichalX K:
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Neil Horman
> > On Thu, Feb 19, 2015 at 01:18:41PM +0100, Pawel Wodkowski wrote:
> > > Hi community,
> > > I would like to introduce library for measuring load of some arbitrary 
> > > jobs.
> > > It can be used to profile every kind of job sets on any arbitrary 
> > > execution unit
> > > or tasking library.
> > >
> > > In provided l2fwd-headroom example I demonstrate how to use this library 
> > > to
> > > select optimal rx burst poll time. Jobs are selected by using existing 
> > > rte_timer
> > > library calls. This example does no limit possible schemes on which this
> > > library can be used.
> > >
> > > Pawel Wodkowski (3):
> > >   librte_headroom: New library for checking core/system/app load
> > >   examples: introduce new l2fwd-headroom example
> > >   MAINTAINERS: claim responsibility for headroom library and example app
> >
> > I'm sorry but I still fail to see how this is a particularly useful 
> > library.  It
> > clearly works fine, but it composes an application event loop in its own
> > terms,
> > and measures stats based on that.  While thats ok, any application is 
> > already
> > going to have to write its own event loop, and can makethe same
> > measurements
> > synchnously within that loop, using alot less code to optimize its polling 
> > time.
> > 
> > In other words, I think this is one of those cases where this library is
> > probably somewhat useful for anyone who just wants to write an application
> > in
> > terms the semantics exposed by this library, but not at all useful for 
> > anyone
> > else.  I'd personally rather not have the extra code to maintain here.
> > 
> > Stephen just gave a presentation at netdev about some of the performance
> > optimization measurements Brocade did with DPDK and how they fine tuned
> > their
> > environment.  One of the big take aways for me was that making time based
> > measurements (especially if it was using the tsc), created cpu stalls that
> > skewed the measurements, and so the best optimizations they made avoided
> > time
> > measurements, opting instead for packet count metrics.
> > 
> > Neil
> 
> Hi Neil,
> 
> I think this library offers something quite useful probably not for everyone, 
> but for many people that use DPDK, and it is measuring quite accurately,
> how many spare cycles a CPU have after executing any serial tasks (as you 
> will know). 
> If you look at two places in example application: main_loop()
> and l2fwd_fwd() functions, you will see two possible approach there, but
> this is not limited to that. You can even nest headroom objects and measure
> process time of particular packets type.
> Of course, this will add an overhead due to the measurements, 
> but that time is also measured, so any user can know what is the relative
> time "wasted" for measuring all this.
> If time delays are measured in bigger timestamps, are handled reliably, 
> the cost of measuring will be low.
> I find this quite similar to the power library case. I would say that library 
> is not useful
> for every application, but there are several cases where it can be 
> (as demonstrated with l3fwd-power app).
> 
> About your last bit, not sure if I understood it right, but in case of the 
> included sample app,
> the main measurement to see if we are overusing a CPU is the packet count
> in a queue (in this case RX queue), and I believe this should be used for 
> other apps,
> especially in those that use a pipeline model, where queues and rings are the 
> key part.
> 
> As a final point, last week (12th of February), there was a request for a 
> tool/library like this
> from a user in the mailing list (Ilan Borenshtein), which indicates that this 
> would be useful
> (probably not just for him, but for others). It probably could be achieved by 
> the user
> by adding their own code, but I believe this library would be a good-to-have,
> in case a user is looking for an easy way to calculate the exposed above.
> Let us give the users an example of this method and we will expand it with 
> more 
> advanced application that may show capabilities of dynamic load scaling based 
> on headroom library measurement.

I wonder how this library is related to DPDK.
I'm not against its integration, though the question must be asked.
DPDK is a set of libraries. What kind of library fit with DPDK goals and
deserve to be integrated?

I don't know whether it's related but nobody acknowledged this patchset.

I also feel that the name of this library is a bit too vague. Some people
were asking first what means "headroom". It's actually for CPU headroom 
monitoring.
What about "cpuheadroom", "cpuheadroomstat", "jobstat"?

Last comment, less important: as many of your colleagues, you don't pay
attention to the copyright dates. I'm pretty sure this code was not written
in 2010. So why claiming it?



[dpdk-dev] [PATCH v4 4/5] eal: add per rx queue interrupt handling based on VFIO

2015-02-23 Thread Zhou, Danny

> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Saturday, February 21, 2015 6:44 AM
> To: Zhou, Danny; Gonzalez Monroy, Sergio
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v4 4/5] eal: add per rx queue interrupt 
> handling based on VFIO
> 
> 2015-02-19 21:48, Zhou Danny:
> > v4 changes:
> > - Adjust position of new-added structure fields
> >
> > v3 changes:
> > - Fix review comments
> >
> > v2 changes:
> > - Fix compilation issue for a missed header file
> > - Bug fix: free unreleased resources on the exception path before return
> > - Consolidate coding style related review comments
> >
> > This patch does below:
> > - Create multiple VFIO eventfd for rx queues.
> > - Handle per rx queue interrupt.
> > - Eliminate unnecessary suspended DPDK polling thread wakeup mechanism
> > for rx interrupt by allowing polling thread epoll_wait rx queue
> > interrupt notification.
> >
> > Signed-off-by: Danny Zhou 
> > Tested-by: Yong Liu 
> [...]
> > --- a/lib/librte_eal/linuxapp/eal/Makefile
> > +++ b/lib/librte_eal/linuxapp/eal/Makefile
> > @@ -43,6 +43,7 @@ CFLAGS += -I$(SRCDIR)/include
> >  CFLAGS += -I$(RTE_SDK)/lib/librte_eal/common
> >  CFLAGS += -I$(RTE_SDK)/lib/librte_eal/common/include
> >  CFLAGS += -I$(RTE_SDK)/lib/librte_ring
> > +CFLAGS += -I$(RTE_SDK)/lib/librte_mbuf
> >  CFLAGS += -I$(RTE_SDK)/lib/librte_mempool
> >  CFLAGS += -I$(RTE_SDK)/lib/librte_malloc
> >  CFLAGS += -I$(RTE_SDK)/lib/librte_ether
> 
> Why do we need mbuf in EAL?

The file eal_interrupts.c includes rte_ethdev.h which defines structure 
rte_eth_devices that 
eal needs to use in order to get per-port intr_handle. The rte_ethdev.h 
includes the rte_mbuf.h
so the Makefile is updated here.


[dpdk-dev] [PATCH v2 4/7] rte_sched: don't clear statistics when read

2015-02-23 Thread Dumitrescu, Cristian


> -Original Message-
> From: Stephen Hemminger [mailto:stephen at networkplumber.org]
> Sent: Saturday, February 21, 2015 1:53 AM
> To: Dumitrescu, Cristian
> Cc: Thomas Monjalon; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2 4/7] rte_sched: don't clear statistics when
> read
> 
> On Fri, 20 Feb 2015 21:28:55 +
> "Dumitrescu, Cristian"  wrote:
> 
> > Agree.
> > Stephen, how about a run-time solution (I agree it would be much better,
> why did I not consider this in the first place?) of adding a new bool 
> parameter
> in struct rte_sched_port_params: clear_stats_on_reset?
> > Both stats read function get the port handle (struct rte_sched_port *) as
> parameter, so there should be no ripple effect to propagate this flag.
> 
> Why not read_and_clear function if absolutely necessary.

I am not sure I understand what you mean. Are you suggesting different set of 
stats read functions, one that is read and clear, while the other one is read 
without clear?

Personally, I think we should avoid proliferating the number of stats 
functions, I would keep a single set of stats read functions, which can clear 
the stats or not, depending on behaviour configured per rte_sched object at 
creation time. Basically, based on the value of configuration parameter struct 
rte_sched_params::clear_stats_on_reset, the stats read functions do clear the 
counters or not. In my opinion, this allows a clean init-time selection of the 
required behaviour, and it also provides backward compatibility. Any issues 
with this approach?




[dpdk-dev] [PATCH v12 00/13] Port Hotplug Framework

2015-02-23 Thread Tetsuya Mukawa
This patch series adds a dynamic port hotplug framework to DPDK.
With the patches, DPDK apps can attach or detach ports at runtime.

The basic concept of the port hotplug is like followings.
- DPDK apps must have responsibility to manage ports.
  DPDK apps only know which ports are attached or detached at the moment.
  The port hotplug framework is implemented to allow DPDK apps to manage ports.
  For example, when DPDK apps call port attach function, attached port number
  will be returned. Also, DPDK apps can detach port by port number.
- Kernel support is needed for attaching or detaching physical device ports.
  To attach a new physical device port, the device will be recognized by
  userspace directly I/O framework in kernel at first. Then DPDK apps can
  call the port hotplug functions to attach ports.
  For detaching, steps are vice versa.
- Before detach ports, ports must be stopped and closed.
  DPDK application must call rte_eth_dev_stop() and rte_eth_dev_close() before
  detaching ports. These function will call finalization codes of PMDs.
  But so far, no PMD frees all resources allocated by initialization.
  It means PMDs are needed to be fixed to support the port hotplug.
  'RTE_PCI_DRV_DETACHABLE' is a new flag indicating a PMD supports detaching.
  Without this flag, detaching will be failed.
- Mustn't affect legacy DPDK apps.
  No DPDK EAL behavior is changed, if the port hotplug functions are't called.
  So all legacy DPDK apps can still work without modifications.

And a few limitations.
- The port hotplug functions are not thread safe.
  DPDK apps should handle it.
- Only support Linux and igb_uio so far.
  BSD and VFIO is not supported. I will send VFIO patches at least, but I don't
  have a plan to submit BSD patch so far.


Here is port hotplug APIs.
---
/**
 * Attach a new device.
 *
 * @param devargs
 *   A pointer to a strings array describing the new device
 *   to be attached. The strings should be a pci address like
 *   ':01:00.0' or virtual device name like 'eth_pcap0'.
 * @param port_id
 *  A pointer to a port identifier actually attached.
 * @return
 *  0 on success and port_id is filled, negative on error
 */
int rte_eal_dev_attach(const char *devargs, uint8_t *port_id);

/**
 * Detach a device.
 *
 * @param port_id
 *   The port identifier of the device to detach.
 * @param addr
 *  A pointer to a device name actually detached.
 * @return
 *  0 on success and devname is filled, negative on error
 */
int rte_eal_dev_detach(uint8_t port_id, char *devname);
---

This patch series are for DPDK EAL. To use port hotplug function by DPDK apps,
each PMD should be fixed to support 'RTE_PCI_DRV_DETACHABLE' flag. Please check
a patch for pcap PMD.

Also, please check testpmd patch. It will show you how to fix your legacy
applications to support port hotplug feature.

PATCH v12 changes
 - Add missing symbol in version map.
   (Thanks to Iremonger, Bernard)

PATCH v11 changes
 - Remove needless devargs handling codes.
 - Replace get_vdev_name() by rte_eal_parse_devargs_str().
 - Replace rte_eal_vdev_find_and_init by rte_eal_vdev_init()
 - Replace rte_eal_vdev_find_and_uninit by rte_eal_vdev_uninit()
 - Fix rte_eal_dev_init() to use rte_eal_vdev_init().
 - Remove needless patch.
   (Thanks to Maxime Leroy)

PATCH v10 changes
 - Add comments.
 - Chagne order of version.map.
 - Fix comment of "rte_ethdev.h".
   (Thanks to Thomas Monjalon)
 - Add size parameter to rte_eth_dev_create_unique_device_name().
   (Thanks to Iremonger, Bernard)

PATCH v9 changes
 - Fix commit title.
 - Fix commit log.
 - Fix comments.
 - Define CONFIG_RTE_LIBRTE_EAL_HOTPLUG at the top of this patch series.
 - DEV_INVALID/VALID are removed.
 - DEV_DISCONNECTED is replaced by DEV_DETACHED.
 - DEV_CONNECTED is replaced by DEV_ATTACHED.
 - rte_eth_dev_allocate_new_port() is renamed to
   rte_eth_dev_find_free_port().
 - rte_eth_dev_validate_port() is renamed to rte_eth_dev_is_valid_port().
 - rte_eth_dev_is_valid_port() is changed not to handle log toggle.
 - eal_compare_pci_addr() is replaced by rte_eal_compare_pci_addr().
 - rte_eth_dev_free() is replaced by rte_eth_dev_release_port().
 - Add a function to create a unique device name.
 - Change parameter of pci_devuninit_t and rte_eth_dev_uninit.
 - Remove code that initiaize callback of ethdev from
   rte_eth_dev_uninit().
 - Remove pci_unmap_device(). It will be implemented in later patch.
 - rte_eth_dev_check_detachable() is replaced by
   rte_eth_dev_is_detachable().
 - strncpy() is replaced by strcpy().
 - Implement pci_unmap_device() in this patch.
 - Remove "rte_dev_hotplug.h".
 - Remove needless "#ifdef".
 - Remove RTE_EAL_INVOKE_TYPE_PROBE/CLOSE.
 - RTE_ETH_DEV_PHYSICAL is replaced by RTE_ETH_DEV_PCI.
 - Use strcmp() instead of strncmp().
 - Remove RTE_EAL_INVOKE_TYPE_PROBE/CLOSE.
   (Thanks to Thomas 

[dpdk-dev] [PATCH v12 01/13] eal: Enable port Hotplug framework in Linux

2015-02-23 Thread Tetsuya Mukawa
The patch adds CONFIG_RTE_LIBRTE_EAL_HOTPLUG in Linux and BSD
configuration. So far, Hotplug functions only support linux.

v9:
- Move this patch at the top of this patch series.
  (Thanks to Thomas Monjalon)

Signed-off-by: Tetsuya Mukawa 
---
 config/common_bsdapp   | 6 ++
 config/common_linuxapp | 5 +
 2 files changed, 11 insertions(+)

diff --git a/config/common_bsdapp b/config/common_bsdapp
index 4c0cfc0..c24f687 100644
--- a/config/common_bsdapp
+++ b/config/common_bsdapp
@@ -116,6 +116,12 @@ CONFIG_RTE_LIBRTE_EAL_BSDAPP=y
 CONFIG_RTE_LIBRTE_EAL_LINUXAPP=n

 #
+# Compile Environment Abstraction Layer to support hotplug
+# So far, Hotplug functions only support linux
+#
+CONFIG_RTE_LIBRTE_EAL_HOTPLUG=n
+
+#
 # Compile Environment Abstraction Layer to support Vmware TSC map
 #
 CONFIG_RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT=y
diff --git a/config/common_linuxapp b/config/common_linuxapp
index 0234236..d66b008 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -114,6 +114,11 @@ CONFIG_RTE_PCI_MAX_READ_REQUEST_SIZE=0
 CONFIG_RTE_LIBRTE_EAL_LINUXAPP=y

 #
+# Compile Environment Abstraction Layer to support hotplug
+#
+CONFIG_RTE_LIBRTE_EAL_HOTPLUG=y
+
+#
 # Compile Environment Abstraction Layer to support Vmware TSC map
 #
 CONFIG_RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT=y
-- 
1.9.1



[dpdk-dev] [PATCH v12 02/13] eal_pci: Add flag to hold kernel driver type

2015-02-23 Thread Tetsuya Mukawa
From: Michael Qiu 

Currently, dpdk has no ability to know which type of driver(
vfio-pci/igb_uio/uio_pci_generic) the device used. It only can
check whether vfio is enabled or not staticly.

It really useful to have the flag, becasue different type need to
handle differently in runtime. For example, pci memory map,
pot hotplug, and so on.

This patch add a flag field for pci device to solve above issue.

Signed-off-by: Michael Qiu 
Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_eal/common/include/rte_pci.h |  8 +
 lib/librte_eal/linuxapp/eal/eal_pci.c   | 53 +++--
 2 files changed, 59 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/common/include/rte_pci.h 
b/lib/librte_eal/common/include/rte_pci.h
index 4301c16..5e0ba00 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -142,6 +142,13 @@ struct rte_pci_addr {

 struct rte_devargs;

+enum rte_pt_driver {
+   RTE_PT_UNKNOWN  = 0,
+   RTE_PT_IGB_UIO  = 1,
+   RTE_PT_VFIO = 2,
+   RTE_PT_UIO_GENERIC  = 3,
+};
+
 /**
  * A structure describing a PCI device.
  */
@@ -155,6 +162,7 @@ struct rte_pci_device {
uint16_t max_vfs;   /**< sriov enable if not zero */
int numa_node;  /**< NUMA node connection */
struct rte_devargs *devargs;/**< Device user arguments */
+   enum rte_pt_driver pt_driver;   /**< Driver of passthrough */
 };

 /** Any PCI device identifier (vendor, device, ...) */
diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c 
b/lib/librte_eal/linuxapp/eal/eal_pci.c
index 63bcbce..9fe2851 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
@@ -97,6 +97,35 @@ error:
return -1;
 }

+static int
+pci_get_kernel_driver_by_path(const char *filename, char *dri_name)
+{
+   int count;
+   char path[PATH_MAX];
+   char *name;
+
+   if (!filename || !dri_name)
+   return -1;
+
+   count = readlink(filename, path, PATH_MAX);
+   if (count >= PATH_MAX)
+   return -1;
+
+   /* For device does not have a driver */
+   if (count < 0)
+   return 1;
+
+   path[count] = '\0';
+
+   name = strrchr(path, '/');
+   if (name) {
+   strncpy(dri_name, name + 1, strlen(name + 1) + 1);
+   return 0;
+   }
+
+   return -1;
+}
+
 void *
 pci_find_max_end_va(void)
 {
@@ -220,11 +249,12 @@ pci_scan_one(const char *dirname, uint16_t domain, 
uint8_t bus,
char filename[PATH_MAX];
unsigned long tmp;
struct rte_pci_device *dev;
+   char driver[PATH_MAX];
+   int ret;

dev = malloc(sizeof(*dev));
-   if (dev == NULL) {
+   if (dev == NULL)
return -1;
-   }

memset(dev, 0, sizeof(*dev));
dev->addr.domain = domain;
@@ -303,6 +333,25 @@ pci_scan_one(const char *dirname, uint16_t domain, uint8_t 
bus,
return -1;
}

+   /* parse driver */
+   snprintf(filename, sizeof(filename), "%s/driver", dirname);
+   ret = pci_get_kernel_driver_by_path(filename, driver);
+   if (!ret) {
+   if (!strcmp(driver, "vfio-pci"))
+   dev->pt_driver = RTE_PT_VFIO;
+   else if (!strcmp(driver, "igb_uio"))
+   dev->pt_driver = RTE_PT_IGB_UIO;
+   else if (!strcmp(driver, "uio_pci_generic"))
+   dev->pt_driver = RTE_PT_UIO_GENERIC;
+   else
+   dev->pt_driver = RTE_PT_UNKNOWN;
+   } else if (ret < 0) {
+   RTE_LOG(ERR, EAL, "Fail to get kernel driver\n");
+   free(dev);
+   return -1;
+   } else
+   dev->pt_driver = RTE_PT_UNKNOWN;
+
/* device is valid, add in list (sorted) */
if (TAILQ_EMPTY(&pci_device_list)) {
TAILQ_INSERT_TAIL(&pci_device_list, dev, next);
-- 
1.9.1



[dpdk-dev] [PATCH v12 03/13] eal_pci: pci memory map work with driver type

2015-02-23 Thread Tetsuya Mukawa
From: Michael Qiu 

With the driver type flag in struct rte_pci_dev, we do not need
to always  map uio devices with vfio related function when
vfio enabled.

Signed-off-by: Michael Qiu 
Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_eal/linuxapp/eal/eal_pci.c | 30 +-
 1 file changed, 17 insertions(+), 13 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c 
b/lib/librte_eal/linuxapp/eal/eal_pci.c
index 9fe2851..c04f897 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
@@ -554,25 +554,29 @@ pci_config_space_set(struct rte_pci_device *dev)
 static int
 pci_map_device(struct rte_pci_device *dev)
 {
-   int ret, mapped = 0;
+   int ret = -1;

/* try mapping the NIC resources using VFIO if it exists */
+   switch (dev->pt_driver) {
+   case RTE_PT_VFIO:
 #ifdef VFIO_PRESENT
-   if (pci_vfio_is_enabled()) {
-   ret = pci_vfio_map_resource(dev);
-   if (ret == 0)
-   mapped = 1;
-   else if (ret < 0)
-   return ret;
-   }
+   if (pci_vfio_is_enabled())
+   ret = pci_vfio_map_resource(dev);
 #endif
-   /* map resources for devices that use uio_pci_generic or igb_uio */
-   if (!mapped) {
+   break;
+   case RTE_PT_IGB_UIO:
+   case RTE_PT_UIO_GENERIC:
+   /* map resources for devices that use uio */
ret = pci_uio_map_resource(dev);
-   if (ret != 0)
-   return ret;
+   break;
+   default:
+   RTE_LOG(DEBUG, EAL, "  Not managed by known pt driver,"
+   " skipped\n");
+   ret = 1;
+   break;
}
-   return 0;
+
+   return ret;
 }

 /*
-- 
1.9.1



[dpdk-dev] [PATCH v12 04/13] eal/pci, ethdev: Remove assumption that port will not be detached

2015-02-23 Thread Tetsuya Mukawa
To remove assumption, do like followings.

This patch adds "RTE_PCI_DRV_DETACHABLE" to drv_flags of rte_pci_driver
structure. The flags indicate the driver can detach devices at runtime.
Also, remove assumption that port will not be detached.

To remove the assumption.
- Add 'attached' member to rte_eth_dev structure.
  This member is used for indicating the port is attached, or not.
  DEV_ATTACHED indicates a port is attached.
  DEV_DETACHED indicates a port is detached.
- Add rte_eth_dev_allocate_new_port().
  This function is used for allocating new port.

v9:
- DEV_INVALID/VALID are removed.
- DEV_DISCONNECTED is replaced by DEV_DETACHED.
- DEV_CONNECTED is replaced by DEV_ATTACHED.
- rte_eth_dev_allocate_new_port() is renamed to
  rte_eth_dev_find_free_port().
- rte_eth_dev_validate_port() is renamed to rte_eth_dev_is_valid_port().
- rte_eth_dev_is_valid_port() is changed not to handle log toggle.
- Fix commit log to describe DEV_ATACHED and DEV_DETACHED.
  (Thanks to Thomas Monjalon)
v8:
- NONE_TRACE is changed to NO_TRACE.
  (Thanks to Iremonger, Bernard)
v5:
- Change parameters of rte_eth_dev_validate_port() to cleanup code.
v4:
- Use braces with 'for' loop.
- Fix indent of 'if' statement.

Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_eal/common/include/rte_pci.h |   2 +
 lib/librte_ether/rte_ethdev.c   | 248 
 lib/librte_ether/rte_ethdev.h   |   5 +
 3 files changed, 164 insertions(+), 91 deletions(-)

diff --git a/lib/librte_eal/common/include/rte_pci.h 
b/lib/librte_eal/common/include/rte_pci.h
index 5e0ba00..ffd13d9 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -210,6 +210,8 @@ struct rte_pci_driver {
 #define RTE_PCI_DRV_FORCE_UNBIND 0x0004
 /** Device driver supports link state interrupt */
 #define RTE_PCI_DRV_INTR_LSC   0x0008
+/** Device driver supports detaching capability */
+#define RTE_PCI_DRV_DETACHABLE 0x0010

 /**< Internal use only - Macro used by pci addr parsing functions **/
 #define GET_PCIADDR_FIELD(in, fd, lim, dlm)   \
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 27bbb0b..0e1e5c9 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -175,6 +175,11 @@ enum {
STAT_QMAP_RX
 };

+enum {
+   DEV_DETACHED = 0,
+   DEV_ATTACHED
+};
+
 static inline void
 rte_eth_dev_data_alloc(void)
 {
@@ -201,19 +206,34 @@ rte_eth_dev_allocated(const char *name)
 {
unsigned i;

-   for (i = 0; i < nb_ports; i++) {
-   if (strcmp(rte_eth_devices[i].data->name, name) == 0)
+   for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
+   if ((rte_eth_devices[i].attached == DEV_ATTACHED) &&
+   strcmp(rte_eth_devices[i].data->name, name) == 0)
return &rte_eth_devices[i];
}
return NULL;
 }

+static uint8_t
+rte_eth_dev_find_free_port(void)
+{
+   unsigned i;
+
+   for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
+   if (rte_eth_devices[i].attached == DEV_DETACHED)
+   return i;
+   }
+   return RTE_MAX_ETHPORTS;
+}
+
 struct rte_eth_dev *
 rte_eth_dev_allocate(const char *name)
 {
+   uint8_t port_id;
struct rte_eth_dev *eth_dev;

-   if (nb_ports == RTE_MAX_ETHPORTS) {
+   port_id = rte_eth_dev_find_free_port();
+   if (port_id == RTE_MAX_ETHPORTS) {
PMD_DEBUG_TRACE("Reached maximum number of Ethernet ports\n");
return NULL;
}
@@ -226,10 +246,12 @@ rte_eth_dev_allocate(const char *name)
return NULL;
}

-   eth_dev = &rte_eth_devices[nb_ports];
-   eth_dev->data = &rte_eth_dev_data[nb_ports];
+   eth_dev = &rte_eth_devices[port_id];
+   eth_dev->data = &rte_eth_dev_data[port_id];
snprintf(eth_dev->data->name, sizeof(eth_dev->data->name), "%s", name);
-   eth_dev->data->port_id = nb_ports++;
+   eth_dev->data->port_id = port_id;
+   eth_dev->attached = DEV_ATTACHED;
+   nb_ports++;
return eth_dev;
 }

@@ -283,6 +305,7 @@ rte_eth_dev_init(struct rte_pci_driver *pci_drv,
(unsigned) pci_dev->id.device_id);
if (rte_eal_process_type() == RTE_PROC_PRIMARY)
rte_free(eth_dev->data->dev_private);
+   eth_dev->attached = DEV_DETACHED;
nb_ports--;
return diag;
 }
@@ -308,10 +331,20 @@ rte_eth_driver_register(struct eth_driver *eth_drv)
rte_eal_pci_register(ð_drv->pci_drv);
 }

+static int
+rte_eth_dev_is_valid_port(uint8_t port_id)
+{
+   if (port_id >= RTE_MAX_ETHPORTS ||
+   rte_eth_devices[port_id].attached != DEV_ATTACHED)
+   return 0;
+   else
+   return 1;
+}
+
 int
 rte_eth_dev_socket_id(uint8_t port_id)
 {
-   if (port_id >= nb_ports)
+   if (!rte_eth_dev_is_valid_port(port_id))
return -1;
return rte_eth_device

[dpdk-dev] [PATCH v12 05/13] eal/pci: Consolidate pci address comparison APIs

2015-02-23 Thread Tetsuya Mukawa
This patch replaces pci_addr_comparison() and memcmp() of pci addresses by
rte_eal_compare_pci_addr().

To compare PCI addresses, rte_eal_compare_pci_addr() doesn't use memcmp().
This is because sizeof(struct rte_pci_addr) returns 6, but actually
this structure is like below.

struct rte_pci_addr {
uint16_t domain;/**< Device domain */
uint8_t bus;/**< Device bus */
uint8_t devid;  /**< Device ID */
uint8_t function;   /**< Device function. */
};

If the structure is dynamically allocated in a function without bzero,
last 1 byte may have value. As a result, memcmp may not work.
To avoid such a case, rte_eal_compare_pci_addr() compare following values.

dev_addr = (addr->domain << 24) | (addr->bus << 16) |
(addr->devid << 8) | addr->function;

v9:
- eal_compare_pci_addr() is replaced by rte_eal_compare_pci_addr().
- Fix commit log.
  (Thanks to Thomas Monjalon)
v8:
- Fix pci_scan_one() to update sysfs values.
  (Thanks to Qiu, Michael and Iremonger, Bernard)
v5:
- Fix pci_scan_one to handle pt_driver correctly.
v4:
- Fix calculation method of eal_compare_pci_addr().
- Add parameter checking.

Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_eal/bsdapp/eal/eal_pci.c   | 29 --
 lib/librte_eal/common/eal_common_pci.c|  2 +-
 lib/librte_eal/common/include/rte_pci.h   | 34 +++
 lib/librte_eal/linuxapp/eal/eal_pci.c | 30 +--
 lib/librte_eal/linuxapp/eal/eal_pci_uio.c |  2 +-
 5 files changed, 63 insertions(+), 34 deletions(-)

diff --git a/lib/librte_eal/bsdapp/eal/eal_pci.c 
b/lib/librte_eal/bsdapp/eal/eal_pci.c
index 74ecce7..9193f80 100644
--- a/lib/librte_eal/bsdapp/eal/eal_pci.c
+++ b/lib/librte_eal/bsdapp/eal/eal_pci.c
@@ -270,20 +270,6 @@ pci_uio_map_resource(struct rte_pci_device *dev)
return (0);
 }

-/* Compare two PCI device addresses. */
-static int
-pci_addr_comparison(struct rte_pci_addr *addr, struct rte_pci_addr *addr2)
-{
-   uint64_t dev_addr = (addr->domain << 24) + (addr->bus << 16) + 
(addr->devid << 8) + addr->function;
-   uint64_t dev_addr2 = (addr2->domain << 24) + (addr2->bus << 16) + 
(addr2->devid << 8) + addr2->function;
-
-   if (dev_addr > dev_addr2)
-   return 1;
-   else
-   return 0;
-}
-
-
 /* Scan one pci sysfs entry, and fill the devices list from it. */
 static int
 pci_scan_one(int dev_pci_fd, struct pci_conf *conf)
@@ -356,13 +342,24 @@ pci_scan_one(int dev_pci_fd, struct pci_conf *conf)
}
else {
struct rte_pci_device *dev2 = NULL;
+   int ret;

TAILQ_FOREACH(dev2, &pci_device_list, next) {
-   if (pci_addr_comparison(&dev->addr, &dev2->addr))
+   ret = rte_eal_compare_pci_addr(&dev->addr, &dev2->addr);
+   if (ret > 0)
continue;
-   else {
+   else if (ret < 0) {
TAILQ_INSERT_BEFORE(dev2, dev, next);
return 0;
+   } else { /* already registered */
+   /* update pt_driver */
+   dev2->pt_driver = dev->pt_driver;
+   dev2->max_vfs = dev->max_vfs;
+   memmove(dev2->mem_resource,
+   dev->mem_resource,
+   sizeof(dev->mem_resource));
+   free(dev);
+   return 0;
}
}
TAILQ_INSERT_TAIL(&pci_device_list, dev, next);
diff --git a/lib/librte_eal/common/eal_common_pci.c 
b/lib/librte_eal/common/eal_common_pci.c
index f3c7f71..bf2793f 100644
--- a/lib/librte_eal/common/eal_common_pci.c
+++ b/lib/librte_eal/common/eal_common_pci.c
@@ -93,7 +93,7 @@ static struct rte_devargs *pci_devargs_lookup(struct 
rte_pci_device *dev)
if (devargs->type != RTE_DEVTYPE_BLACKLISTED_PCI &&
devargs->type != RTE_DEVTYPE_WHITELISTED_PCI)
continue;
-   if (!memcmp(&dev->addr, &devargs->pci.addr, sizeof(dev->addr)))
+   if (!rte_eal_compare_pci_addr(&dev->addr, &devargs->pci.addr))
return devargs;
}
return NULL;
diff --git a/lib/librte_eal/common/include/rte_pci.h 
b/lib/librte_eal/common/include/rte_pci.h
index ffd13d9..6814e91 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -272,6 +272,40 @@ eal_parse_pci_DomBDF(const char *input, struct 
rte_pci_addr *dev_addr)
 }
 #undef GET_PCIADDR_FIELD

+/* Compare two PCI device addresses. */
+/**
+ * Utility function to compare two PCI device addresses.
+ *
+ 

[dpdk-dev] [PATCH v12 06/13] ethdev: Add rte_eth_dev_release_port to release specified port

2015-02-23 Thread Tetsuya Mukawa
This patch adds rte_eth_dev_release_port(). The function is used for
changing an attached status of the device that has specified name.

v9:
- rte_eth_dev_free() is replaced by rte_eth_dev_release_port().
  (Thanks to Thomas Monjalon)
v6:
- Use rte_eth_dev structure as the paramter of rte_eth_dev_free().
v4:
- Add parameter checking.

Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_ether/rte_ethdev.c | 11 +++
 lib/librte_ether/rte_ethdev.h | 12 
 2 files changed, 23 insertions(+)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 0e1e5c9..8d271ae 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -255,6 +255,17 @@ rte_eth_dev_allocate(const char *name)
return eth_dev;
 }

+int
+rte_eth_dev_release_port(struct rte_eth_dev *eth_dev)
+{
+   if (eth_dev == NULL)
+   return -EINVAL;
+
+   eth_dev->attached = 0;
+   nb_ports--;
+   return 0;
+}
+
 static int
 rte_eth_dev_init(struct rte_pci_driver *pci_drv,
 struct rte_pci_device *pci_dev)
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index ef31bda..8016a51 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1487,6 +1487,18 @@ extern uint8_t rte_eth_dev_count(void);
  */
 struct rte_eth_dev *rte_eth_dev_allocate(const char *name);

+/**
+ * Function for internal use by dummy drivers primarily, e.g. ring-based
+ * driver.
+ * Release the specified ethdev port.
+ *
+ * @param eth_dev
+ * The *eth_dev* pointer is the address of the *rte_eth_dev* structure.
+ * @return
+ *   - 0 on success, negative on error
+ */
+int rte_eth_dev_release_port(struct rte_eth_dev *eth_dev);
+
 struct eth_driver;
 /**
  * @internal
-- 
1.9.1



[dpdk-dev] [PATCH v12 07/13] eal, ethdev: Add a function and function pointers to close ether device

2015-02-23 Thread Tetsuya Mukawa
The patch adds function pointer to rte_pci_driver and eth_driver
structure. These function pointers are used when ports are detached.
Also, the patch adds rte_eth_dev_uninit(). So far, it's not called
by anywhere, but it will be called when port hotplug function is
implemented.

v10:
- Add size parameter to rte_eth_dev_create_unique_device_name().
  (Thanks to Iremonger, Bernard)
v9:
- Change parameter of pci_devuninit_t and rte_eth_dev_uninit.
- Remove code that initiaize callback of ethdev from
  rte_eth_dev_uninit().
- Add a function to create a unique device name.
  (Thanks to Thomas Monjalon)
v6:
- Fix rte_eth_dev_uninit() to handle a return value of uninit
  function of PMD.
v4:
- Add parameter checking.
- Change function names.

Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_eal/common/include/rte_pci.h |  6 
 lib/librte_ether/rte_ethdev.c   | 64 +++--
 lib/librte_ether/rte_ethdev.h   | 24 +
 3 files changed, 92 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/common/include/rte_pci.h 
b/lib/librte_eal/common/include/rte_pci.h
index 6814e91..4ea57cb 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -192,12 +192,18 @@ struct rte_pci_driver;
 typedef int (pci_devinit_t)(struct rte_pci_driver *, struct rte_pci_device *);

 /**
+ * Uninitialisation function for the driver called during hotplugging.
+ */
+typedef int (pci_devuninit_t)(struct rte_pci_device *);
+
+/**
  * A structure describing a PCI driver.
  */
 struct rte_pci_driver {
TAILQ_ENTRY(rte_pci_driver) next;   /**< Next in list. */
const char *name;   /**< Driver name. */
pci_devinit_t *devinit; /**< Device init. function. */
+   pci_devuninit_t *devuninit; /**< Device uninit function. */
struct rte_pci_id *id_table;/**< ID table, NULL terminated. 
*/
uint32_t drv_flags; /**< Flags contolling handling 
of device. */
 };
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 8d271ae..3d148e2 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -266,6 +266,24 @@ rte_eth_dev_release_port(struct rte_eth_dev *eth_dev)
return 0;
 }

+static inline int
+rte_eth_dev_create_unique_device_name(char *name, size_t size,
+   struct rte_pci_device *pci_dev)
+{
+   int ret;
+
+   if ((name == NULL) || (pci_dev == NULL))
+   return -EINVAL;
+
+   ret = snprintf(name, size, "%d:%d.%d",
+   pci_dev->addr.bus, pci_dev->addr.devid,
+   pci_dev->addr.function);
+   if (ret < 0)
+   return ret;
+
+   return 0;
+}
+
 static int
 rte_eth_dev_init(struct rte_pci_driver *pci_drv,
 struct rte_pci_device *pci_dev)
@@ -279,8 +297,8 @@ rte_eth_dev_init(struct rte_pci_driver *pci_drv,
eth_drv = (struct eth_driver *)pci_drv;

/* Create unique Ethernet device name using PCI address */
-   snprintf(ethdev_name, RTE_ETH_NAME_MAX_LEN, "%d:%d.%d",
-   pci_dev->addr.bus, pci_dev->addr.devid, 
pci_dev->addr.function);
+   rte_eth_dev_create_unique_device_name(ethdev_name,
+   sizeof(ethdev_name), pci_dev);

eth_dev = rte_eth_dev_allocate(ethdev_name);
if (eth_dev == NULL)
@@ -321,6 +339,47 @@ rte_eth_dev_init(struct rte_pci_driver *pci_drv,
return diag;
 }

+static int
+rte_eth_dev_uninit(struct rte_pci_device *pci_dev)
+{
+   const struct eth_driver *eth_drv;
+   struct rte_eth_dev *eth_dev;
+   char ethdev_name[RTE_ETH_NAME_MAX_LEN];
+   int ret;
+
+   if (pci_dev == NULL)
+   return -EINVAL;
+
+   /* Create unique Ethernet device name using PCI address */
+   rte_eth_dev_create_unique_device_name(ethdev_name,
+   sizeof(ethdev_name), pci_dev);
+
+   eth_dev = rte_eth_dev_allocated(ethdev_name);
+   if (eth_dev == NULL)
+   return -ENODEV;
+
+   eth_drv = (const struct eth_driver *)pci_dev->driver;
+
+   /* Invoke PMD device uninit function */
+   if (*eth_drv->eth_dev_uninit) {
+   ret = (*eth_drv->eth_dev_uninit)(eth_drv, eth_dev);
+   if (ret)
+   return ret;
+   }
+
+   /* free ether device */
+   rte_eth_dev_release_port(eth_dev);
+
+   if (rte_eal_process_type() == RTE_PROC_PRIMARY)
+   rte_free(eth_dev->data->dev_private);
+
+   eth_dev->pci_dev = NULL;
+   eth_dev->driver = NULL;
+   eth_dev->data = NULL;
+
+   return 0;
+}
+
 /**
  * Register an Ethernet [Poll Mode] driver.
  *
@@ -339,6 +398,7 @@ void
 rte_eth_driver_register(struct eth_driver *eth_drv)
 {
eth_drv->pci_drv.devinit = rte_eth_dev_init;
+   eth_drv->pci_drv.devuninit = rte_eth_dev_uninit;
rte_eal_pci_regis

[dpdk-dev] [PATCH v12 08/13] ethdev: Add functions that will be used by port hotplug functions

2015-02-23 Thread Tetsuya Mukawa
The patch adds following functions.

- rte_eth_dev_save()
  The function is used for saving current rte_eth_dev structures.
- rte_eth_dev_get_changed_port()
  The function receives the rte_eth_dev structures, then compare
  these with current values to know which port is actually
  attached or detached.
- rte_eth_dev_get_addr_by_port()
  The function returns a pci address of an ethdev specified by port
  identifier.
- rte_eth_dev_get_port_by_addr()
  The function returns a port identifier of an ethdev specified by
  pci address.
- rte_eth_dev_get_name_by_port()
  The function returns a unique identifier name of an ethdev
  specified by port identifier.
- Add rte_eth_dev_is_detachable()
  The function returns whether a PMD supports detach function.

Also, the patch changes scope of rte_eth_dev_allocated() to global.
This function will be called by virtual PMDs to support port hotplug.
So change scope of the function to global.

v10:
- Change order of version.map.
  (Thanks to Thomas Monjalon)
v9:
- rte_eth_dev_check_detachable() is replaced by
  rte_eth_dev_is_detachable().
- strncpy() is replaced by strcpy().
  (Thanks to Thomas Monjalon)
- Add missing symbol in version map.
  (Thanks to Nail Horman)
v8:
- Add size parameter to rte_eth_dev_save().
- Add missing symbol in version map.
  (Thanks to Qiu, Michael and Iremonger, Bernard)
v7:
- Add pt_driver checking to rte_eth_dev_check_detachable().
  (Thanks to Qiu, Michael)
v5:
- Fix return value of below functions.
  rte_eth_dev_get_changed_port().
  rte_eth_dev_get_port_by_addr().
v4:
- Add parameter checking.
v3:
- Fix if-condition bug while comparing pci addresses.
- Add error checking codes.
Reported-by: Mark Enright 

Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_ether/rte_ethdev.c  | 103 -
 lib/librte_ether/rte_ethdev.h  |  83 ++
 lib/librte_ether/rte_ether_version.map |   7 +++
 3 files changed, 192 insertions(+), 1 deletion(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 3d148e2..7067620 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -201,7 +201,7 @@ rte_eth_dev_data_alloc(void)
RTE_MAX_ETHPORTS * sizeof(*rte_eth_dev_data));
 }

-static struct rte_eth_dev *
+struct rte_eth_dev *
 rte_eth_dev_allocated(const char *name)
 {
unsigned i;
@@ -426,6 +426,107 @@ rte_eth_dev_count(void)
return (nb_ports);
 }

+int
+rte_eth_dev_save(struct rte_eth_dev *devs, size_t size)
+{
+   if ((devs == NULL) ||
+   (size != sizeof(struct rte_eth_dev) * RTE_MAX_ETHPORTS))
+   return -EINVAL;
+
+   /* save current rte_eth_devices */
+   memcpy(devs, rte_eth_devices, size);
+   return 0;
+}
+
+int
+rte_eth_dev_get_changed_port(struct rte_eth_dev *devs, uint8_t *port_id)
+{
+   if ((devs == NULL) || (port_id == NULL))
+   return -EINVAL;
+
+   /* check which port was attached or detached */
+   for (*port_id = 0; *port_id < RTE_MAX_ETHPORTS; (*port_id)++, devs++) {
+   if (rte_eth_devices[*port_id].attached ^ devs->attached)
+   return 0;
+   }
+   return -ENODEV;
+}
+
+int
+rte_eth_dev_get_addr_by_port(uint8_t port_id, struct rte_pci_addr *addr)
+{
+   if (!rte_eth_dev_is_valid_port(port_id)) {
+   PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
+   return -EINVAL;
+   }
+
+   if (addr == NULL) {
+   PMD_DEBUG_TRACE("Null pointer is specified\n");
+   return -EINVAL;
+   }
+
+   *addr = rte_eth_devices[port_id].pci_dev->addr;
+   return 0;
+}
+
+int
+rte_eth_dev_get_port_by_addr(struct rte_pci_addr *addr, uint8_t *port_id)
+{
+   struct rte_pci_addr *tmp;
+
+   if ((addr == NULL) || (port_id == NULL)) {
+   PMD_DEBUG_TRACE("Null pointer is specified\n");
+   return -EINVAL;
+   }
+
+   for (*port_id = 0; *port_id < RTE_MAX_ETHPORTS; (*port_id)++) {
+   if (!rte_eth_devices[*port_id].attached)
+   continue;
+   if (!rte_eth_devices[*port_id].pci_dev)
+   continue;
+   tmp = &rte_eth_devices[*port_id].pci_dev->addr;
+   if (rte_eal_compare_pci_addr(tmp, addr) == 0)
+   return 0;
+   }
+   return -ENODEV;
+}
+
+int
+rte_eth_dev_get_name_by_port(uint8_t port_id, char *name)
+{
+   char *tmp;
+
+   if (!rte_eth_dev_is_valid_port(port_id)) {
+   PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
+   return -EINVAL;
+   }
+
+   if (name == NULL) {
+   PMD_DEBUG_TRACE("Null pointer is specified\n");
+   return -EINVAL;
+   }
+
+   /* shouldn't check 'rte_eth_devices[i].data',
+* because it might be overwritten by VDEV PMD */
+   tmp = rte_eth_dev_data[port_id].name;
+   strcpy(name,

[dpdk-dev] [PATCH v12 09/13] eal/linux/pci: Add functions for unmapping igb_uio resources

2015-02-23 Thread Tetsuya Mukawa
The patch adds functions for unmapping igb_uio resources. The patch is only
for Linux and igb_uio environment. VFIO and BSD are not supported.

v9:
- Remove "rte_dev_hotplug.h".
- Remove needless "#ifdef".
  (Thanks to Thomas Monjalon and Neil Horman)
- Remove pci_unmap_device(). It will be implemented in later patch.
v8:
- Fix typo.
  (Thanks to Iremonger, Bernard)
v5:
- Fix pci_unmap_device() to check pt_driver.
v4:
- Add parameter checking.
- Add header file to determine if hotplug can be enabled.

Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_eal/linuxapp/eal/eal_pci.c  | 17 
 lib/librte_eal/linuxapp/eal/eal_pci_init.h |  7 
 lib/librte_eal/linuxapp/eal/eal_pci_uio.c  | 65 ++
 3 files changed, 89 insertions(+)

diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c 
b/lib/librte_eal/linuxapp/eal/eal_pci.c
index e6cead1..17f32c0 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
@@ -167,6 +167,23 @@ pci_map_resource(void *requested_addr, int fd, off_t 
offset, size_t size)
return mapaddr;
 }

+/* unmap a particular resource */
+void
+pci_unmap_resource(void *requested_addr, size_t size)
+{
+   if (requested_addr == NULL)
+   return;
+
+   /* Unmap the PCI memory resource of device */
+   if (munmap(requested_addr, size)) {
+   RTE_LOG(ERR, EAL, "%s(): cannot munmap(%p, 0x%lx): %s\n",
+   __func__, requested_addr, (unsigned long)size,
+   strerror(errno));
+   } else
+   RTE_LOG(DEBUG, EAL, "  PCI memory unmapped at %p\n",
+   requested_addr);
+}
+
 /* parse the "resource" sysfs file */
 static int
 pci_parse_sysfs_resource(const char *filename, struct rte_pci_device *dev)
diff --git a/lib/librte_eal/linuxapp/eal/eal_pci_init.h 
b/lib/librte_eal/linuxapp/eal/eal_pci_init.h
index 1070eb8..e2dd8a5 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci_init.h
+++ b/lib/librte_eal/linuxapp/eal/eal_pci_init.h
@@ -71,6 +71,13 @@ void *pci_map_resource(void *requested_addr, int fd, off_t 
offset,
 /* map IGB_UIO resource prototype */
 int pci_uio_map_resource(struct rte_pci_device *dev);

+void pci_unmap_resource(void *requested_addr, size_t size);
+
+#ifdef RTE_LIBRTE_EAL_HOTPLUG
+/* unmap IGB_UIO resource prototype */
+void pci_uio_unmap_resource(struct rte_pci_device *dev);
+#endif /* RTE_LIBRTE_EAL_HOTPLUG */
+
 #ifdef VFIO_PRESENT

 #define VFIO_MAX_GROUPS 64
diff --git a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c 
b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
index f7acc55..ff4d0e8 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
@@ -433,3 +433,68 @@ pci_uio_map_resource(struct rte_pci_device *dev)

return 0;
 }
+
+#ifdef RTE_LIBRTE_EAL_HOTPLUG
+static void
+pci_uio_unmap(struct mapped_pci_resource *uio_res)
+{
+   int i;
+
+   if (uio_res == NULL)
+   return;
+
+   for (i = 0; i != uio_res->nb_maps; i++)
+   pci_unmap_resource(uio_res->maps[i].addr,
+   (size_t)uio_res->maps[i].size);
+}
+
+static struct mapped_pci_resource *
+pci_uio_find_resource(struct rte_pci_device *dev)
+{
+   struct mapped_pci_resource *uio_res;
+
+   if (dev == NULL)
+   return NULL;
+
+   TAILQ_FOREACH(uio_res, pci_res_list, next) {
+
+   /* skip this element if it doesn't match our PCI address */
+   if (!rte_eal_compare_pci_addr(&uio_res->pci_addr, &dev->addr))
+   return uio_res;
+   }
+   return NULL;
+}
+
+/* unmap the PCI resource of a PCI device in virtual memory */
+void
+pci_uio_unmap_resource(struct rte_pci_device *dev)
+{
+   struct mapped_pci_resource *uio_res;
+
+   if (dev == NULL)
+   return;
+
+   /* find an entry for the device */
+   uio_res = pci_uio_find_resource(dev);
+   if (uio_res == NULL)
+   return;
+
+   /* secondary processes - just free maps */
+   if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+   return pci_uio_unmap(uio_res);
+
+   TAILQ_REMOVE(pci_res_list, uio_res, next);
+
+   /* unmap all resources */
+   pci_uio_unmap(uio_res);
+
+   /* free uio resource */
+   rte_free(uio_res);
+
+   /* close fd if in primary process */
+   close(dev->intr_handle.fd);
+
+   dev->intr_handle.fd = -1;
+   dev->intr_handle.type = RTE_INTR_HANDLE_UNKNOWN;
+}
+#endif /* RTE_LIBRTE_EAL_HOTPLUG */
-- 
1.9.1



[dpdk-dev] [PATCH v12 10/13] eal/pci: Add probe and close functions of pci driver

2015-02-23 Thread Tetsuya Mukawa
- Add pci_close_all_drivers()
  The function tries to find a driver for the specified device, and
  then close the driver.
- Add rte_eal_pci_probe_one() and rte_eal_pci_close_one()
  The functions are used for probe and close a device.
  First the function tries to find a device that has the specified
  PCI address. Then, probe or close the device.

v9:
- Fix commit title.
- Remove RTE_EAL_INVOKE_TYPE_PROBE/CLOSE.
  (Thanks to Thomas Monjalon)
- Implement pci_unmap_device() in this patch.
v5:
- Remove RTE_EAL_INVOKE_TYPE_UNKNOWN, because it's unused.
v4:
- Fix parameter checking.
- Fix indent of 'if' statement.

Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_eal/common/eal_common_pci.c  | 98 -
 lib/librte_eal/common/eal_private.h | 15 +
 lib/librte_eal/common/include/rte_pci.h | 32 +++
 lib/librte_eal/linuxapp/eal/eal_pci.c   | 94 +++
 4 files changed, 238 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/common/eal_common_pci.c 
b/lib/librte_eal/common/eal_common_pci.c
index bf2793f..5b6b55d 100644
--- a/lib/librte_eal/common/eal_common_pci.c
+++ b/lib/librte_eal/common/eal_common_pci.c
@@ -108,7 +108,10 @@ static int
 pci_probe_all_drivers(struct rte_pci_device *dev)
 {
struct rte_pci_driver *dr = NULL;
-   int rc;
+   int rc = 0;
+
+   if (dev == NULL)
+   return -1;

TAILQ_FOREACH(dr, &pci_driver_list, next) {
rc = rte_eal_pci_probe_one_driver(dr, dev);
@@ -123,6 +126,99 @@ pci_probe_all_drivers(struct rte_pci_device *dev)
return 1;
 }

+#ifdef RTE_LIBRTE_EAL_HOTPLUG
+/*
+ * If vendor/device ID match, call the devuninit() function of all
+ * registered driver for the given device. Return -1 if initialization
+ * failed, return 1 if no driver is found for this device.
+ */
+static int
+pci_close_all_drivers(struct rte_pci_device *dev)
+{
+   struct rte_pci_driver *dr = NULL;
+   int rc = 0;
+
+   if (dev == NULL)
+   return -1;
+
+   TAILQ_FOREACH(dr, &pci_driver_list, next) {
+   rc = rte_eal_pci_close_one_driver(dr, dev);
+   if (rc < 0)
+   /* negative value is an error */
+   return -1;
+   if (rc > 0)
+   /* positive value means driver not found */
+   continue;
+   return 0;
+   }
+   return 1;
+}
+
+/*
+ * Find the pci device specified by pci address, then invoke probe function of
+ * the driver of the devive.
+ */
+int
+rte_eal_pci_probe_one(struct rte_pci_addr *addr)
+{
+   struct rte_pci_device *dev = NULL;
+   int ret = 0;
+
+   if (addr == NULL)
+   return -1;
+
+   TAILQ_FOREACH(dev, &pci_device_list, next) {
+   if (rte_eal_compare_pci_addr(&dev->addr, addr))
+   continue;
+
+   ret = pci_probe_all_drivers(dev);
+   if (ret < 0)
+   goto err_return;
+   return 0;
+   }
+   return -1;
+
+err_return:
+   RTE_LOG(WARNING, EAL, "Requested device " PCI_PRI_FMT
+   " cannot be used\n", dev->addr.domain, dev->addr.bus,
+   dev->addr.devid, dev->addr.function);
+   return -1;
+}
+
+/*
+ * Find the pci device specified by pci address, then invoke close function of
+ * the driver of the devive.
+ */
+int
+rte_eal_pci_close_one(struct rte_pci_addr *addr)
+{
+   struct rte_pci_device *dev = NULL;
+   int ret = 0;
+
+   if (addr == NULL)
+   return -1;
+
+   TAILQ_FOREACH(dev, &pci_device_list, next) {
+   if (rte_eal_compare_pci_addr(&dev->addr, addr))
+   continue;
+
+   ret = pci_close_all_drivers(dev);
+   if (ret < 0)
+   goto err_return;
+
+   TAILQ_REMOVE(&pci_device_list, dev, next);
+   return 0;
+   }
+   return -1;
+
+err_return:
+   RTE_LOG(WARNING, EAL, "Requested device " PCI_PRI_FMT
+   " cannot be used\n", dev->addr.domain, dev->addr.bus,
+   dev->addr.devid, dev->addr.function);
+   return -1;
+}
+#endif /* RTE_LIBRTE_EAL_HOTPLUG */
+
 /*
  * Scan the content of the PCI bus, and call the devinit() function for
  * all registered drivers that have a matching entry in its id_table
diff --git a/lib/librte_eal/common/eal_private.h 
b/lib/librte_eal/common/eal_private.h
index 159cd66..4acf5a0 100644
--- a/lib/librte_eal/common/eal_private.h
+++ b/lib/librte_eal/common/eal_private.h
@@ -165,6 +165,21 @@ int rte_eal_pci_probe_one_driver(struct rte_pci_driver *dr,
struct rte_pci_device *dev);

 /**
+ * Munmap memory for single PCI device
+ *
+ * This function is private to EAL.
+ *
+ * @param  dr
+ *  The pointer to the pci driver structure
+ * @param  dev
+ *  The pointer to the pci device structure
+ * @return
+ 

[dpdk-dev] [PATCH v12 11/13] ethdev: Add one dev_type parameter to rte_eth_dev_allocate

2015-02-23 Thread Tetsuya Mukawa
This new parameter is needed to keep device type like PCI or virtual.
Port detaching processes are different between PCI device and virtual
device.
RTE_ETH_DEV_PCI indicates device type is PCI. RTE_ETH_DEV_VIRTUAL
indicates device is virtual.

v12:
- Add missing symbol in version map.
  (Thanks to Iremonger, Bernard)
v10:
- Change order of version.map.
  (Thanks to Thomas Monjalon)
- Fix comment of "rte_ethdev.h".
  (Thanks to Thomas Monjalon)
v9:
- Fix commit log.
- RTE_ETH_DEV_PHYSICAL is replaced by RTE_ETH_DEV_PCI.
  (Thanks to Thomas Monjalon)
v8:
- NONE_TRACE is replaced by NO_TRACE.
- Add missing symbol in version map.
  (Thanks to Qiu, Michael and Iremonger, Bernard)
v4:
- Fix comments of rte_eth_dev_type.

Signed-off-by: Tetsuya Mukawa 
---
 app/test/virtual_pmd.c   |  2 +-
 lib/librte_ether/rte_ethdev.c| 25 +++--
 lib/librte_ether/rte_ethdev.h| 25 -
 lib/librte_ether/rte_ether_version.map   |  1 +
 lib/librte_pmd_af_packet/rte_eth_af_packet.c |  2 +-
 lib/librte_pmd_bond/rte_eth_bond_api.c   |  2 +-
 lib/librte_pmd_pcap/rte_eth_pcap.c   |  2 +-
 lib/librte_pmd_ring/rte_eth_ring.c   |  2 +-
 lib/librte_pmd_xenvirt/rte_eth_xenvirt.c |  2 +-
 9 files changed, 54 insertions(+), 9 deletions(-)

diff --git a/app/test/virtual_pmd.c b/app/test/virtual_pmd.c
index cd9faf3..01a3913 100644
--- a/app/test/virtual_pmd.c
+++ b/app/test/virtual_pmd.c
@@ -580,7 +580,7 @@ virtual_ethdev_create(const char *name, struct ether_addr 
*mac_addr,
goto err;

/* reserve an ethdev entry */
-   eth_dev = rte_eth_dev_allocate(name);
+   eth_dev = rte_eth_dev_allocate(name, RTE_ETH_DEV_PCI);
if (eth_dev == NULL)
goto err;

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 7067620..f176f1e 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -227,7 +227,7 @@ rte_eth_dev_find_free_port(void)
 }

 struct rte_eth_dev *
-rte_eth_dev_allocate(const char *name)
+rte_eth_dev_allocate(const char *name, enum rte_eth_dev_type type)
 {
uint8_t port_id;
struct rte_eth_dev *eth_dev;
@@ -251,6 +251,7 @@ rte_eth_dev_allocate(const char *name)
snprintf(eth_dev->data->name, sizeof(eth_dev->data->name), "%s", name);
eth_dev->data->port_id = port_id;
eth_dev->attached = DEV_ATTACHED;
+   eth_dev->dev_type = type;
nb_ports++;
return eth_dev;
 }
@@ -262,6 +263,7 @@ rte_eth_dev_release_port(struct rte_eth_dev *eth_dev)
return -EINVAL;

eth_dev->attached = 0;
+   eth_dev->dev_type = RTE_ETH_DEV_UNKNOWN;
nb_ports--;
return 0;
 }
@@ -300,7 +302,7 @@ rte_eth_dev_init(struct rte_pci_driver *pci_drv,
rte_eth_dev_create_unique_device_name(ethdev_name,
sizeof(ethdev_name), pci_dev);

-   eth_dev = rte_eth_dev_allocate(ethdev_name);
+   eth_dev = rte_eth_dev_allocate(ethdev_name, RTE_ETH_DEV_PCI);
if (eth_dev == NULL)
return -ENOMEM;

@@ -426,6 +428,14 @@ rte_eth_dev_count(void)
return (nb_ports);
 }

+enum rte_eth_dev_type
+rte_eth_dev_get_device_type(uint8_t port_id)
+{
+   if (!rte_eth_dev_is_valid_port(port_id))
+   return -1;
+   return rte_eth_devices[port_id].dev_type;
+}
+
 int
 rte_eth_dev_save(struct rte_eth_dev *devs, size_t size)
 {
@@ -523,6 +533,17 @@ rte_eth_dev_is_detachable(uint8_t port_id)
return -EINVAL;
}

+   if (rte_eth_devices[port_id].dev_type == RTE_ETH_DEV_PCI) {
+   switch (rte_eth_devices[port_id].pci_dev->pt_driver) {
+   case RTE_PT_IGB_UIO:
+   case RTE_PT_UIO_GENERIC:
+   break;
+   case RTE_PT_VFIO:
+   default:
+   return -ENOTSUP;
+   }
+   }
+
drv_flags = rte_eth_devices[port_id].driver->pci_drv.drv_flags;
return !(drv_flags & RTE_PCI_DRV_DETACHABLE);
 }
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index d4cfafb..1a978ed 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1382,6 +1382,17 @@ struct eth_dev_ops {
 };

 /**
+ * The eth device type
+ */
+enum rte_eth_dev_type {
+   RTE_ETH_DEV_UNKNOWN,/**< unknown device type */
+   RTE_ETH_DEV_PCI,
+   /**< Physical function and Virtual function of PCI devices */
+   RTE_ETH_DEV_VIRTUAL,/**< non hardware device */
+   RTE_ETH_DEV_MAX /**< max value of this enum */
+};
+
+/**
  * @internal
  * The generic data structure associated with each ethernet device.
  *
@@ -1400,6 +1411,7 @@ struct rte_eth_dev {
struct rte_pci_device *pci_dev; /**< PCI info. supplied by probing */
struct rte_eth_dev_cb_list callbacks; /**< User application callbacks */
uint8_t attached; /*

[dpdk-dev] [PATCH v12 12/13] eal/pci: Add rte_eal_dev_attach/detach() functions

2015-02-23 Thread Tetsuya Mukawa
These functions are used for attaching or detaching a port.
When rte_eal_dev_attach() is called, the function tries to realize the
device name as pci address. If this is done successfully,
rte_eal_dev_attach() will attach physical device port. If not, attaches
virtual devive port.
When rte_eal_dev_detach() is called, the function gets the device type
of this port to know whether the port is come from physical or virtual.
And then specific detaching function will be called.

v11:
- Remove needless devargs handling codes.
- Replace get_vdev_name() by rte_eal_parse_devargs_str().
- Replace rte_eal_vdev_find_and_init by rte_eal_vdev_init()
- Replace rte_eal_vdev_find_and_uninit by rte_eal_vdev_uninit()
- Fix rte_eal_dev_init() to use rte_eal_vdev_init().
  (Thanks to Maxime Leroy)
v10:
- Add comments.
- Change order of version.map.
  (Thanks to Thomas Monjalon)
v9:
- Fix comments.
- Use strcmp() instead of strncmp().
- Remove RTE_EAL_INVOKE_TYPE_PROBE/CLOSE.
- Change definition of rte_dev_uninit_t.
  (Thanks to Thomas Monjalon and Maxime Leroy)
v8:
- Add missing symbol in version map.
  (Thanks to Qiu, Michael and Iremonger, Bernard)
v7:
- Fix typo of warning messages.
  (Thanks to Qiu, Michael)
v5:
- Change function names like below.
  rte_eal_dev_find_and_invoke() to rte_eal_vdev_find_and_invoke().
  rte_eal_dev_invoke() to rte_eal_vdev_invoke().
- Add code to handle a return value of rte_eal_devargs_remove().
- Fix pci address format in rte_eal_dev_detach().
v4:
- Fix comment.
- Add error checking.
- Fix indent of 'if' statement.
- Change function name.

Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_eal/common/eal_common_dev.c  | 285 ++--
 lib/librte_eal/common/eal_common_devargs.c  |  46 ++--
 lib/librte_eal/common/eal_private.h |  11 +
 lib/librte_eal/common/include/rte_dev.h |  33 +++
 lib/librte_eal/common/include/rte_devargs.h |  28 +++
 lib/librte_eal/linuxapp/eal/Makefile|   1 +
 lib/librte_eal/linuxapp/eal/eal_pci.c   |   6 +-
 lib/librte_eal/linuxapp/eal/rte_eal_version.map |   2 +
 8 files changed, 378 insertions(+), 34 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_dev.c 
b/lib/librte_eal/common/eal_common_dev.c
index eae5656..7d4dce6 100644
--- a/lib/librte_eal/common/eal_common_dev.c
+++ b/lib/librte_eal/common/eal_common_dev.c
@@ -32,10 +32,13 @@
  *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */

+#include 
+#include 
 #include 
 #include 
 #include 

+#include 
 #include 
 #include 
 #include 
@@ -61,6 +64,37 @@ rte_eal_driver_unregister(struct rte_driver *driver)
TAILQ_REMOVE(&dev_driver_list, driver, next);
 }

+static int
+rte_eal_vdev_init(const char *name, const char *args)
+{
+   struct rte_driver *driver;
+
+   if (name == NULL)
+   return -EINVAL;
+
+   TAILQ_FOREACH(driver, &dev_driver_list, next) {
+   if (driver->type != PMD_VDEV)
+   continue;
+
+   /*
+* search a driver prefix in virtual device name.
+* For example, if the driver is pcap PMD, driver->name
+* will be "eth_pcap", but "name" will be "eth_pcapN".
+* So use strncmp to compare.
+*/
+   if (!strncmp(driver->name, name, strlen(driver->name))) {
+   driver->init(name, args);
+   break;
+   }
+   }
+
+   if (driver == NULL) {
+   RTE_LOG(WARNING, EAL, "no driver found for %s\n", name);
+   return -EINVAL;
+   }
+   return 0;
+}
+
 int
 rte_eal_dev_init(void)
 {
@@ -79,23 +113,10 @@ rte_eal_dev_init(void)
if (devargs->type != RTE_DEVTYPE_VIRTUAL)
continue;

-   TAILQ_FOREACH(driver, &dev_driver_list, next) {
-   if (driver->type != PMD_VDEV)
-   continue;
-
-   /* search a driver prefix in virtual device name */
-   if (!strncmp(driver->name, devargs->virtual.drv_name,
-   strlen(driver->name))) {
-   driver->init(devargs->virtual.drv_name,
-   devargs->args);
-   break;
-   }
-   }
-
-   if (driver == NULL) {
+   if (rte_eal_vdev_init(devargs->virtual.drv_name,
+   devargs->args))
rte_panic("no driver found for %s\n",
  devargs->virtual.drv_name);
-   }
}

/* Once the vdevs are initalized, start calling all the pdev drivers */
@@ -107,3 +128,237 @@ rte_eal_dev_init(void)
}
return 0;
 }
+
+/* So far, DPDK hotplug function only supports linux */
+#ifdef RTE_LIBRTE_EAL_HOTPLUG
+static int
+rte_eal_vdev

[dpdk-dev] [PATCH v12 13/13] doc: Add port hotplug framework section to programmers guide

2015-02-23 Thread Tetsuya Mukawa
This patch adds a new section for describing port hotplug framework.

Signed-off-by: Tetsuya Mukawa 
---
 doc/guides/prog_guide/index.rst  |   1 +
 doc/guides/prog_guide/port_hotplug_framework.rst | 110 +++
 2 files changed, 111 insertions(+)
 create mode 100644 doc/guides/prog_guide/port_hotplug_framework.rst

diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst
index de69682..60a6ac5 100644
--- a/doc/guides/prog_guide/index.rst
+++ b/doc/guides/prog_guide/index.rst
@@ -71,6 +71,7 @@ Programmer's Guide
 packet_classif_access_ctrl
 packet_framework
 vhost_lib
+port_hotplug_framework
 source_org
 dev_kit_build_system
 dev_kit_root_make_help
diff --git a/doc/guides/prog_guide/port_hotplug_framework.rst 
b/doc/guides/prog_guide/port_hotplug_framework.rst
new file mode 100644
index 000..355ae28
--- /dev/null
+++ b/doc/guides/prog_guide/port_hotplug_framework.rst
@@ -0,0 +1,110 @@
+..  BSD LICENSE
+Copyright(c) 2015 IGEL Co.,Ltd. All rights reserved.
+All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions
+are met:
+
+* Redistributions of source code must retain the above copyright
+notice, this list of conditions and the following disclaimer.
+* Redistributions in binary form must reproduce the above copyright
+notice, this list of conditions and the following disclaimer in
+the documentation and/or other materials provided with the
+distribution.
+* Neither the name of IGEL Co.,Ltd. nor the names of its
+contributors may be used to endorse or promote products derived
+from this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+Port Hotplug Framework
+==
+
+The Port Hotplug Framework provides DPDK applications with the ability to
+attach and detach ports at runtime. Because the framework depends on PMD
+implementation, the ports that PMDs cannot handle are out of scope of this
+framework. Furthermore, after detaching a port from a DPDK application, the
+framework doesn't provide a way for removing the devices from the system.
+For the ports backed by a physical NIC, the kernel will need to support PCI
+Hotplug feature.
+
+Overview
+
+
+The basic requirements of the Port Hotplug Framework are:
+
+*   DPDK applications that use the Port Hotplug Framework must manage their
+own ports.
+
+The Port Hotplug Framework is implemented to allow DPDK applications to
+manage ports. For example, when DPDK applications call the port attach
+function, the attached port number is returned. DPDK applications can
+also detach the port by port number.
+
+*   Kernel support is needed for attaching or detaching physical device
+ports.
+
+To attach new physical device ports, the device will be recognized by
+userspace driver I/O framework in kernel at first. Then DPDK
+applications can call the Port Hotplug functions to attach the ports.
+For detaching, steps are vice versa.
+
+*   Before detaching, they must be stopped and closed.
+
+DPDK applications must call "rte_eth_dev_stop()" and
+"rte_eth_dev_close()" APIs before detaching ports. These functions will
+start finalization sequence of the PMDs.
+
+*   The framework doesn't affect legacy DPDK applications behavior.
+
+If the Port Hotplug functions aren't called, all legacy DPDK apps can
+still work without modifications.
+
+Port Hotplug API overview
+-
+
+*   Attaching a port
+
+"rte_eal_dev_attach()" API attaches a port to DPDK application, and
+returns the attached port number. Before calling the API, the device
+should be recognized by an userspace driver I/O framework. The API
+receives a pci address like ":01:00.0" or a virtual device name
+like "eth_pcap0,iface=eth0". In the case of virtual device name, the
+format is the same as the general "--vdev" option of DPDK.
+
+*   Detac

[dpdk-dev] [PATCH v12] librte_pmd_pcap: Add port hotplug support

2015-02-23 Thread Tetsuya Mukawa
This patch adds finalization code to free resources allocated by the
PMD.

v6:
 - Fix a paramter of rte_eth_dev_free().
v4:
 - Change function name.

Signed-off-by: Tetsuya Mukawa 
---
 lib/librte_pmd_pcap/rte_eth_pcap.c | 40 ++
 1 file changed, 40 insertions(+)

diff --git a/lib/librte_pmd_pcap/rte_eth_pcap.c 
b/lib/librte_pmd_pcap/rte_eth_pcap.c
index af7fae8..5e94930 100644
--- a/lib/librte_pmd_pcap/rte_eth_pcap.c
+++ b/lib/librte_pmd_pcap/rte_eth_pcap.c
@@ -498,6 +498,13 @@ static struct eth_dev_ops ops = {
.stats_reset = eth_stats_reset,
 };

+static struct eth_driver rte_pcap_pmd = {
+   .pci_drv = {
+   .name = "rte_pcap_pmd",
+   .drv_flags = RTE_PCI_DRV_DETACHABLE,
+   },
+};
+
 /*
  * Function handler that opens the pcap file for reading a stores a
  * reference of it for use it later on.
@@ -713,6 +720,10 @@ rte_pmd_init_internals(const char *name, const unsigned 
nb_rx_queues,
if (*eth_dev == NULL)
goto error;

+   /* check length of device name */
+   if ((strlen((*eth_dev)->data->name) + 1) > sizeof(data->name))
+   goto error;
+
/* now put it all together
 * - store queue data in internals,
 * - store numa_node info in pci_driver
@@ -739,10 +750,13 @@ rte_pmd_init_internals(const char *name, const unsigned 
nb_rx_queues,
data->nb_tx_queues = (uint16_t)nb_tx_queues;
data->dev_link = pmd_link;
data->mac_addrs = ð_addr;
+   strncpy(data->name,
+   (*eth_dev)->data->name, strlen((*eth_dev)->data->name));

(*eth_dev)->data = data;
(*eth_dev)->dev_ops = &ops;
(*eth_dev)->pci_dev = pci_dev;
+   (*eth_dev)->driver = &rte_pcap_pmd;

return 0;

@@ -927,10 +941,36 @@ rte_pmd_pcap_devinit(const char *name, const char *params)

 }

+static int
+rte_pmd_pcap_devuninit(const char *name)
+{
+   struct rte_eth_dev *eth_dev = NULL;
+
+   RTE_LOG(INFO, PMD, "Closing pcap ethdev on numa socket %u\n",
+   rte_socket_id());
+
+   if (name == NULL)
+   return -1;
+
+   /* reserve an ethdev entry */
+   eth_dev = rte_eth_dev_allocated(name);
+   if (eth_dev == NULL)
+   return -1;
+
+   rte_free(eth_dev->data->dev_private);
+   rte_free(eth_dev->data);
+   rte_free(eth_dev->pci_dev);
+
+   rte_eth_dev_release_port(eth_dev);
+
+   return 0;
+}
+
 static struct rte_driver pmd_pcap_drv = {
.name = "eth_pcap",
.type = PMD_VDEV,
.init = rte_pmd_pcap_devinit,
+   .uninit = rte_pmd_pcap_devuninit,
 };

 PMD_REGISTER_DRIVER(pmd_pcap_drv);
-- 
1.9.1



[dpdk-dev] [PATCH v12] testpmd: Add port hotplug support

2015-02-23 Thread Tetsuya Mukawa
The patch introduces following commands.
- port attach [ident]
- port detach [port_id]
 - attach: attaching a port
 - detach: detaching a port
 - ident: pci address of physical device.
  Or device name and parameters of virtual device.
 (ex. :02:00.0, eth_pcap0,iface=eth0)
 - port_id: port identifier

v7:
- Fix doc.
  (Thanks to Iremonger, Bernard)
- Fix port checking implementation of star_port();
  (Thanks to Qiu, Michael)
v5:
- Add testpmd documentation.
  (Thanks to Iremonger, Bernard)
v4:
 - Fix strings of command help.

Signed-off-by: Tetsuya Mukawa 
---
 app/test-pmd/cmdline.c  | 137 +++
 app/test-pmd/config.c   | 102 --
 app/test-pmd/parameters.c   |  22 ++-
 app/test-pmd/testpmd.c  | 199 +---
 app/test-pmd/testpmd.h  |  18 ++-
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  57 
 6 files changed, 409 insertions(+), 126 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index c6a1627..b78c659 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -513,6 +513,12 @@ static void cmd_help_long_parsed(void *parsed_result,
"port close (port_id|all)\n"
"Close all ports or port_id.\n\n"

+   "port attach (ident)\n"
+   "Attach physical or virtual dev by pci address or 
virtual device name\n\n"
+
+   "port detach (port_id)\n"
+   "Detach physical or virtual dev by port_id\n\n"
+
"port config (port_id|all)"
" speed (10|100|1000|1|4|auto)"
" duplex (half|full|auto)\n"
@@ -793,6 +799,89 @@ cmdline_parse_inst_t cmd_operate_specific_port = {
},
 };

+/* *** attach a specified port *** */
+struct cmd_operate_attach_port_result {
+   cmdline_fixed_string_t port;
+   cmdline_fixed_string_t keyword;
+   cmdline_fixed_string_t identifier;
+};
+
+static void cmd_operate_attach_port_parsed(void *parsed_result,
+   __attribute__((unused)) struct cmdline *cl,
+   __attribute__((unused)) void *data)
+{
+   struct cmd_operate_attach_port_result *res = parsed_result;
+
+   if (!strcmp(res->keyword, "attach"))
+   attach_port(res->identifier);
+   else
+   printf("Unknown parameter\n");
+}
+
+cmdline_parse_token_string_t cmd_operate_attach_port_port =
+   TOKEN_STRING_INITIALIZER(struct cmd_operate_attach_port_result,
+   port, "port");
+cmdline_parse_token_string_t cmd_operate_attach_port_keyword =
+   TOKEN_STRING_INITIALIZER(struct cmd_operate_attach_port_result,
+   keyword, "attach");
+cmdline_parse_token_string_t cmd_operate_attach_port_identifier =
+   TOKEN_STRING_INITIALIZER(struct cmd_operate_attach_port_result,
+   identifier, NULL);
+
+cmdline_parse_inst_t cmd_operate_attach_port = {
+   .f = cmd_operate_attach_port_parsed,
+   .data = NULL,
+   .help_str = "port attach identifier, "
+   "identifier: pci address or virtual dev name",
+   .tokens = {
+   (void *)&cmd_operate_attach_port_port,
+   (void *)&cmd_operate_attach_port_keyword,
+   (void *)&cmd_operate_attach_port_identifier,
+   NULL,
+   },
+};
+
+/* *** detach a specified port *** */
+struct cmd_operate_detach_port_result {
+   cmdline_fixed_string_t port;
+   cmdline_fixed_string_t keyword;
+   uint8_t port_id;
+};
+
+static void cmd_operate_detach_port_parsed(void *parsed_result,
+   __attribute__((unused)) struct cmdline *cl,
+   __attribute__((unused)) void *data)
+{
+   struct cmd_operate_detach_port_result *res = parsed_result;
+
+   if (!strcmp(res->keyword, "detach"))
+   detach_port(res->port_id);
+   else
+   printf("Unknown parameter\n");
+}
+
+cmdline_parse_token_string_t cmd_operate_detach_port_port =
+   TOKEN_STRING_INITIALIZER(struct cmd_operate_detach_port_result,
+   port, "port");
+cmdline_parse_token_string_t cmd_operate_detach_port_keyword =
+   TOKEN_STRING_INITIALIZER(struct cmd_operate_detach_port_result,
+   keyword, "detach");
+cmdline_parse_token_num_t cmd_operate_detach_port_port_id =
+   TOKEN_NUM_INITIALIZER(struct cmd_operate_detach_port_result,
+   port_id, UINT8);
+
+cmdline_parse_inst_t cmd_operate_detach_port = {
+   .f = cmd_operate_detach_port_parsed,
+   .data = NULL,
+   .help_str = "port detach port_id",
+   .tokens = {
+   (void *)&cmd_operate_detach_port_port,
+   (void *)&cmd_operate_detach_port_keyword,
+ 

[dpdk-dev] [PATCH] ring: fix minor memory leak of kvlist in dev init

2015-02-23 Thread John McNamara
Fix for Klockwork identified issue.

Signed-off-by: John McNamara 
---
 lib/librte_pmd_ring/rte_eth_ring.c |3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/lib/librte_pmd_ring/rte_eth_ring.c 
b/lib/librte_pmd_ring/rte_eth_ring.c
index a23e933..88a1382 100644
--- a/lib/librte_pmd_ring/rte_eth_ring.c
+++ b/lib/librte_pmd_ring/rte_eth_ring.c
@@ -527,7 +527,7 @@ out:
 static int
 rte_pmd_ring_devinit(const char *name, const char *params)
 {
-   struct rte_kvargs *kvlist;
+   struct rte_kvargs *kvlist = NULL;
int ret = 0;
struct node_action_list *info = NULL;

@@ -569,6 +569,7 @@ rte_pmd_ring_devinit(const char *name, const char *params)
 out_free:
rte_free(info);
 out:
+   rte_kvargs_free(kvlist);
return ret;
 }

-- 
1.7.4.1



[dpdk-dev] [PATCH v4 2/5] ixgbe: enable rx queue interrupts for both PF and VF

2015-02-23 Thread Thomas Monjalon
2015-02-23 11:23, Zhou, Danny:
> I noticed the V4 patch conflicts with the latest code on the main branch due 
> to lots of code merged, and it 
> is mentioned by Jun Xu in previous email, and I will have to rebase the patch 
> and send out V5 version.

Maybe you misunderstood my comment.
I'm quoting an usage in patch 2 of max_intr which is defined in patch 4.
You have to better split and order your patches to make them compilable
and not break git bisect.

Thanks

> > -Original Message-
> > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> > Sent: Monday, February 23, 2015 7:20 PM
> > To: Zhou, Danny
> > Cc: dev at dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v4 2/5] ixgbe: enable rx queue interrupts 
> > for both PF and VF
> > 
> > 2015-02-19 21:48, Zhou Danny:
> > > + /* set max interrupt vfio request */
> > > + pci_dev->intr_handle.max_intr = hw->mac.max_rx_queues +
> > > + IXGBE_MAX_OTHER_INTR;
> > > +
> > 
> > Compilation is broken here.




[dpdk-dev] [PATCH] ring: fix minor memory leak of kvlist in dev init

2015-02-23 Thread Wodkowski, PawelX


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of John McNamara
> Sent: Monday, February 23, 2015 2:17 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH] ring: fix minor memory leak of kvlist in dev init
> 
> Fix for Klockwork identified issue.
> 
> Signed-off-by: John McNamara 
> ---
>  lib/librte_pmd_ring/rte_eth_ring.c |3 ++-
>  1 files changed, 2 insertions(+), 1 deletions(-)
> 
> diff --git a/lib/librte_pmd_ring/rte_eth_ring.c
> b/lib/librte_pmd_ring/rte_eth_ring.c
> index a23e933..88a1382 100644
> --- a/lib/librte_pmd_ring/rte_eth_ring.c
> +++ b/lib/librte_pmd_ring/rte_eth_ring.c
> @@ -527,7 +527,7 @@ out:
>  static int
>  rte_pmd_ring_devinit(const char *name, const char *params)
>  {
> - struct rte_kvargs *kvlist;
> + struct rte_kvargs *kvlist = NULL;
>   int ret = 0;
>   struct node_action_list *info = NULL;
> 
> @@ -569,6 +569,7 @@ rte_pmd_ring_devinit(const char *name, const char
> *params)
>  out_free:
>   rte_free(info);
>  out:
> + rte_kvargs_free(kvlist);
>   return ret;
>  }
> 
> --
> 1.7.4.1

This is wrong/incomplete as rte_kvargs_free() is unable to handle NULL argument.
I have patch under review that fix this issue along with rte_kvargs_free().

Pawel




[dpdk-dev] [PATCH v4 4/5] eal: add per rx queue interrupt handling based on VFIO

2015-02-23 Thread Thomas Monjalon
2015-02-23 11:47, Zhou, Danny:
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> > 2015-02-19 21:48, Zhou Danny:
> > > --- a/lib/librte_eal/linuxapp/eal/Makefile
> > > +++ b/lib/librte_eal/linuxapp/eal/Makefile
> > > @@ -43,6 +43,7 @@ CFLAGS += -I$(SRCDIR)/include
> > >  CFLAGS += -I$(RTE_SDK)/lib/librte_eal/common
> > >  CFLAGS += -I$(RTE_SDK)/lib/librte_eal/common/include
> > >  CFLAGS += -I$(RTE_SDK)/lib/librte_ring
> > > +CFLAGS += -I$(RTE_SDK)/lib/librte_mbuf
> > >  CFLAGS += -I$(RTE_SDK)/lib/librte_mempool
> > >  CFLAGS += -I$(RTE_SDK)/lib/librte_malloc
> > >  CFLAGS += -I$(RTE_SDK)/lib/librte_ether
> > 
> > Why do we need mbuf in EAL?
> 
> The file eal_interrupts.c includes rte_ethdev.h which defines structure 
> rte_eth_devices that 
> eal needs to use in order to get per-port intr_handle. The rte_ethdev.h 
> includes the rte_mbuf.h
> so the Makefile is updated here.

I see. You are breaking layer isolation by introducing ethdev in EAL.
The cause seems to be:

+   struct rte_intr_handle intr_handle =
+   rte_eth_devices[port_id].pci_dev->intr_handle;

Maybe that pci_dev should be a parameter of the function.


[dpdk-dev] [PATCH v11 12/13] eal/pci: Add rte_eal_dev_attach/detach() functions

2015-02-23 Thread Maxime Leroy
Hi Tetsuya,

On Mon, Feb 23, 2015 at 6:09 AM, Tetsuya Mukawa  wrote:
> These functions are used for attaching or detaching a port.
[...]
>
> +static int
> +rte_eal_vdev_init(const char *name, const char *args)
> +{
> +   struct rte_driver *driver;
> +
> +   if (name == NULL)
> +   return -EINVAL;
> +
> +   TAILQ_FOREACH(driver, &dev_driver_list, next) {
> +   if (driver->type != PMD_VDEV)
> +   continue;
> +
> +   /*
> +* search a driver prefix in virtual device name.
> +* For example, if the driver is pcap PMD, driver->name
> +* will be "eth_pcap", but "name" will be "eth_pcapN".
> +* So use strncmp to compare.
> +*/
> +   if (!strncmp(driver->name, name, strlen(driver->name))) {
> +   driver->init(name, args);
> +   break;

Please return the value given by init: return driver->init(name, args); .

> +   }
> +   }
> +
> +   if (driver == NULL) {
> +   RTE_LOG(WARNING, EAL, "no driver found for %s\n", name);
--> should be : RTE_LOG(ERR .


> +   return -EINVAL;
> +   }
> +   return 0;
> +}
> +
>  int
>  rte_eal_dev_init(void)
>  {
> @@ -79,23 +113,10 @@ rte_eal_dev_init(void)
> if (devargs->type != RTE_DEVTYPE_VIRTUAL)
> continue;
>
> -   TAILQ_FOREACH(driver, &dev_driver_list, next) {
> -   if (driver->type != PMD_VDEV)
> -   continue;
> -
> -   /* search a driver prefix in virtual device name */
> -   if (!strncmp(driver->name, devargs->virtual.drv_name,
> -   strlen(driver->name))) {
> -   driver->init(devargs->virtual.drv_name,
> -   devargs->args);
> -   break;
> -   }
> -   }
> -
> -   if (driver == NULL) {
> +   if (rte_eal_vdev_init(devargs->virtual.drv_name,
> +   devargs->args))
> rte_panic("no driver found for %s\n",
>   devargs->virtual.drv_name);
instead of that:

if (rte_eal_vdev_init(devargs->virtual.drv_name, devargs->args)) {
  RTE_LOG(ERR, "failed to initialize %s device\n",
devargs->virtual.drv_name);
  return -1;
}

> -   }
> }
>
> /* Once the vdevs are initalized, start calling all the pdev drivers 
> */
> @@ -107,3 +128,237 @@ rte_eal_dev_init(void)
> }
> return 0;
>  }
> +
> +/* So far, DPDK hotplug function only supports linux */
> +#ifdef RTE_LIBRTE_EAL_HOTPLUG
> +static int
> +rte_eal_vdev_uninit(const char *name)
> +{
> +   struct rte_driver *driver;
> +
> +   if (name == NULL)
> +   return -EINVAL;
> +
> +   TAILQ_FOREACH(driver, &dev_driver_list, next) {
> +   if (driver->type != PMD_VDEV)
> +   continue;
> +
> +   /*
> +* search a driver prefix in virtual device name.
> +* For example, if the driver is pcap PMD, driver->name
> +* will be "eth_pcap", but "name" will be "eth_pcapN".
> +* So use strncmp to compare.
> +*/
> +   if (!strncmp(driver->name, name, strlen(driver->name))) {
> +   driver->uninit(name);

Please return the value given by uninit: return driver->uninit(name, args);

> +   break;
> +   }
> +   }
> +
> +   if (driver == NULL) {
> +   RTE_LOG(WARNING, EAL, "no driver found for %s\n", name);
> +   return 1;

As it's an error, the function should return a negative value ( i.e. -EINVAL).
Please set the log level to ERR.

> +   }
> +   return 0;
> +}
> +
[...]
> +}
> +
> +/* attach the new virtual device, then store port_id of the device */
> +static int
> +rte_eal_dev_attach_vdev(const char *vdevargs, uint8_t *port_id)
> +{
> +   char *name = NULL, *args = NULL;
> +   uint8_t new_port_id;
> +   struct rte_eth_dev devs[RTE_MAX_ETHPORTS];
> +   int ret = -1;
> +
> +   if ((vdevargs == NULL) || (port_id == NULL))
> +   goto end;
> +
> +   /* parse vdevargs, then retrieve device name and args */
> +   if (rte_eal_parse_devargs_str(vdevargs, &name, &args))
> +   goto end;
> +
> +   /* save current port status */
> +   if (rte_eth_dev_save(devs, sizeof(devs)))
> +   goto end;
> +   /* walk around dev_driver_list to find the driver of the device,
> +* then invoke probe function o the driver.
> +* TODO:
> +* rte_eal_vdev_init() should return port_id,
> +* And rte_eth_dev_save() and rte_eth_dev_get_changed

[dpdk-dev] [PATCH] af_packet: fix minor memory leak of kvlist in dev init

2015-02-23 Thread John McNamara
Fix for Klockwork identified issue.

Signed-off-by: John McNamara 
---
 lib/librte_pmd_af_packet/rte_eth_af_packet.c |   11 ---
 1 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/lib/librte_pmd_af_packet/rte_eth_af_packet.c 
b/lib/librte_pmd_af_packet/rte_eth_af_packet.c
index 1ffe1cd..7ee3b31 100644
--- a/lib/librte_pmd_af_packet/rte_eth_af_packet.c
+++ b/lib/librte_pmd_af_packet/rte_eth_af_packet.c
@@ -803,7 +803,7 @@ rte_pmd_af_packet_devinit(const char *name, const char 
*params)
 {
unsigned numa_node;
int ret;
-   struct rte_kvargs *kvlist;
+   struct rte_kvargs *kvlist = NULL;
int sockfd = -1;

RTE_LOG(INFO, PMD, "Initializing pmd_af_packet for %s\n", name);
@@ -823,16 +823,21 @@ rte_pmd_af_packet_devinit(const char *name, const char 
*params)
ret = rte_kvargs_process(kvlist, ETH_AF_PACKET_IFACE_ARG,
 &open_packet_iface, &sockfd);
if (ret < 0)
-   return -1;
+   goto out;
}

ret = rte_eth_from_packet(name, &sockfd, numa_node, kvlist);
close(sockfd); /* no longer needed */

if (ret < 0)
-   return -1;
+   goto out;

+   rte_kvargs_free(kvlist);
return 0;
+
+out:
+   rte_kvargs_free(kvlist);
+   return -1;
 }

 static struct rte_driver pmd_af_packet_drv = {
-- 
1.7.4.1



[dpdk-dev] [PATCH 0/8] Improve build process

2015-02-23 Thread Neil Horman
On Mon, Feb 23, 2015 at 10:25:01AM +, Gonzalez Monroy, Sergio wrote:
> On 22/02/2015 23:37, Neil Horman wrote:
> >On Fri, Feb 20, 2015 at 02:31:36PM +, Gonzalez Monroy, Sergio wrote:
> >>On 13/02/2015 12:51, Neil Horman wrote:
> >>>On Fri, Feb 13, 2015 at 11:08:02AM +, Gonzalez Monroy, Sergio wrote:
> On 13/02/2015 10:14, Panu Matilainen wrote:
> >On 02/12/2015 05:52 PM, Neil Horman wrote:
> >>On Thu, Feb 12, 2015 at 04:07:50PM +0200, Panu Matilainen wrote:
> >>>On 02/12/2015 02:23 PM, Neil Horman wrote:
> >[...snip...]
> >>>So I just realized that I was not having into account a possible
> >>>scenario, where
> >>>we have an app built with static dpdk libs then loading a dso
> >>>with -d
> >>>option.
> >>>
> >>>In such case, because the pmd would have DT_NEEDED entries,
> >>>dlopen will
> >>>fail.
> >>>So to enable such scenario we would need to build PMDs without
> >>>DT_NEEDED
> >>>entries.
> >>Hmm, for that to be a problem you'd need to have the PMD built
> >>against
> >>shared dpdk libs and while the application is built against
> >>static dpdk
> >>libs. I dont think that's a supportable scenario in any case.
> >>
> >>Or is there some other scenario that I'm not seeing?
> >>
> >>- Panu -
> >>
> >I agree with you. I suppose it comes down to, do we want to
> >support such
> >scenario?
> >
> > From what I can see, it seems that we do currently support such
> >scenario by
> >building dpdk apps against all static dpdk libs using
> >--whole-archive (all
> >libs and not only PMDs).
> >http://dpdk.org/browse/dpdk/commit/?id=20afd76a504155e947c770783ef5023e87136ad8
> >
> >
> >Am I misunderstanding this?
> >
> Shoot, you're right, I missed the static build aspect to this.  Yes,
> if we do the following:
> 
> 1) Build the DPDK as a static library
> 2) Link an application against (1)
> 3) Use the dlopen mechanism to load a PMD built as a DSO
> 
> Then the DT_NEEDED entries in the DSO will go unsatisfied, because
> the shared
> objects on which it (the PMD) depends will not exist in the file
> system.
> >>>I think its even more twisty:
> >>>
> >>>1) Build the DPDK as a static library
> >>>2) Link an application against (1)
> >>>3) Do another build of DPDK as a shared library
> >>>4) In app 2), use the dlopen mechanism to load a PMD built as a part
> >>>of or
> >>>against 3)
> >>>
> >>>Somehow I doubt this would work very well.
> >>>
> >>Ideally it should, presuming the ABI is preserved between (1) and (3),
> >>though I
> >>agree, up until recently, that was an assumption that was unreliable.
> >Versioning is a big and important step towards reliability but there are
> >more issues to solve. This of course getting pretty far from the original
> >topic, but at least one such issue is that there are some cases where a
> >config value affects what are apparently public structs (rte_mbuf wrt
> >RTE_MBUF_REFCNT for example), which really is a no-go.
> >
> Agree, the RTE_MBUF_REFCNT is something that needs to be dealt with asap.
> I'll look into it.
> 
> I think the problem is a little bit orthogonal to the libdpdk_core
> problem you
> were initially addressing.  That is to say, this problem of
> dlopen-ed PMD's
> exists regardless of weather you build the DPDK as part of a static
> or dynamic
> library.  The problems just happen to intersect in their
> manipulation of the
> DT_NEEDED entries.
> 
> Ok, so, given the above, I would say your approach is likely
> correct, just
> prevent DT_NEEDED entries from getting added to PMD's. Doing so will
> sidestep
> loading issue for libraries that may not exist in the filesystem,
> but thats ok,
> because by all rights, the symbols codified in those needed
> libraries should
> already be present in the running application (either made available
> by the
> application having statically linked them, or having the linker load
> them from
> the proper libraries at run time).
> >>>My 5c is that I'd much rather see the common case (all static or all
> >>>shared)
> >>>be simple and reliable, which in case of DSOs includes no lying
> >>>(whether by
> >>>omission or otherwise) about DT_NEEDED, ever. That way the issue is
> >>>dealt
> >>>once where it belongs. If somebody wants to go down the rabbit hole of
> >>>mixed
> >>>shared + static linkage, let them dig the hole by themselv

[dpdk-dev] [PATCH v2 00/11] qemu vhost-user support

2015-02-23 Thread Czesnowicz, Przemyslaw
> I tried to locally applied the patches, waiting comments are closed.
> But I stopped after patch 04/11 which makes compilation failing.
> I'm so sorry that we still don't have a vhost-user support integrated in DPDK.
> I feel it won't be ready in next days to be able to enter in 2.0 version.

Hi Thomas,

You are seeing this compile failure because Huawei was working
on an older tree and did a rebase as patch #10.
If you apply all the patches from the series they compile and work just fine. 

Unfortunately Huawei is not available at the moment.
We could squash all the patches into one and resend it to the ML.
Is that ok?

Regards
Przemek


[dpdk-dev] [PATCH] Make -Werror optional

2015-02-23 Thread Neil Horman
On Mon, Feb 23, 2015 at 10:19:23AM +0200, Panu Matilainen wrote:
> On 02/21/2015 09:33 PM, Neil Horman wrote:
> >On Fri, Feb 20, 2015 at 05:55:21PM -0800, Stephen Hemminger wrote:
> >>On Thu, 12 Feb 2015 16:54:44 +0200
> >>Panu Matilainen  wrote:
> >>
> >>>On 02/12/2015 04:38 PM, Stephen Hemminger wrote:
> On Thu, 12 Feb 2015 13:13:22 +0200
> Panu Matilainen  wrote:
> 
> >This adds new CONFIG_RTE_ERROR_ON_WARNING config option to enable
> >fail-on-warning compile behavior, defaulting to off.
> >
> >Failing build on warnings is a useful developer tool but its bad
> >for release tarballs which can and do get built with newer
> >compilers than what was used/available during development. Compilers
> >routinely add new warnings so code which built silently with cc X
> >might no longer do so with X+1. This doesn't make the existing code
> >any more buggier and failing the build in this case does not help
> >not help improve code quality of an already released version either.
> >>
> >>Hopefully distro's like RHEL will build with -Werror enabled
> >>and not allow build to go through with errors.
> >>
> >Thats usually what we do, yes.
> 
> Um, nope. All Fedora and RHEL builds are done using a common base set of
> flags set centrally from rpm configuration, and that includes among other
> things -Wall but not -Werror, although since F21 -Werror=format-security is
> included since that there are relatively few false positives for that.
> 
> The thing is, compiler warnings from compilers are just that: warnings, and
> often including hefty dose of false positives. A good package maintainer
> will look at the build logs of his/her packages, investigate warnings and
> send patches upstream to address them in oncoming versions where actually
> relevant, but generally a package maintainer in a distro is not responsible
> for achieving zero-warning build, nor should they.
> 
Um, I don't know what you've been doing, but most of my packages typically have
zero warnings.  Its true package maintainers have the option to disable
warnings, and many do for pragmatic reasons as you note, but when its feasible,
theres no reason not to make sure warning doesn't get raised when you expect
there to be none.

Neil



[dpdk-dev] [PATCH v2 00/11] qemu vhost-user support

2015-02-23 Thread Thomas Monjalon
2015-02-23 13:53, Czesnowicz, Przemyslaw:
> > I tried to locally applied the patches, waiting comments are closed.
> > But I stopped after patch 04/11 which makes compilation failing.
> > I'm so sorry that we still don't have a vhost-user support integrated in 
> > DPDK.
> > I feel it won't be ready in next days to be able to enter in 2.0 version.
> 
> Hi Thomas,
> 
> You are seeing this compile failure because Huawei was working
> on an older tree and did a rebase as patch #10.
> If you apply all the patches from the series they compile and work just fine. 
> 
> Unfortunately Huawei is not available at the moment.
> We could squash all the patches into one and resend it to the ML.
> Is that ok?

Are you joking?
No we need to have patches well split and have them compiling.
Is there someone able to fix correctly the patchset and resubmit them?
I don't want to lose time to fix it myself.


[dpdk-dev] [PATCH v2 00/11] qemu vhost-user support

2015-02-23 Thread Czesnowicz, Przemyslaw
> 2015-02-23 13:53, Czesnowicz, Przemyslaw:
> > > I tried to locally applied the patches, waiting comments are closed.
> > > But I stopped after patch 04/11 which makes compilation failing.
> > > I'm so sorry that we still don't have a vhost-user support integrated in 
> > > DPDK.
> > > I feel it won't be ready in next days to be able to enter in 2.0 version.
> >
> > Hi Thomas,
> >
> > You are seeing this compile failure because Huawei was working on an
> > older tree and did a rebase as patch #10.
> > If you apply all the patches from the series they compile and work just 
> > fine.
> >
> > Unfortunately Huawei is not available at the moment.
> > We could squash all the patches into one and resend it to the ML.
> > Is that ok?
> 
> Are you joking?
I was expecting this answer, sorry for that.
> No we need to have patches well split and have them compiling.
> Is there someone able to fix correctly the patchset and resubmit them?
> I don't want to lose time to fix it myself.

I'll fix the patchset and resubmit.
Przemek


[dpdk-dev] [PATCH 0/5] Fix issues reported by static analysis tool

2015-02-23 Thread Pawel Wodkowski
Klockwork report some issues against current DPDK version. Most of them need
only cosmetic code changes (changing type of variable or adding explicit cast).

One issue related with ring pmd fix real memory leak problem.


Pawel Wodkowski (5):
  rte_timer: fix invalid declaration of rte_timer_cb_t
  librte_kvargs: make rte_kvargs_free() be consistent with other
"free()" functions
  pmd ring: fix possible memory leak during devinit
  cmdline: make parse_set_list() use size_t instead of int for low/high 
   parameter
  Fix usage of fgets in various places

 lib/librte_cfgfile/rte_cfgfile.c|  2 +-
 lib/librte_cmdline/cmdline_parse_portlist.c |  4 ++--
 lib/librte_eal/bsdapp/eal/eal.c |  2 +-
 lib/librte_eal/linuxapp/eal/eal_hugepage_info.c |  4 ++--
 lib/librte_eal/linuxapp/eal/eal_memory.c|  2 +-
 lib/librte_eal/linuxapp/eal/eal_pci.c   |  2 +-
 lib/librte_eal/linuxapp/eal/eal_timer.c |  2 +-
 lib/librte_kvargs/rte_kvargs.c  |  4 
 lib/librte_kvargs/rte_kvargs.h  |  3 ++-
 lib/librte_pmd_ring/rte_eth_ring.c  |  6 +++---
 lib/librte_pmd_virtio/virtio_ethdev.c   |  2 +-
 lib/librte_power/rte_power_acpi_cpufreq.c   | 10 +-
 lib/librte_timer/rte_timer.h|  4 ++--
 13 files changed, 26 insertions(+), 21 deletions(-)

-- 
1.9.1



[dpdk-dev] [PATCH] Make -Werror optional

2015-02-23 Thread Panu Matilainen
On 02/23/2015 03:55 PM, Neil Horman wrote:
> On Mon, Feb 23, 2015 at 10:19:23AM +0200, Panu Matilainen wrote:
>> On 02/21/2015 09:33 PM, Neil Horman wrote:
>>> On Fri, Feb 20, 2015 at 05:55:21PM -0800, Stephen Hemminger wrote:
 On Thu, 12 Feb 2015 16:54:44 +0200
 Panu Matilainen  wrote:

> On 02/12/2015 04:38 PM, Stephen Hemminger wrote:
>> On Thu, 12 Feb 2015 13:13:22 +0200
>> Panu Matilainen  wrote:
>>
>>> This adds new CONFIG_RTE_ERROR_ON_WARNING config option to enable
>>> fail-on-warning compile behavior, defaulting to off.
>>>
>>> Failing build on warnings is a useful developer tool but its bad
>>> for release tarballs which can and do get built with newer
>>> compilers than what was used/available during development. Compilers
>>> routinely add new warnings so code which built silently with cc X
>>> might no longer do so with X+1. This doesn't make the existing code
>>> any more buggier and failing the build in this case does not help
>>> not help improve code quality of an already released version either.

 Hopefully distro's like RHEL will build with -Werror enabled
 and not allow build to go through with errors.

>>> Thats usually what we do, yes.
>>
>> Um, nope. All Fedora and RHEL builds are done using a common base set of
>> flags set centrally from rpm configuration, and that includes among other
>> things -Wall but not -Werror, although since F21 -Werror=format-security is
>> included since that there are relatively few false positives for that.
>>
>> The thing is, compiler warnings from compilers are just that: warnings, and
>> often including hefty dose of false positives. A good package maintainer
>> will look at the build logs of his/her packages, investigate warnings and
>> send patches upstream to address them in oncoming versions where actually
>> relevant, but generally a package maintainer in a distro is not responsible
>> for achieving zero-warning build, nor should they.
>>
> Um, I don't know what you've been doing, but most of my packages typically 
> have
> zero warnings.  Its true package maintainers have the option to disable
> warnings, and many do for pragmatic reasons as you note, but when its 
> feasible,
> theres no reason not to make sure warning doesn't get raised when you expect
> there to be none.

The question wasn't about you or me or any other individual maintainer 
or package, it was whether distros build with -Werror, and the answer to 
that is generally no.

Individual maintainers are free to do so of course, but for example with 
the ubiquitous autoconf-based packages you cant just stick -Werror into 
CFLAGS because it breaks a whole pile of the autoconf tests.

But this is getting wildly off-topic for dpdk dev, I'll shut up now :)

- Panu -


[dpdk-dev] [PATCH 1/5] rte_timer: fix invalid declaration of rte_timer_cb_t

2015-02-23 Thread Pawel Wodkowski
Declaration for function pointer should be
typedef ret_type (*type_name)(args...)
not
typedef ret_type (type_name)(args...)

although compiler treat both of them the same, the static analysis tool
like klocwork complain about that.

Signed-off-by: Pawel Wodkowski 
---
 lib/librte_timer/rte_timer.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/librte_timer/rte_timer.h b/lib/librte_timer/rte_timer.h
index 4907cf5..327fe4b 100644
--- a/lib/librte_timer/rte_timer.h
+++ b/lib/librte_timer/rte_timer.h
@@ -115,7 +115,7 @@ struct rte_timer;
 /**
  * Callback function type for timer expiry.
  */
-typedef void (rte_timer_cb_t)(struct rte_timer *, void *);
+typedef void (*rte_timer_cb_t)(struct rte_timer *, void *);

 #define MAX_SKIPLIST_DEPTH 10

@@ -128,7 +128,7 @@ struct rte_timer
struct rte_timer *sl_next[MAX_SKIPLIST_DEPTH];
volatile union rte_timer_status status; /**< Status of timer. */
uint64_t period;   /**< Period of timer (0 if not periodic). */
-   rte_timer_cb_t *f; /**< Callback function. */
+   rte_timer_cb_t f;  /**< Callback function. */
void *arg; /**< Argument to callback function. */
 };

-- 
1.9.1



[dpdk-dev] [PATCH 2/5] librte_kvargs: make rte_kvargs_free() be consistent with other "free()" functions

2015-02-23 Thread Pawel Wodkowski
It is desired that all type of *_free() functions mimic behaviour of
libc free() function. This function does nothing if given parameter is
NULL. This patch add this behaviour for rte_kvargs_free().

Signed-off-by: Pawel Wodkowski 
---
 lib/librte_kvargs/rte_kvargs.c | 4 
 lib/librte_kvargs/rte_kvargs.h | 3 ++-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/lib/librte_kvargs/rte_kvargs.c b/lib/librte_kvargs/rte_kvargs.c
index 8bc1e46..c2dd051 100644
--- a/lib/librte_kvargs/rte_kvargs.c
+++ b/lib/librte_kvargs/rte_kvargs.c
@@ -174,8 +174,12 @@ rte_kvargs_process(const struct rte_kvargs *kvlist,
 void
 rte_kvargs_free(struct rte_kvargs *kvlist)
 {
+   if (!kvlist)
+   return;
+
if (kvlist->str != NULL)
free(kvlist->str);
+
free(kvlist);
 }

diff --git a/lib/librte_kvargs/rte_kvargs.h b/lib/librte_kvargs/rte_kvargs.h
index ef4efab..ae9ae79 100644
--- a/lib/librte_kvargs/rte_kvargs.h
+++ b/lib/librte_kvargs/rte_kvargs.h
@@ -115,7 +115,8 @@ void rte_kvargs_free(struct rte_kvargs *kvlist);
  *
  * For each key/value association that matches the given key, calls the
  * handler function with the for a given arg_name passing the value on the
- * dictionary for that key and a given extra argument.
+ * dictionary for that key and a given extra argument. If *kvlist* is NULL
+ * function does nothing.
  *
  * @param kvlist
  *   The rte_kvargs structure
-- 
1.9.1



[dpdk-dev] [PATCH v5 0/3] new headroom stats library and example application

2015-02-23 Thread Jastrzebski, MichalX K
> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Monday, February 23, 2015 12:46 PM
> To: Wodkowski, PawelX
> Cc: dev at dpdk.org; Jastrzebski, MichalX K; Neil Horman
> Subject: Re: [dpdk-dev] [PATCH v5 0/3] new headroom stats library and
> example application
> 
> 2015-02-20 15:46, Jastrzebski, MichalX K:
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Neil Horman
> > > On Thu, Feb 19, 2015 at 01:18:41PM +0100, Pawel Wodkowski wrote:
> > > > Hi community,
> > > > I would like to introduce library for measuring load of some arbitrary
> jobs.
> > > > It can be used to profile every kind of job sets on any arbitrary 
> > > > execution
> unit
> > > > or tasking library.
> > > >
> > > > In provided l2fwd-headroom example I demonstrate how to use this
> library to
> > > > select optimal rx burst poll time. Jobs are selected by using existing
> rte_timer
> > > > library calls. This example does no limit possible schemes on which this
> > > > library can be used.
> > > >
> > > > Pawel Wodkowski (3):
> > > >   librte_headroom: New library for checking core/system/app load
> > > >   examples: introduce new l2fwd-headroom example
> > > >   MAINTAINERS: claim responsibility for headroom library and example
> app
> > >
> > > I'm sorry but I still fail to see how this is a particularly useful 
> > > library.  It
> > > clearly works fine, but it composes an application event loop in its own
> > > terms,
> > > and measures stats based on that.  While thats ok, any application is
> already
> > > going to have to write its own event loop, and can makethe same
> > > measurements
> > > synchnously within that loop, using alot less code to optimize its polling
> time.
> > >
> > > In other words, I think this is one of those cases where this library is
> > > probably somewhat useful for anyone who just wants to write an
> application
> > > in
> > > terms the semantics exposed by this library, but not at all useful for
> anyone
> > > else.  I'd personally rather not have the extra code to maintain here.
> > >
> > > Stephen just gave a presentation at netdev about some of the
> performance
> > > optimization measurements Brocade did with DPDK and how they fine
> tuned
> > > their
> > > environment.  One of the big take aways for me was that making time
> based
> > > measurements (especially if it was using the tsc), created cpu stalls that
> > > skewed the measurements, and so the best optimizations they made
> avoided
> > > time
> > > measurements, opting instead for packet count metrics.
> > >
> > > Neil
> >
> > Hi Neil,
> >
> > I think this library offers something quite useful probably not for 
> > everyone,
> > but for many people that use DPDK, and it is measuring quite accurately,
> > how many spare cycles a CPU have after executing any serial tasks (as you
> will know).
> > If you look at two places in example application: main_loop()
> > and l2fwd_fwd() functions, you will see two possible approach there, but
> > this is not limited to that. You can even nest headroom objects and
> measure
> > process time of particular packets type.
> > Of course, this will add an overhead due to the measurements,
> > but that time is also measured, so any user can know what is the relative
> > time "wasted" for measuring all this.
> > If time delays are measured in bigger timestamps, are handled reliably,
> > the cost of measuring will be low.
> > I find this quite similar to the power library case. I would say that 
> > library is
> not useful
> > for every application, but there are several cases where it can be
> > (as demonstrated with l3fwd-power app).
> >
> > About your last bit, not sure if I understood it right, but in case of the
> included sample app,
> > the main measurement to see if we are overusing a CPU is the packet count
> > in a queue (in this case RX queue), and I believe this should be used for
> other apps,
> > especially in those that use a pipeline model, where queues and rings are
> the key part.
> >
> > As a final point, last week (12th of February), there was a request for a
> tool/library like this
> > from a user in the mailing list (Ilan Borenshtein), which indicates that 
> > this
> would be useful
> > (probably not just for him, but for others). It probably could be achieved 
> > by
> the user
> > by adding their own code, but I believe this library would be a good-to-
> have,
> > in case a user is looking for an easy way to calculate the exposed above.
> > Let us give the users an example of this method and we will expand it with
> more
> > advanced application that may show capabilities of dynamic load scaling
> based on headroom library measurement.
> 
Hi Thomas,

> I wonder how this library is related to DPDK.
> I'm not against its integration, though the question must be asked.
> DPDK is a set of libraries. What kind of library fit with DPDK goals and
> deserve to be integrated?
> 
I think this library fits into dpdk goals, because it is 

[dpdk-dev] [PATCH 3/5] pmd ring: fix possible memory leak during devinit

2015-02-23 Thread Pawel Wodkowski
Free kvlist on function exit to avoid memory leak.

Signed-off-by: Pawel Wodkowski 
---
 lib/librte_pmd_ring/rte_eth_ring.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/librte_pmd_ring/rte_eth_ring.c 
b/lib/librte_pmd_ring/rte_eth_ring.c
index a23e933..582a621 100644
--- a/lib/librte_pmd_ring/rte_eth_ring.c
+++ b/lib/librte_pmd_ring/rte_eth_ring.c
@@ -527,7 +527,7 @@ out:
 static int
 rte_pmd_ring_devinit(const char *name, const char *params)
 {
-   struct rte_kvargs *kvlist;
+   struct rte_kvargs *kvlist = NULL;
int ret = 0;
struct node_action_list *info = NULL;

@@ -548,7 +548,7 @@ rte_pmd_ring_devinit(const char *name, const char *params)
info = rte_zmalloc("struct node_action_list", 
sizeof(struct node_action_list) +
   (sizeof(struct node_action_pair) * 
ret), 0);
if (!info)
-   goto out;
+   goto out_free;

info->total = ret;
info->list = (struct node_action_pair*)(info + 1);
@@ -567,8 +567,8 @@ rte_pmd_ring_devinit(const char *name, const char *params)
}

 out_free:
+   rte_kvargs_free(kvlist);
rte_free(info);
-out:
return ret;
 }

-- 
1.9.1



[dpdk-dev] [PATCH 4/5] cmdline: make parse_set_list() use size_t instead of int for low/high parameter

2015-02-23 Thread Pawel Wodkowski
Fix warning reported by klocwork about size_t to int cast when passing
parameters to parse_set_list().

This patch fix code formating errors that give checkpatch.pl errors
after generating patch.

Signed-off-by: Pawel Wodkowski 
---
 lib/librte_cmdline/cmdline_parse_portlist.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/librte_cmdline/cmdline_parse_portlist.c 
b/lib/librte_cmdline/cmdline_parse_portlist.c
index fc6c14e..9c1fe3e 100644
--- a/lib/librte_cmdline/cmdline_parse_portlist.c
+++ b/lib/librte_cmdline/cmdline_parse_portlist.c
@@ -78,7 +78,7 @@ struct cmdline_token_ops cmdline_token_portlist_ops = {
 };

 static void
-parse_set_list(cmdline_portlist_t * pl, int low, int high)
+parse_set_list(cmdline_portlist_t *pl, size_t low, size_t high)
 {
do {
pl->map |= (1 << low++);
@@ -86,7 +86,7 @@ parse_set_list(cmdline_portlist_t * pl, int low, int high)
 }

 static int
-parse_ports(cmdline_portlist_t * pl, const char * str)
+parse_ports(cmdline_portlist_t *pl, const char *str)
 {
size_t ps, pe;
const char *first, *last;
-- 
1.9.1



[dpdk-dev] [PATCH 5/5] Fix usage of fgets in various places

2015-02-23 Thread Pawel Wodkowski
Declaration of fgets() is
char *fgets(char *str, int size, FILE *stream);

Klocwork complain about passing "sizeof()" as size parameter since
implicit casting size_t to int might cause loss of precision.

Signed-off-by: Pawel Wodkowski 
---
 lib/librte_cfgfile/rte_cfgfile.c|  2 +-
 lib/librte_eal/bsdapp/eal/eal.c |  2 +-
 lib/librte_eal/linuxapp/eal/eal_hugepage_info.c |  4 ++--
 lib/librte_eal/linuxapp/eal/eal_memory.c|  2 +-
 lib/librte_eal/linuxapp/eal/eal_pci.c   |  2 +-
 lib/librte_eal/linuxapp/eal/eal_timer.c |  2 +-
 lib/librte_pmd_virtio/virtio_ethdev.c   |  2 +-
 lib/librte_power/rte_power_acpi_cpufreq.c   | 10 +-
 8 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/lib/librte_cfgfile/rte_cfgfile.c b/lib/librte_cfgfile/rte_cfgfile.c
index b81c273..15ef447 100644
--- a/lib/librte_cfgfile/rte_cfgfile.c
+++ b/lib/librte_cfgfile/rte_cfgfile.c
@@ -107,7 +107,7 @@ rte_cfgfile_load(const char *filename, int flags)

memset(cfg->sections, 0, sizeof(cfg->sections[0]) * allocated_sections);

-   while (fgets(buffer, sizeof(buffer), f) != NULL) {
+   while (fgets(buffer, (int)sizeof(buffer), f) != NULL) {
char *pos = NULL;
size_t len = strnlen(buffer, sizeof(buffer));
lineno++;
diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c
index 69f3c03..ca51868 100644
--- a/lib/librte_eal/bsdapp/eal/eal.c
+++ b/lib/librte_eal/bsdapp/eal/eal.c
@@ -134,7 +134,7 @@ eal_parse_sysfs_value(const char *filename, unsigned long 
*val)
return -1;
}

-   if (fgets(buf, sizeof(buf), f) == NULL) {
+   if (fgets(buf, (int)sizeof(buf), f) == NULL) {
RTE_LOG(ERR, EAL, "%s(): cannot read sysfs value %s\n",
__func__, filename);
fclose(f);
diff --git a/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c 
b/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
index 590cb56..551472c 100644
--- a/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
+++ b/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
@@ -115,7 +115,7 @@ get_default_hp_size(void)
FILE *fd = fopen(proc_meminfo, "r");
if (fd == NULL)
rte_panic("Cannot open %s\n", proc_meminfo);
-   while(fgets(buffer, sizeof(buffer), fd)){
+   while (fgets(buffer, (int)sizeof(buffer), fd)) {
if (strncmp(buffer, str_hugepagesz, hugepagesz_len) == 0){
size = rte_str_to_size(&buffer[hugepagesz_len]);
break;
@@ -155,7 +155,7 @@ get_hugepage_dir(uint64_t hugepage_sz)
if (default_size == 0)
default_size = get_default_hp_size();

-   while (fgets(buf, sizeof(buf), fd)){
+   while (fgets(buf, (int)sizeof(buf), fd)) {
if (rte_strsplit(buf, sizeof(buf), splitstr, _FIELDNAME_MAX,
split_tok) != _FIELDNAME_MAX) {
RTE_LOG(ERR, EAL, "Error parsing %s\n", proc_mounts);
diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c 
b/lib/librte_eal/linuxapp/eal/eal_memory.c
index a67a1b0..0c7f8ce 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memory.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
@@ -614,7 +614,7 @@ find_numasocket(struct hugepage_file *hugepg_tbl, struct 
hugepage_info *hpi)
"%s/%s", hpi->hugedir, internal_config.hugefile_prefix);

/* parse numa map */
-   while (fgets(buf, sizeof(buf), f) != NULL) {
+   while (fgets(buf, (int)sizeof(buf), f) != NULL) {

/* ignore non huge page */
if (strstr(buf, " huge ") == NULL &&
diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c 
b/lib/librte_eal/linuxapp/eal/eal_pci.c
index 63bcbce..ee4e1d8 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
@@ -162,7 +162,7 @@ pci_parse_sysfs_resource(const char *filename, struct 
rte_pci_device *dev)

for (i = 0; i

[dpdk-dev] [PATCH v5 0/3] new headroom stats library and example application

2015-02-23 Thread Thomas Monjalon
2015-02-23 14:36, Jastrzebski, MichalX K:
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> > 2015-02-20 15:46, Jastrzebski, MichalX K:
> > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Neil Horman
> > > > On Thu, Feb 19, 2015 at 01:18:41PM +0100, Pawel Wodkowski wrote:
> > > > > Hi community,
> > > > > I would like to introduce library for measuring load of some arbitrary
> > jobs.
> > > > > It can be used to profile every kind of job sets on any arbitrary 
> > > > > execution
> > unit
> > > > > or tasking library.
> > > > >
> > > > > In provided l2fwd-headroom example I demonstrate how to use this
> > library to
> > > > > select optimal rx burst poll time. Jobs are selected by using existing
> > rte_timer
> > > > > library calls. This example does no limit possible schemes on which 
> > > > > this
> > > > > library can be used.
> > > > >
> > > > > Pawel Wodkowski (3):
> > > > >   librte_headroom: New library for checking core/system/app load
> > > > >   examples: introduce new l2fwd-headroom example
> > > > >   MAINTAINERS: claim responsibility for headroom library and example
> > app
> > > >
> > > > I'm sorry but I still fail to see how this is a particularly useful 
> > > > library.  It
> > > > clearly works fine, but it composes an application event loop in its own
> > > > terms,
> > > > and measures stats based on that.  While thats ok, any application is
> > already
> > > > going to have to write its own event loop, and can makethe same
> > > > measurements
> > > > synchnously within that loop, using alot less code to optimize its 
> > > > polling
> > time.
> > > >
> > > > In other words, I think this is one of those cases where this library is
> > > > probably somewhat useful for anyone who just wants to write an
> > application
> > > > in
> > > > terms the semantics exposed by this library, but not at all useful for
> > anyone
> > > > else.  I'd personally rather not have the extra code to maintain here.
> > > >
> > > > Stephen just gave a presentation at netdev about some of the
> > performance
> > > > optimization measurements Brocade did with DPDK and how they fine
> > tuned
> > > > their
> > > > environment.  One of the big take aways for me was that making time
> > based
> > > > measurements (especially if it was using the tsc), created cpu stalls 
> > > > that
> > > > skewed the measurements, and so the best optimizations they made
> > avoided
> > > > time
> > > > measurements, opting instead for packet count metrics.
> > > >
> > > > Neil
> > >
> > > Hi Neil,
> > >
> > > I think this library offers something quite useful probably not for 
> > > everyone,
> > > but for many people that use DPDK, and it is measuring quite accurately,
> > > how many spare cycles a CPU have after executing any serial tasks (as you
> > will know).
> > > If you look at two places in example application: main_loop()
> > > and l2fwd_fwd() functions, you will see two possible approach there, but
> > > this is not limited to that. You can even nest headroom objects and
> > measure
> > > process time of particular packets type.
> > > Of course, this will add an overhead due to the measurements,
> > > but that time is also measured, so any user can know what is the relative
> > > time "wasted" for measuring all this.
> > > If time delays are measured in bigger timestamps, are handled reliably,
> > > the cost of measuring will be low.
> > > I find this quite similar to the power library case. I would say that 
> > > library is
> > not useful
> > > for every application, but there are several cases where it can be
> > > (as demonstrated with l3fwd-power app).
> > >
> > > About your last bit, not sure if I understood it right, but in case of the
> > included sample app,
> > > the main measurement to see if we are overusing a CPU is the packet count
> > > in a queue (in this case RX queue), and I believe this should be used for
> > other apps,
> > > especially in those that use a pipeline model, where queues and rings are
> > the key part.
> > >
> > > As a final point, last week (12th of February), there was a request for a
> > tool/library like this
> > > from a user in the mailing list (Ilan Borenshtein), which indicates that 
> > > this
> > would be useful
> > > (probably not just for him, but for others). It probably could be 
> > > achieved by
> > the user
> > > by adding their own code, but I believe this library would be a good-to-
> > have,
> > > in case a user is looking for an easy way to calculate the exposed above.
> > > Let us give the users an example of this method and we will expand it with
> > more
> > > advanced application that may show capabilities of dynamic load scaling
> > based on headroom library measurement.
> > 
> Hi Thomas,
> 
> > I wonder how this library is related to DPDK.
> > I'm not against its integration, though the question must be asked.
> > DPDK is a set of libraries. What kind of library fit with DPDK goals and
> > deserve to be integrated?
> > 
> I think this

[dpdk-dev] Appropriate DPDK data structures for TCP sockets

2015-02-23 Thread Matt Laswell
Hey Matthew,

I've mostly worked on stackless systems over the last few years, but I have
done a fair bit of work on high performance, highly scalable connection
tracking data structures.  In that spirit, here are a few counterintuitive
insights I've gained over the years.  Perhaps they'll be useful to you.
Apologies in advance for likely being a bit long-winded.

First, you really need to take cache performance into account when you're
choosing a data structure.  Something like a balanced tree can seem awfully
appealing at first blush, either on its own or as a chaining mechanism for
a hash table.  But the problem with trees is that there really isn't much
locality of reference in your memory use - every single step in your
descent ends up being a cache miss.  This hurts you twice: once that you
end up stalled waiting for the next node in the tree to load from main
memory, and again when you have to reload whatever you pushed out of cache
to get it.

It's often better if, instead of a tree, you do linear search across arrays
of hash values.  It's easy to size the array so that it is exactly one
cache line long, and you can generally do linear search of the whole thing
in less time than it takes to do a single cache line fill.   If you find a
match, you can do full verification against the full tuple as needed.

Second, rather than synchronizing (perhaps with locks, perhaps with
lockless data structures), it's often beneficial to create multiple
threads, each of which holds a fraction of your connection tracking data.
Every connection belongs to a single one of these threads, selected perhaps
by hash or RSS value, and all packets from the connection go through that
single thread.  This approach has a couple of advantages.  First,
obviously, no slowdowns for synchronization.  But, second, I've found that
when you are spreading packets from a single connection across many compute
elements, you're inevitably going to start putting packets out of order.
In many applications, this ultimately leads to some additional processing
to put things back in order, which gives away the performance gains you
achieved.  Of course, this approach brings its own set of complexities, and
challenges for your application, and doesn't always spread the work as
efficiently across all of your cores.  But it might be worth considering.

Third, it's very worthwhile to have a cache for the most recently accessed
connection.  First, because network traffic is bursty, and you'll
frequently see multiple packets from the same connection in succession.
Second, because it can make life easier for your application code.  If you
have multiple places that need to access connection data, you don't have to
worry so much about the cost of repeated searches.  Again, this may or may
not matter for your particular application.  But for ones I've worked on,
it's been a win.

Anyway, as predicted, this post has gone far too long for a Monday
morning.  Regardless, I hope you found it useful.  Let me know if you have
questions or comments.

--
Matt Laswell
infinite io, inc.
laswell at infiniteio.com

On Sun, Feb 22, 2015 at 10:50 PM, Matthew Hall 
wrote:

>
> On Feb 22, 2015, at 4:02 PM, Stephen Hemminger 
> wrote:
> > Use userspace RCU? or BSD RB_TREE
>
> Thanks Stephen,
>
> I think the RB_TREE stuff is single threaded mostly.
>
> But user-space RCU looks quite good indeed, I didn't know somebody ported
> it out of the kernel. I'll check it out.
>
> Matthew.


[dpdk-dev] [PATCH v2] mk: Rework gcc version detection to permit versions newer than 4.x

2015-02-23 Thread Panu Matilainen
Separately comparing major and minor versions becomes seriously clumsy
when with major version changes, convert the entire version string into
a numeric value (ie 4.6.0 becomes 460 and 5.0.0 becomes 500) and use
that for comparisons, eliminate unnecessary negations while at it.
This makes the comparisons simpler, more obvious and makes gcc 5.0
naturally recognized at least as capable as newest 4.x.

This three-digit scheme would run into trouble if gcc ever went to
two-digit version segments, but that hasn't happened in the last 10+
years so it seems like a safe assumption.

Signed-off-by: Panu Matilainen 
---
 lib/librte_pmd_fm10k/Makefile|  2 +-
 lib/librte_pmd_i40e/Makefile |  2 +-
 lib/librte_pmd_ixgbe/Makefile|  6 +++---
 lib/librte_pmd_vmxnet3/Makefile  |  2 +-
 mk/toolchain/gcc/rte.toolchain-compat.mk | 22 ++
 5 files changed, 16 insertions(+), 18 deletions(-)

diff --git a/lib/librte_pmd_fm10k/Makefile b/lib/librte_pmd_fm10k/Makefile
index 986f4ef..2902d64 100644
--- a/lib/librte_pmd_fm10k/Makefile
+++ b/lib/librte_pmd_fm10k/Makefile
@@ -62,7 +62,7 @@ else
 #
 # CFLAGS for gcc
 #
-ifneq ($(shell test $(GCC_MAJOR_VERSION) -le 4 -a $(GCC_MINOR_VERSION) -le 3 
&& echo 1), 1)
+ifeq ($(shell test $(GCC_VERSION) -ge 440 && echo 1), 1)
 CFLAGS += -Wno-deprecated
 endif
 CFLAGS_BASE_DRIVER = -Wno-unused-parameter -Wno-unused-value
diff --git a/lib/librte_pmd_i40e/Makefile b/lib/librte_pmd_i40e/Makefile
index 9a0eec8..484379a 100644
--- a/lib/librte_pmd_i40e/Makefile
+++ b/lib/librte_pmd_i40e/Makefile
@@ -69,7 +69,7 @@ CFLAGS_BASE_DRIVER += -Wno-pointer-to-int-cast
 CFLAGS_BASE_DRIVER += -Wno-format-nonliteral
 CFLAGS_BASE_DRIVER += -Wno-format-security

-ifeq ($(shell test $(GCC_MAJOR_VERSION) -ge 4 -a $(GCC_MINOR_VERSION) -ge 4 && 
echo 1), 1)
+ifeq ($(shell test $(GCC_VERSION) -ge 440 && echo 1), 1)
 CFLAGS_BASE_DRIVER += -Wno-unused-but-set-variable
 endif

diff --git a/lib/librte_pmd_ixgbe/Makefile b/lib/librte_pmd_ixgbe/Makefile
index 0279f8c..ab56cbf 100644
--- a/lib/librte_pmd_ixgbe/Makefile
+++ b/lib/librte_pmd_ixgbe/Makefile
@@ -60,18 +60,18 @@ else
 #
 # CFLAGS for gcc
 #
-ifneq ($(shell test $(GCC_MAJOR_VERSION) -le 4 -a $(GCC_MINOR_VERSION) -le 3 
&& echo 1), 1)
+ifeq ($(shell test $(GCC_VERSION) -ge 440 && echo 1), 1)
 CFLAGS += -Wno-deprecated
 endif
 CFLAGS_BASE_DRIVER = -Wno-unused-parameter -Wno-unused-value
 CFLAGS_BASE_DRIVER += -Wno-strict-aliasing -Wno-format-extra-args

-ifeq ($(shell test $(GCC_MAJOR_VERSION) -ge 4 -a $(GCC_MINOR_VERSION) -ge 6 && 
echo 1), 1)
+ifeq ($(shell test $(GCC_VERSION) -ge 460 && echo 1), 1)
 CFLAGS_ixgbe_common.o += -Wno-unused-but-set-variable
 CFLAGS_ixgbe_x550.o += -Wno-unused-but-set-variable -Wno-maybe-uninitialized
 endif

-ifeq ($(shell test $(GCC_MAJOR_VERSION) -le 4 -a $(GCC_MINOR_VERSION) -le 6 && 
echo 1), 1)
+ifeq ($(shell test $(GCC_VERSION) -le 460 && echo 1), 1)
 CFLAGS_ixgbe_x550.o += -Wno-uninitialized
 CFLAGS_ixgbe_phy.o += -Wno-uninitialized
 endif
diff --git a/lib/librte_pmd_vmxnet3/Makefile b/lib/librte_pmd_vmxnet3/Makefile
index 93e5580..9dda0a7 100644
--- a/lib/librte_pmd_vmxnet3/Makefile
+++ b/lib/librte_pmd_vmxnet3/Makefile
@@ -56,7 +56,7 @@ else
 #
 # CFLAGS for gcc
 #
-ifneq ($(shell test $(GCC_MAJOR_VERSION) -le 4 -a $(GCC_MINOR_VERSION) -le 3 
&& echo 1), 1)
+ifeq ($(shell test $(GCC_VERSION) -ge 440 && echo 1), 1)
 CFLAGS += -Wno-deprecated
 endif
 CFLAGS_BASE_DRIVER = -Wno-unused-parameter -Wno-unused-value
diff --git a/mk/toolchain/gcc/rte.toolchain-compat.mk 
b/mk/toolchain/gcc/rte.toolchain-compat.mk
index e40e103..a867559 100644
--- a/mk/toolchain/gcc/rte.toolchain-compat.mk
+++ b/mk/toolchain/gcc/rte.toolchain-compat.mk
@@ -38,17 +38,15 @@

 #find out GCC version

-GCC_MAJOR_VERSION = $(shell $(CC) -dumpversion | cut -f1 -d.)
+GCC_VERSION = $(subst .,,$(shell $(CC) -dumpversion))

-# if GCC is not 4.x
-ifneq ($(GCC_MAJOR_VERSION),4)
+# if GCC is older than 4.x
+ifeq ($(shell test $(GCC_VERSION) -lt 400 && echo 1), 1)
MACHINE_CFLAGS =
-$(warning You are not using GCC 4.x. This is neither supported, nor tested.)
+$(warning You are using GCC < 4.x. This is neither supported, nor tested.)


 else
-   GCC_MINOR_VERSION = $(shell $(CC) -dumpversion | cut -f2 -d.)
-
 # GCC graceful degradation
 # GCC 4.2.x - added support for generic target
 # GCC 4.3.x - added support for core2, ssse3, sse4.1, sse4.2
@@ -57,18 +55,18 @@ else
 # GCC 4.6.x - added support for corei7, corei7-avx
 # GCC 4.7.x - added support for fsgsbase, rdrnd, f16c, core-avx-i, core-avx2

-   ifeq ($(shell test $(GCC_MINOR_VERSION) -le 7 && echo 1), 1)
+   ifeq ($(shell test $(GCC_VERSION) -le 470 && echo 1), 1)
MACHINE_CFLAGS := $(patsubst 
-march=core-avx-i,-march=corei7-avx,$(MACHINE_CFLAGS))
MACHINE_CFLAGS := $(patsubst 
-march=core-avx2,-march=core-avx2,$(MACHINE_CFLAGS))
endif
-   ifeq ($(shell test $(GCC_MINOR_VERSION)

[dpdk-dev] [PATCH] eal: mmap uio resources using resourceX files

2015-02-23 Thread Bruce Richardson
Instead of distinguishing the BAR mappings via offset within a single
file, originally /dev/uioX, switch to mapping each individual bar via
the appropriately numbered resourceX file.

Signed-off-by: Bruce Richardson 
---
 lib/librte_eal/common/include/rte_pci.h|  2 +-
 lib/librte_eal/linuxapp/eal/eal_pci_init.h |  1 +
 lib/librte_eal/linuxapp/eal/eal_pci_uio.c  | 34 --
 lib/librte_eal/linuxapp/eal/eal_pci_vfio.c |  1 +
 4 files changed, 21 insertions(+), 17 deletions(-)

diff --git a/lib/librte_eal/common/include/rte_pci.h 
b/lib/librte_eal/common/include/rte_pci.h
index 4301c16..e34b139 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -117,7 +117,7 @@ struct rte_pci_resource {
 };

 /** Maximum number of PCI resources. */
-#define PCI_MAX_RESOURCE 7
+#define PCI_MAX_RESOURCE 6

 /**
  * A structure describing an ID for a PCI driver. Each driver provides a
diff --git a/lib/librte_eal/linuxapp/eal/eal_pci_init.h 
b/lib/librte_eal/linuxapp/eal/eal_pci_init.h
index 1070eb8..2125d7b 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci_init.h
+++ b/lib/librte_eal/linuxapp/eal/eal_pci_init.h
@@ -38,6 +38,7 @@

 struct pci_map {
void *addr;
+   char *path;
uint64_t offset;
uint64_t size;
uint64_t phaddr;
diff --git a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c 
b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
index 2b16fcb..ecf385a 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
@@ -137,10 +137,10 @@ pci_uio_map_secondary(struct rte_pci_device *dev)
/*
 * open devname, to mmap it
 */
-   fd = open(uio_res->path, O_RDWR);
+   fd = open(uio_res->maps[i].path, O_RDWR);
if (fd < 0) {
RTE_LOG(ERR, EAL, "Cannot open %s: %s\n",
-   uio_res->path, strerror(errno));
+   uio_res->maps[i].path, strerror(errno));
return -1;
}

@@ -149,7 +149,8 @@ pci_uio_map_secondary(struct rte_pci_device *dev)
 (size_t)uio_res->maps[i].size)
!= uio_res->maps[i].addr) {
RTE_LOG(ERR, EAL,
-   "Cannot mmap device resource\n");
+   "Cannot mmap device resource file: 
%s\n",
+   uio_res->maps[i].path);
close(fd);
return -1;
}
@@ -294,8 +295,6 @@ pci_uio_map_resource(struct rte_pci_device *dev)
void *mapaddr;
int uio_num;
uint64_t phaddr;
-   uint64_t offset;
-   uint64_t pagesz;
int nb_maps;
struct rte_pci_addr *loc = &dev->addr;
struct mapped_pci_resource *uio_res;
@@ -336,11 +335,6 @@ pci_uio_map_resource(struct rte_pci_device *dev)
return -1;
}

-   /* update devname for mmap  */
-   snprintf(devname, sizeof(devname),
-   SYSFS_PCI_DEVICES "/" PCI_PRI_FMT "/resource%d",
-   loc->domain, loc->bus, loc->devid, loc->function, 0);
-
/* set bus master that is not done by uio_pci_generic */
if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
if (pci_uio_set_bus_master(dev->intr_handle.uio_cfg_fd)) {
@@ -370,8 +364,6 @@ pci_uio_map_resource(struct rte_pci_device *dev)
uio_res->nb_maps = nb_maps;

/* Map all BARs */
-   pagesz = sysconf(_SC_PAGESIZE);
-
maps = uio_res->maps;
for (i = 0; i != PCI_MAX_RESOURCE; i++) {
int fd;
@@ -389,10 +381,15 @@ pci_uio_map_resource(struct rte_pci_device *dev)
/* if matching map is found, then use it */
if (j != nb_maps) {
int fail = 0;
-   offset = j * pagesz;
+
+   /* update devname for mmap  */
+   snprintf(devname, sizeof(devname),
+   SYSFS_PCI_DEVICES "/" PCI_PRI_FMT "/resource%d",
+   loc->domain, loc->bus, loc->devid, 
loc->function,
+   i);

/*
-* open devname, to mmap it
+* open resource file, to mmap it
 */
fd = open(devname, O_RDWR);
if (fd < 0) {
@@ -408,7 +405,7 @@ pci_uio_map_resource(struct rte_pci_device *dev)
if (pci_map_addr == NULL)
pci_map_addr = pci_find_max_end_va();

-   mapaddr = pci_map_resource(pci_map_addr, fd, 
(off

[dpdk-dev] [PATCH 0/8] Improve build process

2015-02-23 Thread Gonzalez Monroy, Sergio
On 23/02/2015 13:52, Neil Horman wrote:
> On Mon, Feb 23, 2015 at 10:25:01AM +, Gonzalez Monroy, Sergio wrote:
>> On 22/02/2015 23:37, Neil Horman wrote:
>>> On Fri, Feb 20, 2015 at 02:31:36PM +, Gonzalez Monroy, Sergio wrote:
 On 13/02/2015 12:51, Neil Horman wrote:
> On Fri, Feb 13, 2015 at 11:08:02AM +, Gonzalez Monroy, Sergio wrote:
>> On 13/02/2015 10:14, Panu Matilainen wrote:
>>> On 02/12/2015 05:52 PM, Neil Horman wrote:
 On Thu, Feb 12, 2015 at 04:07:50PM +0200, Panu Matilainen wrote:
> On 02/12/2015 02:23 PM, Neil Horman wrote:
>>> [...snip...]
> So I just realized that I was not having into account a possible
> scenario, where
> we have an app built with static dpdk libs then loading a dso
> with -d
> option.
>
> In such case, because the pmd would have DT_NEEDED entries,
> dlopen will
> fail.
> So to enable such scenario we would need to build PMDs without
> DT_NEEDED
> entries.
 Hmm, for that to be a problem you'd need to have the PMD built
 against
 shared dpdk libs and while the application is built against
 static dpdk
 libs. I dont think that's a supportable scenario in any case.

 Or is there some other scenario that I'm not seeing?

 - Panu -

>>> I agree with you. I suppose it comes down to, do we want to
>>> support such
>>> scenario?
>>>
>>>  From what I can see, it seems that we do currently support such
>>> scenario by
>>> building dpdk apps against all static dpdk libs using
>>> --whole-archive (all
>>> libs and not only PMDs).
>>> http://dpdk.org/browse/dpdk/commit/?id=20afd76a504155e947c770783ef5023e87136ad8
>>>
>>>
>>> Am I misunderstanding this?
>>>
>> Shoot, you're right, I missed the static build aspect to this.  Yes,
>> if we do the following:
>>
>> 1) Build the DPDK as a static library
>> 2) Link an application against (1)
>> 3) Use the dlopen mechanism to load a PMD built as a DSO
>>
>> Then the DT_NEEDED entries in the DSO will go unsatisfied, because
>> the shared
>> objects on which it (the PMD) depends will not exist in the file
>> system.
> I think its even more twisty:
>
> 1) Build the DPDK as a static library
> 2) Link an application against (1)
> 3) Do another build of DPDK as a shared library
> 4) In app 2), use the dlopen mechanism to load a PMD built as a part
> of or
> against 3)
>
> Somehow I doubt this would work very well.
>
 Ideally it should, presuming the ABI is preserved between (1) and (3),
 though I
 agree, up until recently, that was an assumption that was unreliable.
>>> Versioning is a big and important step towards reliability but there are
>>> more issues to solve. This of course getting pretty far from the 
>>> original
>>> topic, but at least one such issue is that there are some cases where a
>>> config value affects what are apparently public structs (rte_mbuf wrt
>>> RTE_MBUF_REFCNT for example), which really is a no-go.
>>>
>> Agree, the RTE_MBUF_REFCNT is something that needs to be dealt with asap.
>> I'll look into it.
>>
>> I think the problem is a little bit orthogonal to the libdpdk_core
>> problem you
>> were initially addressing.  That is to say, this problem of
>> dlopen-ed PMD's
>> exists regardless of weather you build the DPDK as part of a static
>> or dynamic
>> library.  The problems just happen to intersect in their
>> manipulation of the
>> DT_NEEDED entries.
>>
>> Ok, so, given the above, I would say your approach is likely
>> correct, just
>> prevent DT_NEEDED entries from getting added to PMD's. Doing so will
>> sidestep
>> loading issue for libraries that may not exist in the filesystem,
>> but thats ok,
>> because by all rights, the symbols codified in those needed
>> libraries should
>> already be present in the running application (either made available
>> by the
>> application having statically linked them, or having the linker load
>> them from
>> the proper libraries at run time).
> My 5c is that I'd much rather see the common case (all static or all
> shared)
> be simple and reliable, which in case of DSOs includes no lying
> (whether by
> omission or otherwise) about DT_NEEDED, ever. That way the issue is
> dealt
> once wher

[dpdk-dev] [PATCH] eal: mmap uio resources using resourceX files

2015-02-23 Thread Bruce Richardson
On Mon, Feb 23, 2015 at 02:57:24PM +, Bruce Richardson wrote:
> Instead of distinguishing the BAR mappings via offset within a single
> file, originally /dev/uioX, switch to mapping each individual bar via
> the appropriately numbered resourceX file.
> 
> Signed-off-by: Bruce Richardson 
> ---
Hi Tetsuya,

in our tests here, this patch seems to fix the immediate problem you were
experiencing on your system. Can you perhaps verify?

Thanks,
/Bruce



[dpdk-dev] [PATCH v4 4/5] eal: add per rx queue interrupt handling based on VFIO

2015-02-23 Thread Zhou, Danny

> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Monday, February 23, 2015 9:28 PM
> To: Zhou, Danny
> Cc: Gonzalez Monroy, Sergio; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v4 4/5] eal: add per rx queue interrupt 
> handling based on VFIO
> 
> 2015-02-23 11:47, Zhou, Danny:
> > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> > > 2015-02-19 21:48, Zhou Danny:
> > > > --- a/lib/librte_eal/linuxapp/eal/Makefile
> > > > +++ b/lib/librte_eal/linuxapp/eal/Makefile
> > > > @@ -43,6 +43,7 @@ CFLAGS += -I$(SRCDIR)/include
> > > >  CFLAGS += -I$(RTE_SDK)/lib/librte_eal/common
> > > >  CFLAGS += -I$(RTE_SDK)/lib/librte_eal/common/include
> > > >  CFLAGS += -I$(RTE_SDK)/lib/librte_ring
> > > > +CFLAGS += -I$(RTE_SDK)/lib/librte_mbuf
> > > >  CFLAGS += -I$(RTE_SDK)/lib/librte_mempool
> > > >  CFLAGS += -I$(RTE_SDK)/lib/librte_malloc
> > > >  CFLAGS += -I$(RTE_SDK)/lib/librte_ether
> > >
> > > Why do we need mbuf in EAL?
> >
> > The file eal_interrupts.c includes rte_ethdev.h which defines structure 
> > rte_eth_devices that
> > eal needs to use in order to get per-port intr_handle. The rte_ethdev.h 
> > includes the rte_mbuf.h
> > so the Makefile is updated here.
> 
> I see. You are breaking layer isolation by introducing ethdev in EAL.
> The cause seems to be:
> 
> +   struct rte_intr_handle intr_handle =
> +   rte_eth_devices[port_id].pci_dev->intr_handle;
> 
> Maybe that pci_dev should be a parameter of the function.

Adding pci_dev as a parameter has similar problem due to eal does not include 
rte_pci.h which
defines struct rte_pci_device. It looks the new-added function 
rte_eal_wait_rx_intr(uint8_t port_id, uint8_t queue_id);
is not proper to be declared in rte_eal.h, will rename it to 
rte_intr_wait_rx (struct rte_intr_handle *intr_handle, uint8_t 
queue_id);

and then move declaration from rte_eal.h to rte_interrupts.h. So isolation can 
be avoided and no need to includes rte_ethdev.h
and change Makefile.


[dpdk-dev] [PATCH v4 2/5] ixgbe: enable rx queue interrupts for both PF and VF

2015-02-23 Thread Zhou, Danny

> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Monday, February 23, 2015 9:20 PM
> To: Zhou, Danny
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v4 2/5] ixgbe: enable rx queue interrupts for 
> both PF and VF
> 
> 2015-02-23 11:23, Zhou, Danny:
> > I noticed the V4 patch conflicts with the latest code on the main branch 
> > due to lots of code merged, and it
> > is mentioned by Jun Xu in previous email, and I will have to rebase the 
> > patch and send out V5 version.
> 
> Maybe you misunderstood my comment.
> I'm quoting an usage in patch 2 of max_intr which is defined in patch 4.
> You have to better split and order your patches to make them compilable
> and not break git bisect.
> 
> Thanks
> 

Will split-off a new patch file for rte_interrupts.h which both patch 2 and 
patch 4 depends on.

> > > -Original Message-
> > > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> > > Sent: Monday, February 23, 2015 7:20 PM
> > > To: Zhou, Danny
> > > Cc: dev at dpdk.org
> > > Subject: Re: [dpdk-dev] [PATCH v4 2/5] ixgbe: enable rx queue interrupts 
> > > for both PF and VF
> > >
> > > 2015-02-19 21:48, Zhou Danny:
> > > > +   /* set max interrupt vfio request */
> > > > +   pci_dev->intr_handle.max_intr = hw->mac.max_rx_queues +
> > > > +   IXGBE_MAX_OTHER_INTR;
> > > > +
> > >
> > > Compilation is broken here.
> 



[dpdk-dev] [PATCH v5 2/3] ethdev: add optional rxtx callback support

2015-02-23 Thread Thomas Monjalon
Hi John,

2015-02-20 17:03, John McNamara:
> From: Richardson, Bruce 
> 
> Add optional support for inline processing of packets inside the RX
> or TX call. For an RX callback, what happens is that we get a set of
> packets from the NIC and then pass them to a callback function, if
> configured, to allow additional processing to be done on them, e.g.
> filling in more mbuf fields, before passing back to the application.
> On TX, the packets are similarly post-processed before being handed
> to the NIC for transmission.
> 
> Signed-off-by: Bruce Richardson 
> Signed-off-by: John McNamara 
[...]
> +#ifdef RTE_ETHDEV_RXTX_CALLBACKS
> +void *
> +rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id,
> + rte_rxtx_callback_fn fn, void *user_param)
> +{
> + /* check input parameters */
> + if (port_id >= nb_ports || fn == NULL ||
> + queue_id >= rte_eth_devices[port_id].data->nb_rx_queues) {
> + rte_errno = EINVAL;
> + return NULL;
> + }

Why not putting #ifdef only here and return an error ENOTSUP?

> + struct rte_eth_rxtx_callback *cb = rte_zmalloc(NULL, sizeof(*cb), 0);
> +
> + if (cb == NULL) {
> + rte_errno = ENOMEM;
> + return NULL;
> + }
> +
> + cb->fn = fn;
> + cb->param = user_param;
> + cb->next = rte_eth_devices[port_id].post_rx_burst_cbs[queue_id];
> + rte_eth_devices[port_id].post_rx_burst_cbs[queue_id] = cb;
> + return cb;
> +}
[...]
> --- a/lib/librte_ether/rte_ethdev.h
> +++ b/lib/librte_ether/rte_ethdev.h
> @@ -1522,6 +1522,49 @@ struct eth_dev_ops {
>   eth_filter_ctrl_t  filter_ctrl;  /**< common filter 
> control*/
>  };
>  
> +#ifdef RTE_ETHDEV_RXTX_CALLBACKS
> +/**
> + * Function type used for callbacks for processing packets on RX and TX
> + *
> + * If configured for RX, it is called with a burst of packets that have just
> + * been received on the given port and queue. On TX, it is called with a 
> burst
> + * of packets immediately before those packets are put onto the hardware 
> queue
> + * for transmission.
> + *
> + * @param port
> + *   The ethernet port on which rx or tx is being performed
> + * @param queue
> + *   The queue on the ethernet port which is being used to receive or 
> transmit
> + *   the packets.
> + * @param pkts
> + *   The burst of packets on which processing is to be done. On RX, these
> + *   packets have just been received. On TX, they are about to be 
> transmitted.
> + * @param nb_pkts
> + *   The number of packets in the burst pointed to by "pkts"
> + * @param user_param
> + *   The arbitrary user parameter passed in by the application when the 
> callback
> + *   was originally configured.
> + * @return
> + *   The number of packets remaining in pkts are processing.
> + *   * On RX, this will be returned to the user as the return value from
> + * rte_eth_rx_burst.
> + *   * On TX, this will be the number of packets actually written to the NIC.
> + */
> +typedef uint16_t (*rte_rxtx_callback_fn)(uint8_t port, uint16_t queue,
> + struct rte_mbuf *pkts[], uint16_t nb_pkts, void *user_param);
> +
> +/**
> + * @internal
> + * Structure used to hold information about the callbacks to be called for a
> + * queue on RX and TX.
> + */
> +struct rte_eth_rxtx_callback {
> + struct rte_eth_rxtx_callback *next;
> + rte_rxtx_callback_fn fn;
> + void *param;
> +};
> +#endif
> +
>  /**
>   * @internal
>   * The generic data structure associated with each ethernet device.
> @@ -1539,7 +1582,22 @@ struct rte_eth_dev {
>   const struct eth_driver *driver;/**< Driver for this device */
>   struct eth_dev_ops *dev_ops;/**< Functions exported by PMD */
>   struct rte_pci_device *pci_dev; /**< PCI info. supplied by probing */
> - struct rte_eth_dev_cb_list link_intr_cbs; /**< User application 
> callbacks on interrupt*/
> + /** User application callbacks for NIC interrupts */
> + struct rte_eth_dev_cb_list link_intr_cbs;
> +
> +#ifdef RTE_ETHDEV_RXTX_CALLBACKS
> + /**
> +  * User-supplied functions called from rx_burst to post-process
> +  * received packets before passing them to the user
> +  */
> + struct rte_eth_rxtx_callback **post_rx_burst_cbs;
> +
> + /**
> +  * User-supplied functions called from tx_burst to pre-process
> +  * received packets before passing them to the driver for transmission.
> +  */
> + struct rte_eth_rxtx_callback **pre_tx_burst_cbs;
> +#endif
>  };

Generally, I think it's a bad idea to put #ifdef in API (structs or functions).

>  struct rte_eth_dev_sriov {
> @@ -2393,7 +2451,23 @@ rte_eth_rx_burst(uint8_t port_id, uint16_t queue_id,
>   struct rte_eth_dev *dev;
>  
>   dev = &rte_eth_devices[port_id];
> +
> +#ifndef RTE_ETHDEV_RXTX_CALLBACKS
>   return (*dev->rx_pkt_burst)(dev->data->rx_queues[queue_id], rx_pkts, 
> nb_pkts);
> +#else
> + nb_pkts = (*dev->rx_pkt_burst)(dev->data->rx_queues[queue_id], 

[dpdk-dev] [PATCH v4 4/5] eal: add per rx queue interrupt handling based on VFIO

2015-02-23 Thread Thomas Monjalon
2015-02-23 15:02, Zhou, Danny:
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> > 2015-02-23 11:47, Zhou, Danny:
> > > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> > > > 2015-02-19 21:48, Zhou Danny:
> > > > > --- a/lib/librte_eal/linuxapp/eal/Makefile
> > > > > +++ b/lib/librte_eal/linuxapp/eal/Makefile
> > > > > @@ -43,6 +43,7 @@ CFLAGS += -I$(SRCDIR)/include
> > > > >  CFLAGS += -I$(RTE_SDK)/lib/librte_eal/common
> > > > >  CFLAGS += -I$(RTE_SDK)/lib/librte_eal/common/include
> > > > >  CFLAGS += -I$(RTE_SDK)/lib/librte_ring
> > > > > +CFLAGS += -I$(RTE_SDK)/lib/librte_mbuf
> > > > >  CFLAGS += -I$(RTE_SDK)/lib/librte_mempool
> > > > >  CFLAGS += -I$(RTE_SDK)/lib/librte_malloc
> > > > >  CFLAGS += -I$(RTE_SDK)/lib/librte_ether
> > > >
> > > > Why do we need mbuf in EAL?
> > >
> > > The file eal_interrupts.c includes rte_ethdev.h which defines structure 
> > > rte_eth_devices that
> > > eal needs to use in order to get per-port intr_handle. The rte_ethdev.h 
> > > includes the rte_mbuf.h
> > > so the Makefile is updated here.
> > 
> > I see. You are breaking layer isolation by introducing ethdev in EAL.
> > The cause seems to be:
> > 
> > +   struct rte_intr_handle intr_handle =
> > +   
> > rte_eth_devices[port_id].pci_dev->intr_handle;
> > 
> > Maybe that pci_dev should be a parameter of the function.
> 
> Adding pci_dev as a parameter has similar problem due to eal does not include 
> rte_pci.h which

I don't understand your assertion. rte_pci.h is part of EAL.

> defines struct rte_pci_device. It looks the new-added function 
> rte_eal_wait_rx_intr(uint8_t port_id, uint8_t queue_id);
> is not proper to be declared in rte_eal.h, will rename it to 
>   rte_intr_wait_rx (struct rte_intr_handle *intr_handle, uint8_t 
> queue_id);
> 
> and then move declaration from rte_eal.h to rte_interrupts.h. So isolation 
> can be avoided and no need to includes rte_ethdev.h
> and change Makefile.

Yes could be better.


[dpdk-dev] [PATCH v4 4/5] eal: add per rx queue interrupt handling based on VFIO

2015-02-23 Thread Zhou, Danny

> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Monday, February 23, 2015 11:19 PM
> To: Zhou, Danny
> Cc: Gonzalez Monroy, Sergio; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v4 4/5] eal: add per rx queue interrupt 
> handling based on VFIO
> 
> 2015-02-23 15:02, Zhou, Danny:
> > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> > > 2015-02-23 11:47, Zhou, Danny:
> > > > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> > > > > 2015-02-19 21:48, Zhou Danny:
> > > > > > --- a/lib/librte_eal/linuxapp/eal/Makefile
> > > > > > +++ b/lib/librte_eal/linuxapp/eal/Makefile
> > > > > > @@ -43,6 +43,7 @@ CFLAGS += -I$(SRCDIR)/include
> > > > > >  CFLAGS += -I$(RTE_SDK)/lib/librte_eal/common
> > > > > >  CFLAGS += -I$(RTE_SDK)/lib/librte_eal/common/include
> > > > > >  CFLAGS += -I$(RTE_SDK)/lib/librte_ring
> > > > > > +CFLAGS += -I$(RTE_SDK)/lib/librte_mbuf
> > > > > >  CFLAGS += -I$(RTE_SDK)/lib/librte_mempool
> > > > > >  CFLAGS += -I$(RTE_SDK)/lib/librte_malloc
> > > > > >  CFLAGS += -I$(RTE_SDK)/lib/librte_ether
> > > > >
> > > > > Why do we need mbuf in EAL?
> > > >
> > > > The file eal_interrupts.c includes rte_ethdev.h which defines structure 
> > > > rte_eth_devices that
> > > > eal needs to use in order to get per-port intr_handle. The rte_ethdev.h 
> > > > includes the rte_mbuf.h
> > > > so the Makefile is updated here.
> > >
> > > I see. You are breaking layer isolation by introducing ethdev in EAL.
> > > The cause seems to be:
> > >
> > > +   struct rte_intr_handle intr_handle =
> > > +   
> > > rte_eth_devices[port_id].pci_dev->intr_handle;
> > >
> > > Maybe that pci_dev should be a parameter of the function.
> >
> > Adding pci_dev as a parameter has similar problem due to eal does not 
> > include rte_pci.h which
> 
> I don't understand your assertion. rte_pci.h is part of EAL.
> 

rte_eal.h does not include any DPDK header file, adding pci_dev would force it 
to include rte_pci.h file.
With solution below, those kinds of mess could be completely avoided.

> > defines struct rte_pci_device. It looks the new-added function 
> > rte_eal_wait_rx_intr(uint8_t port_id, uint8_t queue_id);
> > is not proper to be declared in rte_eal.h, will rename it to
> > rte_intr_wait_rx (struct rte_intr_handle *intr_handle, uint8_t 
> > queue_id);
> >
> > and then move declaration from rte_eal.h to rte_interrupts.h. So isolation 
> > can be avoided and no need to includes
> rte_ethdev.h
> > and change Makefile.
> 
> Yes could be better.


[dpdk-dev] [PATCH v3 1/3] eal: enable uio_pci_generic support

2015-02-23 Thread David Marchand
Hello,

Ok this is coming too late, but anyway, my comments.


On Fri, Feb 20, 2015 at 5:59 PM, Bruce Richardson <
bruce.richardson at intel.com> wrote:

[snip]

diff --git a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
> b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
> index 54cce08..2b16fcb 100644
> --- a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
> @@ -36,6 +36,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>
>  #include 
>  #include 
> @@ -47,71 +48,73 @@
>  #include "eal_filesystem.h"
>  #include "eal_pci_init.h"
>
> -static int pci_parse_sysfs_value(const char *filename, uint64_t *val);
> -
>  void *pci_map_addr = NULL;
>
>
>  #define OFF_MAX  ((uint64_t)(off_t)-1)
>  static int
> -pci_uio_get_mappings(const char *devname, struct pci_map maps[], int
> nb_maps)
> +pci_uio_get_mappings(struct rte_pci_device *dev,
> +   struct pci_map maps[], int nb_maps)
>  {
> -   int i;
> -   char dirname[PATH_MAX];
> +   struct rte_pci_addr *loc = &dev->addr;
> +   int i = 0;
> char filename[PATH_MAX];
> -   uint64_t offset, size;
> +   unsigned long long start_addr, end_addr, flags;
> +   FILE *f;
>
> -   for (i = 0; i != nb_maps; i++) {
> +   snprintf(filename, sizeof(filename),
> +   SYSFS_PCI_DEVICES "/" PCI_PRI_FMT "/resource",
> +   loc->domain, loc->bus, loc->devid, loc->function);
>
> -   /* check if map directory exists */
> -   snprintf(dirname, sizeof(dirname),
> -   "%s/maps/map%u", devname, i);
> -
> -   if (access(dirname, F_OK) != 0)
> -   break;
> +   f = fopen(filename, "r");
> +   if (f == NULL) {
> +   RTE_LOG(ERR, EAL,
> +   "%s(): cannot open sysfs %s\n",
> +   __func__, filename);
> +   return -1;
> +   }
>
> -   /* get mapping offset */
> -   snprintf(filename, sizeof(filename),
> -   "%s/offset", dirname);
> -   if (pci_parse_sysfs_value(filename, &offset) < 0) {
> -   RTE_LOG(ERR, EAL,
> -   "%s(): cannot parse offset of %s\n",
> -   __func__, dirname);
> -   return -1;
> +   while (fscanf(f, "%llx %llx %llx", &start_addr,
> +   &end_addr, &flags) == 3 && i < nb_maps) {
> +   if (flags & IORESOURCE_MEM) {
> +   maps[i].offset = 0x0;
> +   maps[i].size = end_addr - start_addr + 1;
> +   maps[i].phaddr = start_addr;
> +   i++;
> }
> +   }
> +   fclose(f);
>
> -   /* get mapping size */
> -   snprintf(filename, sizeof(filename),
> -   "%s/size", dirname);
> -   if (pci_parse_sysfs_value(filename, &size) < 0) {
> -   RTE_LOG(ERR, EAL,
> -   "%s(): cannot parse size of %s\n",
> -   __func__, dirname);
> -   return -1;
> -   }
> +   return i;
> +}
>
> -   /* get mapping physical address */
> -   snprintf(filename, sizeof(filename),
> -   "%s/addr", dirname);
> -   if (pci_parse_sysfs_value(filename, &maps[i].phaddr) < 0) {
> -   RTE_LOG(ERR, EAL,
> -   "%s(): cannot parse addr of %s\n",
> -   __func__, dirname);
> -   return -1;
> -   }
>

This function ends up reinventing the wheel from eal_pci.c plus it adds
some new array with mappings in them but not indexed the same way as
eal_pci.c see comments at the end of this mail.


> +static int
> +pci_uio_set_bus_master(int dev_fd)
> +{
> +   uint16_t reg;
> +   int ret;
>
> -   if ((offset > OFF_MAX) || (size > SIZE_MAX)) {
> -   RTE_LOG(ERR, EAL,
> -   "%s(): offset/size exceed system max
> value\n",
> -   __func__);
> -   return -1;
> -   }
> +   ret = pread(dev_fd, ®, sizeof(reg), PCI_COMMAND);
> +   if (ret != sizeof(reg)) {
> +   RTE_LOG(ERR, EAL,
> +   "Cannot read command from PCI config space!\n");
> +   return -1;
> +   }
> +
> +   /* return if bus mastering is already on */
> +   if (reg & PCI_COMMAND_MASTER)
> +   return 0;
> +
> +   reg |= PCI_COMMAND_MASTER;
>
> -   maps[i].offset = offset;
> -   maps[i].size = size;
> +   ret = pwrite(dev_fd, ®, sizeof(reg), PCI_COMMAND);
> +   if (ret != sizeof(reg)) {
> +   RTE_LOG(ERR, EAL,
> +   "Cannot write command to PCI config space!

[dpdk-dev] [PATCH] eal: mmap uio resources using resourceX files

2015-02-23 Thread Iremonger, Bernard


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Bruce Richardson
> Sent: Monday, February 23, 2015 3:00 PM
> To: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] eal: mmap uio resources using resourceX files
> 
> On Mon, Feb 23, 2015 at 02:57:24PM +, Bruce Richardson wrote:
> > Instead of distinguishing the BAR mappings via offset within a single
> > file, originally /dev/uioX, switch to mapping each individual bar via
> > the appropriately numbered resourceX file.
> >
> > Signed-off-by: Bruce Richardson 
> > ---
> Hi Tetsuya,
> 
> in our tests here, this patch seems to fix the immediate problem you were 
> experiencing on your
> system. Can you perhaps verify?
> 
> Thanks,
> /Bruce

Hi Bruce,

I was seeing a similar problem on my system when attaching a virtual function 
port.

testpmd> port attach :06:10.0 
Attaching a new port...
EAL: PCI device :06:10.0 on NUMA socket -1
EAL:   probe driver: 8086:10ed rte_ixgbevf_pmd
EAL:   PCI memory mapped at 0x1
EAL: pci_map_resource(): cannot mmap(27, 0x14000, 0x4000, 0x1000): Invalid 
argument (0x)
EAL: Requested device :06:10.0 cannot be used
EAL: Driver, cannot attach the device

This patch seems to solve the problem.

Regards,

Bernard.





[dpdk-dev] [PATCH v3 0/4] Support NVGRE on i40e

2015-02-23 Thread Thomas Monjalon
2015-02-20 17:01, Declan Doherty:
> The patch set supports NVGRE on i40e.
> 
> It includes:
>  - Support RX filters for NVGRE packet. It uses MAC and VLAN to point
>to a queue. The filter types supported are listed below:
> 
>1. Inner MAC and Inner VLAN ID
> 
>2. Inner MAC address, inner VLAN ID and tenant ID.
> 
>3. Inner MAC and tenant ID
> 
>4. Inner MAC address
> 
>5. Outer MAC address, tenant ID and inner MAC
> 
>  - Support TX checksum offload for NVGRE packet, which include outer L3(IP), 
> inner L3(IP) and inner L4(UDP, TCP and SCTP)
> 
> V2 changes:
>   Do some rework based on Olivier's patch set [PATCH v2 00/20] enhance tx 
> checksum offload API; the changes are listed below,
>   1. remove nvgre_hdr definition from rte_ether.h file. It is not used in 
> csumonly.c file.
>   2. remove filter type iip that is not supported well in current 
> firmware.
>   3. remove GRE packet flag from mbuf.
> 
> V3 changes:
>   - Addresses Olivier's comment's for V2 of patchset
>   - Re-based against HEAD 
> 
> Jijiang Liu (4):
>   librte_ether:add an ETHER_TYPE_TEB macro
>   i40e:support RX tunnel filter for NVGRE packet
>   app/testpmd:test RX tunnel filter for NVGRE packet
>   app/testpmd:test NVGRE Tx checksum offload

Applied, thanks


[dpdk-dev] [PATCH v5 0/3] new headroom stats library and example application

2015-02-23 Thread Jastrzebski, MichalX K
> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Monday, February 23, 2015 3:47 PM
> To: Jastrzebski, MichalX K
> Cc: Wodkowski, PawelX; dev at dpdk.org; Neil Horman
> Subject: Re: [dpdk-dev] [PATCH v5 0/3] new headroom stats library and
> example application
> 
> 2015-02-23 14:36, Jastrzebski, MichalX K:
> > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> > > 2015-02-20 15:46, Jastrzebski, MichalX K:
> > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Neil Horman
> > > > > On Thu, Feb 19, 2015 at 01:18:41PM +0100, Pawel Wodkowski wrote:
> > > > > > Hi community,
> > > > > > I would like to introduce library for measuring load of some 
> > > > > > arbitrary
> > > jobs.
> > > > > > It can be used to profile every kind of job sets on any arbitrary
> execution
> > > unit
> > > > > > or tasking library.
> > > > > >
> > > > > > In provided l2fwd-headroom example I demonstrate how to use this
> > > library to
> > > > > > select optimal rx burst poll time. Jobs are selected by using 
> > > > > > existing
> > > rte_timer
> > > > > > library calls. This example does no limit possible schemes on which
> this
> > > > > > library can be used.
> > > > > >
> > > > > > Pawel Wodkowski (3):
> > > > > >   librte_headroom: New library for checking core/system/app load
> > > > > >   examples: introduce new l2fwd-headroom example
> > > > > >   MAINTAINERS: claim responsibility for headroom library and
> example
> > > app
> > > > >
> > > > > I'm sorry but I still fail to see how this is a particularly useful 
> > > > > library.  It
> > > > > clearly works fine, but it composes an application event loop in its
> own
> > > > > terms,
> > > > > and measures stats based on that.  While thats ok, any application is
> > > already
> > > > > going to have to write its own event loop, and can makethe same
> > > > > measurements
> > > > > synchnously within that loop, using alot less code to optimize its
> polling
> > > time.
> > > > >
> > > > > In other words, I think this is one of those cases where this library 
> > > > > is
> > > > > probably somewhat useful for anyone who just wants to write an
> > > application
> > > > > in
> > > > > terms the semantics exposed by this library, but not at all useful for
> > > anyone
> > > > > else.  I'd personally rather not have the extra code to maintain here.
> > > > >
> > > > > Stephen just gave a presentation at netdev about some of the
> > > performance
> > > > > optimization measurements Brocade did with DPDK and how they fine
> > > tuned
> > > > > their
> > > > > environment.  One of the big take aways for me was that making time
> > > based
> > > > > measurements (especially if it was using the tsc), created cpu stalls
> that
> > > > > skewed the measurements, and so the best optimizations they made
> > > avoided
> > > > > time
> > > > > measurements, opting instead for packet count metrics.
> > > > >
> > > > > Neil
> > > >
> > > > Hi Neil,
> > > >
> > > > I think this library offers something quite useful probably not for
> everyone,
> > > > but for many people that use DPDK, and it is measuring quite
> accurately,
> > > > how many spare cycles a CPU have after executing any serial tasks (as
> you
> > > will know).
> > > > If you look at two places in example application: main_loop()
> > > > and l2fwd_fwd() functions, you will see two possible approach there,
> but
> > > > this is not limited to that. You can even nest headroom objects and
> > > measure
> > > > process time of particular packets type.
> > > > Of course, this will add an overhead due to the measurements,
> > > > but that time is also measured, so any user can know what is the
> relative
> > > > time "wasted" for measuring all this.
> > > > If time delays are measured in bigger timestamps, are handled reliably,
> > > > the cost of measuring will be low.
> > > > I find this quite similar to the power library case. I would say that 
> > > > library
> is
> > > not useful
> > > > for every application, but there are several cases where it can be
> > > > (as demonstrated with l3fwd-power app).
> > > >
> > > > About your last bit, not sure if I understood it right, but in case of 
> > > > the
> > > included sample app,
> > > > the main measurement to see if we are overusing a CPU is the packet
> count
> > > > in a queue (in this case RX queue), and I believe this should be used 
> > > > for
> > > other apps,
> > > > especially in those that use a pipeline model, where queues and rings
> are
> > > the key part.
> > > >
> > > > As a final point, last week (12th of February), there was a request for 
> > > > a
> > > tool/library like this
> > > > from a user in the mailing list (Ilan Borenshtein), which indicates that
> this
> > > would be useful
> > > > (probably not just for him, but for others). It probably could be 
> > > > achieved
> by
> > > the user
> > > > by adding their own code, but I believe this library would be a good-to-
> > > have,
> > > > i

[dpdk-dev] [PATCH 5/5] Fix usage of fgets in various places

2015-02-23 Thread Stephen Hemminger
On Mon, 23 Feb 2015 15:10:00 +0100
Pawel Wodkowski  wrote:

> Declaration of fgets() is
> char *fgets(char *str, int size, FILE *stream);
> 
> Klocwork complain about passing "sizeof()" as size parameter since
> implicit casting size_t to int might cause loss of precision.
> 
> Signed-off-by: Pawel Wodkowski 

NAK this is shooting at Unicorns.
The tool is the problem not the code.


[dpdk-dev] [PATCH v5 0/3] new headroom stats library and example application

2015-02-23 Thread Thomas Monjalon
2015-02-23 15:55, Jastrzebski, MichalX K:
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> > 2015-02-23 14:36, Jastrzebski, MichalX K:
> > > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> > > > 2015-02-20 15:46, Jastrzebski, MichalX K:
> > > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Neil Horman
> > > > > > On Thu, Feb 19, 2015 at 01:18:41PM +0100, Pawel Wodkowski wrote:
> > > > > > > Hi community,
> > > > > > > I would like to introduce library for measuring load of some 
> > > > > > > arbitrary
> > > > jobs.
> > > > > > > It can be used to profile every kind of job sets on any arbitrary
> > execution
> > > > unit
> > > > > > > or tasking library.
> > > > > > >
> > > > > > > In provided l2fwd-headroom example I demonstrate how to use this
> > > > library to
> > > > > > > select optimal rx burst poll time. Jobs are selected by using 
> > > > > > > existing
> > > > rte_timer
> > > > > > > library calls. This example does no limit possible schemes on 
> > > > > > > which
> > this
> > > > > > > library can be used.
> > > > > > >
> > > > > > > Pawel Wodkowski (3):
> > > > > > >   librte_headroom: New library for checking core/system/app load
> > > > > > >   examples: introduce new l2fwd-headroom example
> > > > > > >   MAINTAINERS: claim responsibility for headroom library and
> > example
> > > > app
> > > > > >
> > > > > > I'm sorry but I still fail to see how this is a particularly useful 
> > > > > > library.  It
> > > > > > clearly works fine, but it composes an application event loop in its
> > own
> > > > > > terms,
> > > > > > and measures stats based on that.  While thats ok, any application 
> > > > > > is
> > > > already
> > > > > > going to have to write its own event loop, and can makethe same
> > > > > > measurements
> > > > > > synchnously within that loop, using alot less code to optimize its
> > polling
> > > > time.
> > > > > >
> > > > > > In other words, I think this is one of those cases where this 
> > > > > > library is
> > > > > > probably somewhat useful for anyone who just wants to write an
> > > > application
> > > > > > in
> > > > > > terms the semantics exposed by this library, but not at all useful 
> > > > > > for
> > > > anyone
> > > > > > else.  I'd personally rather not have the extra code to maintain 
> > > > > > here.
> > > > > >
> > > > > > Stephen just gave a presentation at netdev about some of the
> > > > performance
> > > > > > optimization measurements Brocade did with DPDK and how they fine
> > > > tuned
> > > > > > their
> > > > > > environment.  One of the big take aways for me was that making time
> > > > based
> > > > > > measurements (especially if it was using the tsc), created cpu 
> > > > > > stalls
> > that
> > > > > > skewed the measurements, and so the best optimizations they made
> > > > avoided
> > > > > > time
> > > > > > measurements, opting instead for packet count metrics.
> > > > > >
> > > > > > Neil
> > > > >
> > > > > Hi Neil,
> > > > >
> > > > > I think this library offers something quite useful probably not for
> > everyone,
> > > > > but for many people that use DPDK, and it is measuring quite
> > accurately,
> > > > > how many spare cycles a CPU have after executing any serial tasks (as
> > you
> > > > will know).
> > > > > If you look at two places in example application: main_loop()
> > > > > and l2fwd_fwd() functions, you will see two possible approach there,
> > but
> > > > > this is not limited to that. You can even nest headroom objects and
> > > > measure
> > > > > process time of particular packets type.
> > > > > Of course, this will add an overhead due to the measurements,
> > > > > but that time is also measured, so any user can know what is the
> > relative
> > > > > time "wasted" for measuring all this.
> > > > > If time delays are measured in bigger timestamps, are handled 
> > > > > reliably,
> > > > > the cost of measuring will be low.
> > > > > I find this quite similar to the power library case. I would say that 
> > > > > library
> > is
> > > > not useful
> > > > > for every application, but there are several cases where it can be
> > > > > (as demonstrated with l3fwd-power app).
> > > > >
> > > > > About your last bit, not sure if I understood it right, but in case 
> > > > > of the
> > > > included sample app,
> > > > > the main measurement to see if we are overusing a CPU is the packet
> > count
> > > > > in a queue (in this case RX queue), and I believe this should be used 
> > > > > for
> > > > other apps,
> > > > > especially in those that use a pipeline model, where queues and rings
> > are
> > > > the key part.
> > > > >
> > > > > As a final point, last week (12th of February), there was a request 
> > > > > for a
> > > > tool/library like this
> > > > > from a user in the mailing list (Ilan Borenshtein), which indicates 
> > > > > that
> > this
> > > > would be useful
> > > > > (probably not just for him, but for others). It probably could be 
> > > > > achieved
> > by
> >

[dpdk-dev] [PATCH 1/2] enic: replace use of printf with log

2015-02-23 Thread David Marchand
On Sat, Feb 14, 2015 at 5:28 PM, Neil Horman  wrote:

> On Sat, Feb 14, 2015 at 10:32:58AM -0500, Stephen Hemminger wrote:
> > Device driver should log via DPDK log, not to printf which is
> > sends to /dev/null in a daemon application.
> >
> > Signed-off-by: Stephen Hemminger 
> > ---
> >  lib/librte_pmd_enic/enic_compat.h | 11 +++
> >  1 file changed, 7 insertions(+), 4 deletions(-)
> >
> > diff --git a/lib/librte_pmd_enic/enic_compat.h
> b/lib/librte_pmd_enic/enic_compat.h
> > index b1af838..c3ab76e 100644
> > --- a/lib/librte_pmd_enic/enic_compat.h
> > +++ b/lib/librte_pmd_enic/enic_compat.h
> > @@ -75,10 +75,13 @@
> >  #define kzalloc(size, flags) calloc(1, size)
> >  #define kfree(x) free(x)
> >
> > -#define dev_err(x, args...) printf("rte_enic_pmd : Error - " args)
> > -#define dev_info(x, args...) printf("rte_enic_pmd: Info - " args)
> > -#define dev_warning(x, args...) printf("rte_enic_pmd: Warning - " args)
> > -#define dev_trace(x, args...) printf("rte_enic_pmd: Trace - " args)
> > +#define dev_printk(level, fmt, args...)  \
> > + RTE_LOG(level, PMD, "rte_enic_pmd:" fmt, ## args)
> > +
> > +#define dev_err(x, args...) dev_printk(ERR, args)
> > +#define dev_info(x, args...) dev_printk(INFO,  args)
> > +#define dev_warning(x, args...) dev_printk(WARNING, args)
> > +#define dev_debug(x, args...) dev_printk(DEBUG, args)
> >
> >  #define __le16 u16
> >  #define __le32 u32
> > --
> > 2.1.4
> >
> >
> Series
> Acked-by: Neil Horman 
>

Use of rte_log would be better for init messages, but since the driver
makes no difference, this looks good enough to me.
Thanks Stephen.

Acked-by: David Marchand 

-- 
David Marchand


[dpdk-dev] [PATCH v5 1/6] ethdev: add rx interrupt enable/disable functions

2015-02-23 Thread Stephen Hemminger
On Tue, 24 Feb 2015 00:55:37 +0800
Zhou Danny  wrote:

> +int
> +rte_eth_dev_rx_queue_intr_enable(uint8_t port_id,
> + uint16_t queue_id)
> +{
> + struct rte_eth_dev *dev;
> +
> + if (port_id >= nb_ports) {
> + PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
> + return (-ENODEV);

Please don't use the BSD style of extra useless paren's around
returns.


[dpdk-dev] [PATCH v2 0/2] fix and improve uio_pci_generic support

2015-02-23 Thread Bruce Richardson
This patch does some cleanup of the uio mapping code to 
a) fix issue with mmap of PCI bars reported by Tetsuya and confirmed
by others.
b) eliminate redundant code and reduce scans of /sys


Bruce Richardson (2):
  eal: mmap uio resources using resourceX files
  eal: populate uio_maps from pci mem_resources array

 lib/librte_eal/common/include/rte_pci.h|   2 +-
 lib/librte_eal/linuxapp/eal/eal_pci_init.h |   1 +
 lib/librte_eal/linuxapp/eal/eal_pci_uio.c  | 173 +++--
 lib/librte_eal/linuxapp/eal/eal_pci_vfio.c |   1 +
 4 files changed, 66 insertions(+), 111 deletions(-)

-- 
2.1.0



[dpdk-dev] [PATCH v2 1/2] eal: mmap uio resources using resourceX files

2015-02-23 Thread Bruce Richardson
Instead of distinguishing the BAR mappings via offset within a single
file, originally /dev/uioX, switch to mapping each individual bar via
the appropriately numbered resourceX file.

Signed-off-by: Bruce Richardson 
---
 lib/librte_eal/common/include/rte_pci.h|  2 +-
 lib/librte_eal/linuxapp/eal/eal_pci_init.h |  1 +
 lib/librte_eal/linuxapp/eal/eal_pci_uio.c  | 34 --
 lib/librte_eal/linuxapp/eal/eal_pci_vfio.c |  1 +
 4 files changed, 21 insertions(+), 17 deletions(-)

diff --git a/lib/librte_eal/common/include/rte_pci.h 
b/lib/librte_eal/common/include/rte_pci.h
index 4301c16..e34b139 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -117,7 +117,7 @@ struct rte_pci_resource {
 };

 /** Maximum number of PCI resources. */
-#define PCI_MAX_RESOURCE 7
+#define PCI_MAX_RESOURCE 6

 /**
  * A structure describing an ID for a PCI driver. Each driver provides a
diff --git a/lib/librte_eal/linuxapp/eal/eal_pci_init.h 
b/lib/librte_eal/linuxapp/eal/eal_pci_init.h
index 1070eb8..2125d7b 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci_init.h
+++ b/lib/librte_eal/linuxapp/eal/eal_pci_init.h
@@ -38,6 +38,7 @@

 struct pci_map {
void *addr;
+   char *path;
uint64_t offset;
uint64_t size;
uint64_t phaddr;
diff --git a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c 
b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
index 2b16fcb..ecf385a 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
@@ -137,10 +137,10 @@ pci_uio_map_secondary(struct rte_pci_device *dev)
/*
 * open devname, to mmap it
 */
-   fd = open(uio_res->path, O_RDWR);
+   fd = open(uio_res->maps[i].path, O_RDWR);
if (fd < 0) {
RTE_LOG(ERR, EAL, "Cannot open %s: %s\n",
-   uio_res->path, strerror(errno));
+   uio_res->maps[i].path, strerror(errno));
return -1;
}

@@ -149,7 +149,8 @@ pci_uio_map_secondary(struct rte_pci_device *dev)
 (size_t)uio_res->maps[i].size)
!= uio_res->maps[i].addr) {
RTE_LOG(ERR, EAL,
-   "Cannot mmap device resource\n");
+   "Cannot mmap device resource file: 
%s\n",
+   uio_res->maps[i].path);
close(fd);
return -1;
}
@@ -294,8 +295,6 @@ pci_uio_map_resource(struct rte_pci_device *dev)
void *mapaddr;
int uio_num;
uint64_t phaddr;
-   uint64_t offset;
-   uint64_t pagesz;
int nb_maps;
struct rte_pci_addr *loc = &dev->addr;
struct mapped_pci_resource *uio_res;
@@ -336,11 +335,6 @@ pci_uio_map_resource(struct rte_pci_device *dev)
return -1;
}

-   /* update devname for mmap  */
-   snprintf(devname, sizeof(devname),
-   SYSFS_PCI_DEVICES "/" PCI_PRI_FMT "/resource%d",
-   loc->domain, loc->bus, loc->devid, loc->function, 0);
-
/* set bus master that is not done by uio_pci_generic */
if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
if (pci_uio_set_bus_master(dev->intr_handle.uio_cfg_fd)) {
@@ -370,8 +364,6 @@ pci_uio_map_resource(struct rte_pci_device *dev)
uio_res->nb_maps = nb_maps;

/* Map all BARs */
-   pagesz = sysconf(_SC_PAGESIZE);
-
maps = uio_res->maps;
for (i = 0; i != PCI_MAX_RESOURCE; i++) {
int fd;
@@ -389,10 +381,15 @@ pci_uio_map_resource(struct rte_pci_device *dev)
/* if matching map is found, then use it */
if (j != nb_maps) {
int fail = 0;
-   offset = j * pagesz;
+
+   /* update devname for mmap  */
+   snprintf(devname, sizeof(devname),
+   SYSFS_PCI_DEVICES "/" PCI_PRI_FMT "/resource%d",
+   loc->domain, loc->bus, loc->devid, 
loc->function,
+   i);

/*
-* open devname, to mmap it
+* open resource file, to mmap it
 */
fd = open(devname, O_RDWR);
if (fd < 0) {
@@ -408,7 +405,7 @@ pci_uio_map_resource(struct rte_pci_device *dev)
if (pci_map_addr == NULL)
pci_map_addr = pci_find_max_end_va();

-   mapaddr = pci_map_resource(pci_map_addr, fd, 
(off

[dpdk-dev] [PATCH v2 2/2] eal: populate uio_maps from pci mem_resources array

2015-02-23 Thread Bruce Richardson
Rather than scanning the resource file in sysfs a second time, we
can pull the information on physical addresses of BARs from the
pci resource information already present in the dev structure.

Signed-off-by: Bruce Richardson 
---
 lib/librte_eal/linuxapp/eal/eal_pci_uio.c | 163 +++---
 1 file changed, 57 insertions(+), 106 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c 
b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
index ecf385a..6aa2599 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci_uio.c
@@ -52,41 +52,6 @@ void *pci_map_addr = NULL;


 #define OFF_MAX  ((uint64_t)(off_t)-1)
-static int
-pci_uio_get_mappings(struct rte_pci_device *dev,
-   struct pci_map maps[], int nb_maps)
-{
-   struct rte_pci_addr *loc = &dev->addr;
-   int i = 0;
-   char filename[PATH_MAX];
-   unsigned long long start_addr, end_addr, flags;
-   FILE *f;
-
-   snprintf(filename, sizeof(filename),
-   SYSFS_PCI_DEVICES "/" PCI_PRI_FMT "/resource",
-   loc->domain, loc->bus, loc->devid, loc->function);
-
-   f = fopen(filename, "r");
-   if (f == NULL) {
-   RTE_LOG(ERR, EAL,
-   "%s(): cannot open sysfs %s\n",
-   __func__, filename);
-   return -1;
-   }
-
-   while (fscanf(f, "%llx %llx %llx", &start_addr,
-   &end_addr, &flags) == 3 && i < nb_maps) {
-   if (flags & IORESOURCE_MEM) {
-   maps[i].offset = 0x0;
-   maps[i].size = end_addr - start_addr + 1;
-   maps[i].phaddr = start_addr;
-   i++;
-   }
-   }
-   fclose(f);
-
-   return i;
-}

 static int
 pci_uio_set_bus_master(int dev_fd)
@@ -130,10 +95,6 @@ pci_uio_map_secondary(struct rte_pci_device *dev)
continue;

for (i = 0; i != uio_res->nb_maps; i++) {
-   /* ignore mappings unused in primary process */
-   if (uio_res->maps[i].addr == NULL)
-   continue;
-
/*
 * open devname, to mmap it
 */
@@ -144,13 +105,21 @@ pci_uio_map_secondary(struct rte_pci_device *dev)
return -1;
}

-   if (pci_map_resource(uio_res->maps[i].addr, fd,
-(off_t)uio_res->maps[i].offset,
-(size_t)uio_res->maps[i].size)
-   != uio_res->maps[i].addr) {
-   RTE_LOG(ERR, EAL,
-   "Cannot mmap device resource file: 
%s\n",
-   uio_res->maps[i].path);
+   void *mapaddr = pci_map_resource(uio_res->maps[i].addr,
+   fd, (off_t)uio_res->maps[i].offset,
+   (size_t)uio_res->maps[i].size);
+   if (mapaddr != uio_res->maps[i].addr) {
+   if (mapaddr == MAP_FAILED)
+   RTE_LOG(ERR, EAL,
+   "Cannot mmap device 
resource file %s: %s\n",
+   uio_res->maps[i].path,
+   strerror(errno));
+   else
+   RTE_LOG(ERR, EAL,
+   "Cannot mmap device 
resource file %s to address: %p\n",
+   uio_res->maps[i].path,
+   uio_res->maps[i].addr);
+
close(fd);
return -1;
}
@@ -288,14 +257,13 @@ pci_get_uio_dev(struct rte_pci_device *dev, char *dstbuf,
 int
 pci_uio_map_resource(struct rte_pci_device *dev)
 {
-   int i, j;
+   int i, map_idx;
char dirname[PATH_MAX];
char cfgname[PATH_MAX];
char devname[PATH_MAX]; /* contains the /dev/uioX */
void *mapaddr;
int uio_num;
uint64_t phaddr;
-   int nb_maps;
struct rte_pci_addr *loc = &dev->addr;
struct mapped_pci_resource *uio_res;
struct pci_map *maps;
@@ -336,11 +304,9 @@ pci_uio_map_resource(struct rte_pci_device *dev)
}

/* set bus master that is not done by uio_pci_generic */
-   if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
-   if (pci_uio_set_bus_master(dev->intr_handle.uio_cfg_fd)) {
-   RTE_LOG(ERR, EAL, "Cannot set up bus mastering!\n");
-   return -1;
-   }

[dpdk-dev] [PATCH v5 1/6] ethdev: add rx interrupt enable/disable functions

2015-02-23 Thread Zhou, Danny

> -Original Message-
> From: Stephen Hemminger [mailto:stephen at networkplumber.org]
> Sent: Tuesday, February 24, 2015 12:59 AM
> To: Zhou, Danny
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v5 1/6] ethdev: add rx interrupt 
> enable/disable functions
> 
> On Tue, 24 Feb 2015 00:55:37 +0800
> Zhou Danny  wrote:
> 
> > +int
> > +rte_eth_dev_rx_queue_intr_enable(uint8_t port_id,
> > +   uint16_t queue_id)
> > +{
> > +   struct rte_eth_dev *dev;
> > +
> > +   if (port_id >= nb_ports) {
> > +   PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
> > +   return (-ENODEV);
> 
> Please don't use the BSD style of extra useless paren's around
> returns.

This code snippet doing sanity check is copied from other functions defined in 
the same file 
lib/librte_ether/rte_ethdev.c, and there are plenty of legacy code in this file 
doing the similar
BSD style thing. I'd suggest somebody to clean all those kinds of BSD style 
code from DPDK code, 
in a separated patchset. 


[dpdk-dev] [PATCH v5 2/3] ethdev: add optional rxtx callback support

2015-02-23 Thread Mcnamara, John
> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Monday, February 23, 2015 3:12 PM
> To: Mcnamara, John
> Cc: dev at dpdk.org; Richardson, Bruce; nhorman at tuxdriver.com;
> stephen at networkplumber.org; Doherty, Declan
> Subject: Re: [PATCH v5 2/3] ethdev: add optional rxtx callback support
> 
> > +#ifdef RTE_ETHDEV_RXTX_CALLBACKS
> > +void *
> > +rte_eth_add_rx_callback(uint8_t port_id, uint16_t queue_id,
> > +   rte_rxtx_callback_fn fn, void *user_param) {
> > +   /* check input parameters */
> > +   if (port_id >= nb_ports || fn == NULL ||
> > +   queue_id >= rte_eth_devices[port_id].data->nb_rx_queues) {
> > +   rte_errno = EINVAL;
> > +   return NULL;
> > +   }
> 
> Why not putting #ifdef only here and return an error ENOTSUP?


Hi Thomas,

That would probably be cleaner/clearer. I'll rework this patch with your 
suggestions.

John
-- 



[dpdk-dev] [PATCH v3 00/11] qemu vhost-user support

2015-02-23 Thread Przemyslaw Czesnowicz
v3 changes:
  * move things around to make all patches compile


Xie, Huawei (11):
  lib/librte_vhost: enable VIRTIO_NET_F_CTRL_RX VIRTIO_NET_F_CTRL_RX is
dependant on VIRTIO_NET_F_CTRL_VQ. Observed that virtio-net driver
in guest would crash with only CTRL_RX enabled.
  lib/librte_vhost: create vhost_cuse directory and move
vhost-net-cdev.c into vhost_cuse
  lib/librte_vhost: rename vhost-net-cdev.h to vhost-net.h
  lib/librte_vhost: move fd copying(from qemu process into vhost
process) to eventfd_copy.c
  lib/librte_vhost: copy host_memory_map from virtio-net.c to a new file
virtio-net-cdev.c
  lib/librte_vhost: make host_memory_map a more generic function.
  lib/librte_vhost: implement cuse_set_memory_table
  lib/librte_vhost: add select based event driven processing
  lib/librte_vhost: vhost user support
  lib/librte_vhost: support dev->ifname for vhost-user
  lib/librte_vhost: support dynamically registering vhost server

 lib/librte_vhost/Makefile |   8 +-
 lib/librte_vhost/rte_virtio_net.h |   5 +-
 lib/librte_vhost/vhost-net-cdev.c | 389 
 lib/librte_vhost/vhost-net-cdev.h | 113 --
 lib/librte_vhost/vhost-net.h  | 118 +++
 lib/librte_vhost/vhost_cuse/eventfd_copy.c|  88 +
 lib/librte_vhost/vhost_cuse/eventfd_copy.h|  39 ++
 lib/librte_vhost/vhost_cuse/vhost-net-cdev.c  | 417 ++
 lib/librte_vhost/vhost_cuse/virtio-net-cdev.c | 423 ++
 lib/librte_vhost/vhost_cuse/virtio-net-cdev.h |  48 +++
 lib/librte_vhost/vhost_rxtx.c |   2 +-
 lib/librte_vhost/vhost_user/fd_man.c  | 258 ++
 lib/librte_vhost/vhost_user/fd_man.h  |  67 
 lib/librte_vhost/vhost_user/vhost-net-user.c  | 472 +
 lib/librte_vhost/vhost_user/vhost-net-user.h  | 106 ++
 lib/librte_vhost/vhost_user/virtio-net-user.c | 314 
 lib/librte_vhost/vhost_user/virtio-net-user.h |  49 +++
 lib/librte_vhost/virtio-net.c | 491 ++
 lib/librte_vhost/virtio-net.h |  43 +++
 19 files changed, 2491 insertions(+), 959 deletions(-)
 delete mode 100644 lib/librte_vhost/vhost-net-cdev.c
 delete mode 100644 lib/librte_vhost/vhost-net-cdev.h
 create mode 100644 lib/librte_vhost/vhost-net.h
 create mode 100644 lib/librte_vhost/vhost_cuse/eventfd_copy.c
 create mode 100644 lib/librte_vhost/vhost_cuse/eventfd_copy.h
 create mode 100644 lib/librte_vhost/vhost_cuse/vhost-net-cdev.c
 create mode 100644 lib/librte_vhost/vhost_cuse/virtio-net-cdev.c
 create mode 100644 lib/librte_vhost/vhost_cuse/virtio-net-cdev.h
 create mode 100644 lib/librte_vhost/vhost_user/fd_man.c
 create mode 100644 lib/librte_vhost/vhost_user/fd_man.h
 create mode 100644 lib/librte_vhost/vhost_user/vhost-net-user.c
 create mode 100644 lib/librte_vhost/vhost_user/vhost-net-user.h
 create mode 100644 lib/librte_vhost/vhost_user/virtio-net-user.c
 create mode 100644 lib/librte_vhost/vhost_user/virtio-net-user.h
 create mode 100644 lib/librte_vhost/virtio-net.h

-- 
1.9.3



[dpdk-dev] [PATCH v3 01/11] lib/librte_vhost: enable VIRTIO_NET_F_CTRL_RX VIRTIO_NET_F_CTRL_RX is dependant on VIRTIO_NET_F_CTRL_VQ. Observed that virtio-net driver in guest would crash with only CTRL

2015-02-23 Thread Przemyslaw Czesnowicz
From: "Xie, Huawei" 

In virtnet_send_command:

/* Caller should know better */
BUG_ON(!virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_VQ) ||
(out + in > VIRTNET_SEND_COMMAND_SG_MAX));

Signed-off-by: Huawei Xie 
---
 lib/librte_vhost/virtio-net.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/lib/librte_vhost/virtio-net.c b/lib/librte_vhost/virtio-net.c
index b041849..52b4957 100644
--- a/lib/librte_vhost/virtio-net.c
+++ b/lib/librte_vhost/virtio-net.c
@@ -73,7 +73,8 @@ static struct virtio_net_config_ll *ll_root;

 /* Features supported by this lib. */
 #define VHOST_SUPPORTED_FEATURES ((1ULL << VIRTIO_NET_F_MRG_RXBUF) | \
- (1ULL << VIRTIO_NET_F_CTRL_RX))
+   (1ULL << VIRTIO_NET_F_CTRL_VQ) | \
+   (1ULL << VIRTIO_NET_F_CTRL_RX))
 static uint64_t VHOST_FEATURES = VHOST_SUPPORTED_FEATURES;

 /* Line size for reading maps file. */
-- 
1.9.3



[dpdk-dev] [PATCH v3 02/11] lib/librte_vhost: create vhost_cuse directory and move vhost-net-cdev.c into vhost_cuse

2015-02-23 Thread Przemyslaw Czesnowicz
From: "Xie, Huawei" 

vhost-cuse driver will be divided into two parts: cuse driver specific message
handling(in cuse directory) and common message handling(in virtio-net.c).

vhost ioctl message is pre-processed in cuse and then sent to virtio-net
if is not terminated.

virtio-net.c provides common message handling for both vhost-cuse and 
vhost-user.

Signed-off-by: Huawei Xie 
---
 lib/librte_vhost/Makefile|   4 +-
 lib/librte_vhost/vhost-net-cdev.c| 389 ---
 lib/librte_vhost/vhost_cuse/vhost-net-cdev.c | 389 +++
 3 files changed, 391 insertions(+), 391 deletions(-)
 delete mode 100644 lib/librte_vhost/vhost-net-cdev.c
 create mode 100644 lib/librte_vhost/vhost_cuse/vhost-net-cdev.c

diff --git a/lib/librte_vhost/Makefile b/lib/librte_vhost/Makefile
index 369c25a..49ae7ae 100644
--- a/lib/librte_vhost/Makefile
+++ b/lib/librte_vhost/Makefile
@@ -38,10 +38,10 @@ EXPORT_MAP := rte_vhost_version.map

 LIBABIVER := 1

-CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3 -D_FILE_OFFSET_BITS=64 -lfuse
+CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -I vhost_cuse -O3 -D_FILE_OFFSET_BITS=64 
-lfuse
 LDFLAGS += -lfuse
 # all source are stored in SRCS-y
-SRCS-$(CONFIG_RTE_LIBRTE_VHOST) := vhost-net-cdev.c virtio-net.c vhost_rxtx.c
+SRCS-$(CONFIG_RTE_LIBRTE_VHOST) := vhost_cuse/vhost-net-cdev.c virtio-net.c 
vhost_rxtx.c

 # install includes
 SYMLINK-$(CONFIG_RTE_LIBRTE_VHOST)-include += rte_virtio_net.h
diff --git a/lib/librte_vhost/vhost-net-cdev.c 
b/lib/librte_vhost/vhost-net-cdev.c
deleted file mode 100644
index 57c76cb..000
--- a/lib/librte_vhost/vhost-net-cdev.c
+++ /dev/null
@@ -1,389 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- * * Redistributions of source code must retain the above copyright
- *   notice, this list of conditions and the following disclaimer.
- * * Redistributions in binary form must reproduce the above copyright
- *   notice, this list of conditions and the following disclaimer in
- *   the documentation and/or other materials provided with the
- *   distribution.
- * * Neither the name of Intel Corporation nor the names of its
- *   contributors may be used to endorse or promote products derived
- *   from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-
-#include 
-#include 
-#include 
-#include 
-
-#include "vhost-net-cdev.h"
-
-#define FUSE_OPT_DUMMY "\0\0"
-#define FUSE_OPT_FORE  "-f\0\0"
-#define FUSE_OPT_NOMULTI "-s\0\0"
-
-static const uint32_t default_major = 231;
-static const uint32_t default_minor = 1;
-static const char cuse_device_name[] = "/dev/cuse";
-static const char default_cdev[] = "vhost-net";
-
-static struct fuse_session *session;
-static struct vhost_net_device_ops const *ops;
-
-/*
- * Returns vhost_device_ctx from given fuse_req_t. The index is populated later
- * when the device is added to the device linked list.
- */
-static struct vhost_device_ctx
-fuse_req_to_vhost_ctx(fuse_req_t req, struct fuse_file_info *fi)
-{
-   struct vhost_device_ctx ctx;
-   struct fuse_ctx const *const req_ctx = fuse_req_ctx(req);
-
-   ctx.pid = req_ctx->pid;
-   ctx.fh = fi->fh;
-
-   return ctx;
-}
-
-/*
- * When the device is created in QEMU it gets initialised here and
- * added to the device linked list.
- */
-static void
-vhost_net_open(fuse_req_t req, struct fuse_file_info *fi)
-{
-   struct vhost_device_ctx ctx = fuse_req_to_vhost_ctx(req, fi);
-   int err = 0;
-
-   err = ops->new_device(ctx);
-   if (err == -1) {
-   fuse_reply_err(req, EPERM);
-   return;
-   }
-
-   fi->fh = err;
-
-   RTE_LOG(INFO, VHOST_CONFIG,
-   "(%"PRIu64") Device configuration started\n", fi->fh);
-   fuse_reply_open(req, fi);
-}
-
-/*
- * When QEMU is sh

[dpdk-dev] [PATCH v3 03/11] lib/librte_vhost: rename vhost-net-cdev.h to vhost-net.h

2015-02-23 Thread Przemyslaw Czesnowicz
From: "Xie, Huawei" 

This file defines common operations provided by virtio-net(.c).

Signed-off-by: Huawei Xie 
---
 lib/librte_vhost/vhost-net-cdev.h| 113 ---
 lib/librte_vhost/vhost-net.h | 113 +++
 lib/librte_vhost/vhost_cuse/vhost-net-cdev.c |   2 +-
 lib/librte_vhost/vhost_rxtx.c|   2 +-
 lib/librte_vhost/virtio-net.c|   2 +-
 5 files changed, 116 insertions(+), 116 deletions(-)
 delete mode 100644 lib/librte_vhost/vhost-net-cdev.h
 create mode 100644 lib/librte_vhost/vhost-net.h

diff --git a/lib/librte_vhost/vhost-net-cdev.h 
b/lib/librte_vhost/vhost-net-cdev.h
deleted file mode 100644
index 03a5c57..000
--- a/lib/librte_vhost/vhost-net-cdev.h
+++ /dev/null
@@ -1,113 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- * * Redistributions of source code must retain the above copyright
- *   notice, this list of conditions and the following disclaimer.
- * * Redistributions in binary form must reproduce the above copyright
- *   notice, this list of conditions and the following disclaimer in
- *   the documentation and/or other materials provided with the
- *   distribution.
- * * Neither the name of Intel Corporation nor the names of its
- *   contributors may be used to endorse or promote products derived
- *   from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-#ifndef _VHOST_NET_CDEV_H_
-#define _VHOST_NET_CDEV_H_
-#include 
-#include 
-#include 
-#include 
-#include 
-
-#include 
-
-/* Macros for printing using RTE_LOG */
-#define RTE_LOGTYPE_VHOST_CONFIG RTE_LOGTYPE_USER1
-#define RTE_LOGTYPE_VHOST_DATA   RTE_LOGTYPE_USER1
-
-#ifdef RTE_LIBRTE_VHOST_DEBUG
-#define VHOST_MAX_PRINT_BUFF 6072
-#define LOG_LEVEL RTE_LOG_DEBUG
-#define LOG_DEBUG(log_type, fmt, args...) RTE_LOG(DEBUG, log_type, fmt, ##args)
-#define PRINT_PACKET(device, addr, size, header) do { \
-   char *pkt_addr = (char *)(addr); \
-   unsigned int index; \
-   char packet[VHOST_MAX_PRINT_BUFF]; \
-   \
-   if ((header)) \
-   snprintf(packet, VHOST_MAX_PRINT_BUFF, "(%"PRIu64") Header size 
%d: ", (device->device_fh), (size)); \
-   else \
-   snprintf(packet, VHOST_MAX_PRINT_BUFF, "(%"PRIu64") Packet size 
%d: ", (device->device_fh), (size)); \
-   for (index = 0; index < (size); index++) { \
-   snprintf(packet + strnlen(packet, VHOST_MAX_PRINT_BUFF), 
VHOST_MAX_PRINT_BUFF - strnlen(packet, VHOST_MAX_PRINT_BUFF), \
-   "%02hhx ", pkt_addr[index]); \
-   } \
-   snprintf(packet + strnlen(packet, VHOST_MAX_PRINT_BUFF), 
VHOST_MAX_PRINT_BUFF - strnlen(packet, VHOST_MAX_PRINT_BUFF), "\n"); \
-   \
-   LOG_DEBUG(VHOST_DATA, "%s", packet); \
-} while (0)
-#else
-#define LOG_LEVEL RTE_LOG_INFO
-#define LOG_DEBUG(log_type, fmt, args...) do {} while (0)
-#define PRINT_PACKET(device, addr, size, header) do {} while (0)
-#endif
-
-
-/*
- * Structure used to identify device context.
- */
-struct vhost_device_ctx {
-   pid_t   pid;/* PID of process calling the IOCTL. */
-   uint64_tfh; /* Populated with fi->fh to track the device 
index. */
-};
-
-/*
- * Structure contains function pointers to be defined in virtio-net.c. These
- * functions are called in CUSE context and are used to configure devices.
- */
-struct vhost_net_device_ops {
-   int (*new_device)(struct vhost_device_ctx);
-   void (*destroy_device)(struct vhost_device_ctx);
-
-   int (*get_features)(struct vhost_device_ctx, uint64_t *);
-   int (*set_features)(struct vhost_device_ctx, uint64_t *);
-
-   int (*set_mem_table)(struct vhost_device_ctx, const void *, uint32_t);
-
-   int (*set_vring_num)(struct vhost_device_ctx, struct vhost_vring_state 
*);
-   int (*set_vring_addr)(struct vhost_device_ctx, s

[dpdk-dev] [PATCH v3 04/11] lib/librte_vhost: move fd copying(from qemu process into vhost process) to eventfd_copy.c

2015-02-23 Thread Przemyslaw Czesnowicz
From: "Xie, Huawei" 

vhost-user doesn't need eventfd kernel module to copy fds between processes.

Signed-off-by: Huawei Xie 
Signed-off-by: Przemyslaw Czesnowicz 
---
 lib/librte_vhost/Makefile|  2 +-
 lib/librte_vhost/vhost_cuse/eventfd_copy.c   | 88 
 lib/librte_vhost/vhost_cuse/eventfd_copy.h   | 39 
 lib/librte_vhost/vhost_cuse/vhost-net-cdev.c | 41 +
 lib/librte_vhost/virtio-net.c| 75 
 5 files changed, 170 insertions(+), 75 deletions(-)
 create mode 100644 lib/librte_vhost/vhost_cuse/eventfd_copy.c
 create mode 100644 lib/librte_vhost/vhost_cuse/eventfd_copy.h

diff --git a/lib/librte_vhost/Makefile b/lib/librte_vhost/Makefile
index 49ae7ae..88d1295 100644
--- a/lib/librte_vhost/Makefile
+++ b/lib/librte_vhost/Makefile
@@ -41,7 +41,7 @@ LIBABIVER := 1
 CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -I vhost_cuse -O3 -D_FILE_OFFSET_BITS=64 
-lfuse
 LDFLAGS += -lfuse
 # all source are stored in SRCS-y
-SRCS-$(CONFIG_RTE_LIBRTE_VHOST) := vhost_cuse/vhost-net-cdev.c virtio-net.c 
vhost_rxtx.c
+SRCS-$(CONFIG_RTE_LIBRTE_VHOST) := vhost_cuse/vhost-net-cdev.c 
vhost_cuse/eventfd_copy.c virtio-net.c vhost_rxtx.c

 # install includes
 SYMLINK-$(CONFIG_RTE_LIBRTE_VHOST)-include += rte_virtio_net.h
diff --git a/lib/librte_vhost/vhost_cuse/eventfd_copy.c 
b/lib/librte_vhost/vhost_cuse/eventfd_copy.c
new file mode 100644
index 000..4d697a2
--- /dev/null
+++ b/lib/librte_vhost/vhost_cuse/eventfd_copy.c
@@ -0,0 +1,88 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+#include "eventfd_link/eventfd_link.h"
+#include "eventfd_copy.h"
+#include "vhost-net.h"
+
+static const char eventfd_cdev[] = "/dev/eventfd-link";
+
+/*
+ * This function uses the eventfd_link kernel module to copy an eventfd file
+ * descriptor provided by QEMU in to our process space.
+ */
+int
+eventfd_copy(int target_fd, int target_pid)
+{
+   int eventfd_link, ret;
+   struct eventfd_copy eventfd_copy;
+   int fd = eventfd(0, EFD_NONBLOCK | EFD_CLOEXEC);
+
+   if (fd == -1)
+   return -1;
+
+   /* Open the character device to the kernel module. */
+   /* TODO: check this earlier rather than fail until VM boots! */
+   eventfd_link = open(eventfd_cdev, O_RDWR);
+   if (eventfd_link < 0) {
+   RTE_LOG(ERR, VHOST_CONFIG,
+   "eventfd_link module is not loaded\n");
+   close(fd);
+   return -1;
+   }
+
+   eventfd_copy.source_fd = fd;
+   eventfd_copy.target_fd = target_fd;
+   eventfd_copy.target_pid = target_pid;
+   /* Call the IOCTL to copy the eventfd. */
+   ret = ioctl(eventfd_link, EVENTFD_COPY, &eventfd_copy);
+   close(eventfd_link);
+
+   if (ret < 0) {
+   RTE_LOG(ERR, VHOST_CONFIG,
+   "EVENTFD_COPY ioctl failed\n");
+   close(fd);
+   return -1;
+   }
+
+   return fd;
+}
diff --git a/lib/librte_vhost/vhost_cuse/eventfd_copy.h 
b/lib/librte_vhost/vhost_cuse/eventfd_copy.h
new file mode 100644
index 000..19ae30d
--- /dev/null
+++ b/lib/librte_vhost/vhost_cuse/eventfd_copy.h
@@ -0,0 +1,39 @@
+/*-
+ *   BSD LIC

  1   2   >