[dpdk-dev] [DPDK-memory] how qemu waste such long time under dpdk huge page envriment?
Hi all, I'm running `QEMU_CMD ...` to create a vm under dpdk huge page envriment (which set huge page 1G). And I enable all events in qemu. For qemu and ovs-dpdk(ovs-2.4.9 with dpdk-2.2.0) environment, detail log is: > 30012@1497443246.678304:object_dynamic_cast_assert qemu:memory-region->qemu:memory-region (/home/hu > anghuai/cloud/contrib/qemu-2.6.0/memory.c:1076:memory_region_initfn) > 30012@1497443256.274866:object_dynamic_cast_assert qio-channel-socket->qio-channel-socket (io/chann > el-socket.c:389:qio_channel_socket_init) I don't know why qemu doing 'memory_region_initfn' function in this 10 second, does anyone know this? static void memory_region_initfn(Object *obj) > { > MemoryRegion *mr = MEMORY_REGION(obj); > ObjectProperty *op; > mr->ops = &unassigned_mem_ops; > mr->enabled = true; > mr->romd_mode = true; > mr->global_locking = true; > mr->destructor = memory_region_destructor_none; > QTAILQ_INIT(&mr->subregions); > QTAILQ_INIT(&mr->coalesced); > op = object_property_add(OBJECT(mr), "container", > "link<" TYPE_MEMORY_REGION ">", > memory_region_get_container, > NULL, /* memory_region_set_container */ > NULL, NULL, &error_abort); > op->resolve = memory_region_resolve_container; > object_property_add(OBJECT(mr), "addr", "uint64", > memory_region_get_addr, > NULL, /* memory_region_set_addr */ > NULL, NULL, &error_abort); > object_property_add(OBJECT(mr), "priority", "uint32", > memory_region_get_priority, > NULL, /* memory_region_set_priority */ > NULL, NULL, &error_abort); > object_property_add_bool(OBJECT(mr), "may-overlap", > memory_region_get_may_overlap, > NULL, /* memory_region_set_may_overlap */ > &error_abort); > object_property_add(OBJECT(mr), "size", "uint64", > memory_region_get_size, > NULL, /* memory_region_set_size, */ > NULL, NULL, &error_abort); > } Thank you~
Re: [dpdk-dev] [DPDK-memory] how qemu waste such long time under dpdk huge page envriment?
BTW, while running ovs-dpdk, this log is also take long time, does that mean dpdk request large memory take long time? EAL: Setting up physically contiguous memory... 2017-06-16 16:11 GMT+08:00 Sam : > Hi all, > > I'm running `QEMU_CMD ...` to create a vm under dpdk huge page envriment > (which set huge page 1G). And I enable all events in qemu. > > For qemu and ovs-dpdk(ovs-2.4.9 with dpdk-2.2.0) environment, detail log > is: > > > 30012@1497443246.678304:object_dynamic_cast_assert > qemu:memory-region->qemu:memory-region (/home/hu > > anghuai/cloud/contrib/qemu-2.6.0/memory.c:1076:memory_region_initfn) > > 30012@1497443256.274866:object_dynamic_cast_assert > qio-channel-socket->qio-channel-socket (io/chann > > el-socket.c:389:qio_channel_socket_init) > > > I don't know why qemu doing 'memory_region_initfn' function in this 10 > second, does anyone know this? > > static void memory_region_initfn(Object *obj) >> { >> MemoryRegion *mr = MEMORY_REGION(obj); >> ObjectProperty *op; >> mr->ops = &unassigned_mem_ops; >> mr->enabled = true; >> mr->romd_mode = true; >> mr->global_locking = true; >> mr->destructor = memory_region_destructor_none; >> QTAILQ_INIT(&mr->subregions); >> QTAILQ_INIT(&mr->coalesced); >> op = object_property_add(OBJECT(mr), "container", >> "link<" TYPE_MEMORY_REGION ">", >> memory_region_get_container, >> NULL, /* memory_region_set_container */ >> NULL, NULL, &error_abort); >> op->resolve = memory_region_resolve_container; >> object_property_add(OBJECT(mr), "addr", "uint64", >> memory_region_get_addr, >> NULL, /* memory_region_set_addr */ >> NULL, NULL, &error_abort); >> object_property_add(OBJECT(mr), "priority", "uint32", >> memory_region_get_priority, >> NULL, /* memory_region_set_priority */ >> NULL, NULL, &error_abort); >> object_property_add_bool(OBJECT(mr), "may-overlap", >> memory_region_get_may_overlap, >> NULL, /* memory_region_set_may_overlap */ >> &error_abort); >> object_property_add(OBJECT(mr), "size", "uint64", >> memory_region_get_size, >> NULL, /* memory_region_set_size, */ >> NULL, NULL, &error_abort); >> } > > > Thank you~ >
Re: [dpdk-dev] How to get "--base-virtaddr" when using DPDK
On Thu, Jun 15, 2017 at 12:00:16PM -0600, Junguk Cho wrote: > Hi, > > I have the same situation which is explained here ( > https://github.com/collectd/collectd/issues/2284). > > So, I would like to try "--base-virtaddr" option. > I tried to use "struct rte_memseg *m = rte_eal_get_physmem_layout()" option > based on (http://dpdk.org/doc/api/structrte__memseg.html). > However, it always returns different number. > > How do I get virtual address value to use it as an input of "base-viraddr"? > Hi, there is no one way to get a virtual address to pass in here - if you are aware of some suitable values just pass them directly. What I have done in the past in a multi-process system, is to run the primary normaly, and then when running the failing secondary turn up the log level so I can see the mapped addresses before it fails. Then I record the starting address that the secondary uses, and re-run the primary passing that in as the base-virtaddr hint. Regards, /Bruce
Re: [dpdk-dev] [PATCH 01/38] eal: add support for 24 40 and 48 bit operations
On Fri, Jun 16, 2017 at 11:10:31AM +0530, Shreyansh Jain wrote: > From: Hemant Agrawal > > Bit Swap and LE<=>BE conversions for 23, 40 and 48 bit width > > Signed-off-by: Hemant Agrawal > --- > .../common/include/generic/rte_byteorder.h | 78 > ++ > 1 file changed, 78 insertions(+) > Are these really common enough for inclusion in an generic EAL file? Would they be better being driver specific, so that we don't end up with lots of extra byte-swap routines for each possible size used by a driver. /Bruce
Re: [dpdk-dev] [DPDK-memory] how qemu waste such long time under dpdk huge page envriment?
On Fri, Jun 16, 2017 at 04:26:40PM +0800, Sam wrote: > BTW, while running ovs-dpdk, this log is also take long time, does that > mean dpdk request large memory take long time? > > EAL: Setting up physically contiguous memory... > When running with 1G pages, I found that the mmap system call takes a considerable amount of time to execute. I think this is due to the kernel zero-ing out the 1G pages. IIRC on one system I measured it as taking about 0.4 seconds per 1G page. /Bruce > > 2017-06-16 16:11 GMT+08:00 Sam : > > > Hi all, > > > > I'm running `QEMU_CMD ...` to create a vm under dpdk huge page envriment > > (which set huge page 1G). And I enable all events in qemu. > > > > For qemu and ovs-dpdk(ovs-2.4.9 with dpdk-2.2.0) environment, detail log > > is: > > > > > 30012@1497443246.678304:object_dynamic_cast_assert > > qemu:memory-region->qemu:memory-region (/home/hu > > > anghuai/cloud/contrib/qemu-2.6.0/memory.c:1076:memory_region_initfn) > > > 30012@1497443256.274866:object_dynamic_cast_assert > > qio-channel-socket->qio-channel-socket (io/chann > > > el-socket.c:389:qio_channel_socket_init) > > > > > > I don't know why qemu doing 'memory_region_initfn' function in this 10 > > second, does anyone know this? > > > > static void memory_region_initfn(Object *obj) > >> { > >> MemoryRegion *mr = MEMORY_REGION(obj); > >> ObjectProperty *op; > >> mr->ops = &unassigned_mem_ops; > >> mr->enabled = true; > >> mr->romd_mode = true; > >> mr->global_locking = true; > >> mr->destructor = memory_region_destructor_none; > >> QTAILQ_INIT(&mr->subregions); > >> QTAILQ_INIT(&mr->coalesced); > >> op = object_property_add(OBJECT(mr), "container", > >> "link<" TYPE_MEMORY_REGION ">", > >> memory_region_get_container, > >> NULL, /* memory_region_set_container */ > >> NULL, NULL, &error_abort); > >> op->resolve = memory_region_resolve_container; > >> object_property_add(OBJECT(mr), "addr", "uint64", > >> memory_region_get_addr, > >> NULL, /* memory_region_set_addr */ > >> NULL, NULL, &error_abort); > >> object_property_add(OBJECT(mr), "priority", "uint32", > >> memory_region_get_priority, > >> NULL, /* memory_region_set_priority */ > >> NULL, NULL, &error_abort); > >> object_property_add_bool(OBJECT(mr), "may-overlap", > >> memory_region_get_may_overlap, > >> NULL, /* memory_region_set_may_overlap */ > >> &error_abort); > >> object_property_add(OBJECT(mr), "size", "uint64", > >> memory_region_get_size, > >> NULL, /* memory_region_set_size, */ > >> NULL, NULL, &error_abort); > >> } > > > > > > Thank you~ > >
Re: [dpdk-dev] [PATCH 01/38] eal: add support for 24 40 and 48 bit operations
Hi Bruce, > -Original Message- > From: Bruce Richardson [mailto:bruce.richard...@intel.com] > Sent: Friday, June 16, 2017 2:27 PM > To: Shreyansh Jain > Cc: dev@dpdk.org; ferruh.yi...@intel.com; Hemant Agrawal > > Subject: Re: [dpdk-dev] [PATCH 01/38] eal: add support for 24 40 and 48 bit > operations > > On Fri, Jun 16, 2017 at 11:10:31AM +0530, Shreyansh Jain wrote: > > From: Hemant Agrawal > > > > Bit Swap and LE<=>BE conversions for 23, 40 and 48 bit width > > > > Signed-off-by: Hemant Agrawal > > --- > > .../common/include/generic/rte_byteorder.h | 78 > ++ > > 1 file changed, 78 insertions(+) > > > Are these really common enough for inclusion in an generic EAL file? > Would they be better being driver specific, so that we don't end up with > lots of extra byte-swap routines for each possible size used by a > driver. Reasoning was to keep all bit/byte swap at a single place and if it is useful for others. >From DPAA perspective, these macro can be anywhere. In case someone else too has use of this (now or in near-future), probably then we can consider this in EAL. Else, if I don't get much responses in a few days, I will shift them to DPAA driver in next version of this series. - Shreyansh
[dpdk-dev] [PATCH v3 3/3] doc: add new ddp get info command
This patch adds description of new command for dynamic device personalization profiles: ddp get info (profile_path) Signed-off-by: Andrey Chilikin --- doc/guides/testpmd_app_ug/testpmd_funcs.rst | 7 +++ 1 file changed, 7 insertions(+) diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst index 0e50c1071..c982cd895 100644 --- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst +++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst @@ -380,6 +380,13 @@ For example:: testpmd> read txd 0 0 4 0x0001 - 0x24C3C440 / 0x000F - 0x2330003C +ddp get info + + +Display information about dynamic device personalization profile:: + + testpmd> ddp get info (profile_patch) + show vf stats ~ -- 2.13.0
[dpdk-dev] [PATCH v3 0/3] net/i40e: get information about ddp profile
This patch adds ability to request information about dynamic device personalization profile v3: Add command description to test-pmd guide v2: Add rte_pmd_i40e_get_ddp_info() function to i40e version map Add CL to test-pmd for getting information about profile Andrey Chilikin (3): net/i40e: get information about ddp profile app/testpmd: enable ddp get info feature doc: add new ddp get info command app/test-pmd/cmdline.c | 131 ++ doc/guides/testpmd_app_ug/testpmd_funcs.rst | 7 ++ drivers/net/i40e/rte_pmd_i40e.c | 163 +++- drivers/net/i40e/rte_pmd_i40e.h | 46 drivers/net/i40e/rte_pmd_i40e_version.map | 7 ++ 5 files changed, 350 insertions(+), 4 deletions(-) -- 2.13.0 Acked-by: Beilei Xing
[dpdk-dev] [PATCH v3 1/3] net/i40e: get information about ddp profile
This patch adds ability to request information about dynamic device personalization profile Signed-off-by: Andrey Chilikin --- drivers/net/i40e/rte_pmd_i40e.c | 163 +- drivers/net/i40e/rte_pmd_i40e.h | 46 + drivers/net/i40e/rte_pmd_i40e_version.map | 7 ++ 3 files changed, 212 insertions(+), 4 deletions(-) diff --git a/drivers/net/i40e/rte_pmd_i40e.c b/drivers/net/i40e/rte_pmd_i40e.c index f7ce62bb5..45cdcfaa3 100644 --- a/drivers/net/i40e/rte_pmd_i40e.c +++ b/drivers/net/i40e/rte_pmd_i40e.c @@ -1468,7 +1468,7 @@ rte_pmd_i40e_set_tc_strict_prio(uint8_t port, uint8_t tc_map) return ret; } -#define I40E_PROFILE_INFO_SIZE 48 +#define I40E_PROFILE_INFO_SIZE sizeof(struct rte_pmd_i40e_profile_info) #define I40E_MAX_PROFILE_NUM 16 static void @@ -1520,9 +1520,6 @@ i40e_add_rm_profile_info(struct i40e_hw *hw, uint8_t *profile_info_sec) return status; } -#define I40E_PROFILE_INFO_SIZE 48 -#define I40E_MAX_PROFILE_NUM 16 - /* Check if the profile info exists */ static int i40e_check_profile_info(uint8_t port, uint8_t *profile_info_sec) @@ -1682,6 +1679,164 @@ rte_pmd_i40e_process_ddp_package(uint8_t port, uint8_t *buff, return status; } +int rte_pmd_i40e_get_ddp_info(uint8_t *pkg_buff, uint32_t pkg_size, + uint8_t *info_buff, uint32_t info_size, + enum rte_pmd_i40e_package_info type) +{ + uint32_t ret_size; + struct i40e_package_header *pkg_hdr; + struct i40e_generic_seg_header *i40e_seg_hdr; + struct i40e_generic_seg_header *note_seg_hdr; + struct i40e_generic_seg_header *metadata_seg_hdr; + + if (!info_buff) { + PMD_DRV_LOG(ERR, "Output info buff is invalid."); + return -EINVAL; + } + + if (!pkg_buff || pkg_size < (sizeof(struct i40e_package_header) + + sizeof(struct i40e_metadata_segment) + + sizeof(uint32_t) * 2)) { + PMD_DRV_LOG(ERR, "Package buff is invalid."); + return -EINVAL; + } + + pkg_hdr = (struct i40e_package_header *)pkg_buff; + if (pkg_hdr->segment_count < 2) { + PMD_DRV_LOG(ERR, "Segment_count should be 2 at least."); + return -EINVAL; + } + + /* Find metadata segment */ + metadata_seg_hdr = i40e_find_segment_in_package(SEGMENT_TYPE_METADATA, + pkg_hdr); + + /* Find global notes segment */ + note_seg_hdr = i40e_find_segment_in_package(SEGMENT_TYPE_NOTES, + pkg_hdr); + + /* Find i40e profile segment */ + i40e_seg_hdr = i40e_find_segment_in_package(SEGMENT_TYPE_I40E, pkg_hdr); + + /* get global header info */ + if (type == RTE_PMD_I40E_PKG_INFO_GLOBAL_HEADER) { + struct rte_pmd_i40e_profile_info *info = + (struct rte_pmd_i40e_profile_info *)info_buff; + + if (info_size < sizeof(struct rte_pmd_i40e_profile_info)) { + PMD_DRV_LOG(ERR, "Output info buff size is invalid."); + return -EINVAL; + } + + if (!metadata_seg_hdr) { + PMD_DRV_LOG(ERR, "Failed to find metadata segment header"); + return -EINVAL; + } + + memset(info, 0, sizeof(struct rte_pmd_i40e_profile_info)); + info->owner = RTE_PMD_I40E_DDP_OWNER_UNKNOWN; + info->track_id = + ((struct i40e_metadata_segment *)metadata_seg_hdr)->track_id; + + memcpy(info->name, + ((struct i40e_metadata_segment *)metadata_seg_hdr)->name, + I40E_DDP_NAME_SIZE); + memcpy(&info->version, + &((struct i40e_metadata_segment *)metadata_seg_hdr)->version, + sizeof(struct i40e_ddp_version)); + return I40E_SUCCESS; + } + + /* get global note size */ + if (type == RTE_PMD_I40E_PKG_INFO_GLOBAL_NOTES_SIZE) { + if (info_size < sizeof(uint32_t)) { + PMD_DRV_LOG(ERR, "Invalid information buffer size"); + return -EINVAL; + } + if (note_seg_hdr == NULL) + ret_size = 0; + else + ret_size = note_seg_hdr->size; + *(uint32_t *)info_buff = ret_size; + return I40E_SUCCESS; + } + + /* get global note */ + if (type == RTE_PMD_I40E_PKG_INFO_GLOBAL_NOTES) { + if (note_seg_hdr == NULL) + return -ENOTSUP; + if (info_size < note_seg_hdr->size) { + PMD_DRV_LOG(ERR, "Information buffer size is too small"); + return -EINVAL; + } + memcpy(info_buff, ¬e_seg_hdr[1], note_seg_hdr->size); + return I40E_SUCCESS; + }
[dpdk-dev] [PATCH v3 2/3] app/testpmd: enable ddp get info feature
This patch demostrates how to get information about dynamic device personalization profile. Command 'ddp get info (path_to_profile)' extracts and prints information about the given profile. Signed-off-by: Andrey Chilikin --- app/test-pmd/cmdline.c | 131 + 1 file changed, 131 insertions(+) diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c index 0afac68c7..9e7858901 100644 --- a/app/test-pmd/cmdline.c +++ b/app/test-pmd/cmdline.c @@ -218,6 +218,9 @@ static void cmd_help_long_parsed(void *parsed_result, "ddp get list (port_id)\n" "Get ddp profile info list\n\n" + "ddp get info (profile_path)\n" + "Get ddp profile information.\n\n" + "show vf stats (port_id) (vf_id)\n" "Display a VF's statistics.\n\n" @@ -12904,6 +12907,133 @@ cmdline_parse_inst_t cmd_ddp_add = { }, }; +/* Get dynamic device personalization profile info */ +struct cmd_ddp_info_result { + cmdline_fixed_string_t ddp; + cmdline_fixed_string_t get; + cmdline_fixed_string_t info; + char filepath[]; +}; + +cmdline_parse_token_string_t cmd_ddp_info_ddp = + TOKEN_STRING_INITIALIZER(struct cmd_ddp_info_result, ddp, "ddp"); +cmdline_parse_token_string_t cmd_ddp_info_get = + TOKEN_STRING_INITIALIZER(struct cmd_ddp_info_result, get, "get"); +cmdline_parse_token_string_t cmd_ddp_info_info = + TOKEN_STRING_INITIALIZER(struct cmd_ddp_info_result, info, "info"); +cmdline_parse_token_string_t cmd_ddp_info_filepath = + TOKEN_STRING_INITIALIZER(struct cmd_ddp_info_result, filepath, NULL); + +static void +cmd_ddp_info_parsed( + void *parsed_result, + __attribute__((unused)) struct cmdline *cl, + __attribute__((unused)) void *data) +{ + struct cmd_ddp_info_result *res = parsed_result; + uint8_t *pkg; + uint32_t pkg_size; + int ret = -ENOTSUP; +#ifdef RTE_LIBRTE_I40E_PMD + uint32_t i; + uint8_t *buff; + uint32_t buff_size; + struct rte_pmd_i40e_profile_info info; + uint32_t dev_num; + struct rte_pmd_i40e_ddp_device_id *devs; +#endif + + pkg = open_ddp_package_file(res->filepath, &pkg_size); + if (!pkg) + return; + +#ifdef RTE_LIBRTE_I40E_PMD + ret = rte_pmd_i40e_get_ddp_info(pkg, pkg_size, + (uint8_t *)&info, sizeof(info), + RTE_PMD_I40E_PKG_INFO_GLOBAL_HEADER); + if (!ret) { + printf("Global Track id: 0x%x\n", info.track_id); + printf("Global Version:%d.%d.%d.%d\n", + info.version.major, + info.version.minor, + info.version.update, + info.version.draft); + printf("Global Package name: %s\n\n", info.name); + } + + ret = rte_pmd_i40e_get_ddp_info(pkg, pkg_size, + (uint8_t *)&info, sizeof(info), + RTE_PMD_I40E_PKG_INFO_HEADER); + if (!ret) { + printf("i40e Profile Track id: 0x%x\n", info.track_id); + printf("i40e Profile Version: %d.%d.%d.%d\n", + info.version.major, + info.version.minor, + info.version.update, + info.version.draft); + printf("i40e Profile name: %s\n\n", info.name); + } + + ret = rte_pmd_i40e_get_ddp_info(pkg, pkg_size, + (uint8_t *)&buff_size, sizeof(buff_size), + RTE_PMD_I40E_PKG_INFO_GLOBAL_NOTES_SIZE); + if (!ret && buff_size) { + buff = (uint8_t *)malloc(buff_size); + if (buff) { + ret = rte_pmd_i40e_get_ddp_info(pkg, pkg_size, + buff, buff_size, + RTE_PMD_I40E_PKG_INFO_GLOBAL_NOTES); + if (!ret) + printf("Package Notes:\n%s\n\n", buff); + free(buff); + } + } + + ret = rte_pmd_i40e_get_ddp_info(pkg, pkg_size, + (uint8_t *)&dev_num, sizeof(dev_num), + RTE_PMD_I40E_PKG_INFO_DEVID_NUM); + if (!ret && dev_num) { + devs = (struct rte_pmd_i40e_ddp_device_id *)malloc(dev_num * + sizeof(struct rte_pmd_i40e_ddp_device_id)); + if (devs) { + ret = rte_pmd_i40e_get_ddp_info(pkg, pkg_size, + (uint8_t *)devs, dev_num * + sizeof(struct rte_pmd_i40e_ddp_device_id), + RTE_PMD_
Re: [dpdk-dev] [PATCH 01/38] eal: add support for 24 40 and 48 bit operations
16/06/2017 11:21, Shreyansh Jain: > Hi Bruce, > > From: Bruce Richardson [mailto:bruce.richard...@intel.com] > > On Fri, Jun 16, 2017 at 11:10:31AM +0530, Shreyansh Jain wrote: > > > From: Hemant Agrawal > > > > > > Bit Swap and LE<=>BE conversions for 23, 40 and 48 bit width > > > > > > Signed-off-by: Hemant Agrawal > > > --- > > > .../common/include/generic/rte_byteorder.h | 78 > > ++ > > > 1 file changed, 78 insertions(+) > > > > > Are these really common enough for inclusion in an generic EAL file? > > Would they be better being driver specific, so that we don't end up with > > lots of extra byte-swap routines for each possible size used by a > > driver. > > Reasoning was to keep all bit/byte swap at a single place and if it is > useful for others. > > From DPAA perspective, these macro can be anywhere. In case someone else too > has use of this (now or in near-future), probably then we can consider this > in EAL. > Else, if I don't get much responses in a few days, I will shift them to > DPAA driver in next version of this series. I prefer keeping common functions in a common place. So I like this patch.
Re: [dpdk-dev] [PATCH 01/38] eal: add support for 24 40 and 48 bit operations
Hi Shreyansh, On Fri, Jun 16, 2017 at 09:21:35AM +, Shreyansh Jain wrote: > Hi Bruce, > > > -Original Message- > > From: Bruce Richardson [mailto:bruce.richard...@intel.com] > > Sent: Friday, June 16, 2017 2:27 PM > > To: Shreyansh Jain > > Cc: dev@dpdk.org; ferruh.yi...@intel.com; Hemant Agrawal > > > > Subject: Re: [dpdk-dev] [PATCH 01/38] eal: add support for 24 40 and 48 bit > > operations > > > > On Fri, Jun 16, 2017 at 11:10:31AM +0530, Shreyansh Jain wrote: > > > From: Hemant Agrawal > > > > > > Bit Swap and LE<=>BE conversions for 23, 40 and 48 bit width > > > > > > Signed-off-by: Hemant Agrawal > > > --- > > > .../common/include/generic/rte_byteorder.h | 78 > > ++ > > > 1 file changed, 78 insertions(+) > > > > > Are these really common enough for inclusion in an generic EAL file? > > Would they be better being driver specific, so that we don't end up with > > lots of extra byte-swap routines for each possible size used by a > > driver. > > Reasoning was to keep all bit/byte swap at a single place and if it is > useful for others. > > From DPAA perspective, these macro can be anywhere. In case someone else too > has use of this (now or in near-future), probably then we can consider this > in EAL. > Else, if I don't get much responses in a few days, I will shift them to > DPAA driver in next version of this series. While I'm not against exposing exotic byte swapping functions, they are not completely safe and I'm not sure they should be part of public header files on that basis. Problem is their storage size is larger than the number of bytes they deal with, which raises the question: are filler bytes prepended or appended to the converted value? How about input values in non-native order? Answering that is not so easy as it depends on the use case. We actually had a similar issue when defining VXLAN's VNI field for rte_flow, which is 24-bit in network order. Take rte_constant_bswap48() for instance, assuming input value is little-endian, output is supposed to be big-endian. While the shifts are correct, filler bytes are not in the right place for a big-endian system, and the resulting value stored on uint64_t cannot be used as-is. Again, that depends on the use case, it could be correct if the resulting value was to be used as is on a little-endian system. I think the only safe way to deal with that is by defining specific types of the proper size, e.g.: typedef uint8_t uint48_t[6]; These are cumbersome and cannot be used like normal integers though. With such types, byte-swapping functions become meaningless. Since these are supposed to be rather simple functions, I'm not sure handling/documenting all this complexity in rte_byteorder.h makes sense. -- Adrien Mazarguil 6WIND
[dpdk-dev] [PATCH] vhost: added error log in vhost_user_set_features
Since vhost_user_set_features failure is not handled in any way, a single error log has been added to at least to let the user know that something has gone wrong. Signed-off-by: Dariusz Stojaczyk --- lib/librte_vhost/vhost_user.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c index 5c8058b..b3aff90 100644 --- a/lib/librte_vhost/vhost_user.c +++ b/lib/librte_vhost/vhost_user.c @@ -164,8 +164,12 @@ vhost_user_set_features(struct virtio_net *dev, uint64_t features) uint64_t vhost_features = 0; rte_vhost_driver_get_features(dev->ifname, &vhost_features); - if (features & ~vhost_features) + if (features & ~vhost_features) { + RTE_LOG(ERR, VHOST_CONFIG, + "(%d) received invalid negotiated features.\n", + dev->vid); return -1; + } if ((dev->flags & VIRTIO_DEV_RUNNING) && dev->features != features) { if (dev->notify_ops->features_changed) -- 2.7.4
[dpdk-dev] [PATCH] net/mlx4: fix assertion failure on link update
The interrupt handler can sometimes be triggered for reasons other than a link status event. An assertion failure happen when such events occur while an asynchronous link status update is already scheduled. Address this issue using the same approach as its mlx5 counterpart, commit a9f2fbc42f0c ("net/mlx5: fix inconsistent link status") Fixes: c4da6caa426d ("mlx4: handle link status interrupts") Cc: sta...@dpdk.org Signed-off-by: Adrien Mazarguil Acked-by: Nelio Laranjeiro --- drivers/net/mlx4/mlx4.c | 20 ++-- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c index 050d646..57b52ac 100644 --- a/drivers/net/mlx4/mlx4.c +++ b/drivers/net/mlx4/mlx4.c @@ -5343,6 +5343,7 @@ priv_dev_status_handler(struct priv *priv, struct rte_eth_dev *dev, { struct ibv_async_event event; int port_change = 0; + struct rte_eth_link *link = &dev->data->dev_link; int ret = 0; *events = 0; @@ -5364,22 +5365,20 @@ priv_dev_status_handler(struct priv *priv, struct rte_eth_dev *dev, event.event_type, event.element.port_num); ibv_ack_async_event(&event); } - - if (port_change ^ priv->pending_alarm) { - struct rte_eth_link *link = &dev->data->dev_link; - - priv->pending_alarm = 0; - mlx4_link_update(dev, 0); - if (((link->link_speed == 0) && link->link_status) || - ((link->link_speed != 0) && !link->link_status)) { + if (!port_change) + return ret; + mlx4_link_update(dev, 0); + if (((link->link_speed == 0) && link->link_status) || + ((link->link_speed != 0) && !link->link_status)) { + if (!priv->pending_alarm) { /* Inconsistent status, check again later. */ priv->pending_alarm = 1; rte_eal_alarm_set(MLX4_ALARM_TIMEOUT_US, mlx4_dev_link_status_handler, dev); - } else { - *events |= (1 << RTE_ETH_EVENT_INTR_LSC); } + } else { + *events |= (1 << RTE_ETH_EVENT_INTR_LSC); } return ret; } @@ -5400,6 +5399,7 @@ mlx4_dev_link_status_handler(void *arg) priv_lock(priv); assert(priv->pending_alarm == 1); + priv->pending_alarm = 0; ret = priv_dev_status_handler(priv, dev, &events); priv_unlock(priv); if (ret > 0 && events & (1 << RTE_ETH_EVENT_INTR_LSC)) -- 2.1.4
Re: [dpdk-dev] [PATCH] vhost: added error log in vhost_user_set_features
On Fri, Jun 16, 2017 at 04:32:05PM +0200, Dariusz Stojaczyk wrote: > Since vhost_user_set_features failure is not handled in any way, a > single error log has been added to at least to let the user know that > something has gone wrong. Indeed. And applied to dpdk-next-virtio. Thanks. --yliu
Re: [dpdk-dev] [PATCH] doc: add testpmd commands for DDP
On 6/16/2017 3:43 AM, Beilei Xing wrote: > Add testpmd commands for loading DDP package and getting > loaded DDP info list. > > Signed-off-by: Beilei Xing Applied to dpdk-next-net/master, thanks. (Abbreviation DDP extended in document)
[dpdk-dev] [git pull] virtio changes for 17.08-rc1
Hi Thomas, Please consider pulling following virtio changes for 17.08-rc1 at git://dpdk.org/next/dpdk-next-virtiomaster Thanks. --yliu --- Daniel Verkamp (1): vhost: access VhostUsrMsg via packed struct Dariusz Stojaczyk (3): vhost: fix malloc size too small vhost: fix memory leak vhost: added error log for badly negotiated features Jens Freimann (1): vhost: check malloc return value when allocating guest pages Tiwei Bie (1): net/virtio: zero the whole memory zone Yuanhan Liu (1): vhost: fix crash on NUMA Zhihong Wang (1): vhost: support rx_queue_count --- drivers/net/vhost/rte_eth_vhost.c | 13 drivers/net/virtio/virtio_ethdev.c | 2 +- lib/librte_vhost/rte_vhost.h | 12 +++ lib/librte_vhost/rte_vhost_version.map | 7 + lib/librte_vhost/vhost.c | 28 - lib/librte_vhost/vhost_user.c | 81 +--- 6 files changed, 109 insertions(+), 34 deletions(-)
Re: [dpdk-dev] mlx debug build error with clang
Hi Ferruh, On Tue, Jun 13, 2017 at 04:32:03PM +0100, Ferruh Yigit wrote: > Hi Adrien, Nelio, > > I am getting following build error [1] with clang [2] when debug enabled > for mlx4 and mlx5. > > This started after I update my box, not sure what triggered this. > Can you please help fixing this? > > Thanks, > ferruh > > > [1] > > .../drivers/net/mlx4/mlx4_flow.c:731:3: error: use of GNU statement > expression extension [-Werror,-Wgnu-statement-expression] > claim_zero(ibv_destroy_qp(fdq->qp)); > ^ > .../drivers/net/mlx4/mlx4.h:185:25: note: expanded from macro 'claim_zero' > #define claim_zero(...) assert((__VA_ARGS__) == 0) > ^ > /usr/include/assert.h:95:6: note: expanded from macro 'assert' > ({ \ > ^ > > > > .../drivers/net/mlx5/mlx5_fdir.c:278:2: error: use of GNU statement > expression extension [-Werror,-Wgnu-statement-expression] > assert(((uint8_t *)attr + sizeof(*attr)) == (uint8_t *)spec_offset); > ^ > /usr/include/assert.h:95:6: note: expanded from macro 'assert' > ({ \ > ^ > > [Many of similar ...] > > > [2] > target: x86_64-native-linuxapp-clang > > clang version 4.0.0 (tags/RELEASE_400/final) Recent Glibc versions now apparently handle assert() through a nonstandard ({ }) construct, which is not pedantic-safe due to a missing __extension__ keyword. assert.h still provides a standard assert() definition that shouldn't cause compilation to fail when the following condition is met: #if !defined __GNUC__ || defined __STRICT_ANSI__ However __GNUC__ is (always?) defined by clang for maximum compatibility with GCC while __STRICT_ANSI__ is not due to the -std=gnu99 parameter, assert.h thinks it's OK to use a ({ }) construct but is then caught by clang's -pedantic parameter. There are two ways to address this issue while keeping our beloved -pedantic parameter in debug mode: 1. Replacing -std=gnu99 with -std=c99 (which is even stricter) to bring back __STRICT_ANSI__. 2. Replacing assert() statements with RTE_ASSERT(). The former should be doable now that DPDK includes have been cleaned up, and we're thinking about doing the latter at some point. Since I don't have a recent Glibc handy, can you try replacing -std=gnu99 with -std=c99 in both Makefiles (mlx4 and mlx5), and report how GCC and clang fare? (GCC at least seems to have no problem with that on my side) -- Adrien Mazarguil 6WIND
Re: [dpdk-dev] [PATCH v3 0/3] net/i40e: get information about ddp profile
On 6/16/2017 10:25 AM, Andrey Chilikin wrote: > This patch adds ability to request information about dynamic device > personalization profile > > v3: > Add command description to test-pmd guide > > v2: > Add rte_pmd_i40e_get_ddp_info() function to i40e version map > Add CL to test-pmd for getting information about profile > > Andrey Chilikin (3): > net/i40e: get information about ddp profile > app/testpmd: enable ddp get info feature > doc: add new ddp get info command Series applied to dpdk-next-net/master, thanks. - Kept ACK from Beilei - Merged 3/3 into 2/3
Re: [dpdk-dev] mlx debug build error with clang
On 6/16/2017 1:19 PM, Adrien Mazarguil wrote: > Hi Ferruh, > > On Tue, Jun 13, 2017 at 04:32:03PM +0100, Ferruh Yigit wrote: >> Hi Adrien, Nelio, >> >> I am getting following build error [1] with clang [2] when debug enabled >> for mlx4 and mlx5. >> >> This started after I update my box, not sure what triggered this. >> Can you please help fixing this? >> >> Thanks, >> ferruh >> >> >> [1] >> >> .../drivers/net/mlx4/mlx4_flow.c:731:3: error: use of GNU statement >> expression extension [-Werror,-Wgnu-statement-expression] >> claim_zero(ibv_destroy_qp(fdq->qp)); >> ^ >> .../drivers/net/mlx4/mlx4.h:185:25: note: expanded from macro 'claim_zero' >> #define claim_zero(...) assert((__VA_ARGS__) == 0) >> ^ >> /usr/include/assert.h:95:6: note: expanded from macro 'assert' >> ({ \ >> ^ >> >> >> >> .../drivers/net/mlx5/mlx5_fdir.c:278:2: error: use of GNU statement >> expression extension [-Werror,-Wgnu-statement-expression] >> assert(((uint8_t *)attr + sizeof(*attr)) == (uint8_t *)spec_offset); >> ^ >> /usr/include/assert.h:95:6: note: expanded from macro 'assert' >> ({ \ >> ^ >> >> [Many of similar ...] >> >> >> [2] >> target: x86_64-native-linuxapp-clang >> >> clang version 4.0.0 (tags/RELEASE_400/final) > > Recent Glibc versions now apparently handle assert() through a nonstandard > ({ }) construct, which is not pedantic-safe due to a missing __extension__ > keyword. > > assert.h still provides a standard assert() definition that shouldn't cause > compilation to fail when the following condition is met: > > #if !defined __GNUC__ || defined __STRICT_ANSI__ > > However __GNUC__ is (always?) defined by clang for maximum compatibility > with GCC while __STRICT_ANSI__ is not due to the -std=gnu99 parameter, > assert.h thinks it's OK to use a ({ }) construct but is then caught by > clang's -pedantic parameter. > > There are two ways to address this issue while keeping our beloved -pedantic > parameter in debug mode: > > 1. Replacing -std=gnu99 with -std=c99 (which is even stricter) to bring back >__STRICT_ANSI__. > 2. Replacing assert() statements with RTE_ASSERT(). > > The former should be doable now that DPDK includes have been cleaned up, and > we're thinking about doing the latter at some point. > > Since I don't have a recent Glibc handy, can you try replacing -std=gnu99 > with -std=c99 in both Makefiles (mlx4 and mlx5), and report how GCC and > clang fare? (GCC at least seems to have no problem with that on my side) Sure, I will try -std=c99 >
Re: [dpdk-dev] [PATCH 2/7] net/mlx4: fix Rx interrupts with multiple ports
On 6/14/2017 12:49 PM, Adrien Mazarguil wrote: > Several Ethernet device structures are allocated on top of a common PCI > device for mlx4 adapters with multiple ports. These inherit a common > interrupt handle from their parent PCI device, which prevents Rx interrupts > from working properly on all ports as their configuration is overwritten. > > Use a local interrupt handle to address this issue. Hi Adrien, I am not clear why local copy required, and main concern from my point of view is if this is a common problem for all PMDs and should be addressed in higher level? The variable is already per eth_dev, but this patch moves it the private data. What overwrites it within eth_dev? Thanks, ferruh > > Fixes: 9f05a4b81809 ("net/mlx4: support user space Rx interrupt event") > > Signed-off-by: Adrien Mazarguil > Acked-by: Moti Haimovsky > --- > drivers/net/mlx4/mlx4.c | 9 + > drivers/net/mlx4/mlx4.h | 1 + > 2 files changed, 10 insertions(+) > > diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c > index 178562e..2b4722f 100644 > --- a/drivers/net/mlx4/mlx4.c > +++ b/drivers/net/mlx4/mlx4.c > @@ -6207,6 +6207,15 @@ mlx4_pci_probe(struct rte_pci_driver *pci_drv, struct > rte_pci_device *pci_dev) > > eth_dev->device->driver = &mlx4_driver.driver; > > + /* > + * Copy and override interrupt handle to prevent it from > + * being shared between all ethdev instances of a given PCI > + * device. This is required to properly handle Rx interrupts > + * on all ports. > + */ > + priv->intr_handle_dev = *eth_dev->intr_handle; > + eth_dev->intr_handle = &priv->intr_handle_dev; > + > priv->dev = eth_dev; > eth_dev->dev_ops = &mlx4_dev_ops; > > diff --git a/drivers/net/mlx4/mlx4.h b/drivers/net/mlx4/mlx4.h > index c46fc23..b74fbf8 100644 > --- a/drivers/net/mlx4/mlx4.h > +++ b/drivers/net/mlx4/mlx4.h > @@ -345,6 +345,7 @@ struct priv { > unsigned int txqs_n; /* TX queues array size. */ > struct rxq *(*rxqs)[]; /* RX queues. */ > struct txq *(*txqs)[]; /* TX queues. */ > + struct rte_intr_handle intr_handle_dev; /* Device interrupt handler. */ > struct rte_intr_handle intr_handle; /* Interrupt handler. */ > struct rte_flow_drop *flow_drop_queue; /* Flow drop queue. */ > LIST_HEAD(mlx4_flows, rte_flow) flows; >
Re: [dpdk-dev] [PATCH 2/7] net/mlx4: fix Rx interrupts with multiple ports
On Fri, Jun 16, 2017 at 02:07:54PM +0100, Ferruh Yigit wrote: > On 6/14/2017 12:49 PM, Adrien Mazarguil wrote: > > Several Ethernet device structures are allocated on top of a common PCI > > device for mlx4 adapters with multiple ports. These inherit a common > > interrupt handle from their parent PCI device, which prevents Rx interrupts > > from working properly on all ports as their configuration is overwritten. > > > > Use a local interrupt handle to address this issue. > > Hi Adrien, > > I am not clear why local copy required, and main concern from my point > of view is if this is a common problem for all PMDs and should be > addressed in higher level? This issue only affects PMDs that handle multiple Ethernet ports through a single PCI device. Such PMDs (like mlx4) identify themselves as PCI drivers that manually have to register multiple rte_eth_dev instances through rte_eth_dev_allocate(), which they then have to initialize. > The variable is already per eth_dev, but this patch moves it the private > data. What overwrites it within eth_dev? Calling rte_eth_copy_pci_info() makes the rte_eth_dev structure inherit the default interrupt handle of the underlying PCI device. By "inherit", I mean eth_dev->intr_handle points to it, in that sense it's not per eth_dev but per PCI device. mlx4 Rx interrupts are associated with a given Verbs context, and each port has its own Verbs context, so they cannot be shared, while other PMDs using other methods for catching interrupts may be perfectly fine with a single vector associated with the PCI device. It depends on the PMD, for instance there is no such problem with mlx5 as exactly one PCI device is associated with a given port. This patch merely allocates a specific interrupt handle associated with the eth_dev itself and makes the eth_dev handle point to that instead of the default PCI handle. This "local" handle is initialized using the PCI handle as a template before modifying the pointer. It's completely safe. > > Fixes: 9f05a4b81809 ("net/mlx4: support user space Rx interrupt event") > > > > Signed-off-by: Adrien Mazarguil > > Acked-by: Moti Haimovsky > > --- > > drivers/net/mlx4/mlx4.c | 9 + > > drivers/net/mlx4/mlx4.h | 1 + > > 2 files changed, 10 insertions(+) > > > > diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c > > index 178562e..2b4722f 100644 > > --- a/drivers/net/mlx4/mlx4.c > > +++ b/drivers/net/mlx4/mlx4.c > > @@ -6207,6 +6207,15 @@ mlx4_pci_probe(struct rte_pci_driver *pci_drv, > > struct rte_pci_device *pci_dev) > > > > eth_dev->device->driver = &mlx4_driver.driver; > > > > + /* > > +* Copy and override interrupt handle to prevent it from > > +* being shared between all ethdev instances of a given PCI > > +* device. This is required to properly handle Rx interrupts > > +* on all ports. > > +*/ > > + priv->intr_handle_dev = *eth_dev->intr_handle; > > + eth_dev->intr_handle = &priv->intr_handle_dev; > > + > > priv->dev = eth_dev; > > eth_dev->dev_ops = &mlx4_dev_ops; > > > > diff --git a/drivers/net/mlx4/mlx4.h b/drivers/net/mlx4/mlx4.h > > index c46fc23..b74fbf8 100644 > > --- a/drivers/net/mlx4/mlx4.h > > +++ b/drivers/net/mlx4/mlx4.h > > @@ -345,6 +345,7 @@ struct priv { > > unsigned int txqs_n; /* TX queues array size. */ > > struct rxq *(*rxqs)[]; /* RX queues. */ > > struct txq *(*txqs)[]; /* TX queues. */ > > + struct rte_intr_handle intr_handle_dev; /* Device interrupt handler. */ > > struct rte_intr_handle intr_handle; /* Interrupt handler. */ > > struct rte_flow_drop *flow_drop_queue; /* Flow drop queue. */ > > LIST_HEAD(mlx4_flows, rte_flow) flows; > > > -- Adrien Mazarguil 6WIND
Re: [dpdk-dev] [PATCH 2/7] net/mlx4: fix Rx interrupts with multiple ports
On 6/16/2017 2:39 PM, Adrien Mazarguil wrote: > On Fri, Jun 16, 2017 at 02:07:54PM +0100, Ferruh Yigit wrote: >> On 6/14/2017 12:49 PM, Adrien Mazarguil wrote: >>> Several Ethernet device structures are allocated on top of a common PCI >>> device for mlx4 adapters with multiple ports. These inherit a common >>> interrupt handle from their parent PCI device, which prevents Rx interrupts >>> from working properly on all ports as their configuration is overwritten. >>> >>> Use a local interrupt handle to address this issue. >> >> Hi Adrien, >> >> I am not clear why local copy required, and main concern from my point >> of view is if this is a common problem for all PMDs and should be >> addressed in higher level? > > This issue only affects PMDs that handle multiple Ethernet ports through a > single PCI device. Such PMDs (like mlx4) identify themselves as PCI drivers > that manually have to register multiple rte_eth_dev instances through > rte_eth_dev_allocate(), which they then have to initialize. > >> The variable is already per eth_dev, but this patch moves it the private >> data. What overwrites it within eth_dev? > > Calling rte_eth_copy_pci_info() makes the rte_eth_dev structure inherit the > default interrupt handle of the underlying PCI device. By "inherit", I mean > eth_dev->intr_handle points to it, in that sense it's not per eth_dev but > per PCI device. > > mlx4 Rx interrupts are associated with a given Verbs context, and each port > has its own Verbs context, so they cannot be shared, while other PMDs using > other methods for catching interrupts may be perfectly fine with a single > vector associated with the PCI device. It depends on the PMD, for instance > there is no such problem with mlx5 as exactly one PCI device is associated > with a given port. > > This patch merely allocates a specific interrupt handle associated with the > eth_dev itself and makes the eth_dev handle point to that instead of the > default PCI handle. This "local" handle is initialized using the PCI handle > as a template before modifying the pointer. It's completely safe. Thanks for clarification. > >>> Fixes: 9f05a4b81809 ("net/mlx4: support user space Rx interrupt event") >>> >>> Signed-off-by: Adrien Mazarguil >>> Acked-by: Moti Haimovsky >>> --- >>> drivers/net/mlx4/mlx4.c | 9 + >>> drivers/net/mlx4/mlx4.h | 1 + >>> 2 files changed, 10 insertions(+) >>> >>> diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c >>> index 178562e..2b4722f 100644 >>> --- a/drivers/net/mlx4/mlx4.c >>> +++ b/drivers/net/mlx4/mlx4.c >>> @@ -6207,6 +6207,15 @@ mlx4_pci_probe(struct rte_pci_driver *pci_drv, >>> struct rte_pci_device *pci_dev) >>> >>> eth_dev->device->driver = &mlx4_driver.driver; >>> >>> + /* >>> +* Copy and override interrupt handle to prevent it from >>> +* being shared between all ethdev instances of a given PCI >>> +* device. This is required to properly handle Rx interrupts >>> +* on all ports. >>> +*/ >>> + priv->intr_handle_dev = *eth_dev->intr_handle; >>> + eth_dev->intr_handle = &priv->intr_handle_dev; >>> + >>> priv->dev = eth_dev; >>> eth_dev->dev_ops = &mlx4_dev_ops; >>> >>> diff --git a/drivers/net/mlx4/mlx4.h b/drivers/net/mlx4/mlx4.h >>> index c46fc23..b74fbf8 100644 >>> --- a/drivers/net/mlx4/mlx4.h >>> +++ b/drivers/net/mlx4/mlx4.h >>> @@ -345,6 +345,7 @@ struct priv { >>> unsigned int txqs_n; /* TX queues array size. */ >>> struct rxq *(*rxqs)[]; /* RX queues. */ >>> struct txq *(*txqs)[]; /* TX queues. */ >>> + struct rte_intr_handle intr_handle_dev; /* Device interrupt handler. */ >>> struct rte_intr_handle intr_handle; /* Interrupt handler. */ >>> struct rte_flow_drop *flow_drop_queue; /* Flow drop queue. */ >>> LIST_HEAD(mlx4_flows, rte_flow) flows; >>> >> >
Re: [dpdk-dev] [PATCH 0/7] Fix Rx interrupt support in mlx4 and mlx5
On 6/14/2017 12:49 PM, Adrien Mazarguil wrote: > Several issues mainly related to error handling were found in both > implementations as they share most of their Rx interrupts handling code. > > Another problem with the mlx4 implementation is that it does not work > properly with multiple ports adapters that share a common PCI device. > > Adrien Mazarguil (7): > net/mlx4: fix typos from prior commit > net/mlx4: fix Rx interrupts with multiple ports > net/mlx4: fix Rx interrupts management > net/mlx5: fix misplaced Rx interrupts functions > net/mlx5: fix Rx interrupts support checks > net/mlx5: fix return value in Rx interrupts code > net/mlx5: fix Rx interrupts management > > drivers/net/mlx4/mlx4.c | 179 --- > drivers/net/mlx4/mlx4.h | 1 + > drivers/net/mlx5/mlx5.c | 2 + > drivers/net/mlx5/mlx5_rxq.c | 142 ++- > drivers/net/mlx5/mlx5_rxtx.c| 73 -- > drivers/net/mlx5/mlx5_rxtx.h| 12 +-- > drivers/net/mlx5/mlx5_trigger.c | 16 ++-- > 7 files changed, 194 insertions(+), 231 deletions(-) Series applied to dpdk-next-net/master, thanks.
Re: [dpdk-dev] [PATCH] app/testpmd: fix build without ixgbe
16/06/2017 03:29, Lu, Wenzhuo: > From: Nicolau, Radu > > > > Looks ok to me, but why would one enable IXGBE_BYPASS without enabling > > IXGBE? > > Although the scenario doesn't make sense, I think it's better to add more > check to avoid the users making any mistake. Applied
Re: [dpdk-dev] [PATCH v2 1/3] eal: introduce big and little endian types
Series applied, thanks
Re: [dpdk-dev] [PATCH] eal: remove vdev probe by dev args
Please Jan, could you comment? 09/06/2017 11:21, Ferruh Yigit: > On 6/8/2017 9:45 PM, Thomas Monjalon wrote: > > 10/05/2017 13:01, Ferruh Yigit: > >> Virtual device/driver probing done via name. > >> > >> A new alternative method introduced to probe the device with providing > >> driver name in devargs as "driver=". > >> > >> This patch removes alternative method and fixes virtual device usages > >> with proper device names. > >> > >> Fixes: 87c3bf29c642 ("test: do not short-circuit null device creation") > >> Fixes: d39670086a63 ("eal: parse driver argument before probing drivers") > >> > >> Signed-off-by: Ferruh Yigit > >> --- > > [...] > >> static int > >> vdev_probe_all_drivers(struct rte_vdev_device *dev) > >> { > >>const char *name; > >> - char *drv_name; > >>struct rte_vdev_driver *driver; > >>int ret = 1; > >> > >> - drv_name = parse_driver_arg(rte_vdev_device_args(dev)); > >> - name = drv_name ? drv_name : rte_vdev_device_name(dev); > >> + name = rte_vdev_device_name(dev); > > > > It seems you are reverting the commit d39670086a63: > > eal: parse driver argument before probing drivers > > Mostly yes. > > > > > In some cases the virtual device name should be totally different than > > the driver being used for the device. Therefore lets parse the devargs > > for > > the "driver" argument before probing drivers in > > vdev_probe_all_drivers(). > > > > Is this "driver" option useless? > > There is already a generic way to probe, why need another method, I > didn't get the motivation but this looks like a hack, only place I found > used is in bonding unit test, which can be switched to generic way > without this support. > > And this is a hidden / an undocumented feature. > > Thanks, > ferruh >
Re: [dpdk-dev] [PATCH] mem: support page locking on FreeBSD
On Thu, Jun 15, 2017 at 07:50:55PM +0200, Thomas Monjalon wrote: > The function rte_mem_lock_page() was added for Linux only. > The file eal_common_memory.c is a better place to make it > available in FreeBSD also. > > The issue is seen when trying to compile bnxt on FreeBSD: > bnxt_hwrm.c: undefined reference to `rte_mem_lock_page' > > Fixes: 3097de6e6bfb ("mem: get physical address of any pointer") > > Reported-by: Fangfang Wei > Signed-off-by: Thomas Monjalon > --- This fixes gcc and clang errors on FreeBSD so Acked-by: Bruce Richardson
Re: [dpdk-dev] [PATCH] net/bonding: support bifurcated driver in eal cli using --vdev
14/06/2017 12:49, Gowrishankar: > At present, creating bonding devices using --vdev is broken for PMD like > mlx5 as it is neither UIO nor VFIO based and hence PMD driver is unknown > to find_port_id_by_pci_addr(), as below. > > testpmd --vdev 'net_bonding0,mode=1,slave=,socket_id=0' > > PMD: bond_ethdev_parse_slave_port_kvarg(150) - Invalid slave port value > () specified > EAL: Failed to parse slave ports for bonded device net_bonding0 Thanks for reporting. > This patch adds RTE_KDRV_BIFURCATED in rte_kernel_driver for the PMD > driver like mlx5 to be a known one. The current kdrv value should be RTE_KDRV_UNKNOWN for Mellanox devices. I do not see the value of creating a new type. I think the issue is in the bonding code (find_port_id_by_pci_addr): * TODO: Once the PCI bus has arrived we should have a better * way to test for being a PCI device or not. */ if (rte_eth_devices[i].data->kdrv == RTE_KDRV_UNKNOWN || rte_eth_devices[i].data->kdrv == RTE_KDRV_NONE) continue;
Re: [dpdk-dev] [PATCH v2] test/test_mbuf: remove mempool global var
On Thu, 8 Jun 2017 14:28:31 +, Santosh Shukla wrote: > Let test_mbuf alloc and free mempool. > > Cc: sta...@dpdk.org > Signed-off-by: Santosh Shukla Acked-by: Olivier Matz
Re: [dpdk-dev] mlx debug build error with clang
On 6/16/2017 1:58 PM, Ferruh Yigit wrote: > On 6/16/2017 1:19 PM, Adrien Mazarguil wrote: >> Hi Ferruh, >> >> On Tue, Jun 13, 2017 at 04:32:03PM +0100, Ferruh Yigit wrote: >>> Hi Adrien, Nelio, >>> >>> I am getting following build error [1] with clang [2] when debug enabled >>> for mlx4 and mlx5. >>> >>> This started after I update my box, not sure what triggered this. >>> Can you please help fixing this? >>> >>> Thanks, >>> ferruh >>> >>> >>> [1] >>> >>> .../drivers/net/mlx4/mlx4_flow.c:731:3: error: use of GNU statement >>> expression extension [-Werror,-Wgnu-statement-expression] >>> claim_zero(ibv_destroy_qp(fdq->qp)); >>> ^ >>> .../drivers/net/mlx4/mlx4.h:185:25: note: expanded from macro 'claim_zero' >>> #define claim_zero(...) assert((__VA_ARGS__) == 0) >>> ^ >>> /usr/include/assert.h:95:6: note: expanded from macro 'assert' >>> ({ \ >>> ^ >>> >>> >>> >>> .../drivers/net/mlx5/mlx5_fdir.c:278:2: error: use of GNU statement >>> expression extension [-Werror,-Wgnu-statement-expression] >>> assert(((uint8_t *)attr + sizeof(*attr)) == (uint8_t *)spec_offset); >>> ^ >>> /usr/include/assert.h:95:6: note: expanded from macro 'assert' >>> ({ \ >>> ^ >>> >>> [Many of similar ...] >>> >>> >>> [2] >>> target: x86_64-native-linuxapp-clang >>> >>> clang version 4.0.0 (tags/RELEASE_400/final) >> >> Recent Glibc versions now apparently handle assert() through a nonstandard >> ({ }) construct, which is not pedantic-safe due to a missing __extension__ >> keyword. >> >> assert.h still provides a standard assert() definition that shouldn't cause >> compilation to fail when the following condition is met: >> >> #if !defined __GNUC__ || defined __STRICT_ANSI__ >> >> However __GNUC__ is (always?) defined by clang for maximum compatibility >> with GCC while __STRICT_ANSI__ is not due to the -std=gnu99 parameter, >> assert.h thinks it's OK to use a ({ }) construct but is then caught by >> clang's -pedantic parameter. >> >> There are two ways to address this issue while keeping our beloved -pedantic >> parameter in debug mode: >> >> 1. Replacing -std=gnu99 with -std=c99 (which is even stricter) to bring back >>__STRICT_ANSI__. >> 2. Replacing assert() statements with RTE_ASSERT(). >> >> The former should be doable now that DPDK includes have been cleaned up, and >> we're thinking about doing the latter at some point. >> >> Since I don't have a recent Glibc handy, can you try replacing -std=gnu99 >> with -std=c99 in both Makefiles (mlx4 and mlx5), and report how GCC and >> clang fare? (GCC at least seems to have no problem with that on my side) > > Sure, I will try -std=c99 This fixes the build error. > >> >
Re: [dpdk-dev] [PATCH] mem: support page locking on FreeBSD
16/06/2017 16:31, Bruce Richardson: > On Thu, Jun 15, 2017 at 07:50:55PM +0200, Thomas Monjalon wrote: > > The function rte_mem_lock_page() was added for Linux only. > > The file eal_common_memory.c is a better place to make it > > available in FreeBSD also. > > > > The issue is seen when trying to compile bnxt on FreeBSD: > > bnxt_hwrm.c: undefined reference to `rte_mem_lock_page' > > > > Fixes: 3097de6e6bfb ("mem: get physical address of any pointer") > > > > Reported-by: Fangfang Wei > > Signed-off-by: Thomas Monjalon > > --- > This fixes gcc and clang errors on FreeBSD so > > Acked-by: Bruce Richardson Applied
Re: [dpdk-dev] [RFC] Kernel Control Path (KCP)
Hi Alex, On 6/15/2017 1:07 PM, Alex Rosenbaum wrote: > please excuse me if I missed out of the previous conversation and > asking these questions again... > > Why create a new driver instead of improving the existing KNI driver? For control path, KNI uses Linux kernel driver within KNI kernel module. This method works, but may not be best option, and technically not extendable for some drivers. KNI control path currently supports only two drivers, proposed KCP works for all PMDs by default. Overall, KCP is outcome of the effort of improving KNI control path. Initial proposal was (a year ago I guess) introducing two new modules, one for control path and one for data path, and replace KNI completely. But current target is have KCP to have better control path support. Also, KNI handles both data and control path. But both are different functionalities and not need to be in some module. For example an application may not need exception data path to kernel, but may be interested in controlling DPDK interfaces via common Linux tools. > Can you share a table of the differences between the two driver / > approaches [KNI vs KCP]? KCP differences against KNI: - KCP is only for control path - Linux virtual interfaces created automatically, without DPDK application modification. - To create/destroy interfaces KCP uses rtnl, KNI uses ioctl. So technically it is possible to use "ip" tool to create / destroy interfaces supported by KCP. - KCP kernel module and userspace counterpart communicates via netlink, KNI uses ioctl. - KCP works for all PMDs without update on PMDs. > > Why do you want to remove features like data path that is provided by KNI > today? There is not intention to remove exception data path, the focus is to improve KNI. > > thanks, > Alex >
Re: [dpdk-dev] [RFC] Kernel Control Path (KCP)
On 5/28/2017 5:55 PM, Wiles, Keith wrote: > >> On May 26, 2017, at 11:52 AM, Ferruh Yigit wrote: >> >> We are looking for re-sending [1] the Kernel Control Path (KCP) >> with some updates [2]. >> >> Mainly this is an usability improvement for DPDK. >> >> And a quick reminder about what KCP is: >> >> "KCP is Linux virtual network interface that can control DPDK ports". >> >> So DPDK interfaces, somehow will be visible and it will be possible to >> use common Linux tools on DPDK interfaces. >> >> This work can be done in multiple steps: >> >> - At first step virtual interfaces can be read-only, and can be used >> to get stats / information from DPDK ports. >> >> - Second step can be controlling the DPDK interfaces in a common way >> like Linux interfaces. >> >> It is good to remind that KCP is only for control path, and no data >> traffic will be available on those interfaces, meaning not able to use >> tcpdump or similar tools on those interfaces. >> >> I would like to hear about comments, requirements and objection about >> the idea? >> >> Also the name "Kernel Control Path" can be too broad, I am open to a >> name change, any comments on naming is welcome. > > Using kernel in the name is not very useful, but netlink is the real part > that makes sense. > > How about one of these: > - DNI = DPDK Netlink Interface > - DNC = DPDK Netlink Control > - NCI = Netlink Control Interface > > Being able to control DPDK interfaces via Netlink is one of the customer > needs I have heard of late. My concern is this name my create a miss understanding that DPDK is providing a netlink interface for other applications that they can use to control DPDK application / interfaces. Here although netlink sockets used to communicate between kernel and userspace, DPDK application connects to the netlink socket provided by kernel module, and DPDK interfaces controlled using virtual Linux network interfaces, independent from what kind of communication method used between kernel and userspace. > >> >> >> [1] >> http://dpdk.org/ml/archives/dev/2016-March/035139.html >> >> [2] >> Updates planned to the latest version sent: >> - Create control interfaces without requiring an API call from user >> application, this will let DPDK applications have this support >> without any modification. >> - Default enabled interfaces will be read-only. >> - Possible rename. >> >> >> Thanks, >> ferruh >> >> >> Ferruh Yigit (4): >> ethtool: move from sample folder to lib folder >> kcp: add kernel control path kernel module >> rte_ctrl_if: add control interface library >> ethdev: add control interface support >> >> -- >> 2.9.3 >> > > Regards, > Keith >
Re: [dpdk-dev] [PATCH v2 1/2] net/e1000: fix flex type filter
On 6/16/2017 6:04 AM, Wei Zhao wrote: > Fix bug in parse the flex info of wrong use local > variable index which will cause filter fail, and some > error in calculation mask. > > Fixes: 7cd77faf7129 ("net/igb: parse flow API flex filter") > > Signed-off-by: Wei Zhao Series applied to dpdk-next-net/master, thanks.
Re: [dpdk-dev] how to poll a 40G/100G NIC with multiqueues without RSS
With XL710 NICs you can use flexible payload for RSS for ETH_RSS_L2_PAYLOAD. By default XL710 PMD will extract first 16 bytes of the payload. You can include these 16 bytes for RSS with rte_eth_dev_filter_ctrl(...,RTE_ETH_FILTER_HASH, RTE_ETH_FILTER_SET, ...) /Andrey -Original Message- From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Chillance Zen Sent: Friday, June 9, 2017 7:48 AM To: dev@dpdk.org Subject: [dpdk-dev] how to poll a 40G/100G NIC with multiqueues without RSS Hi, everyone, since I need to realize my own protocol with which RSS can not be configured. but we need massive packets to be processed with multiple cpus(even that one cpu is always polling is not enough) does anyone know how to distribute traffic to multiple queues thanks®ards -- Linc @bjtu
Re: [dpdk-dev] [RFC] Kernel Control Path (KCP)
On Fri, 16 Jun 2017 16:27:47 +0100 Ferruh Yigit wrote: > Hi Alex, > > On 6/15/2017 1:07 PM, Alex Rosenbaum wrote: > > please excuse me if I missed out of the previous conversation and > > asking these questions again... > > > > Why create a new driver instead of improving the existing KNI driver? > > For control path, KNI uses Linux kernel driver within KNI kernel module. > This method works, but may not be best option, and technically not > extendable for some drivers. KNI control path currently supports only > two drivers, proposed KCP works for all PMDs by default. > > Overall, KCP is outcome of the effort of improving KNI control path. > > Initial proposal was (a year ago I guess) introducing two new modules, > one for control path and one for data path, and replace KNI completely. > But current target is have KCP to have better control path support. > > Also, KNI handles both data and control path. But both are different > functionalities and not need to be in some module. For example an > application may not need exception data path to kernel, but may be > interested in controlling DPDK interfaces via common Linux tools. > > > Can you share a table of the differences between the two driver / > > approaches [KNI vs KCP]? > > KCP differences against KNI: > > - KCP is only for control path > - Linux virtual interfaces created automatically, without DPDK > application modification. > - To create/destroy interfaces KCP uses rtnl, KNI uses ioctl. So > technically it is possible to use "ip" tool to create / destroy > interfaces supported by KCP. > - KCP kernel module and userspace counterpart communicates via netlink, > KNI uses ioctl. > - KCP works for all PMDs without update on PMDs. > > > > > Why do you want to remove features like data path that is provided by KNI > > today? > > There is not intention to remove exception data path, the focus is to > improve KNI. > > > > > thanks, > > Alex > > > Hopefully KCP can be submitted for upstream kernel, and therefore be supportable over the long term. KNI in its current form is not acceptable upstream for a number of reasons: style, use of ioctl, races with control operations, etc.