Re: [dpdk-dev] [PATCH v4 1/4] ethdev: add the API for getting burst mode information

Ray Kinsella Mon, 04 Nov 2019 02:03:55 -0800

On 04/11/2019 09:54, Thomas Monjalon wrote:
> 04/11/2019 10:49, Ray Kinsella:
>> On 03/11/2019 22:41, Thomas Monjalon wrote:
>>> 03/11/2019 21:35, Ray Kinsella:
>>>> On 29/10/2019 14:27, Ferruh Yigit wrote:
>>>>> On 10/26/2019 5:23 PM, Thomas Monjalon wrote:
>>>>>> 26/10/2019 11:23, Wang, Haiyue:
>>>>>>> From: Thomas Monjalon [mailto:tho...@monjalon.net]
>>>>>>>> 26/10/2019 06:40, Wang, Haiyue:
>>>>>>>>> From: Thomas Monjalon [mailto:tho...@monjalon.net]
>>>>>>>>>> 25/10/2019 18:02, Jerin Jacob:
>>>>>>>>>>> On Fri, Oct 25, 2019 at 9:15 PM Thomas Monjalon 
>>>>>>>>>>> <tho...@monjalon.net> wrote:
>>>>>>>>>>>> 25/10/2019 16:08, Ferruh Yigit:
>>>>>>>>>>>>> On 10/25/2019 10:36 AM, Thomas Monjalon wrote:
>>>>>>>>>>>>>> 15/10/2019 09:51, Haiyue Wang:
>>>>>>>>>>>>>>> Some PMDs have more than one RX/TX burst paths, add the ethdev 
>>>>>>>>>>>>>>> API
>>>>>>>>>>>>>>> that allows an application to retrieve the mode information 
>>>>>>>>>>>>>>> about
>>>>>>>>>>>>>>> Rx/Tx packet burst such as Scalar or Vector, and Vector 
>>>>>>>>>>>>>>> technology
>>>>>>>>>>>>>>> like AVX2.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I missed this patch. I and Andrew, maintainers of ethdev, were 
>>>>>>>>>>>>>> not CC'ed.
>>>>>>>>>>>>>> Ferruh, I would expect to be Cc'ed and/or get a notification 
>>>>>>>>>>>>>> before merging.
>>>>>>>>>>>>>
>>>>>>>>>>>>> It has been discussed in the mail list and went through multiple 
>>>>>>>>>>>>> discussions,
>>>>>>>>>>>>> patch is out since the August, +1 to cc all maintainers I missed 
>>>>>>>>>>>>> that part,
>>>>>>>>>>>>> but when the patch is reviewed and there is no objection, why 
>>>>>>>>>>>>> block the merge?
>>>>>>>>>>>>
>>>>>>>>>>>> I'm not saying blocking the merge.
>>>>>>>>>>>> My bad is that I missed the patch and I am asking for help with a 
>>>>>>>>>>>> notification
>>>>>>>>>>>> in this case. Same for Andrew I guess.
>>>>>>>>>>>> Note: it is merged in master and I am looking to improve this 
>>>>>>>>>>>> feature.
>>>>>>>>>>>
>>>>>>>>>>>>>>> +/**
>>>>>>>>>>>>>>> + * Ethernet device RX/TX queue packet burst mode information 
>>>>>>>>>>>>>>> structure.
>>>>>>>>>>>>>>> + * Used to retrieve information about packet burst mode 
>>>>>>>>>>>>>>> setting.
>>>>>>>>>>>>>>> + */
>>>>>>>>>>>>>>> +struct rte_eth_burst_mode {
>>>>>>>>>>>>>>> +  uint64_t options;
>>>>>>>>>>>>>>> +};
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Why a struct for an integer?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Again by a request from me, to not need to break the API if we 
>>>>>>>>>>>>> need to add more
>>>>>>>>>>>>> thing in the future.
>>>>>>>>>>>>
>>>>>>>>>>>> I would replace it with a string. This is the most flexible API.
>>>>>>>>>>>
>>>>>>>>>>> IMO, Probably, best of both worlds make a good option here,
>>>>>>>>>>> as Haiyue suggested if we have an additional dev_specific[1] in 
>>>>>>>>>>> structure.
>>>>>>>>>>> and when a pass to the application, let common code make final 
>>>>>>>>>>> string as
>>>>>>>>>>> (options flags to string + dev_specific)
>>>>>>>>>>>
>>>>>>>>>>> options flag can be zero if PMD does not have any generic flags nor
>>>>>>>>>>> interested in such a scheme.
>>>>>>>>>>> Generic flags will help at least to have some common code.
>>>>>>>>>>>
>>>>>>>>>>> [1]
>>>>>>>>>>> struct rte_eth_burst_mode {
>>>>>>>>>>>         uint64_t options;
>>>>>>>>>>>         char dev_specific[128]; /* PMD has specific burst mode 
>>>>>>>>>>> information */
>>>>>>>>>>> };
>>>>>>>>>>
>>>>>>>>>> I really don't see how we can have generic flags.
>>>>>>>>>> The flags which are proposed are just matching
>>>>>>>>>> the functions implemented in Intel PMDs.
>>>>>>>>>> And this is a complicate solution.
>>>>>>>>>> Why not just returning a name for the selected Rx/Tx mode?
>>>>>>>>>
>>>>>>>>> Intel PMDs use the *generic* methods like x86 SSE, AVX2, ARM NEON, 
>>>>>>>>> PPC ALTIVEC,
>>>>>>>>> 'dev->data->scattered_rx' etc for the target : "DPDK is the Data 
>>>>>>>>> Plane Development Kit
>>>>>>>>> that consists of libraries to accelerate packet processing workloads 
>>>>>>>>> running on a wide
>>>>>>>>> variety of CPU architectures."
>>>>>>>>
>>>>>>>> How RTE_ETH_BURST_SCATTERED and RTE_ETH_BURST_BULK_ALLOC are generic?
>>>>>>>> They just match some features of the Intel PMDs.
>>>>>>>> Why not exposing other optimizations of the Rx/Tx implementations?
>>>>>>>> You totally missed the point of generic burst mode description.
>>>>>>>>
>>>>>>>>> If understand these new experimental APIs from above, then bit 
>>>>>>>>> options is the best,
>>>>>>>>> and we didn't invent new words to describe them, just from the CPU & 
>>>>>>>>> other *generic*
>>>>>>>>> technology. And the application can loop to check which kind of burst 
>>>>>>>>> is running by
>>>>>>>>> just simple bit test.
>>>>>>>>>
>>>>>>>>> If PMDs missed these, they can update them in future roadmaps to 
>>>>>>>>> enhance their PMDs,
>>>>>>>>> like MLX5 supports ARM NEON, x86 SSE.
>>>>>>>>
>>>>>>>> I have no word!
>>>>>>>> You really think other PMDs should learn from Intel how to "enhance" 
>>>>>>>> their PMD?
>>>>>>>> You talk about mlx5, did you look at its code? Did you see the burst 
>>>>>>>> modes
>>>>>>>> depending on which specific hardware path is used (MPRQ, EMPW, inline)?
>>>>>>>> Or depending on which offloads are handled?
>>>>>>>>
>>>>>>>> Again, the instruction set used by the function is a small part
>>>>>>>> of the burst mode optimization.
>>>>>>>>
>>>>>>>> So you did not reply to my question:
>>>>>>>> Why not just returning a name for the selected Rx/Tx mode?
>>>>>>>
>>>>>>> In fact, RFC v1/v2 returns the *name*, but the *name* is hard for
>>>>>>> application to do further processing, strcmp, strstr ? Not so nice
>>>>>>> for C code, and it is not so standard, So switch it to bit definition.
>>>>>>
>>>>>> Again, please answer my question: why do you need it?
>>>>>> I think it is just informative, that's why a string should be enough.
>>>>>> I am clearly against the bitmap because it is way too much restrictive.
>>>>>> I disagree that knowing it is using AVX2 or AVX512 is so interesting.
>>>>>> What you would like to know is whether it is processing packets 4 by 4,
>>>>>> for instance, or to know which offload is supported, or what hardware 
>>>>>> trick
>>>>>> is used in the datapath design.
>>>>>> There are so many options in a datapath design that it cannot be
>>>>>> represented with a bitmap. And it makes no sense to have some design
>>>>>> criterias more important than others.
>>>>>> I Cc an Intel architect (Edwin) who could explain you how much
>>>>>> a datapath design is more complicate than just using AVX instructions.
>>>>>
>>>>> As I understand this is to let applications to give informed decision 
>>>>> based on
>>>>> what vectorization is used in the driver, currently this is not know by 
>>>>> the
>>>>> application.
>>>>>
>>>>> And as previously replied, the main target of the API is to define the 
>>>>> vector
>>>>> path, not all optimizations, so the number is limited.
>>>
>>> No!
>>> The name of this API is "burst mode information",
>>> not "vector instructions used".
>>> I think the main error is that in Intel PMDs,
>>> each Rx/Tx function use different vector instructions.
>>> So you generalize that knowing the vectors instructions
>>> will give you a good information about the performance.
>>> But this is generally wrong!
>>> The right level of infos is much more complex.
>>
>> I don't think anyone was suggesting limiting it to purely describing PMD 
>> optimization 
>> with vector instructions. If there are other commonalities let's describe 
>> those also. 
>>
>> Vectorization was thought to be a good starting point - IMHO it is.
>>
>>>
>>>>> There are many optimization in the data path, I agree we may not 
>>>>> represent all
>>>>> of them, and agreed existing enum having "RTE_ETH_BURST_BULK_ALLOC" and 
>>>>> similar
>>>>> causing this confusion, perhaps we can remove them.
>>>>>
>>>>> And if the requirement from the application is just informative, I would 
>>>>> agree
>>>>> that free text string will be better, right now 
>>>>> 'rte_eth_rx/tx_burst_mode_get()'
>>>>> is the main API to provide the information and
>>>>> 'rte_eth_burst_mode_option_name()' is a helper for application/driver to 
>>>>> log
>>>>> this information.
>>>>>
>>>>
>>>> Well look we have a general deficit of information about what is happening 
>>>> under 
>>>> the covers in DPDK. The end user may get wildly different performance 
>>>> characteristics 
>>>> based on the DPDK configuration. Simple example is using flow director 
>>>> causes the i40e 
>>>> PMD to switch to using a scalar code path, and performance may as much as 
>>>> half.
>>>>
>>>> This can cause no end of head-scratching in consuming products, I have 
>>>> done some 
>>>> of that head scratching myself, it is a usability nightmare. 
>>>>
>>>> FD.io VPP tries to work around this by mining the call stack, to give the 
>>>> user _some_
>>>> kind of information about what is happening. These kind of heroics should 
>>>> not be necessary.
>>>>
>>>> For exactly the same reasons as telemetry, we should be trying to give the 
>>>> users as much 
>>>> information as possible, in as standard as format as possible. Otherwise 
>>>> DPDK 
>>>> becomes arcane leaving the user running gdb to understand what is going 
>>>> on, as I 
>>>> frequently do.
>>>
>>> I agree we must provide a clue to understand the performance result.
>>> As Stephen commented at the very beginning, a log is enough for such debug.
>>> But his comment was ignored. 
>>
>> Do we expect applications built on DPDK to have to grep it's log to make 
>> such discoveries?
>> It's very brittle and arcane way to provide information, if nothing else. 
>>
>>> You wanted an API, fine.
>>> I am OK to have an API to request infos which are also in logs.
>>
>> I would point out that an API to query meta-data is common practice else 
>> where.
>> GStreamer GstCaps and Linux Sysfs are the closest example I can think of.
>>
>>>
>>>> Finally, again for the same reasons as telemetry, I would say that machine 
>>>> readable is the 
>>>> ideal here.
>>>
>>> I disagree here. There is no need to make this info machine readable.
>>> We want a clue about the optimizations which are all about creativity.
>>> And we cannot make creativity of developers "machine readable".
>>
>> I am more concerned about the creativity in how developers describe 
>> optimizations. 
>> If there is no standardization of strings (or bits), the API will be 
>> challenging to use. 
> 
> No it won't be challenging because it will be just a string to print.

Well the challenge is getting everyone to use the same set of strings, 
such that what is returned by the API has common meaning. 

I am fine with strings.
So long as we have a method of encouraging folks to use a standard set were 
possible.

> The challenge is trying to fix the design characteristics in an API.

I thought Haiyue's patch with a fair degree of input from Ferruh and others is 
a pretty solid start. 
Let's describe those commonalities that _do_ exist today - it may not be 
enough, but it's better than we had.
Re: [dpdk-dev] [PATCH v4 1/4] ethdev: add the API for getting burst mode information

Reply via email to