> -----Original Message----- > From: Ray Kinsella <m...@ashroe.eu> > Sent: Monday, November 4, 2019 18:03 > To: Thomas Monjalon <tho...@monjalon.net> > Cc: dev@dpdk.org; Yigit, Ferruh <ferruh.yi...@intel.com>; 'Damjan Marion' > <dmar...@me.com>; Wang, > Haiyue <haiyue.w...@intel.com>; Jerin Jacob Kollanukkaran > <jer...@marvell.com>; > viachesl...@mellanox.com; step...@networkplumber.org; > arybche...@solarflare.com > Subject: Re: [dpdk-dev] [PATCH v4 1/4] ethdev: add the API for getting burst > mode information > > > > On 04/11/2019 09:54, Thomas Monjalon wrote: > > 04/11/2019 10:49, Ray Kinsella: > >> On 03/11/2019 22:41, Thomas Monjalon wrote: > >>> 03/11/2019 21:35, Ray Kinsella: > >>>> On 29/10/2019 14:27, Ferruh Yigit wrote: > >>>>> On 10/26/2019 5:23 PM, Thomas Monjalon wrote: > >>>>>> 26/10/2019 11:23, Wang, Haiyue: > >>>>>>> From: Thomas Monjalon [mailto:tho...@monjalon.net] > >>>>>>>> 26/10/2019 06:40, Wang, Haiyue: > >>>>>>>>> From: Thomas Monjalon [mailto:tho...@monjalon.net] > >>>>>>>>>> 25/10/2019 18:02, Jerin Jacob: > >>>>>>>>>>> On Fri, Oct 25, 2019 at 9:15 PM Thomas Monjalon > >>>>>>>>>>> <tho...@monjalon.net> wrote: > >>>>>>>>>>>> 25/10/2019 16:08, Ferruh Yigit: > >>>>>>>>>>>>> On 10/25/2019 10:36 AM, Thomas Monjalon wrote: > >>>>>>>>>>>>>> 15/10/2019 09:51, Haiyue Wang: > >>>>>>>>>>>>>>> Some PMDs have more than one RX/TX burst paths, add the > >>>>>>>>>>>>>>> ethdev API > >>>>>>>>>>>>>>> that allows an application to retrieve the mode information > >>>>>>>>>>>>>>> about > >>>>>>>>>>>>>>> Rx/Tx packet burst such as Scalar or Vector, and Vector > >>>>>>>>>>>>>>> technology > >>>>>>>>>>>>>>> like AVX2. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> I missed this patch. I and Andrew, maintainers of ethdev, were > >>>>>>>>>>>>>> not CC'ed. > >>>>>>>>>>>>>> Ferruh, I would expect to be Cc'ed and/or get a notification > >>>>>>>>>>>>>> before merging. > >>>>>>>>>>>>> > >>>>>>>>>>>>> It has been discussed in the mail list and went through > >>>>>>>>>>>>> multiple discussions, > >>>>>>>>>>>>> patch is out since the August, +1 to cc all maintainers I > >>>>>>>>>>>>> missed that part, > >>>>>>>>>>>>> but when the patch is reviewed and there is no objection, why > >>>>>>>>>>>>> block the merge? > >>>>>>>>>>>> > >>>>>>>>>>>> I'm not saying blocking the merge. > >>>>>>>>>>>> My bad is that I missed the patch and I am asking for help with > >>>>>>>>>>>> a notification > >>>>>>>>>>>> in this case. Same for Andrew I guess. > >>>>>>>>>>>> Note: it is merged in master and I am looking to improve this > >>>>>>>>>>>> feature. > >>>>>>>>>>> > >>>>>>>>>>>>>>> +/** > >>>>>>>>>>>>>>> + * Ethernet device RX/TX queue packet burst mode information > >>>>>>>>>>>>>>> structure. > >>>>>>>>>>>>>>> + * Used to retrieve information about packet burst mode > >>>>>>>>>>>>>>> setting. > >>>>>>>>>>>>>>> + */ > >>>>>>>>>>>>>>> +struct rte_eth_burst_mode { > >>>>>>>>>>>>>>> + uint64_t options; > >>>>>>>>>>>>>>> +}; > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Why a struct for an integer? > >>>>>>>>>>>>> > >>>>>>>>>>>>> Again by a request from me, to not need to break the API if we > >>>>>>>>>>>>> need to add more > >>>>>>>>>>>>> thing in the future. > >>>>>>>>>>>> > >>>>>>>>>>>> I would replace it with a string. This is the most flexible API. > >>>>>>>>>>> > >>>>>>>>>>> IMO, Probably, best of both worlds make a good option here, > >>>>>>>>>>> as Haiyue suggested if we have an additional dev_specific[1] in > >>>>>>>>>>> structure. > >>>>>>>>>>> and when a pass to the application, let common code make final > >>>>>>>>>>> string as > >>>>>>>>>>> (options flags to string + dev_specific) > >>>>>>>>>>> > >>>>>>>>>>> options flag can be zero if PMD does not have any generic flags > >>>>>>>>>>> nor > >>>>>>>>>>> interested in such a scheme. > >>>>>>>>>>> Generic flags will help at least to have some common code. > >>>>>>>>>>> > >>>>>>>>>>> [1] > >>>>>>>>>>> struct rte_eth_burst_mode { > >>>>>>>>>>> uint64_t options; > >>>>>>>>>>> char dev_specific[128]; /* PMD has specific burst mode > >>>>>>>>>>> information */ > >>>>>>>>>>> }; > >>>>>>>>>> > >>>>>>>>>> I really don't see how we can have generic flags. > >>>>>>>>>> The flags which are proposed are just matching > >>>>>>>>>> the functions implemented in Intel PMDs. > >>>>>>>>>> And this is a complicate solution. > >>>>>>>>>> Why not just returning a name for the selected Rx/Tx mode? > >>>>>>>>> > >>>>>>>>> Intel PMDs use the *generic* methods like x86 SSE, AVX2, ARM NEON, > >>>>>>>>> PPC ALTIVEC, > >>>>>>>>> 'dev->data->scattered_rx' etc for the target : "DPDK is the Data > >>>>>>>>> Plane Development Kit > >>>>>>>>> that consists of libraries to accelerate packet processing > >>>>>>>>> workloads running on a wide > >>>>>>>>> variety of CPU architectures." > >>>>>>>> > >>>>>>>> How RTE_ETH_BURST_SCATTERED and RTE_ETH_BURST_BULK_ALLOC are generic? > >>>>>>>> They just match some features of the Intel PMDs. > >>>>>>>> Why not exposing other optimizations of the Rx/Tx implementations? > >>>>>>>> You totally missed the point of generic burst mode description. > >>>>>>>> > >>>>>>>>> If understand these new experimental APIs from above, then bit > >>>>>>>>> options is the best, > >>>>>>>>> and we didn't invent new words to describe them, just from the CPU > >>>>>>>>> & other *generic* > >>>>>>>>> technology. And the application can loop to check which kind of > >>>>>>>>> burst is running by > >>>>>>>>> just simple bit test. > >>>>>>>>> > >>>>>>>>> If PMDs missed these, they can update them in future roadmaps to > >>>>>>>>> enhance their PMDs, > >>>>>>>>> like MLX5 supports ARM NEON, x86 SSE. > >>>>>>>> > >>>>>>>> I have no word! > >>>>>>>> You really think other PMDs should learn from Intel how to "enhance" > >>>>>>>> their PMD? > >>>>>>>> You talk about mlx5, did you look at its code? Did you see the burst > >>>>>>>> modes > >>>>>>>> depending on which specific hardware path is used (MPRQ, EMPW, > >>>>>>>> inline)? > >>>>>>>> Or depending on which offloads are handled? > >>>>>>>> > >>>>>>>> Again, the instruction set used by the function is a small part > >>>>>>>> of the burst mode optimization. > >>>>>>>> > >>>>>>>> So you did not reply to my question: > >>>>>>>> Why not just returning a name for the selected Rx/Tx mode? > >>>>>>> > >>>>>>> In fact, RFC v1/v2 returns the *name*, but the *name* is hard for > >>>>>>> application to do further processing, strcmp, strstr ? Not so nice > >>>>>>> for C code, and it is not so standard, So switch it to bit definition. > >>>>>> > >>>>>> Again, please answer my question: why do you need it? > >>>>>> I think it is just informative, that's why a string should be enough. > >>>>>> I am clearly against the bitmap because it is way too much restrictive. > >>>>>> I disagree that knowing it is using AVX2 or AVX512 is so interesting. > >>>>>> What you would like to know is whether it is processing packets 4 by 4, > >>>>>> for instance, or to know which offload is supported, or what hardware > >>>>>> trick > >>>>>> is used in the datapath design. > >>>>>> There are so many options in a datapath design that it cannot be > >>>>>> represented with a bitmap. And it makes no sense to have some design > >>>>>> criterias more important than others. > >>>>>> I Cc an Intel architect (Edwin) who could explain you how much > >>>>>> a datapath design is more complicate than just using AVX instructions. > >>>>> > >>>>> As I understand this is to let applications to give informed decision > >>>>> based on > >>>>> what vectorization is used in the driver, currently this is not know by > >>>>> the > >>>>> application. > >>>>> > >>>>> And as previously replied, the main target of the API is to define the > >>>>> vector > >>>>> path, not all optimizations, so the number is limited. > >>> > >>> No! > >>> The name of this API is "burst mode information", > >>> not "vector instructions used". > >>> I think the main error is that in Intel PMDs, > >>> each Rx/Tx function use different vector instructions. > >>> So you generalize that knowing the vectors instructions > >>> will give you a good information about the performance. > >>> But this is generally wrong! > >>> The right level of infos is much more complex. > >> > >> I don't think anyone was suggesting limiting it to purely describing PMD > >> optimization > >> with vector instructions. If there are other commonalities let's describe > >> those also. > >> > >> Vectorization was thought to be a good starting point - IMHO it is. > >> > >>> > >>>>> There are many optimization in the data path, I agree we may not > >>>>> represent all > >>>>> of them, and agreed existing enum having "RTE_ETH_BURST_BULK_ALLOC" and > >>>>> similar > >>>>> causing this confusion, perhaps we can remove them. > >>>>> > >>>>> And if the requirement from the application is just informative, I > >>>>> would agree > >>>>> that free text string will be better, right now > >>>>> 'rte_eth_rx/tx_burst_mode_get()' > >>>>> is the main API to provide the information and > >>>>> 'rte_eth_burst_mode_option_name()' is a helper for application/driver > >>>>> to log > >>>>> this information. > >>>>> > >>>> > >>>> Well look we have a general deficit of information about what is > >>>> happening under > >>>> the covers in DPDK. The end user may get wildly different performance > >>>> characteristics > >>>> based on the DPDK configuration. Simple example is using flow director > >>>> causes the i40e > >>>> PMD to switch to using a scalar code path, and performance may as much > >>>> as half. > >>>> > >>>> This can cause no end of head-scratching in consuming products, I have > >>>> done some > >>>> of that head scratching myself, it is a usability nightmare. > >>>> > >>>> FD.io VPP tries to work around this by mining the call stack, to give > >>>> the user _some_ > >>>> kind of information about what is happening. These kind of heroics > >>>> should not be necessary. > >>>> > >>>> For exactly the same reasons as telemetry, we should be trying to give > >>>> the users as much > >>>> information as possible, in as standard as format as possible. Otherwise > >>>> DPDK > >>>> becomes arcane leaving the user running gdb to understand what is going > >>>> on, as I > >>>> frequently do. > >>> > >>> I agree we must provide a clue to understand the performance result. > >>> As Stephen commented at the very beginning, a log is enough for such > >>> debug. > >>> But his comment was ignored. > >> > >> Do we expect applications built on DPDK to have to grep it's log to make > >> such discoveries? > >> It's very brittle and arcane way to provide information, if nothing else. > >> > >>> You wanted an API, fine. > >>> I am OK to have an API to request infos which are also in logs. > >> > >> I would point out that an API to query meta-data is common practice else > >> where. > >> GStreamer GstCaps and Linux Sysfs are the closest example I can think of. > >> > >>> > >>>> Finally, again for the same reasons as telemetry, I would say that > >>>> machine readable is the > >>>> ideal here. > >>> > >>> I disagree here. There is no need to make this info machine readable. > >>> We want a clue about the optimizations which are all about creativity. > >>> And we cannot make creativity of developers "machine readable". > >> > >> I am more concerned about the creativity in how developers describe > >> optimizations. > >> If there is no standardization of strings (or bits), the API will be > >> challenging to use. > > > > No it won't be challenging because it will be just a string to print. > > Well the challenge is getting everyone to use the same set of strings, > such that what is returned by the API has common meaning. > > I am fine with strings. > So long as we have a method of encouraging folks to use a standard set were > possible. > > > The challenge is trying to fix the design characteristics in an API. > > I thought Haiyue's patch with a fair degree of input from Ferruh and others > is a pretty solid start. > Let's describe those commonalities that _do_ exist today - it may not be > enough, but it's better than > we had.
Then, this one : http://patchwork.dpdk.org/patch/62368/ ?