On Wed, Sep 27, 2023 at 1:55 PM Ferruh Yigit <ferruh.yi...@amd.com> wrote: > > On 9/21/2023 3:49 PM, Stanisław Kardach wrote: > > On Thu, Sep 21, 2023, 15:18 Tummala, Sivaprasad > > <sivaprasad.tumm...@amd.com <mailto:sivaprasad.tumm...@amd.com>> wrote: > > > > [AMD Official Use Only - General] > > > > > -----Original Message----- > > > From: David Marchand <david.march...@redhat.com > > <mailto:david.march...@redhat.com>> > > > Sent: Wednesday, September 20, 2023 1:05 PM > > > To: Stanisław Kardach <k...@semihalf.com > > <mailto:k...@semihalf.com>>; Tummala, Sivaprasad > > > <sivaprasad.tumm...@amd.com <mailto:sivaprasad.tumm...@amd.com>> > > > Cc: Ruifeng Wang <ruifeng.w...@arm.com > > <mailto:ruifeng.w...@arm.com>>; Min Zhou <zhou...@loongson.cn > > <mailto:zhou...@loongson.cn>>; > > > David Christensen <d...@linux.vnet.ibm.com > > <mailto:d...@linux.vnet.ibm.com>>; Bruce Richardson > > > <bruce.richard...@intel.com <mailto:bruce.richard...@intel.com>>; > > Konstantin Ananyev > > > <konstantin.v.anan...@yandex.ru > > <mailto:konstantin.v.anan...@yandex.ru>>; dev <dev@dpdk.org > > <mailto:dev@dpdk.org>>; Yigit, Ferruh > > > <ferruh.yi...@amd.com <mailto:ferruh.yi...@amd.com>>; Thomas > > Monjalon <tho...@monjalon.net <mailto:tho...@monjalon.net>> > > > Subject: Re: [PATCH v2 2/2] eal: remove NUMFLAGS enumeration > > > > > > Caution: This message originated from an External Source. Use > > proper caution > > > when opening attachments, clicking links, or responding. > > > > > > > > > On Wed, Sep 20, 2023 at 8:01 AM Stanisław Kardach > > <k...@semihalf.com <mailto:k...@semihalf.com>> wrote: > > > > > > > > On Tue, Sep 19, 2023 at 4:47 PM David Marchand > > > <david.march...@redhat.com <mailto:david.march...@redhat.com>> wrote: > > > > <snip> > > > > > > Also I see you're still removing the RTE_CPUFLAG_NUMFLAGS > > (what I call a > > > last element canary). Why? If you're concerned with ABI, then > > we're talking about > > > an application linking dynamically with DPDK or talking via some > > RPC channel with > > > another DPDK application. So clashing with this definition does > > not come into > > > question. One should rather use rte_cpu_get_flag_enabled(). > > > > > > Also if you want to introduce new features, one would add > > them yo the > > > rte_cpuflags headers, unless you'd like to not add those and keep an > > > undocumented list "above" the last defined element. > > > > > > Could you explain a bit more Your use-case? > > > > > > > > > > Hey Stanislaw, > > > > > > > > > > Talking generically, one problem with such pattern (having a LAST, > > > > > or MAX enum) is when an array sized with such a symbol is exposed. > > > > > As I mentionned in the past, this can have unwanted effects: > > > > > > > https://patchwork.dpdk.org/project/dpdk/patch/20230919140430.3251493 > > <https://patchwork.dpdk.org/project/dpdk/patch/20230919140430.3251493> > > > > > -1-david.march...@redhat.com/ > > <http://1-david.march...@redhat.com/> > > > > > > Argh... who broke copy/paste in my browser ?! > > > Wrt to MAX and arrays, I wanted to point at: > > > > > > > http://inbox.dpdk.org/dev/CAJFAV8xs5CVdE2xwRtaxk5vE_PiQMV5LY5tKStk3R1gOuR > > <http://inbox.dpdk.org/dev/CAJFAV8xs5CVdE2xwRtaxk5vE_PiQMV5LY5tKStk3R1gOuR> > > > t...@mail.gmail.com/ <http://t...@mail.gmail.com/> > > > > > > > I agree, though I'd argue "LAST" and "MAX" semantics are a bit > > different. "LAST" > > > delimits the known enumeration territory while "MAX" is more of a > > `constepxr` > > > value type. > > > > > > > > > > Another issue is when an existing enum meaning changes: from the > > > > > application pov, the (old) MAX value is incorrect, but for the > > > > > library pov, a new meaning has been associated. > > > > > This may trigger bugs in the application when calling a function > > > > > that returns such an enum which never return this MAX value in > > the past. > > > > > > > > > > For at least those two reasons, removing those canary elements is > > > > > being done in DPDK. > > > > > > > > > > This specific removal has been announced: > > > > > > > https://patchwork.dpdk.org/project/dpdk/patch/20230919140430.3251493 > > <https://patchwork.dpdk.org/project/dpdk/patch/20230919140430.3251493> > > > > > -1-david.march...@redhat.com/ > > <http://1-david.march...@redhat.com/> > > > > Thanks for pointing this out but did you mean to link to the > > patch again here? > > > > > > Sorry, same here, bad copy/paste :-(. > > > > > > The intended link is: > > https://git.dpdk.org/dpdk/commit/?id=5da7c13521 > > <https://git.dpdk.org/dpdk/commit/?id=5da7c13521> > > > The deprecation notice was badly formulated and this patch here is > > consistent with > > > it. > > > > > > > > > > > > > > > > Now, practically, when I look at the cpuflags API, I don't see us > > > > > exposed to those two issues wrt rte_cpu_flag_t, so maybe this > > change > > > > > is unneeded. > > > > > But on the other hand, is it really an issue for an application to > > > > > lose this (internal) information? > > > > I doubt it, maybe it could be used as a sanity check for > > choosing proper functors > > > in the application. Though the initial description of the reason > > behind this patch was > > > to not break the ABI and I don't think it does that. What it does > > is enforces users to > > > use explicit cpu flag values which is a good thing. Though if so, > > then it should be > > > stated in the commit description. > > > > > > I agree. > > > Siva, can you work on a new revision? > > > > > David, Stanislaw, > > > > The original motivation of this patch was to avoid ABI breakage with > > the introduction of new CPU flag > > "RTE_CPUFLAG_MONITORX" > > (http://mails.dpdk.org/archives/test-report/2023-April/382489.html > > <http://mails.dpdk.org/archives/test-report/2023-April/382489.html>). > > > > Because of ABI breakage, the feature was postponed to this release. > > > > https://patchwork.dpdk.org/project/dpdk/patch/20230413115334.43172-3-sivaprasad.tumm...@amd.com/ > > > > <https://patchwork.dpdk.org/project/dpdk/patch/20230413115334.43172-3-sivaprasad.tumm...@amd.com/> > > > > This test is flawed, reason being that the NUMFLAGS should not be > > treated as a flag value and instead as a canary but this test is not > > taking into account. > > > > Hi Stanislaw, > > Why test is flawed? > > The enum in in the public header, so the 'RTE_CPUFLAG_NUMFLAGS' enum > item, and there are APIs using the enum, so the enum exchanged between > shared library and the application. In a similar way lots of Linux uapi headers contain bits that should not be used directly, even though they are defined there. The reason for that is the C language syntax, not necessarily the intent of a developer. Since NUMFLAGS was a canary to make the flag handling code easier, it should not be treated as a "real" value and hence my suggestion of a flawed test. That said, NUMFLAGS does not bring enough value to not remove it. :) > > Similar thing discussed before and when enum exchanged between > application and shared library, there is an ABI breakage risk when enum > extended and general tendency is to eliminate the MAX value to reduce > the risk. Agreed though as I have mentioned before, "MAX" has a different semantics than "NUM". Then again since we have rte_cpu_feature_table, we can RTE_DIM to check the user input. > > > When enum value sent from library to application, it is more clear that > this can cause an ABI breakage, because application can receive a value > that it is not aware in the build time, which can cause unexpected behavior. > Simply think about a case application allocated array in > 'RTE_CPUFLAG_NUMFLAGS' size and directly accessing the array index based > on returned enum item value, if the enum extended in the new version of > the shared library, this can cause invalid memory access in application. Using the NUM enum element (which serves as a last item canary) to size an array is not a good idea unless it's returned from a runtime call. Otherwise one hits issues that you've described. > > When enum value sent from application to library, I am not quite sure > how problematic it is to be honest. Like being in the > 'rte_cpu_get_flag_enabled()' & 'rte_cpu_get_flag_name()' in question. > Only when application sends 'RTE_CPUFLAG_NUMFLAGS' to > 'rte_cpu_get_flag_name()', it expects a NULL returned, but this won't > happen in new version of the shared library, not sure if this can cause > any problem for the application. > But as I mentioned, general guidance is to eliminate this kind of MAX > enum value usage. > > > And for this specific issue, although usage of the enum in > 'rte_cpu_get_flag_enabled()' & 'rte_cpu_get_flag_name()' APIs is not > clear if it cause ABI breakage, > enum being embedded into the 'struct rte_bbdev_driver_info' struct > doesn't leave a question, since this struct is returned from library to > the application and change in the enum causes an ABI breakage. Enum size does not change irrespective of changing its values. So size-wise it's not an ABI breakage. Re-ordering values is an ABI breakage. > > > Briefly, I think even appending to the end of 'enum rte_cpu_flag_t' > cause ABI breakage and removing 'RTE_CPUFLAG_NUMFLAGS' helps to extend > this enum in the future. > And an outstanding deprecation notice already exists for this: > https://git.dpdk.org/dpdk/tree/doc/guides/rel_notes/deprecation.rst?h=v23.07#n63 > > > > Your change did not break the ABI because you have properly added the > > new flag at the end. > > So I would ask to change the commit description to mention that NUMFLAGS > > is removed to: > > 1. Prevent users from treating it as a usable value or an array size. > > 2. Prevent false-positive failures in the ABI test. > > > > Also it would be good to link to the aforementioned ABI test failure to > > give readers some context when inspecting the git tree. > > > > > > > > Can you please add what exactly needs to be reworked in the new version. > > > > > > > > Thanks. > > > > > > -- > > > David Marchand > > >
-- Best Regards, Stanisław Kardach