On Thu, Sep 12, 2024 at 01:59:34PM +0200, Mattias Rönnblom wrote: > On 2024-09-12 13:17, Varghese, Vipin wrote: > > [AMD Official Use Only - AMD Internal Distribution Only] > > > > <snipped> > > > >>>> Thank you Mattias for the information, as shared by in the reply > > > >>>> with > > > >> Anatoly we want expose a new API `rte_get_next_lcore_ex` which > > > >> intakes a extra argument `u32 flags`. > > > >>>> The flags can be RTE_GET_LCORE_L1 (SMT), RTE_GET_LCORE_L2, > > > >> RTE_GET_LCORE_L3, RTE_GET_LCORE_BOOST_ENABLED, > > > >> RTE_GET_LCORE_BOOST_DISABLED. > > > >>> > > > >>> Wouldn't using that API be pretty awkward to use? > > > > Current API available under DPDK is ` rte_get_next_lcore`, which is used > > > within DPDK example and in customer solution. > > > > Based on the comments from others we responded to the idea of changing > > > the new Api from ` rte_get_next_lcore_llc` to ` rte_get_next_lcore_exntd`. > > > > > > > > Can you please help us understand what is `awkward`. > > > > > > > > > > The awkwardness starts when you are trying to fit provide hwloc type > > > information over an API that was designed for iterating over lcores. > > I disagree to this point, current implementation of lcore libraries is > > only focused on iterating through list of enabled cores, core-mask, and > > lcore-map. > > With ever increasing core count, memory, io and accelerators on SoC, > > sub-numa partitioning is common in various vendor SoC. Enhancing or > > Augumenting lcore API to extract or provision NUMA, Cache Topology is > > not awkward. > > DPDK providing an API for this information makes sense to me, as I've > mentioned before. What I questioned was the way it was done (i.e., the API > design) in your RFC, and the limited scope (which in part you have > addressed). >
Actually, I'd like to touch on this first item a little bit. What is the main benefit of providing this information in EAL? To me, it seems like something that is for apps to try and be super-smart and select particular cores out of a set of cores to run on. However, is that not taking work that should really be the job of the person deploying the app? The deployer - if I can use that term - has already selected a set of cores and NICs for a DPDK application to use. Should they not also be the one selecting - via app argument, via --lcores flag to map one core id to another, or otherwise - which part of an application should run on what particular piece of hardware? In summary, what is the final real-world intended usecase for this work? DPDK already tries to be smart about cores and NUMA, and in some cases we have hit issues where users have - for their own valid reasons - wanted to run DPDK in a sub-optimal way, and they end up having to fight DPDK's smarts in order to do so! Ref: [1] /Bruce [1] https://git.dpdk.org/dpdk/commit/?id=ed34d87d9cfbae8b908159f60df2008e45e4c39f