On Thu, Sep 12, 2024 at 01:59:34PM +0200, Mattias Rönnblom wrote:
> On 2024-09-12 13:17, Varghese, Vipin wrote:
> > [AMD Official Use Only - AMD Internal Distribution Only]
> > 
> > <snipped>
> > > >>>> Thank you Mattias for the information, as shared by in the reply
> > > >>>> with
> > > >> Anatoly we want expose a new API `rte_get_next_lcore_ex` which
> > > >> intakes a extra argument `u32 flags`.
> > > >>>> The flags can be RTE_GET_LCORE_L1 (SMT), RTE_GET_LCORE_L2,
> > > >> RTE_GET_LCORE_L3, RTE_GET_LCORE_BOOST_ENABLED,
> > > >> RTE_GET_LCORE_BOOST_DISABLED.
> > > >>>
> > > >>> Wouldn't using that API be pretty awkward to use?
> > > > Current API available under DPDK is ` rte_get_next_lcore`, which is used
> > > within DPDK example and in customer solution.
> > > > Based on the comments from others we responded to the idea of changing
> > > the new Api from ` rte_get_next_lcore_llc` to ` rte_get_next_lcore_exntd`.
> > > >
> > > > Can you please help us understand what is `awkward`.
> > > >
> > > 
> > > The awkwardness starts when you are trying to fit provide hwloc type
> > > information over an API that was designed for iterating over lcores.
> > I disagree to this point, current implementation of lcore libraries is
> > only focused on iterating through list of enabled cores, core-mask, and
> > lcore-map.
> > With ever increasing core count, memory, io and accelerators on SoC,
> > sub-numa partitioning is common in various vendor SoC. Enhancing or
> > Augumenting lcore API to extract or provision NUMA, Cache Topology is
> > not awkward.
> 
> DPDK providing an API for this information makes sense to me, as I've
> mentioned before. What I questioned was the way it was done (i.e., the API
> design) in your RFC, and the limited scope (which in part you have
> addressed).
> 

Actually, I'd like to touch on this first item a little bit. What is the
main benefit of providing this information in EAL? To me, it seems like
something that is for apps to try and be super-smart and select particular
cores out of a set of cores to run on. However, is that not taking work
that should really be the job of the person deploying the app? The deployer
- if I can use that term - has already selected a set of cores and NICs for
a DPDK application to use. Should they not also be the one selecting - via
app argument, via --lcores flag to map one core id to another, or otherwise
- which part of an application should run on what particular piece of
hardware?

In summary, what is the final real-world intended usecase for this work?
DPDK already tries to be smart about cores and NUMA, and in some cases we
have hit issues where users have - for their own valid reasons - wanted to
run DPDK in a sub-optimal way, and they end up having to fight DPDK's
smarts in order to do so! Ref: [1]

/Bruce

[1] 
https://git.dpdk.org/dpdk/commit/?id=ed34d87d9cfbae8b908159f60df2008e45e4c39f

Reply via email to