On Mon, Apr 07, 2025 at 03:18:36PM +0200, Morten Brørup wrote:
> > From: Bruce Richardson [mailto:bruce.richard...@intel.com]
> > Sent: Monday, 7 April 2025 14.42
> > 
> > On Mon, Apr 07, 2025 at 02:25:46PM +0200, Morten Brørup wrote:
> > > > From: Bruce Richardson [mailto:bruce.richard...@intel.com]
> > > > Sent: Monday, 7 April 2025 13.56
> > > >
> > > > On Mon, Apr 07, 2025 at 01:32:59PM +0200, Morten Brørup wrote:
> > > > > > From: Bruce Richardson [mailto:bruce.richard...@intel.com]
> > > > > > Sent: Monday, 7 April 2025 12.41
> > > > > >
> > > > > > On Mon, Apr 07, 2025 at 12:15:13PM +0200, Morten Brørup wrote:
> > > > > > > > From: Bruce Richardson [mailto:bruce.richard...@intel.com]
> > > > Sent:
> > > > > > > > Monday, 7 April 2025 11.49
> > > > > > > >
> > > > > > > > On Mon, Apr 07, 2025 at 09:04:05AM +0200, David Marchand
> > wrote:
> > > > > > > > > Hello Bruce,
> > > > > > > > >
> > > > > > > > > On Tue, Apr 1, 2025 at 4:08 PM Bruce Richardson
> > > > > > > > > <bruce.richard...@intel.com> wrote:
> > > > > > > > > >
> > > > > > > > > > On Mon, Mar 24, 2025 at 05:30:26PM +0000, Bruce
> > Richardson
> > > > > > wrote:
> > > > > > > > > > > Traditionally, DPDK has had a direct mapping of
> > internal
> > > > > > lcore-
> > > > > > > > ids, to
> > > > > > > > > > > the actual core numbers in use. With higher core
> > count
> > > > > > servers
> > > > > > > > becoming
> > > > > > > > > > > more prevalent the issue becomes one of increasing
> > memory
> > > > > > > > footprint when
> > > > > > > > > > > using such a scheme, due to the need to have all
> > arrays
> > > > > > > > dimensioned for
> > > > > > > > > > > all cores on the system, whether or not those cores
> > are
> > > > in
> > > > > > use by
> > > > > > > > the
> > > > > > > > > > > app.
> > > > > > > > > > >
> > > > > > > > > > > Therefore, the decision was made in the past to not
> > > > expand
> > > > > > the
> > > > > > > > > > > build-time RTE_MAX_LCORE value beyond 128. Instead,
> > it
> > > > was
> > > > > > > > recommended
> > > > > > > > > > > that users use the "--lcores" EAL parameter to take
> > the
> > > > high-
> > > > > > > > numbered
> > > > > > > > > > > cores they wish to use and map them to lcore-ids
> > within
> > > > the 0
> > > > > > -
> > > > > > > > 128
> > > > > > > > > > > range. While this works, this is a little clunky as
> > it
> > > > means
> > > > > > that
> > > > > > > > > > > instead of just passing, for example, "-l 130-139",
> > the
> > > > user
> > > > > > must
> > > > > > > > > > > instead pass "--lcores 0@130,1@131,2@132,3@133,...."
> > > > > > > > > > >
> > > > > > > > > > > This patchset attempts to simplify the situation by
> > > > adding a
> > > > > > new
> > > > > > > > flag to
> > > > > > > > > > > do this mapping automatically. To use cores 130-139
> > and
> > > > map
> > > > > > them
> > > > > > > > to ids
> > > > > > > > > > > 0-9 internally, the EAL args now become: "-l 130-139
> > --
> > > > map-
> > > > > > lcore-
> > > > > > > > ids",
> > > > > > > > > > > or using the shorter "-M" version of the flag: "-Ml
> > 130-
> > > > 139".
> > > > > > > > > > >
> > > > > > > > > > > Adding this new parameter required some rework of the
> > > > > > existing
> > > > > > > > arg
> > > > > > > > > > > parsing code, because in current DPDK the args are
> > parsed
> > > > and
> > > > > > > > checked in
> > > > > > > > > > > the order they appear on the commandline. This means
> > that
> > > > > > using
> > > > > > > > the
> > > > > > > > > > > example above, the core parameter 130-139 will be
> > > > rejected
> > > > > > > > immediately
> > > > > > > > > > > before the "map-lcore-ids" parameter is seen. To work
> > > > around
> > > > > > > > this, the
> > > > > > > > > > > core (and service core) parameters are not parsed
> > when
> > > > seen,
> > > > > > > > instead
> > > > > > > > > > > they are only saved off and parsed after all
> > arguments
> > > > are
> > > > > > > > parsed. The
> > > > > > > > > > > "-l" and "-c" parameters are converted into "--
> > lcores"
> > > > > > arguments,
> > > > > > > > so all
> > > > > > > > > > > assigning of lcore ids is done there in all cases.
> > > > > > > > > > >
> > > > > > > > > > > RFC->v2: * converted printf to DEBUG log * added "-M"
> > as
> > > > > > shorter
> > > > > > > > > > > version of flag * added documentation * renamed
> > internal
> > > > API
> > > > > > that
> > > > > > > > > > > was changed to avoid any potential
> > > > > > > > hidden
> > > > > > > > > > >   runtime issues.
> > > > > > > > > > >
> > > > > > > > > > > Bruce Richardson (3): eal: centralize core parameter
> > > > parsing
> > > > > > eal:
> > > > > > > > > > > convert core masks and lists to core sets eal: allow
> > > > > > automatic
> > > > > > > > > > > mapping of high lcore ids
> > > > > > > > > > >
> > > > > > > > > > Ping for review.
> > > > > > > > > >
> > > > > > > > > > At a high level, does this feature seem useful to
> > users?
> > > > > > > > >
> > > > > > > > > This seems useful, though I am not I would touch the
> > existing
> > > > > > > > options.
> > > > > > > > > I would have gone with a simple -L option (taking the
> > same
> > > > kind
> > > > > > of
> > > > > > > > > input than -l but with new behavior), and not combine a
> > flag
> > > > with
> > > > > > > > > existing options.
> > > > > > > > >
> > > > > > > >
> > > > > > > > That would be an easier patchset to do up. However, it
> > would
> > > > then
> > > > > > mean
> > > > > > > > that we have no less than 4 different ways to specify the
> > cores
> > > > to
> > > > > > use:
> > > > > > > > "- c", "-l", "-L", "--lcores" - and therefore need 4
> > different
> > > > sets
> > > > > > of
> > > > > > > > parsing options for them, and more checks to ensure we have
> > > > only
> > > > > > one of
> > > > > > > > the 4 specified in any run. That's why I chose the modifier
> > > > option,
> > > > > > and
> > > > > > > > to try and consolidate the core setup a bit.
> > > > > > > >
> > > > > > > > However, if having a completely new option is preferred, I
> > am
> > > > happy
> > > > > > > > enough to do up a different patchset for that.
> > > > > > > >
> > > > > > > > > I scanned through the series, not much to say.  Maybe add
> > a
> > > > unit
> > > > > > test
> > > > > > > > > for new cmdline option.
> > > > > > > > >
> > > > > > > > Sure. Once it's decided what approach (if any) to take,
> > I'll do
> > > > up
> > > > > > a
> > > > > > > > new patchset, taking into account any relevant feedback on
> > this
> > > > > > set.
> > > > > > > >
> > > > > > > > /Bruce
> > > > > > >
> > > > > > > Changing the EAL parameter parser to a "two pass parser"
> > makes
> > > > sense.
> > > > > > I
> > > > > > > think checking for existence of more than one lcore
> > specification
> > > > > > options
> > > > > > > should suffice; we don't need to accept multiple lcore
> > > > specification
> > > > > > > options and check for conflicts.
> > > > > > >
> > > > > > > When remapping, do we need to support gaps in the "lcore"
> > > > (logical
> > > > > > cores)
> > > > > > > array, e.g. for secondary processes, or can it be continuous
> > from
> > > > 0
> > > > > > to
> > > > > > > the number of specified lcores?
> > > > > > >
> > > > > > > And are new EAL parameters for this really beneficial?
> > Doesn't
> > > > e.g.
> > > > > > "-l
> > > > > > > 0-9@130-139,100@140" suffice?
> > > > > > >
> > > > > > Actually, I believe "0-9@130-139"[1]  is not the same as
> > > > > > "0@130,1@131,2@132,...". The latter affinities one thread to
> > one
> > > > core
> > > > > > ten
> > > > > > times over, while the former affinitizes 10 threads to 10 cores
> > -
> > > > > > leaving
> > > > > > those threads free to move about within the 10 cores specified.
> > > > >
> > > > > Interesting. The documentation [GSG] isn't clear to me about
> > this; a
> > > > example there could help clarify.
> > > > >
> > > > > [GSG]:
> > > >
> > https://doc.dpdk.org/guides/linux_gsg/linux_eal_parameters.html#lcore-
> > > > related-options
> > > > >
> > > >
> > > > Yep, agreed.
> > > >
> > > > > If users are manually passing lcore parameters to the EAL, then I
> > see
> > > > why some sort of remapping shorthand is useful.
> > > > > IMO, if the mappings are relatively exotic, it should be
> > acceptable
> > > > requiring an external script to build a long list of mapping
> > parameters
> > > > and then invoke the application with those script-generated EAL
> > > > parameters.
> > > > > This would reduce the scope to support relatively simple, common
> > > > mappings.
> > > > >
> > > > > Could we expand the --lcores syntax to support common mappings?
> > > > >
> > > > > E.g. "0-9@130+" to do what I thought.
> > > > > The lack of "()" treats the entries individually (not as a
> > group),
> > > > and the "+" indicates auto-increment.
> > > > >
> > > > > A more advanced example:
> > > > > "0-9@(130-131)+", meaning lcore 0 gets cpus 130-131, lcore 1 gets
> > > > cpus 132-133, etc.
> > > > >
> > > >
> > > > My issues with the above syntax idea is:
> > > > * I think it's overly complicating the lcores parameter adding in
> > the
> > > >   support for the "+" symbol - especially in the advanced case
> > example
> > > > you
> > > >   provide. I worry about code maintainability here.
> > > > * More significantly for me, I think it's also getting things
> > backwards
> > > > in
> > > >   that it is focusing more on the lcore ids visible to the app,
> > rather
> > > > than
> > > >   the physical cores to be used. For the example above of 0-9@130+,
> > > > what I
> > > >   would expect is that the user is mainly thinking about the cores
> > he
> > > > wants
> > > >   to use 130-139, which are then to be mapped to lower ids. If we
> > had
> > > > the
> > > >   syntax reversed where the physical cores were first, I'd say it
> > would
> > > >   make more sense, e.g. 130-139@0+
> > >
> > > I 100 % agree on the syntax being backwards.
> > > A good reason for introducing a new parameter, rather than expanding
> > on "--lcores".
> > >
> > 
> > Yes and no.
> > I agree with not expanding on --lcores, but I also don't think any new
> > parameter added should attempt to reproduce all of what lcores does. I
> > would leave --lcores as-is, as the power-tool for lcore config e.g.
> > what
> > you talk about below for mapping multiple lcores to the same physical
> > cpu.
> > 
> > > We should consider deprecating the old (backwards) syntax, so users
> > don’t get confused about one EAL parameter being "backwards" of the
> > other.
> > >
> > 
> > I disagree with this. I would be ok with deprecating the old "-c"
> > coremask
> > syntax - I think the time is long past when we should be dealing with
> > masks. However, removing "-l" and "--lcores" flag is, to me anyway, too
> > big
> > and jarring a change for end users for us to consider.
> > 
> > > > * finally, as I found out last month, there are systems on which
> > the
> > > > cores
> > > >   are spread across numa-nodes on odd/even boundaries, so to have
> > an
> > > > app
> > > >   running on socket 0, you need to give the core ids as a list i.e.
> > > >   0,2,4,6, and cannot use ranges. [This also reenforces the point
> > above
> > > >   too, where again it's the internal ids we need to generalize, not
> > the
> > > >   physical cpus]
> > >
> > > For this, we could use "/2" (like in subnets), or "+2" as increment
> > parameter.
> > >
> > > >
> > > > My thinking on the matter, and I'm happy to be corrected if I'm
> > wrong
> > > > here,
> > > > is that end-users are unlikely to significantly care what the
> > actual
> > > > lcore
> > > > ids are internally in the program, so long as they work and are
> > unique.
> > >
> > > Generally yes.
> > > Note that we should allow the same physical CPU core to be assigned
> > to multiple lcores...
> > > If a CPU core is shared between multiple worker lcores, then it could
> > be problematic, but with a mix of lcore roles, this might be handy for
> > some applications. E.g. a virtual machine with only one CPU core could
> > use that single CPU core as both main and worker lcore, or as main and
> > service lcore.
> > > Sharing an lcore between threads requires the developer to take
> > special care of e.g. spinlocks, but the EAL parameter parser should not
> > prohibit it. It might log a notice, though.
> > >
> > 
> > This is already taken care of via --lcores, so I don't see the need to
> > reimplement it using a new flag also. Any new flags we add should be
> > kept
> > deliberately simple.
> > 
> > > > What does matter is the physical cpus on which the code is to run.
> > > > Therefore, my approach to this is to find the simplest possible
> > > > solution
> > > > whereby the user can just provide a list of cores and tell EAL to
> > just
> > > > map
> > > > them to reasonable values. For the "reasonable" values, I would
> > imagine
> > > > that for the vast majority of cases starting "0" is what is wanted.
> > For
> > > > anything beyond that, we already have the existing --lcores syntax
> > to
> > > > be
> > > > used.
> > >
> > > Agree with all of the above. :-)
> > >
> > 
> > Great. That still leaves us with the problem of what exactly to do as
> > the
> > best solution. Here are some alternatives that I see:
> > 
> > 1. Add a modifier flag for -l and -c parameters to auto-remap the lcore
> > ids
> >    to zero, so user is just specifying physical CPU id's.
> > 2. Add a new flag for specifying physical cpu ids, which auto-remaps
> > the
> >    cores specified to zero.
> >    2a. To simplify our code and user experience we could at the same
> > time:
> >        * deprecate "-c" flag for coremasks
> >        * make "-l" and "--lcores" the same flag just in long and short
> >          versions. This should not break anything as "-l" parameter is
> >          just as subset of what "--lcores" provides.
> >        * that would leave us with effectively two core flag paths:
> >           - -l/--lcores, behaviour as now, full explicit control
> >           - -L/--lcores-remapped, takes simplified core list (only ","
> > and
> >             "-" supported as with "-l" now), and maps them to zero-
> > based.
> 
> +1 for 2a.
> 
> > 
> > Third options? Any other feedback?
> 
> If the internal lcore ids are effectively hidden by passing physical CPU ids, 
> any related options should take physical CPU ids too. I.e. the options for 
> choosing main lcore and service lcores also need variants with other names.

Yes, I suppose so. That may be tricky, but we'll have to see what we can
do.

> The service cores are passed as a core mask; this should be deprecated (like 
> "-c"). And superseded by a parameter taking a core list.
> 

Yes to deprecate -s. I believe -S already exists to take a service core
list, so that can be marked off the list.

/Bruce

Reply via email to