On Mon, Apr 07, 2025 at 03:18:36PM +0200, Morten Brørup wrote: > > From: Bruce Richardson [mailto:bruce.richard...@intel.com] > > Sent: Monday, 7 April 2025 14.42 > > > > On Mon, Apr 07, 2025 at 02:25:46PM +0200, Morten Brørup wrote: > > > > From: Bruce Richardson [mailto:bruce.richard...@intel.com] > > > > Sent: Monday, 7 April 2025 13.56 > > > > > > > > On Mon, Apr 07, 2025 at 01:32:59PM +0200, Morten Brørup wrote: > > > > > > From: Bruce Richardson [mailto:bruce.richard...@intel.com] > > > > > > Sent: Monday, 7 April 2025 12.41 > > > > > > > > > > > > On Mon, Apr 07, 2025 at 12:15:13PM +0200, Morten Brørup wrote: > > > > > > > > From: Bruce Richardson [mailto:bruce.richard...@intel.com] > > > > Sent: > > > > > > > > Monday, 7 April 2025 11.49 > > > > > > > > > > > > > > > > On Mon, Apr 07, 2025 at 09:04:05AM +0200, David Marchand > > wrote: > > > > > > > > > Hello Bruce, > > > > > > > > > > > > > > > > > > On Tue, Apr 1, 2025 at 4:08 PM Bruce Richardson > > > > > > > > > <bruce.richard...@intel.com> wrote: > > > > > > > > > > > > > > > > > > > > On Mon, Mar 24, 2025 at 05:30:26PM +0000, Bruce > > Richardson > > > > > > wrote: > > > > > > > > > > > Traditionally, DPDK has had a direct mapping of > > internal > > > > > > lcore- > > > > > > > > ids, to > > > > > > > > > > > the actual core numbers in use. With higher core > > count > > > > > > servers > > > > > > > > becoming > > > > > > > > > > > more prevalent the issue becomes one of increasing > > memory > > > > > > > > footprint when > > > > > > > > > > > using such a scheme, due to the need to have all > > arrays > > > > > > > > dimensioned for > > > > > > > > > > > all cores on the system, whether or not those cores > > are > > > > in > > > > > > use by > > > > > > > > the > > > > > > > > > > > app. > > > > > > > > > > > > > > > > > > > > > > Therefore, the decision was made in the past to not > > > > expand > > > > > > the > > > > > > > > > > > build-time RTE_MAX_LCORE value beyond 128. Instead, > > it > > > > was > > > > > > > > recommended > > > > > > > > > > > that users use the "--lcores" EAL parameter to take > > the > > > > high- > > > > > > > > numbered > > > > > > > > > > > cores they wish to use and map them to lcore-ids > > within > > > > the 0 > > > > > > - > > > > > > > > 128 > > > > > > > > > > > range. While this works, this is a little clunky as > > it > > > > means > > > > > > that > > > > > > > > > > > instead of just passing, for example, "-l 130-139", > > the > > > > user > > > > > > must > > > > > > > > > > > instead pass "--lcores 0@130,1@131,2@132,3@133,...." > > > > > > > > > > > > > > > > > > > > > > This patchset attempts to simplify the situation by > > > > adding a > > > > > > new > > > > > > > > flag to > > > > > > > > > > > do this mapping automatically. To use cores 130-139 > > and > > > > map > > > > > > them > > > > > > > > to ids > > > > > > > > > > > 0-9 internally, the EAL args now become: "-l 130-139 > > -- > > > > map- > > > > > > lcore- > > > > > > > > ids", > > > > > > > > > > > or using the shorter "-M" version of the flag: "-Ml > > 130- > > > > 139". > > > > > > > > > > > > > > > > > > > > > > Adding this new parameter required some rework of the > > > > > > existing > > > > > > > > arg > > > > > > > > > > > parsing code, because in current DPDK the args are > > parsed > > > > and > > > > > > > > checked in > > > > > > > > > > > the order they appear on the commandline. This means > > that > > > > > > using > > > > > > > > the > > > > > > > > > > > example above, the core parameter 130-139 will be > > > > rejected > > > > > > > > immediately > > > > > > > > > > > before the "map-lcore-ids" parameter is seen. To work > > > > around > > > > > > > > this, the > > > > > > > > > > > core (and service core) parameters are not parsed > > when > > > > seen, > > > > > > > > instead > > > > > > > > > > > they are only saved off and parsed after all > > arguments > > > > are > > > > > > > > parsed. The > > > > > > > > > > > "-l" and "-c" parameters are converted into "-- > > lcores" > > > > > > arguments, > > > > > > > > so all > > > > > > > > > > > assigning of lcore ids is done there in all cases. > > > > > > > > > > > > > > > > > > > > > > RFC->v2: * converted printf to DEBUG log * added "-M" > > as > > > > > > shorter > > > > > > > > > > > version of flag * added documentation * renamed > > internal > > > > API > > > > > > that > > > > > > > > > > > was changed to avoid any potential > > > > > > > > hidden > > > > > > > > > > > runtime issues. > > > > > > > > > > > > > > > > > > > > > > Bruce Richardson (3): eal: centralize core parameter > > > > parsing > > > > > > eal: > > > > > > > > > > > convert core masks and lists to core sets eal: allow > > > > > > automatic > > > > > > > > > > > mapping of high lcore ids > > > > > > > > > > > > > > > > > > > > > Ping for review. > > > > > > > > > > > > > > > > > > > > At a high level, does this feature seem useful to > > users? > > > > > > > > > > > > > > > > > > This seems useful, though I am not I would touch the > > existing > > > > > > > > options. > > > > > > > > > I would have gone with a simple -L option (taking the > > same > > > > kind > > > > > > of > > > > > > > > > input than -l but with new behavior), and not combine a > > flag > > > > with > > > > > > > > > existing options. > > > > > > > > > > > > > > > > > > > > > > > > > That would be an easier patchset to do up. However, it > > would > > > > then > > > > > > mean > > > > > > > > that we have no less than 4 different ways to specify the > > cores > > > > to > > > > > > use: > > > > > > > > "- c", "-l", "-L", "--lcores" - and therefore need 4 > > different > > > > sets > > > > > > of > > > > > > > > parsing options for them, and more checks to ensure we have > > > > only > > > > > > one of > > > > > > > > the 4 specified in any run. That's why I chose the modifier > > > > option, > > > > > > and > > > > > > > > to try and consolidate the core setup a bit. > > > > > > > > > > > > > > > > However, if having a completely new option is preferred, I > > am > > > > happy > > > > > > > > enough to do up a different patchset for that. > > > > > > > > > > > > > > > > > I scanned through the series, not much to say. Maybe add > > a > > > > unit > > > > > > test > > > > > > > > > for new cmdline option. > > > > > > > > > > > > > > > > > Sure. Once it's decided what approach (if any) to take, > > I'll do > > > > up > > > > > > a > > > > > > > > new patchset, taking into account any relevant feedback on > > this > > > > > > set. > > > > > > > > > > > > > > > > /Bruce > > > > > > > > > > > > > > Changing the EAL parameter parser to a "two pass parser" > > makes > > > > sense. > > > > > > I > > > > > > > think checking for existence of more than one lcore > > specification > > > > > > options > > > > > > > should suffice; we don't need to accept multiple lcore > > > > specification > > > > > > > options and check for conflicts. > > > > > > > > > > > > > > When remapping, do we need to support gaps in the "lcore" > > > > (logical > > > > > > cores) > > > > > > > array, e.g. for secondary processes, or can it be continuous > > from > > > > 0 > > > > > > to > > > > > > > the number of specified lcores? > > > > > > > > > > > > > > And are new EAL parameters for this really beneficial? > > Doesn't > > > > e.g. > > > > > > "-l > > > > > > > 0-9@130-139,100@140" suffice? > > > > > > > > > > > > > Actually, I believe "0-9@130-139"[1] is not the same as > > > > > > "0@130,1@131,2@132,...". The latter affinities one thread to > > one > > > > core > > > > > > ten > > > > > > times over, while the former affinitizes 10 threads to 10 cores > > - > > > > > > leaving > > > > > > those threads free to move about within the 10 cores specified. > > > > > > > > > > Interesting. The documentation [GSG] isn't clear to me about > > this; a > > > > example there could help clarify. > > > > > > > > > > [GSG]: > > > > > > https://doc.dpdk.org/guides/linux_gsg/linux_eal_parameters.html#lcore- > > > > related-options > > > > > > > > > > > > > Yep, agreed. > > > > > > > > > If users are manually passing lcore parameters to the EAL, then I > > see > > > > why some sort of remapping shorthand is useful. > > > > > IMO, if the mappings are relatively exotic, it should be > > acceptable > > > > requiring an external script to build a long list of mapping > > parameters > > > > and then invoke the application with those script-generated EAL > > > > parameters. > > > > > This would reduce the scope to support relatively simple, common > > > > mappings. > > > > > > > > > > Could we expand the --lcores syntax to support common mappings? > > > > > > > > > > E.g. "0-9@130+" to do what I thought. > > > > > The lack of "()" treats the entries individually (not as a > > group), > > > > and the "+" indicates auto-increment. > > > > > > > > > > A more advanced example: > > > > > "0-9@(130-131)+", meaning lcore 0 gets cpus 130-131, lcore 1 gets > > > > cpus 132-133, etc. > > > > > > > > > > > > > My issues with the above syntax idea is: > > > > * I think it's overly complicating the lcores parameter adding in > > the > > > > support for the "+" symbol - especially in the advanced case > > example > > > > you > > > > provide. I worry about code maintainability here. > > > > * More significantly for me, I think it's also getting things > > backwards > > > > in > > > > that it is focusing more on the lcore ids visible to the app, > > rather > > > > than > > > > the physical cores to be used. For the example above of 0-9@130+, > > > > what I > > > > would expect is that the user is mainly thinking about the cores > > he > > > > wants > > > > to use 130-139, which are then to be mapped to lower ids. If we > > had > > > > the > > > > syntax reversed where the physical cores were first, I'd say it > > would > > > > make more sense, e.g. 130-139@0+ > > > > > > I 100 % agree on the syntax being backwards. > > > A good reason for introducing a new parameter, rather than expanding > > on "--lcores". > > > > > > > Yes and no. > > I agree with not expanding on --lcores, but I also don't think any new > > parameter added should attempt to reproduce all of what lcores does. I > > would leave --lcores as-is, as the power-tool for lcore config e.g. > > what > > you talk about below for mapping multiple lcores to the same physical > > cpu. > > > > > We should consider deprecating the old (backwards) syntax, so users > > don’t get confused about one EAL parameter being "backwards" of the > > other. > > > > > > > I disagree with this. I would be ok with deprecating the old "-c" > > coremask > > syntax - I think the time is long past when we should be dealing with > > masks. However, removing "-l" and "--lcores" flag is, to me anyway, too > > big > > and jarring a change for end users for us to consider. > > > > > > * finally, as I found out last month, there are systems on which > > the > > > > cores > > > > are spread across numa-nodes on odd/even boundaries, so to have > > an > > > > app > > > > running on socket 0, you need to give the core ids as a list i.e. > > > > 0,2,4,6, and cannot use ranges. [This also reenforces the point > > above > > > > too, where again it's the internal ids we need to generalize, not > > the > > > > physical cpus] > > > > > > For this, we could use "/2" (like in subnets), or "+2" as increment > > parameter. > > > > > > > > > > > My thinking on the matter, and I'm happy to be corrected if I'm > > wrong > > > > here, > > > > is that end-users are unlikely to significantly care what the > > actual > > > > lcore > > > > ids are internally in the program, so long as they work and are > > unique. > > > > > > Generally yes. > > > Note that we should allow the same physical CPU core to be assigned > > to multiple lcores... > > > If a CPU core is shared between multiple worker lcores, then it could > > be problematic, but with a mix of lcore roles, this might be handy for > > some applications. E.g. a virtual machine with only one CPU core could > > use that single CPU core as both main and worker lcore, or as main and > > service lcore. > > > Sharing an lcore between threads requires the developer to take > > special care of e.g. spinlocks, but the EAL parameter parser should not > > prohibit it. It might log a notice, though. > > > > > > > This is already taken care of via --lcores, so I don't see the need to > > reimplement it using a new flag also. Any new flags we add should be > > kept > > deliberately simple. > > > > > > What does matter is the physical cpus on which the code is to run. > > > > Therefore, my approach to this is to find the simplest possible > > > > solution > > > > whereby the user can just provide a list of cores and tell EAL to > > just > > > > map > > > > them to reasonable values. For the "reasonable" values, I would > > imagine > > > > that for the vast majority of cases starting "0" is what is wanted. > > For > > > > anything beyond that, we already have the existing --lcores syntax > > to > > > > be > > > > used. > > > > > > Agree with all of the above. :-) > > > > > > > Great. That still leaves us with the problem of what exactly to do as > > the > > best solution. Here are some alternatives that I see: > > > > 1. Add a modifier flag for -l and -c parameters to auto-remap the lcore > > ids > > to zero, so user is just specifying physical CPU id's. > > 2. Add a new flag for specifying physical cpu ids, which auto-remaps > > the > > cores specified to zero. > > 2a. To simplify our code and user experience we could at the same > > time: > > * deprecate "-c" flag for coremasks > > * make "-l" and "--lcores" the same flag just in long and short > > versions. This should not break anything as "-l" parameter is > > just as subset of what "--lcores" provides. > > * that would leave us with effectively two core flag paths: > > - -l/--lcores, behaviour as now, full explicit control > > - -L/--lcores-remapped, takes simplified core list (only "," > > and > > "-" supported as with "-l" now), and maps them to zero- > > based. > > +1 for 2a. > > > > > Third options? Any other feedback? > > If the internal lcore ids are effectively hidden by passing physical CPU ids, > any related options should take physical CPU ids too. I.e. the options for > choosing main lcore and service lcores also need variants with other names.
Yes, I suppose so. That may be tricky, but we'll have to see what we can do. > The service cores are passed as a core mask; this should be deprecated (like > "-c"). And superseded by a parameter taking a core list. > Yes to deprecate -s. I believe -S already exists to take a service core list, so that can be marked off the list. /Bruce