[AMD Official Use Only - General] Hi Konstantin, Morten,
> -----Original Message----- > From: Konstantin Ananyev <konstantin.anan...@huawei.com> > Sent: Tuesday, November 7, 2023 8:03 PM > To: Morten Brørup <m...@smartsharesystems.com>; Thomas Monjalon > <tho...@monjalon.net>; Kevin Traynor <ktray...@redhat.com>; Tummala, > Sivaprasad <sivaprasad.tumm...@amd.com>; David Marchand > <david.march...@redhat.com>; Yigit, Ferruh <ferruh.yi...@amd.com>; > bruce.richard...@intel.com; konstantin.v.anan...@yandex.ru; > maxime.coque...@redhat.com; Aaron Conole <acon...@redhat.com> > Cc: dev@dpdk.org > Subject: RE: [PATCH] config/x86: config support for AMD EPYC processors > > Caution: This message originated from an External Source. Use proper caution > when opening attachments, clicking links, or responding. > > > > > > > > > >> From: Tummala, Sivaprasad <sivaprasad.tumm...@amd.com> > > > > > > > >>> From: David Marchand <david.march...@redhat.com> On Mon, > > > > > > > >>> Sep 25, 2023 at 5:11 PM Sivaprasad Tummala > > > > > > > >>>> From: Sivaprasad Tummala <sivaprasad.tumm...@amd.com> > > > > > > > >>>> > > > > > > > >>>> By default, max lcores are limited to 128 for x86 > > > platforms. > > > > > > > >>>> On AMD EPYC processors, this limit needs to be > > > > > > > >>>> increased > > > to > > > > > > > leverage > > > > > > > >>>> all the cores. > > > > > > > >>>> > > > > > > > >>>> The patch adjusts the limit specifically for native > > > > > compilation on > > > > > > > >>>> AMD EPYC CPUs. > > > > > > > >>>> > > > > > > > >>>> Signed-off-by: Sivaprasad Tummala > > > <sivaprasad.tumm...@amd.com> > > > > > > > >>> > > > > > > > >>> This patch is a revamp of > > > > > > > >>> > > > > > > > >> > > > > > > > > > > > > > > > > http://inbox.dpdk.org/dev/BY5PR12MB3681C3FC6676BC03F0B42CCC96789@BY5 > > > PR > > > > > > > >>> 12MB3681.namprd12.prod.outlook.com/ > > > > > > > >>> for which a discussion at techboard is supposed to have > > > taken > > > > > place. > > > > > > > >>> But I didn't find a trace of it. > > > > > > > >>> > > > > > > > >>> One option that had been discussed in the previous > > > > > > > >>> thread > > > was > > > > > to > > > > > > > >>> increase the max number of cores for x86. > > > > > > > >>> I am unclear if this option has been properly > > > > > evaluated/debatted. > > > > > > > > > > > > Here are the minutes from the previous techboard discussions: > > > > > > [1]: http://inbox.dpdk.org/dev/YZ43U36bFWHYClAi@platinum/ > > > > > > [2]: > > > http://inbox.dpdk.org/dev/20211202112506.68acaa1a@hermes.local/ > > > > > > > > > > > > AFAIK, there has been no progress with dynamic max_lcores, so > > > > > > I > > > guess > > > > > the techboard's conclusion still stands: > > > > > > > > > > > > There is no identified use-case where a single application > > > requires > > > > > more than 128 lcores. If a case a use-case exists for a single > > > > > application that uses more than 128 lcores, the TB is ok to > > > > > update > > > the > > > > > default config value. > > > > > > > > > > > > > >>> > > > > > > > >>> Can the topic be brought again at techboard? > > > > > > > >> > > > > > > > >> Hi David, > > > > > > > >> > > > > > > > >> The patch is intended to detect AMD platforms and enable > > > > > > > >> all > > > CPU > > > > > > > cores by default > > > > > > > >> on native builds. > > > > > > > > > > > > This is done on native ARM builds, so why not on native X86 > > > builds > > > > > too? > > > > > > > > > > > > > >> > > > > > > > >> As an optimization for memory footprint, users can > > > > > > > >> override > > > this > > > > > by > > > > > > > specifying "- > > > > > > > >> Dmax_lcores" option based on DPDK lcores required for > > > > > > > >> their > > > > > usecases. > > > > > > > >> > > > > > > > >> Sure, will request to add this topic for discussion at > > > > > techboard. > > > > > > > > > > This is the summary of the techboard meeting: > > > > > (see > > > > > https://mails.dpdk.org/archives/dev/2023-October/279672.html) > > > > > > > > > > - There is some asks for more than 128 worker cores > > > > > - Discussion about generally increasing the default max core > > > > > count > > > and > > > > > trade-offs with memory consumption but this is longer term issue > > > > > > > > The distros are currently satisfied with the 128 cores default, so > > > the decision here was: Leave the 128 cores default as is, for now. > > > > > > > > Any long term improvements regarding memory consumption of > > > > many-core > > > systems are not relevant for this patch. > > > > > > > > > - Acceptance for the direction of this patch in the short term > > > > > > > > With the twist that it must work for cross compile. It is the > > > properties of the target CPU that matter, not the properties of the > > > host > > > > CPU. (Although the build may be "native", i.e. the target CPU is > > > > the > > > same as the host CPU, it is still the target CPU that matters.) > > > > > > > > > - Details of whether it should be for EPYC only or x86 to be > > > figured > > > > > out > > > > > on mailing list > > > > > > > > I think this is obvious... > > > > > > > > ARM already provides ARM CPU specific optimizations. > > > > AMD should be allowed to provide AMD CPU specific optimizations too. > > > > Intel can also provide Intel CPU specific optimizations. > > > > > > I suppose no-one stopping AMD/Intel/ARM to provide their CPU > > > specific optimizations. > > > Though as end-user, my preference would be to have one generic build > > > (machine=default) that would work ok on all cpus for given > > > architecture (let say x86) instead of maintaining/testing dozens of > > > different flavors. > > > > Agree. Machine specific builds should be explicitly specified. I consider > > "native" a > variant of explicitly specifying the target machine. > > > > > I suppose for 23.11 we have not much choice but accept that patch as > > > it is. > > > > No. They agreed in the techboard meeting to rework it for cross compile. > > Ah, yes, cross-builds, nearly forgot about them. > I suppose yes, you are right, it needs to be supported for completeness. > Yes, currently the patch is targeted to support max lcores selection only for native builds. Cross-compilation works as it is now. Once this patch is merged in this release, we plan to extend for cross builds in the coming releases. > > > > > Though I think in future (24.11?) it would be ideal to make > > > RTE_MAX_LCORE a runtime parameter and remove it from public API > > > structs. > > > > It is probably more difficult than it sounds to fully remove RTE_MAX_LCORE. > > I am not saying it is not an easy one, otherwise probably it will be already > done. > > > However, if we could make it mostly a runtime parameter, and only keep > > RTE_MAX_LCORE in arrays of small structures, we could increase it to > > some very large (but still sane) value, like 4096 or whatever number of CPU > cores can be found in the largest system out there. > > > > > > > > > > > And if some of these optimizations are rooted in the same > > > > criteria, > > > they should be shared across the relevant CPU architectures. We > > > > follow this principle in the source code files, and the principle > > > also applies to the build files. > > > > > > > > > > > > > > So now let's figure out the details please. > > > > > Suggestions? > > > > > > > > Suggestions provided inline above. :-)