On 26.11.2018 17:21, Burakov, Anatoly wrote: > On 26-Nov-18 2:10 PM, Ilya Maximets wrote: >> On 26.11.2018 16:42, Burakov, Anatoly wrote: >>> On 26-Nov-18 1:20 PM, Ilya Maximets wrote: >>>> On 26.11.2018 16:16, Ilya Maximets wrote: >>>>> On 26.11.2018 15:50, Burakov, Anatoly wrote: >>>>>> On 26-Nov-18 11:43 AM, Burakov, Anatoly wrote: >>>>>>> On 26-Nov-18 11:33 AM, Asaf Sinai wrote: >>>>>>>> Hi Anatoly, >>>>>>>> >>>>>>>> We did not check it with "testpmd", only with our application. >>>>>>>> From the beginning, we did not enable this configuration (look at >>>>>>>> attached files), and everything works fine. >>>>>>>> Of course we rebuild DPDK, when we change configuration. >>>>>>>> Please note that we use DPDK 17.11.3, maybe this is why it works fine? >>>>>>> >>>>>>> Just tested with DPDK 17.11, and yes, it does work the way you are >>>>>>> describing. This is not intended behavior. I will look into it. >>>>>>> >>>>>> >>>>>> +CC author of commit introducing CONFIG_RTE_EAL_NUMA_AWARE_HUGEPAGES. >>>>>> >>>>>> Looking at the code, i think this config option needs to be reworked and >>>>>> we should clarify what we mean by this option. It appears that i've >>>>>> misunderstood what this option actually intended to do, and i also think >>>>>> it's naming could be improved because it's confusing and misleading. >>>>>> >>>>>> In 17.11, this option does *not* prevent EAL from using NUMA - it merely >>>>>> disables using libnuma to perform memory allocation. This looks like >>>>>> intended (if counter-intuitive) behavior - disabling this option will >>>>>> simply revert DPDK to working as it did before this option was >>>>>> introduced (i.e. best-effort allocation). This is why your code still >>>>>> works - because EAL still does allocate memory on socket 1, and *knows* >>>>>> that it's socket 1 memory. It still supports NUMA. >>>>>> >>>>>> The commit message for these changes states that the actual purpose of >>>>>> this option is to enable "balanced" hugepage allocation. In case of >>>>>> cgroups limitations, previously, DPDK would've exhausted all hugepages >>>>>> on master core's socket before attempting to allocate from other >>>>>> sockets, but by the time we've reached cgroups limits on numbers of >>>>>> hugepages, we might not have reached socket 1 and thus missed out on the >>>>>> pages we could've allocated, but didn't. Using libnuma solves this >>>>>> issue, because now we can allocate pages on sockets we want, instead of >>>>>> hoping we won't run out of hugepages before we get the memory we need. >>>>>> >>>>>> In 18.05 onwards, this option works differently (and arguably wrong). >>>>>> More specifically, it disallows allocations on sockets other than 0, and >>>>>> it also makes it so that EAL does not check which socket the memory >>>>>> *actually* came from. So, not only allocating memory from socket 1 is >>>>>> disabled, but allocating from socket 0 may even get you memory from >>>>>> socket 1! >>>>> >>>>> I'd consider this as a bug. >>>>> >>>>>> >>>>>> +CC Thomas >>>>>> >>>>>> The CONFIG_RTE_EAL_NUMA_AWARE_HUGEPAGES option is a misnomer, because it >>>>>> makes it seem like this option disables NUMA support, which is not the >>>>>> case. >>>>>> >>>>>> I would also argue that it is not relevant to 18.05+ memory subsystem, >>>>>> and should only work in legacy mode, because it is *impossible* to make >>>>>> it work right in the new memory subsystem, and here's why: >>>>>> >>>>>> Without libnuma, we have no way of "asking" the kernel to allocate a >>>>>> hugepage on a specific socket - instead, any allocation will most likely >>>>>> happen on socket from which the allocation came from. For example, if >>>>>> user program's lcore is on socket 1, allocation on socket 0 will >>>>>> actually allocate a page on socket 1. >>>>>> >>>>>> If we don't check for page's NUMA node affinity (which is what currently >>>>>> happens) - we get performance degradation because we may unintentionally >>>>>> allocate memory on wrong NUMA node. If we do check for this - then >>>>>> allocation of memory on socket 1 from lcore on socket 0 will almost >>>>>> never succeed, because kernel will always give us pages on socket 0. >>>>>> >>>>>> Put it simply, there is no sane way to make this option work for the new >>>>>> memory subsystem - IMO it should be dropped, and libnuma should be made >>>>>> a hard dependency on Linux. >>>>> >>>>> I agree that new memory model could not work without libnuma, i.e. will >>>>> lead to unpredictable memory allocations with no any respect to requested >>>>> socket_id's. I also agree that CONFIG_RTE_EAL_NUMA_AWARE_HUGEPAGES is only >>>>> sane for a legacy memory model. >>>>> It looks like we have no other choice than just drop the option and make >>>>> the code unconditional, i.e. have hard dependency on libnuma. >>>>> >>>> >>>> We, probably, could compile this code and have hard dependency only for >>>> platforms with 'RTE_MAX_NUMA_NODES > 1'. >>> >>> Well, as long as legacy mode stays supported, we have to keep the option. >>> The "drop" part was referring to supporting it under the new memory system, >>> not a literal drop from config files. >> >> The option was introduced because we didn't want to introduce the >> new hard dependency. Since we'll have it anyway, I'm not sure if >> keeping the option for legacy mode makes any sense. > > Oh yes, you're right. Drop it is! > >> >>> >>> As for using RTE_MAX_NUMA_NODES, i don't think it's merited. Distributions >>> cannot deliver different DPDK versions based on the number of sockets on a >>> particular machine - so it would have to be a hard dependency for >>> distributions anyway (does any distribution ship DPDK without libnuma?). >> >> At least ARMv7 builds commonly does not ship libnuma package. > > Do you mean libnuma builds for ARMv7 are not available? Or do you mean the > libnuma package is not installed by default? > > If it's the latter, then i believe it's not installed by default anywhere, > but if using distribution version of DPDK, libnuma will be taken care of via > package manager. Presumably building from source can be taken care of with > pkg-config/meson. > > Or do you mean ARMv7 does not have libnuma for their arch at all, in any > distro?
libnuma builds for ARMv7 are not available in most of the distros. I didn't check all, but here is results for Ubuntu: https://packages.ubuntu.com/search?suite=bionic&arch=armhf&searchon=names&keywords=libnuma You may see that Ubuntu 18.04 (bionic) has no libnuma package for 'armhf' and also 'powerpc' platforms. > >> >>> >>> For those compiling from source - are there any supported distributions >>> which don't package libnuma? I don't see much sense in keeping libnuma >>> optional, IMO. This is of course up to the tech board to decide, but IMO >>> the "without libnuma it's basically broken" argument is very strong in my >>> opinion :) >>> >> > >