On 26.11.2018 17:21, Burakov, Anatoly wrote:
> On 26-Nov-18 2:10 PM, Ilya Maximets wrote:
>> On 26.11.2018 16:42, Burakov, Anatoly wrote:
>>> On 26-Nov-18 1:20 PM, Ilya Maximets wrote:
>>>> On 26.11.2018 16:16, Ilya Maximets wrote:
>>>>> On 26.11.2018 15:50, Burakov, Anatoly wrote:
>>>>>> On 26-Nov-18 11:43 AM, Burakov, Anatoly wrote:
>>>>>>> On 26-Nov-18 11:33 AM, Asaf Sinai wrote:
>>>>>>>> Hi Anatoly,
>>>>>>>>
>>>>>>>> We did not check it with "testpmd", only with our application.
>>>>>>>>    From the beginning, we did not enable this configuration (look at 
>>>>>>>> attached files), and everything works fine.
>>>>>>>> Of course we rebuild DPDK, when we change configuration.
>>>>>>>> Please note that we use DPDK 17.11.3, maybe this is why it works fine?
>>>>>>>
>>>>>>> Just tested with DPDK 17.11, and yes, it does work the way you are 
>>>>>>> describing. This is not intended behavior. I will look into it.
>>>>>>>
>>>>>>
>>>>>> +CC author of commit introducing CONFIG_RTE_EAL_NUMA_AWARE_HUGEPAGES.
>>>>>>
>>>>>> Looking at the code, i think this config option needs to be reworked and 
>>>>>> we should clarify what we mean by this option. It appears that i've 
>>>>>> misunderstood what this option actually intended to do, and i also think 
>>>>>> it's naming could be improved because it's confusing and misleading.
>>>>>>
>>>>>> In 17.11, this option does *not* prevent EAL from using NUMA - it merely 
>>>>>> disables using libnuma to perform memory allocation. This looks like 
>>>>>> intended (if counter-intuitive) behavior - disabling this option will 
>>>>>> simply revert DPDK to working as it did before this option was 
>>>>>> introduced (i.e. best-effort allocation). This is why your code still 
>>>>>> works - because EAL still does allocate memory on socket 1, and *knows* 
>>>>>> that it's socket 1 memory. It still supports NUMA.
>>>>>>
>>>>>> The commit message for these changes states that the actual purpose of 
>>>>>> this option is to enable "balanced" hugepage allocation. In case of 
>>>>>> cgroups limitations, previously, DPDK would've exhausted all hugepages 
>>>>>> on master core's socket before attempting to allocate from other 
>>>>>> sockets, but by the time we've reached cgroups limits on numbers of 
>>>>>> hugepages, we might not have reached socket 1 and thus missed out on the 
>>>>>> pages we could've allocated, but didn't. Using libnuma solves this 
>>>>>> issue, because now we can allocate pages on sockets we want, instead of 
>>>>>> hoping we won't run out of hugepages before we get the memory we need.
>>>>>>
>>>>>> In 18.05 onwards, this option works differently (and arguably wrong). 
>>>>>> More specifically, it disallows allocations on sockets other than 0, and 
>>>>>> it also makes it so that EAL does not check which socket the memory 
>>>>>> *actually* came from. So, not only allocating memory from socket 1 is 
>>>>>> disabled, but allocating from socket 0 may even get you memory from 
>>>>>> socket 1!
>>>>>
>>>>> I'd consider this as a bug.
>>>>>
>>>>>>
>>>>>> +CC Thomas
>>>>>>
>>>>>> The CONFIG_RTE_EAL_NUMA_AWARE_HUGEPAGES option is a misnomer, because it 
>>>>>> makes it seem like this option disables NUMA support, which is not the 
>>>>>> case.
>>>>>>
>>>>>> I would also argue that it is not relevant to 18.05+ memory subsystem, 
>>>>>> and should only work in legacy mode, because it is *impossible* to make 
>>>>>> it work right in the new memory subsystem, and here's why:
>>>>>>
>>>>>> Without libnuma, we have no way of "asking" the kernel to allocate a 
>>>>>> hugepage on a specific socket - instead, any allocation will most likely 
>>>>>> happen on socket from which the allocation came from. For example, if 
>>>>>> user program's lcore is on socket 1, allocation on socket 0 will 
>>>>>> actually allocate a page on socket 1.
>>>>>>
>>>>>> If we don't check for page's NUMA node affinity (which is what currently 
>>>>>> happens) - we get performance degradation because we may unintentionally 
>>>>>> allocate memory on wrong NUMA node. If we do check for this - then 
>>>>>> allocation of memory on socket 1 from lcore on socket 0 will almost 
>>>>>> never succeed, because kernel will always give us pages on socket 0.
>>>>>>
>>>>>> Put it simply, there is no sane way to make this option work for the new 
>>>>>> memory subsystem - IMO it should be dropped, and libnuma should be made 
>>>>>> a hard dependency on Linux.
>>>>>
>>>>> I agree that new memory model could not work without libnuma, i.e. will
>>>>> lead to unpredictable memory allocations with no any respect to requested
>>>>> socket_id's. I also agree that CONFIG_RTE_EAL_NUMA_AWARE_HUGEPAGES is only
>>>>> sane for a legacy memory model.
>>>>> It looks like we have no other choice than just drop the option and make
>>>>> the code unconditional, i.e. have hard dependency on libnuma.
>>>>>
>>>>
>>>> We, probably, could compile this code and have hard dependency only for
>>>> platforms with 'RTE_MAX_NUMA_NODES > 1'.
>>>
>>> Well, as long as legacy mode stays supported, we have to keep the option. 
>>> The "drop" part was referring to supporting it under the new memory system, 
>>> not a literal drop from config files.
>>
>> The option was introduced because we didn't want to introduce the
>> new hard dependency. Since we'll have it anyway, I'm not sure if
>> keeping the option for legacy mode makes any sense.
> 
> Oh yes, you're right. Drop it is!
> 
>>
>>>
>>> As for using RTE_MAX_NUMA_NODES, i don't think it's merited. Distributions 
>>> cannot deliver different DPDK versions based on the number of sockets on a 
>>> particular machine - so it would have to be a hard dependency for 
>>> distributions anyway (does any distribution ship DPDK without libnuma?).
>>
>> At least ARMv7 builds commonly does not ship libnuma package.
> 
> Do you mean libnuma builds for ARMv7 are not available? Or do you mean the 
> libnuma package is not installed by default?
> 
> If it's the latter, then i believe it's not installed by default anywhere, 
> but if using distribution version of DPDK, libnuma will be taken care of via 
> package manager. Presumably building from source can be taken care of with 
> pkg-config/meson.
> 
> Or do you mean ARMv7 does not have libnuma for their arch at all, in any 
> distro?

libnuma builds for ARMv7 are not available in most of the distros. I didn't 
check all,
but here is results for Ubuntu:
    
https://packages.ubuntu.com/search?suite=bionic&arch=armhf&searchon=names&keywords=libnuma

You may see that Ubuntu 18.04 (bionic) has no libnuma package for 'armhf' and
also 'powerpc' platforms.

> 
>>
>>>
>>> For those compiling from source - are there any supported distributions 
>>> which don't package libnuma? I don't see much sense in keeping libnuma 
>>> optional, IMO. This is of course up to the tech board to decide, but IMO 
>>> the "without libnuma it's basically broken" argument is very strong in my 
>>> opinion :)
>>>
>>
> 
> 

Reply via email to