On 12/19/22 10:55, David Hildenbrand wrote: > On 16.12.22 14:47, Michal Prívozník wrote: >> On 12/16/22 14:41, David Hildenbrand wrote: >>> On 15.12.22 10:55, Michal Privoznik wrote: >>>> If a memory-backend is configured with mode >>>> HOST_MEM_POLICY_PREFERRED then >>>> host_memory_backend_memory_complete() calls mbind() as: >>>> >>>> mbind(..., MPOL_PREFERRED, nodemask, ...); >>>> >>>> Here, 'nodemask' is a bitmap of host NUMA nodes and corresponds >>>> to the .host-nodes attribute. Therefore, there can be multiple >>>> nodes specified. However, the documentation to MPOL_PREFERRED >>>> says: >>>> >>>> MPOL_PREFERRED >>>> This mode sets the preferred node for allocation. ... >>>> If nodemask specifies more than one node ID, the first node >>>> in the mask will be selected as the preferred node. >>>> >>>> Therefore, only the first node is honored and the rest is >>>> silently ignored. Well, with recent changes to the kernel and >>>> numactl we can do better. >>>> >>>> The Linux kernel added in v5.15 via commit cfcaa66f8032 >>>> ("mm/hugetlb: add support for mempolicy MPOL_PREFERRED_MANY") >>>> support for MPOL_PREFERRED_MANY, which accepts multiple preferred >>>> NUMA nodes instead. >>>> >>>> Then, numa_has_preferred_many() API was introduced to numactl >>>> (v2.0.15~26) allowing applications to query kernel support. >>>> >>>> Wiring this all together, we can pass MPOL_PREFERRED_MANY to the >>>> mbind() call instead and stop ignoring multiple nodes, silently. >>>> >>>> Signed-off-by: Michal Privoznik <mpriv...@redhat.com> >>>> --- >>> >>> [...] >>> >>>> +#ifdef HAVE_NUMA_SET_PREFERRED_MANY > > That should be HAVE_NUMA_HAS_PREFERRED_MANY, right? >
Oops, yes. Do you want me to send v3? Michal