----- Original Message ----- > On 08/20/2013 09:41 PM, Andrew Jones wrote: > >> + > >> + /* This is a workaround for a long standing bug in Linux' > >> + * mbind implementation, which cuts off the last specified > >> + * node. To stay compatible should this bug be fixed, we > >> + * specify one more node and zero this one out. > >> + */ > >> + clear_bit(numa_num_configured_nodes() + 1, numa_info[i].host_mem); > >> + if (mbind(ram_ptr + ram_offset, len, bind_mode, > >> + numa_info[i].host_mem, numa_num_configured_nodes() + 1, 0)) { > >> + perror("mbind"); > >> + return -1; > >> + } > > > >>From my quick read of this patch series, I think these two calls of > > numa_num_configured_nodes() are the only places that libnuma is used. > > Is it really worth the new dependency? Actually libnuma will only calculate > > what it returns from numa_num_configured_nodes() once, because it simply > > counts bits in a bitmask that it only initializes at library load time. So > > it would be more robust wrt to node onlining/offlining to avoid libnuma and > > to just fetch information from sysfs as needed anyway. In this particular > > code though, I think replacing numa_num_configured_nodes() with a maxnode, > > where > > > > unsigned long maxnode = find_last_bit(numa_info[i].host_mem, > > MAX_CPUMASK_BITS) > > Sorry I can't understand this since numa_numa_configured_nodes() is for host, > but why could we find the last bit of guest setting to replace it? >
You're not using numa_numa_configured_nodes() to index _the_ host's nodemask, you're using it to find the highest possible bit set in _a_ nodemask, numa_info[i].host_mem. mbind doesn't need its 'maxnode' param to be the highest possible host node bit, but rather just the highest possible bit set in the nodemask passed to it. find_last_bit will find that bit. You still need to add 1 to it as you do with numa_numa_configured_nodes() though, due to the kernel decrementing it by one erroneously as you've pointed out in your comment. drew