From: Oleg Drokin <gr...@linuxhacker.ru> I just got a report today from Tyson Whitehead <twhiteh...@gmail.com> that Lustre crashes when CPUMASK_OFFSTACK is enabled.
A little investigation revealed that this code: cpumask_t mask; ... cpumask_copy(&mask, topology_thread_cpumask(0)); weight = cpus_weight(mask); that was supposed to calculate number of cpu siblings/partitions returns a crazy high number over 3000 which is impossible as I only have 8 cpu cores on my system. So after a bit of digging I found out that: cpumask_copy only copies up to nr_cpumask_bits (actual number of cpus I have in the system), where as cpumask_weight actully tries to count bits up to NR_CPUS. Not only calculating up to NR_CPUS is wasteful in this case, and since we know how many cpus we have in the system - it only makes sense to calculate only this much anyway, it's wrong because the copy only copied 8 bits to our variable and the rest of it is some random stack garbage. So I propose two patches here, the first one I am more certain about - operations that operate on current cpuset like cpus_weight, but also cpus_empty, cpus_$LOGICALop cpus_$BINARYop are converted from NR_CPUS to nr_cpumask_bits (this is ok when CONFIG_CPUMASK_OFFSTACK is not set as it's then defined to NR_CPUS anyway). I am leaving __cpus_setall __cpus_clear out of it as these two look like they deal with entire set and it would be useful for them to operate on all NR_CPUS bits for the case if more cpus are added later and such. The second patch that I am not sure if we wnat, but it seems to be useful until struct cpumask is fully dynamic is to convert what looks like whole-set operations e.g. copies, namely: cpumask_setall, cpumask_clear, cpumask_copy to always operate on NR_CPUS bits to ensure there's no stale garbage left in the mask should the cpu count increases later. I checked the code and allocating cpumasks on stack is not all that uncommon in the code, so this should be a worthwhile fix. Please consider. Oleg Drokin (2): cpumask: Properly calculate cpumask values cpumask: make whole cpumask operations like copy to work with NR_CPUS bits include/linux/cpumask.h | 37 +++++++++++++++++++++---------------- 1 file changed, 21 insertions(+), 16 deletions(-) -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/