> Here's more up-to-date version: https://lkml.org/lkml/2012/8/20/337
These don't seem to give us a noticeable performance change either: With THP: real 22m34.279s user 10797m35.984s sys 39m18.188s Without THP: real 4m48.957s user 2118m23.208s sys 113m12.740s Looks like we got a few minutes faster on the with THP case, but it's still significantly slower, and that could just be a fluke result; we're still floating at about a 5x performance degradation. I talked with one of our performance/benchmarking experts last week and he's done a bit more research into the actual problem here, so I've got a bit more information: The real performance hit, based on our testing, seems to be coming from the increased latency that comes into play on large NUMA systems when a process has to go off-node to read from/write to memory. To give an extreme example, say we have a 16 node system with 8 cores per node. If we have a job that shares a 2MB data structure between 128 threads, with THP on, the first thread to touch the structure will allocate all 2MB of space for that structure in a 2MB page, local to its socket. This means that all the memory accessses for the other 120 threads will be remote acceses. With THP off, each thread could locally allocate a number of 4K pages sufficient to hold the chunk of the structure on which it needs to work, significantly reducing the number of remote accesses that each thread will need to perform. So, with that in mind, do we agree that a per-process tunable (or something similar) to control THP seems like a reasonable method to handle this issue? Just want to confirm that everyone likes this approach before moving forward with another revision of the patch. I'm currently in favor of moving this to a per-mm tunable, since that seems to make more sense when it comes to threaded jobs. Also, a decent chunk of the code I've already written can be reused with this approach, and prctl will still be an appropriate place from which to control the behavior. Andrew Morton suggested possibly controlling this through the ELF header, but I'm going to lean towards the per-mm route unless anyone has a major objection to it. - Alex -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/