On 04/01/2022 15:55, Jakub Jelinek wrote:
The usual libgomp way of doing this wouldn't be to use #ifdef __linux__, but instead add libgomp/config/linux/allocator.c that includes some headers, defines some macros and then includes the generic allocator.c.
OK, good point, I can do that.
I think perror is the wrong thing to do, omp_alloc etc. has a well defined interface what to do in such cases - the allocation should just fail (not be allocated) and depending on user's choice that can be fatal, or return NULL, or chain to some other allocator with other properties etc.
I did it this way because pinning feels more like an optimization, and falling back to "just works" seemed like what users would want to happen. The perror was added because it turns out the default ulimit is tiny and I wanted to hint at the solution.
I guess you're right that the consistent behaviour would be to silently switch to the fallback allocator, but it still feels like users will be left in the dark about why it failed.
Other issues in the patch are that it doesn't munlock on deallocation and that because of that deallocation we need to figure out what to do on page boundaries. As documented, mlock can be passed address and/or address + size that aren't at page boundaries and pinning happens even just for partially touched pages. But munlock unpins also even the partially overlapping pages and we don't know at that point whether some other pinned allocations don't appear in those pages.
Right, it doesn't munlock because of these issues. I don't know of any way to solve this that wouldn't involve building tables of locked ranges (and knowing what the page size is).
I considered using mmap with the lock flag instead, but the failure mode looked unhelpful. I guess we could mmap with the regular flags, then mlock after. That should bypass the regular heap and ensure each allocation has it's own page. I'm not sure what the unintended side-effects of that might be.
Some bad options are only pin pages wholy contained within the allocation and don't pin partial pages around it, force at least page alignment and size so that everything can be pinned, somehow ensure that we never allocate more than one pinned allocation in such partial pages (but can allocate there non-pinned allocations), or e.g. use some internal data structure to track how many pinned allocations are on the partial pages (say a hash map from page start address to a counter how many pinned allocations are there, if it goes to 0 munlock even that page, otherwise munlock just the wholy contained pages), or perhaps use page size aligned allocation and size and just remember in some data structure that the partial pages could be used for other pinned (small) allocations.
Bad options indeed. If any part of the memory block is not pinned I expect no performance gains whatsoever. And all this other business adds complexity and runtime overhead.
For version 1.0 it feels reasonable to omit the unlock step and hope that a) pinned data will be long-lived, or b) short-lived pinned data will be replaced with more data that -- most likely -- occupies the same pages.
Similarly, it seems likely that serious HPC applications will run on devices with lots of RAM, and if not any page swapping will destroy the performance gains of using OpenMP.
For now I'll just fix the architectural issues. Andrew