Hi Andrew, Side remark:
-#define MEMSPACE_CALLOC(MEMSPACE, SIZE) \ - calloc (1, (((void)(MEMSPACE), (SIZE))))
This fits a bit more to previous patch, but I wonder whether that should use (MEMSPACE, NMEMB, SIZE) instead - to fit to the actual calloc arguments. I think the main/only difference between SIZE and NMEMB and SIZE is that "If the multiplication of nmemb and size would result in integer overflow, then calloc() returns an error." (Linux manpage) However, while this wording seems to be neither in POSIX nor in the OpenMP spec. There was some alignment discussion at https://gcc.gnu.org/PR112364 regarding whether C (since C23) has a different alignment for calloc(1, n) vs. calloc(n,1) but Joseph believes it doen't. Thus, this is more bikesheding than making a real difference. * * * [somehow my email program caused some odd formatting issues when I hit some odd key combo. I am not sure whether I fully fixed it or not; sorry if some parts look odd.] On 23.08.23 16:14, Andrew Stubbs wrote:
Implement the OpenMP pinned memory trait on Linux hosts using the mlock syscall. Pinned allocations are performed using mmap, not malloc, to ensure that they can be unpinned safely when freed. This implementation will work OK for page-scale allocations, and finer-grained allocations will be implemented in a future patch.
Can you also update libgomp.texi, i.e. https://gcc.gnu.org/onlinedocs/libgomp/Memory-allocation.html to document that and how pinning works on Linux? I think I proposed in the low-latency patch to add a @ref to https://gcc.gnu.org/onlinedocs/libgomp/Offload-Target-Specifics.html and add there the nvptx and gcn specific memory-allocation handling. * * * I think the following is not ideal in the pinning-is-not-supported case:
@@ -434,10 +435,6 @@ omp_init_allocator (omp_memspace_handle_t memspace, int ntraits, } #endif - /* No support for this so far. */ - if (data.pinned) - return omp_null_allocator; - ret = gomp_malloc (sizeof (struct omp_allocator_data)); *ret = data; #ifndef HAVE_SYNC_BUILTINS
which continues as: gomp_mutex_init (&ret->lock); #endif return (omp_allocator_handle_t) ret; } Therefore: This code will always return a handle, even if pinning is not supported. I had expected that the following happens: "Otherwise if an allocator based on the requirements cannot be created then the special omp_null_allocator handle is returned." Using this allocator on a system where libgomp does not support pinning will always fail with the fallback, which could be either of: default_mem_fb (= omp_atv_default), null_fb, abort_fb, allocator_fb (+ fb_data). Thus, the current code kind of works if the fallback is (explicitly or implicitly) omp_atv_default or (explicitly) default_mem_fb – but otherwise, allocations will always fail, most prominently with "abort_fb". * * *
The following definitions (ab)use comma operators to avoid unused variable errors. */ #ifndef MEMSPACE_ALLOC -#define MEMSPACE_ALLOC(MEMSPACE, SIZE) \ - malloc (((void)(MEMSPACE), (SIZE))) +#define MEMSPACE_ALLOC(MEMSPACE, SIZE, PIN) \ + (PIN ? NULL : malloc (((void)(MEMSPACE), (SIZE))))
I wonder whether the comment should note something like: All of the following will return NULL or are a no-op when pinning is enabled (unless overridden). And the following looks odd:
+#define MEMSPACE_FREE(MEMSPACE, ADDR, SIZE, PIN) \ + (PIN ? NULL : free (((void)(MEMSPACE), (void)(SIZE), (ADDR)))) #endif
Contrary to the other functions that return a value (pointer), 'free' "returns" 'void'. And, indeed, the compiler might complain: test.c:5:14: warning: ISO C forbids conditional expr with only one void side [-Wpedantic] 5 | pin ? NULL : free(p); | ^ While (void)NULL works, I think the simplest is to just use an 'if (pin) free(...)'. * * *
+linux_memspace_alloc (omp_memspace_handle_t memspace, size_t size, int pin) +{ + (void)memspace; + + if (pin) + { + void *addr = mmap (NULL, size, PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
Maybe add a comment noting that mmap returns nullified memory – as required for the calloc call. (The linux man page states for MAP_ANONYMOUS: "The mapping ...; its contents are initialized to zero." while POSIX has: "The system shall always zero-fill any partial page at the end of an object.", which should be all in case of addr = NULL.)
+ if (mlock (addr, size)) + { + gomp_debug (0, "libgomp: failed to pin memory (ulimit too low?)\n");
I wonder whether the size should be included in the output - it might help to debug to know whether "just" 1 kiB or 20 GB were tried to be pinned. For the comment, I wonder whether it should mention RLIMIT_MEMLOCK or 'lockable memory' instead or in addition to ulimit to be clearer. (csh uses 'limit' instead of ulimit, but POSIX has both as function and as shell (sh, bash) ulimit, i.e. using 'ulimit' is fine. Albeit 'ulimit()' has been deprecated in favour of {s,g}etrlimit().) * * *
+linux_memspace_realloc (omp_memspace_handle_t memspace, void *addr, + size_t oldsize, size_t size, int oldpin, int pin)
(I was wondering about some corner cases, but I think the caller, i.e. omp_realloc, already takes care of those.) * * * Regarding the testcases: I think those are fine for __linux__ with only very minor assumptions (like that procfs is mounted). However, for !defined(__linux__) it seems as if there are issues: * * * Regarding alloc-pinned-3.c:
omp_allocator_handle_t allocator1 = omp_init_allocator (omp_default_mem_space, 2, traits1);
The code assumes that the return value cannot be omp_null_allocator. But if no pinning is supported, it will be NULL. (Requested change from above.)
/* Ensure that the limit is smaller than the allocation. */ set_pin_limit (SIZE / 2); ... // Should fail void *p = omp_alloc (SIZE, allocator1); if (p) abort ();
The 'set_pin_limit ()' call is a no-op except on Linux. Thus, the allocation may fail or not, depending whether the SIZE of ~39 MiB will is accepted or not. Given that my Linux defaults to 4,075,856 kbytes and, for non-Linux, SIZE = 40,000 kiB, it looks as if it might well work on larger systems - returning a non-null pointer.
// Should fall back p = omp_alloc (SIZE, allocator2); if (!p) abort ();
How about calling 'omp_free' to avoid leaking + to check that the right 'free' is called? (multiple times) * * * Regarding alloc-pinned-1.c:
#ifdef __linux__ ... #else #define PAGE_SIZE 1 /* unknown */ ... #endif ... /* Allocate at least a page each time, allowing space for overhead, but stay within the ulimit. */ const int SIZE = PAGE_SIZE - 128;
Are you sure that a SIZE of (1 - 128) = -127 is a good size? For Linux:
#define CHECK_SIZE(SIZE) { \ struct rlimit limit; \ if (getrlimit (RLIMIT_MEMLOCK, &limit) \ || limit.rlim_cur <= SIZE) \ fprintf (stderr, "unsufficient lockable memory; please increase ulimit\n"); \ }
I wonder whether you shouldn't add an 'abort()' here. Otherwise, it will simply proceed - init_allocator will likely work, omp_alloc as well due to the fallback and then: // Sanity check if (get_pinned_mem () != 0) abort (); will fail. That works as well, but is somewhat odd. Otherwise, as remarked above, you need to handle omp_init_allocator returning omp_null_allocator (possibly aborting when __linux__ as we know that it should be supported there.)
int amount = get_pinned_mem (); if (amount == 0) abort ();
This one will always fail - except for __linux__, independent whether pinning worked or not. (Three times) Otherwise, it looks good to me. Tobias ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955