Hi Andrew,

Side remark:

-#define MEMSPACE_CALLOC(MEMSPACE, SIZE) \ - calloc (1,
(((void)(MEMSPACE), (SIZE))))

This fits a bit more to previous patch, but I wonder whether that should
use (MEMSPACE, NMEMB, SIZE) instead - to fit to the actual calloc arguments.

I think the main/only difference between SIZE and NMEMB and SIZE is that
"If the multiplication of nmemb and size would result in integer overflow,
then calloc() returns an error." (Linux manpage)

However, while this wording seems to be neither in POSIX nor in the OpenMP
spec. There was some alignment discussion at https://gcc.gnu.org/PR112364
regarding whether C (since C23) has a different alignment for
calloc(1, n) vs. calloc(n,1) but Joseph believes it doen't.

Thus, this is more bikesheding than making a real difference.

* * *

[somehow my email program caused some odd formatting issues when I
hit some odd key combo. I am not sure whether I fully fixed it or not;
sorry if some parts look odd.]


On 23.08.23 16:14, Andrew Stubbs wrote:
Implement the OpenMP pinned memory trait on Linux hosts using the
mlock syscall.  Pinned allocations are performed using mmap, not
malloc, to ensure that they can be unpinned safely when freed.

This implementation will work OK for page-scale allocations, and
finer-grained allocations will be implemented in a future patch.

Can you also update libgomp.texi, i.e. 
https://gcc.gnu.org/onlinedocs/libgomp/Memory-allocation.html
to document that and how pinning works on Linux?

I think I proposed in the low-latency patch to add a @ref to
https://gcc.gnu.org/onlinedocs/libgomp/Offload-Target-Specifics.html and
add there the nvptx and gcn specific memory-allocation handling.

* * *

I think the following is not ideal in the pinning-is-not-supported case:

@@ -434,10 +435,6 @@ omp_init_allocator (omp_memspace_handle_t
memspace, int ntraits, } #endif

-  /* No support for this so far.  */
-  if (data.pinned)
-    return omp_null_allocator;
- ret = gomp_malloc (sizeof (struct omp_allocator_data));
*ret = data;
#ifndef HAVE_SYNC_BUILTINS

which continues as:
  gomp_mutex_init (&ret->lock);
#endif
  return (omp_allocator_handle_t) ret;
}


Therefore:

This code will always return a handle, even if pinning is not supported.
I had expected that the following happens:

"Otherwise if an allocator based on the requirements cannot be created
then the special omp_null_allocator handle is returned."

Using this allocator on a system where libgomp does not support pinning
will always fail with the fallback, which could be either of:

default_mem_fb (= omp_atv_default), null_fb, abort_fb, allocator_fb (+ fb_data).

Thus, the current code kind of works if the fallback is (explicitly or 
implicitly)
omp_atv_default or (explicitly) default_mem_fb – but otherwise, allocations will
always fail, most prominently with "abort_fb".

* * *
The following definitions (ab)use comma operators to avoid unused
variable errors. */ #ifndef MEMSPACE_ALLOC -#define
MEMSPACE_ALLOC(MEMSPACE, SIZE) \ - malloc (((void)(MEMSPACE), (SIZE)))
+#define MEMSPACE_ALLOC(MEMSPACE, SIZE, PIN) \ + (PIN ? NULL : malloc
(((void)(MEMSPACE), (SIZE))))

I wonder whether the comment should note something like: All of the
following will return NULL or are a no-op when pinning is enabled
(unless overridden).

And the following looks odd:

+#define MEMSPACE_FREE(MEMSPACE, ADDR, SIZE, PIN) \
+  (PIN ? NULL :  free (((void)(MEMSPACE), (void)(SIZE), (ADDR))))
#endif

Contrary to the other functions that return a value (pointer), 'free' "returns" 
'void'.

And, indeed, the compiler might complain:

test.c:5:14: warning: ISO C forbids conditional expr with only one void side 
[-Wpedantic]
    5 |   pin ? NULL : free(p);
      |              ^

While (void)NULL works, I think the simplest is to just use an 'if (pin) 
free(...)'.

* * *
+linux_memspace_alloc (omp_memspace_handle_t memspace, size_t size,
int pin) +{ + (void)memspace; + + if (pin) + { + void *addr = mmap
(NULL, size, PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANONYMOUS,
-1, 0);


Maybe add a comment noting that mmap returns nullified memory – as required for 
the calloc call.

(The linux man page states for MAP_ANONYMOUS: "The mapping ...; its contents are 
initialized to zero."
while POSIX has: "The system shall always zero-fill any partial page at the end of 
an object.", which
should be all in case of addr = NULL.)

+      if (mlock (addr, size))
+     { +       gomp_debug (0, "libgomp: failed to pin memory (ulimit too 
low?)\n");

I wonder whether the size should be included in the output - it might help to 
debug to know
whether "just" 1 kiB or 20 GB were tried to be pinned.

For the comment, I wonder whether it should mention RLIMIT_MEMLOCK or 'lockable 
memory' instead
or in addition to ulimit to be clearer.
(csh uses 'limit' instead of ulimit, but POSIX has both as function and as 
shell (sh, bash) ulimit,
i.e. using 'ulimit' is fine. Albeit 'ulimit()' has been deprecated in favour of 
{s,g}etrlimit().)

* * *

+linux_memspace_realloc (omp_memspace_handle_t memspace, void *addr,
 +                    size_t oldsize, size_t size, int oldpin, int pin)

(I was wondering about some corner cases, but I think the caller, i.e. 
omp_realloc,
already takes care of those.)

* * *

Regarding the testcases:

I think those are fine for __linux__ with only very minor assumptions (like 
that procfs is mounted).

However, for !defined(__linux__) it seems as if there are issues:

* * *

Regarding alloc-pinned-3.c:

omp_allocator_handle_t allocator1 = omp_init_allocator
(omp_default_mem_space, 2, traits1);

The code assumes that the return value cannot be omp_null_allocator.
But if no pinning is supported, it will be NULL. (Requested change from above.)

/* Ensure that the limit is smaller than the allocation. */
set_pin_limit (SIZE / 2); ... // Should fail void *p = omp_alloc
(SIZE, allocator1); if (p) abort ();


The 'set_pin_limit ()' call is a no-op except on Linux. Thus, the allocation 
may fail or not,
depending whether the SIZE of ~39 MiB will is accepted or not.

Given that my Linux defaults to 4,075,856 kbytes and, for non-Linux, SIZE = 
40,000 kiB,
it looks as if it might well work on larger systems - returning a non-null 
pointer.

// Should fall back p = omp_alloc (SIZE, allocator2); if (!p) abort ();


How about calling 'omp_free' to avoid leaking + to check that the right 'free'
is called?

(multiple times)

* * *

Regarding alloc-pinned-1.c:

#ifdef __linux__ ... #else #define PAGE_SIZE 1 /* unknown */ ...
#endif ... /* Allocate at least a page each time, allowing space for
overhead, but stay within the ulimit. */ const int SIZE = PAGE_SIZE -
128;

Are you sure that a SIZE of (1 - 128) = -127 is a good size?


For Linux:

#define CHECK_SIZE(SIZE) { \ struct rlimit limit; \ if (getrlimit
(RLIMIT_MEMLOCK, &limit) \ || limit.rlim_cur <= SIZE) \ fprintf
(stderr, "unsufficient lockable memory; please increase ulimit\n"); \ }

I wonder whether you shouldn't add an 'abort()' here.

Otherwise, it will simply proceed - init_allocator will likely work, omp_alloc 
as
well due to the fallback and then:

  // Sanity check
  if (get_pinned_mem () != 0)
    abort ();

will fail. That works as well, but is somewhat odd.


Otherwise, as remarked above, you need to handle omp_init_allocator returning
omp_null_allocator (possibly aborting when __linux__ as we know that it should 
be
supported there.)


int amount = get_pinned_mem (); if (amount == 0) abort ();

This one will always fail - except for __linux__, independent whether pinning 
worked or not.

(Three times)



Otherwise, it looks good to me.

Tobias

-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955

Reply via email to