On Fri, 8 May 2020 16:18:34 +0200 Thomas Schwinge <tho...@codesourcery.com> wrote:
> >> Can you please explain (textually?) how this checking (design per > >> your textual description below) is working in context of mixed > >> OpenACC structured ("S") and dynamic ("D") reference counts? For > >> example: > >> > >> // S: 0, D: 0 > >> > >> #pragma acc enter data copyin ([data]) // copyin; S: 0, D: 1 > >> > >> acc_copyin ([data]) // no-op; S: 0, D: 2 > > > > Unfortunately it's not quite that simple. > > Does "not quite that simple" apply to (a) your reference count > consistency checking specifically, or to (b) libgomp implementation > peculiarities, or to (c) OpenACC reference counting semantics? The OpenACC semantics (c) are indeed simple. I was referring to (b). Having (a), i.e. machine-checkable invariants, is nice but I suppose not vital. We do need to know exactly what the reference counting model is, though! > The latter (c) certainly are meant to be that simple (see OpenACC 3.0, > 2.6.7. "Reference Counters", etc.), and these are what my example > illustrated. Aha, right. > Remember the conceptually simple implementation that we had before > your commit 378da98fcc907d05002bcd3d6ff7951f0cf485e5 "OpenACC > reference count overhaul", which to the best of my knowledge would be > explained as follows: > > - The OpenACC structured reference count corresponds to libgomp > 'key->refcount - key->dynamic_refcount'. > - The OpenACC dynamic reference count corresponds to libgomp > 'key->dynamic_refcount'. > - Thus, libgomp 'key->refcount' corresponds to the sum of OpenACC > structured and dynamic reference counts. > > ..., and this seemed to have worked fine? (... aside from a few > specific bugs that we fixed.) Certain things weren't right though, but that only showed up "in anger" during the development of the manual deep-copy support. IIRC in particular, for dynamic data mappings, "group mappings" would only try to track the reference count for the first mapping in the group (e.g. those comprising GOMP_MAP_TO/FROM, GOMP_MAP_TO_PSET and then GOMP_MAP_POINTER would only try to refcount the GOMP_MAP_TO/FROM). I'm not at all sure that the old implementation got the "subtle case" of the target_mem_desc returned from gomp_map_vars "owning" the block of device memory -- or not -- correct, at all. Hence writing the verification code, to figure out what the invariants actually were supposed to be. Maybe the questions to ask are: 1. With the old scheme, how do you calculate how much to decrement the "structured" key->refcount by when you see a finalize operation? Is that always correct? Do you need to know which target_mem_descs' variable lists refer back to this key? 2. What do you do with the target_mem_desc returned from gomp_map_vars{_internal} in the dynamic data-mapping case? Is it then always freed at the right point? (We used to record it in an off-side linked list, but that scheme had its own problems.) > Doing it like this meant that 'libgomp/target.c' didn't have to care > about the OpenACC-specific 'key->dynamic_refcount' at all. Of > course, we could instead have implemented it as follows: > > - The OpenACC structured reference count corresponds to libgomp > 'key->refcount'. > - The OpenACC dynamic reference count corresponds to libgomp > 'key->dynamic_refcount'. > > ..., which would've make some things simpler in 'libgomp/oacc-mem.c', > but 'libgomp/target.c' then would've had to care about the > OpenACC-specific 'key->dynamic_refcount'. Yeah, I think that implies quite heavy surgery. In terms of the key->refcount field, the overhauled OpenACC implementation is more-or-less compatible with the way OpenMP dynamic data mapping works with that counter, FWIW, and switching to a pure split between key->refcount for structured and key->dynamic_refcount for dynamic mappings would lose that. Unless we switched OpenMP over to the new scheme too, of course. > Now, explicitly asking the other way round: with your "overhaul", > have we now made the libgomp implementation different (and more > complicated) just to make it amenable to your reference count > consistency checking (a), or are there any actual "functional" > reasons (b) that we'd not yet considered in the old scheme? The overhaul wasn't just done for arbitrary reasons -- trying to rationalise the reference counting was vital in getting the manual deep-copy support working properly. Indeed some bugs have been fixed since then, but I'm not sure if that's all that was missing. Are you advocating sticking with/switching back to the "old" scheme? If so, we'd have to try to make that work with the manual deep-copy implementation. There are still some awkward corners (particularly wrt. unhandled cases of mixing structured & dynamic reference counts), but I don't think those would be any easier with the "old" scheme -- and indeed may be harder to track down without the ability to do automated RC checking. Thanks, Julian