https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101061
--- Comment #11 from rguenther at suse dot de <rguenther at suse dot de> --- On Wed, 16 Jun 2021, alexander.gr...@tu-dresden.de wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101061 > > --- Comment #9 from Alexander Grund <alexander.gr...@tu-dresden.de> --- > > Note that when the union type is visible in the access path then GCC allows > > punning without further restrictions. Thus the accesses as written above > > are OK. > > Now I have to ask again for clarification because this seemingly contradicts > your previous statement. So let me try to state the previous question again: > > Given a pointer `slot_type* slot` (where slot_type is the union as defined > before) is it legal to access `slot->value.first`, > `slot->mutable_value.first`, > `slot->key` interchangeably? > > In all 3 accesses the union is visible in the access path (if I understood > that > correctly) so the type punning should be ok, shouldn't it? Yes - but it routinely happens that in C++ you end up returning a reference to the union member as abstraction and an access to that reference does _not_ have the union visible but only the member. > > the circumstances have been so there's incentive to do an invalid > > transform... > > So is this indeed a GCC bug which may be fixed or gone latent? I don't think so, but we cannot rule this out at this point (but see above). > Otherwise, maybe the construction is the problem? What Abseil basically > does is allocating a (suitably aligned) buffer of chars and in-place > constructing those slot_type unions/pairs there as needed (i.e. similar > to std::vector) After abstractions what basically happens is: > > // alloc buffer done, then: > slot_type* slot_buffer = reinterpret_cast<slot_type*>(buffer); > > // on emplace: > size_t x = ...; > slot_type* slot = slot_buffer + x; > new(slot) slot_type; > new(&slot->mutable_value) pair<const K, V>(k, v); > slot_type* slot2 = slot_buffer + x; // same as before, actually done through > the iterator > idx_vec[i] = slot2->value.second; so if slot_type is the union type then new(&slot->mutable_value) pair<const K, V>(k, v) looks problematic since that calls operator new with a pointer to the union member and the actual construction receives &slot->mutable_value as address of type decltype (slot->mutable_value) * which falls foul of the union punning rule. I'm not sure how one can solve this issue with using placement new but are unions not sufficiently restricted so that copy assignment should work (and activate the appropriate union member)? Thus slot->mutable_value = pair<const K, V>(k, v); ? The compiler should hopefully be able to elide the copy. > My suspicion is, that GCC loads the value of slot2 before constructing the > object, at least sometimes. Printing the values in the vector often shows a > run > of zeroes before some other (potentially correct) values. I'd guess the > correct > values happen, when no insertion took place, i.e. the construction was done > already in a prior loop iteration. Yes, GCC would simply "skip" the offending placement new and if it finds a suitable definition before it would replace the load with said definition. To expand on the placement new issue if you for example have struct Y { Y(int); }; union X { int i; float f; }; void foo (void *p) { X *q = reinterpret_cast <X *> (p); new (&q->i) Y (1); } we end up with (early unoptimized IL): void __GIMPLE (void * p) { void * D_6558; void * D_6557; union X * q; q = p; D_6558 = &q->i; D_6557 = operator new (1ul, D_6558); try { Y::Y (D_6571, 1); } catch { operator delete (D_6557, D_6558); } } where 'operator new' is the usual noop that just returns the passed pointer here. But what you can see is that the resulting pointer is used for the initialization and not the placement address as literally written in source. And even then, the CTOR that is involved of course sees only 'this' which is a plain pointer to its class type and all accesses in the CTOR will not have the union visible. That might be unexpected but I think it's a quite natural consequence of C++ abstraction :/