Re: [07/23] Add a class that multiplexes two pointer types

Martin Sebor via Gcc-patches Fri, 27 Nov 2020 16:17:22 -0800

On 11/26/20 10:06 AM, Richard Sandiford wrote:

Martin Sebor <mse...@gmail.com> writes:

I do have one concern: the tendency to prioritize efficiency
over safety (this can be said about most GCC code). Specifically
in this class, the address bit twiddling makes me uneasy.  I don't
think the object model in either language (certainly not C but
I don't have the impression C++ either) makes it unequivocally
valid.  On the contrary, I'd say many of us interpret the current
rules as leaving it undefined.  There are efforts to sanction
this sort of thing under some conditions (e.g, the C object
model proposal) but they have not been adopted yet.  I think
we should try to avoid exploiting these dark corners in new
code.


I'd tried to stick to operations that I thought were well-defined.
The primitives being used are really:

(1) convert a T1* or T2* to char*
(2) increment an unincremented char*
(3) decrement an incremented char*
(4) convert a char* back to T1* or T2*
(5) convert a char* to an intptr_t in order to test its low bit


All those are valid as long as the pointer points into the same
object, both before and after.

I thought (1) and (4) were allowed.  At least, [basic.compound] says
that void* must be able to hold any object pointer and that it must have
the same representation as char*, so I thought the conversion in (1) was
guaranteed to be representable.  And (4) only ever undoes (1): it only
converts the result of (1) back to the original pointer type.

For (2) and (3), the incremented pointer will still be within the
containing object, so I thought it would be well-defined.  Here too,
(3) only ever undoes (2): it only decrements a pointer that had
previously been incremented.

One thing I'd deliberately tried to avoid was converting integers
“back” to pointers, because that seemed like a more dangerous thing.
That's why:

+template<typename T1, typename T2>
+inline T2 *
+pointer_mux<T1, T2>::second_or_null () const
+{
+  // Micro optimization that's effective as of GCC 11: compute the value
+  // of the second pointer as an integer and test that, so that the integer
+  // result can be reused as the pointer and so that all computation can
+  // happen before a branch on null.  This reduces the number of branches
+  // needed for loops.
+  return uintptr_t (m_ptr - 1) & 1 ? nullptr : known_second ();


This is only valid if m_ptr points to the second byte of an object.
If it points to the first byte of A then it's invalid.  This would
make the test valid but the result strictly unspecified (though in
practice I'd expect it to do what you expect):

  return (uintptr_t (m_ptr) - 1) & 1 ? nullptr : known_second ();

+}


is written in a somewhat indirect way.

Are your concerns with the primitives above, or is the problem with
something else?


My initial impression was that the code stored information in
the least significant bits of the pointer.  Now that I've looked
at it again I still think it does that, except not directly but
indirectly, by storing a pointer to either the first byte of one
object (A) or to the second byte of another (B).  Correct?  (If
so, I would recommend to expand the documentation to make this
clearer.)

It's clever (a little too clever for my taste) but other than
the m_ptr - 1 part above I can't think of anything undefined
about it.  My main concern also wasn't with the bit twiddling
as such but with hiding the identity of the objects by manipulating
the representation of the pointers via integer operations.  Since
(if) the code doesn't really do that, it may be less of a problem
than I thought.

Martin

Re: [07/23] Add a class that multiplexes two pointer types

Reply via email to