https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83201

--- Comment #24 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Sean from comment #23)
> Sorry for digging up the past, but recently ran into a related
> implementation issue in OpenBSD's qsort implementation and came across this
> discussion.  While I understand the proposed patch was rejected, it looks
> like it also degraded the implementation to single byte swaps.
> 
> Admittedly untested, but a possible solution that I think would make LTO
> happy (avoids aliasing the pointer) and doesn't change the implementation
> profile would be to cast it differently since it's always dealing with a
> simple type alignment check on long/int/char.  Something like this:
> 
> #define SWAPINIT(TYPE, a, es) swaptype_ ## TYPE = (uintptr_t)a %
> sizeof(TYPE) || es % sizeof(TYPE) ? 2 : es == sizeof(TYPE) ? 0 : 1;
> 
> Maybe a more acceptable alternative compromise since it would have a nearly
> identical performance profile with it still swap-optimizing int and long.

Note the issue is not the way the type is computed but the memory
accesses done via

#define swap(a, b)                              \
        if (swaptype_long == 0) {               \
                long t = *(long *)(a);          \
                *(long *)(a) = *(long *)(b);    \
                *(long *)(b) = t;               \
        } else if (swaptype_int == 0) {         \
                int t = *(int *)(a);            \
                *(int *)(a) = *(int *)(b);      \
                *(int *)(b) = t;                \

there exists non-standard ways to preserve the access pattern like

typedef int notbaa_int __attribute__((may_alias));
typedef long notbaa_long __attribute__((may_alias));
#define swap(a, b)                              \
        if (swaptype_long == 0) {               \
                long t = *(notbaa_long *)(a);          \
                *(notbaa_long *)(a) = *(notbaa_long *)(b);    \
                *(notbaa_long *)(b) = t;               \
        } else if (swaptype_int == 0) {         \
                int t = *(notbaa_int *)(a);            \
                *(notbaa_int *)(a) = *(notbaa_int *)(b);      \
                *(notbaa_int *)(b) = t;                \

or portable ones like doing

  long t;
  memcpy (&t, a, sizeof (long));
  memcpy (a, b, sizeof (long));
  memcpy (b, &t, sizeof (long));

and hoping for the compiler to optimize this well (GCC will do).

Note at time I proposed the change to the originating project (freebsd?) and
it was accepted there.

Reply via email to