include

Bruce Evans Tue, 28 Feb 2012 23:45:25 -0800

On Wed, 29 Feb 2012, Bruce Evans wrote:

I cleaned this up a bit according to ideas in my previous mails, and
added a comment about the confusing use of __bswap64_const() (now
named __bswap64_gen()) in __bswap64_var():


A minor problem with only having a macro version for __bswap64() turned
up:

% -#define      __bswap16_const(_x)     (__uint16_t)((_x) << 8 | (_x) >> 8)
% -
% -#define      __bswap16(_x)                   \
% -     (__builtin_constant_p(_x) ?     \
% -         __bswap16_const((__uint16_t)(_x)) : __bswap16_var(_x))
...
% +#define      ___bswap16(x)   (__uint16_t)((x) << 8 | (x) >> 8)
% +#define      __bswap16(x)    (___bswap16((__uint16_t)(x)))


When x a non-volatile variable, gcc and clang produce the good code
"rolw $8,x" for "x = __bswap16(x);" on short x.  But when x a a volatile
variable, gcc and clang produce fairly horrible code, with 2 loads of
x corresponding to the 2 accesses to x.  This is probably required by
volatile semantics, and is a problem for all unsafe macros, especially
when their name says that they are safe (oops).  When __bswap16 is
implemented as an inline function for the var case like it used to be,
it only loads x once and there are no problems with volatile variables.
Optimizing to "rolw $8,x" might still be possible iff x is not volatile,
but load-modify-store is probably better anyway.

So any macro version must use gcc features to be safe.  The following
seems to work:

#define __bswap16(x)    __extension__ ({ __uint16_t __x = x;
        (__uint16_t)(__x << 8 | __x >> 8); })

clang now produces "rolw $8,x" when x is volatile.  This seems to
violate volatile semantics.  gcc produces load-rolw-store then.  Both
produce "rolw $8,x" when x is not volatile.

Bruce
_______________________________________________
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"

Re: svn commit: r232266 - in head/sys: amd64/include i386/include pc98/include x86/include

Reply via email to