On 9/27/18 5:55 PM, Peter Maydell wrote: > If QEMU is compiled with clang-7 it results in the warning: > > hw/display/qxl.c:1884:19: error: misaligned or large atomic operation > may incur significant performance penalty [-Werror,-Watomic-alignment] > old_pending = atomic_fetch_or(&d->ram->int_pending, le_events); > ^ > > This is because the Spice headers forgot to define the QXLRam struct > with the '__aligned__(4)' attribute. clang 7 and newer will thus > warn that the access here to int_pending might not be 4-aligned > (because the QXLRam object d->ram points at might start at a > misaligned address). In fact we set up d->ram in init_qxl_ram() so > it always starts at a 4K boundary, so we know the atomic access here > is OK. > > Newer Spice versions (with Spice commit > beda5ec7a6848be20c0cac2a9a8ef2a41e8069c1) will fix the bug; > for older Spice versions, work around it by telling the compiler > explicitly that the alignment is OK using __builtin_assume_aligned(). > > Signed-off-by: Peter Maydell <peter.mayd...@linaro.org>
Thanks! Reviewed-by: Philippe Mathieu-Daudé <f4...@amsat.org> Tested-by: Philippe Mathieu-Daudé <f4...@amsat.org> > --- > include/qemu/compiler.h | 9 +++++++++ > hw/display/qxl.c | 26 +++++++++++++++++++++++++- > 2 files changed, 34 insertions(+), 1 deletion(-) > > diff --git a/include/qemu/compiler.h b/include/qemu/compiler.h > index 5843812710c..bf47e7bee4e 100644 > --- a/include/qemu/compiler.h > +++ b/include/qemu/compiler.h > @@ -122,6 +122,15 @@ > #ifndef __has_feature > #define __has_feature(x) 0 /* compatibility with non-clang compilers */ > #endif > + > +#ifndef __has_builtin > +#define __has_builtin(x) 0 /* compatibility with non-clang compilers */ > +#endif > + > +#if __has_builtin(__builtin_assume_aligned) || QEMU_GNUC_PREREQ(4, 7) > +#define HAS_ASSUME_ALIGNED > +#endif > + > /* Implement C11 _Generic via GCC builtins. Example: > * > * QEMU_GENERIC(x, (float, sinf), (long double, sinl), sin) (x) > diff --git a/hw/display/qxl.c b/hw/display/qxl.c > index 8e9135d9c67..5702e8645d3 100644 > --- a/hw/display/qxl.c > +++ b/hw/display/qxl.c > @@ -1881,7 +1881,31 @@ static void qxl_send_events(PCIQXLDevice *d, uint32_t > events) > trace_qxl_send_events_vm_stopped(d->id, events); > return; > } > - old_pending = atomic_fetch_or(&d->ram->int_pending, le_events); > + /* > + * Older versions of Spice forgot to define the QXLRam struct > + * with the '__aligned__(4)' attribute. clang 7 and newer will > + * thus warn that atomic_fetch_or(&d->ram->int_pending, ...) > + * might be a misaligned atomic access, and will generate an > + * out-of-line call for it, which results in a link error since > + * we don't currently link against libatomic. > + * > + * In fact we set up d->ram in init_qxl_ram() so it always starts > + * at a 4K boundary, so we know that &d->ram->int_pending is > + * naturally aligned for a uint32_t. Newer Spice versions > + * (with Spice commit beda5ec7a6848be20c0cac2a9a8ef2a41e8069c1) > + * will fix the bug directly. To deal with older versions, > + * we tell the compiler to assume the address really is aligned. > + * Any compiler which cares about the misalignment will have > + * __builtin_assume_aligned. > + */ > +#ifdef HAS_ASSUME_ALIGNED > +#define ALIGNED_UINT32_PTR(P) ((uint32_t *)__builtin_assume_aligned(P, 4)) > +#else > +#define ALIGNED_UINT32_PTR(P) ((uint32_t *)P) > +#endif > + > + old_pending = atomic_fetch_or(ALIGNED_UINT32_PTR(&d->ram->int_pending), > + le_events); > if ((old_pending & le_events) == le_events) { > return; > } >