On Mon, 25 Feb 2019 at 10:24, Natanael Copa <nc...@alpinelinux.org> wrote: > > On Sat, 23 Feb 2019 16:18:15 +0000 > Peter Maydell <peter.mayd...@linaro.org> wrote: > > > On Sat, 23 Feb 2019 at 16:05, Natanael Copa <nc...@alpinelinux.org> wrote: > > > I was thinking of something in the lines of: > > > > > > typedef volatile uint16_t __attribute__((__may_alias__)) > > > volatile_uint16_t; > > > static inline int lduw_he_p(const void *ptr) > > > { > > > volatile_uint16_t r = *(volatile_uint16_t*)ptr; > > > return r; > > > } > > > > This won't correctly handle accesses with unaligned pointers, > > I'm afraid. We rely on these functions correctly working > > with pointers that are potentially unaligned. > > Well, current code does not even handle access with aligned pointers, > depending on FORTIFY_SOURCE implementation.
It correctly handles aligned and unaligned pointers for the API guarantees that the function in QEMU provides, which is to say "definitely works on any kind of aligned or unaligned pointer, not guaranteed to be atomic". Unfortunately some code in QEMU is implicitly assuming it is atomic, which this QEMU function does not guarantee -- it just happens to provide that most of the time. > My thinking here is that we depend on assumption that compiler will > remove the memcpy call, so maybe find other way to generate same > assembly, while still depend on compiler optimization. > I did some tests and compared the assembly output. The compiler will > generate same assembly if volatile is not used. The attribute > __may_alias__ does not seem to make any difference. So the following > function generates exactly same assembly: > > static inline int lduw_he_p(const void *ptr) > { > uint16_t r = *(uint16_t*)ptr; > return r; > } This still is not guaranteed to work on unaligned pointers. (For instance it probably won't work on SPARC hosts.) More generally there is no way to have a single function that is guaranteed to handle unaligned pointers and also guaranteed to produce an atomic access on all host architectures that we support, because not all host architectures allow you to do both with the same instruction sequence. So you can't just change this function to make it provide atomic access without breaking the other guarantee it is providing. The long term fix for this is that we need to separate out our APIs so we can have a family of functions that guarantee to work on unaligned pointers, and a family that guarantee to work atomically, and calling code can use the one that provides the semantics it requires. The short term fix is to fix your toolchain/compilation environment options so that it isn't trying to override the definition of memcpy(). thanks -- PMM