On Wed, Aug 12, 2020 at 3:42 PM Tijl Coosemans <t...@freebsd.org> wrote:
> On Wed, 12 Aug 2020 09:44:25 +0400 Gleb Popov <arr...@freebsd.org> wrote: > > On Wed, Aug 12, 2020 at 9:21 AM Gleb Popov <arr...@freebsd.org> wrote: > >> Indeed, this looks like a culprit! When compiling using first command > line > >> (the long one) I get following warnings: > >> > >> > /wrkdirs/usr/ports/lang/ghc/work/ghc-8.10.1/libraries/ghc-prim/cbits/atomic.c:369:10: > >> warning: misaligned atomic operation may incur significant performance > >> penalty [-Watomic-alignment] > >> return __atomic_load_n((StgWord64 *) x, __ATOMIC_SEQ_CST); > >> ^ > >> > /wrkdirs/usr/ports/lang/ghc/work/ghc-8.10.1/libraries/ghc-prim/cbits/atomic.c:417:3: > >> warning: misaligned atomic operation may incur significant performance > >> penalty [-Watomic-alignment] > >> __atomic_store_n((StgWord64 *) x, (StgWord64) val, __ATOMIC_SEQ_CST); > >> ^ > >> 2 warnings generated. > >> > >> I guess this basically means "I'm emitting a call there". So, what's the > >> correct fix in this case? > > > > I just noticed that Clang emits these warnings (and the call instruction) > > only for functions handling StgWord64 type. For the same code with > > StgWord32, like > > > > StgWord > > hs_atomicread32(StgWord x) > > { > > #if HAVE_C11_ATOMICS > > return __atomic_load_n((StgWord32 *) x, __ATOMIC_SEQ_CST); > > #else > > return __sync_add_and_fetch((StgWord32 *) x, 0); > > #endif > > } > > > > no warning is emitted as well as no call. > > > > How does clang infer alignment in these cases? What's so special about > > StgWord64? > > StgWord64 is uint64_t which is unsigned long long which is 4 byte > aligned on i386. Clang wants 8 byte alignment to use the fildll > instruction. > > You could change the definition of the StgWord64 type to look like: > > typedef uint64_t StgWord64 __attribute__((aligned(8))); > > But this only works if all calls to hs_atomicread64 pass a StgWord64 > as argument and not some other 64 bit value. > > > Another solution I already mentioned in a previous message: replace > HAVE_C11_ATOMICS with 0 in hs_atomicread64 so it uses > __sync_add_and_fetch instead of __atomic_load_n. That uses the > cmpxchg8b instruction which doesn't care about alignment. It's much > slower but I guess 64 bit atomic loads are rare enough that this > doesn't matter much. > Yep, your suggested workaround worked, many thanks. Still, I'm curious where I can get __atomic_load_n in an i386 case, if I don't want to pull in gcc? _______________________________________________ freebsd-toolchain@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-toolchain To unsubscribe, send any mail to "freebsd-toolchain-unsubscr...@freebsd.org"