https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96159
--- Comment #1 from Martin Uecker <uecker at eecs dot berkeley.edu> --- On x86-64 the following struct has alignment 4 but gcc creates a single mov instruction which according to my understanding may fail to be atomic when it crosses a cache line boundary. Documentation seems to imply that __atomic_load is supposed to work for all types. It also compiles without warning. typedef struct { float re; float im; } foo_t; int foo_align(void) { foo_t x; return _Alignof(x); } foo_t foo_load(foo_t* x) { foo_t r; __atomic_load(x, &r, __ATOMIC_SEQ_CST); return r; } assembler: foo_align: mov eax, 4 ret foo_load: movq xmm0, QWORD PTR [rdi] ret