https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79319
--- Comment #4 from Erik Hofman <erik at ehofman dot com> --- This was just the shortest snippet of code that showed the situation. The reason for 32-byte alignment is that I use it with AVX code and wanted the fastest possible assignment from a float vector.