https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308
--- Comment #55 from wilco at gcc dot gnu.org --- (In reply to Bernd Edlinger from comment #39) > Created attachment 39940 [details] > proposed patch, v2 > > last upload was accidentally truncated. > uploaded the right patch. Right so looking at your patch, I think we should make the LDRD peephole change in a separate patch. I tried your foo example on all combinations of ARM, Thumb-2, VFP, NEON on various CPUs with both settings of prefer_ldrd_strd. In all cases the current GCC generates LDRD/STRD, even for zero offsets. CPUs where prefer_ldrd_strd=false emit LDR/STR for the shifts with -msoft-float or -mfpu=vfp (but not -mfpu=neon). This is clearly incorrect given that LDRD/STRD is used in all other cases, and prefer_ldrd_strd seems to imply whether to prefer using LDRD/STRD in prolog/epilog and inlined memcpy. So that means we should remove the odd checks for codesize and current_tune->prefer_ldrd_strd from all the peepholes.