https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67366
--- Comment #7 from rguenther at suse dot de <rguenther at suse dot de> --- On Thu, 27 Aug 2015, ramana at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67366 > > --- Comment #6 from Ramana Radhakrishnan <ramana at gcc dot gnu.org> --- > (In reply to rguent...@suse.de from comment #3) > > On Thu, 27 Aug 2015, rearnsha at gcc dot gnu.org wrote: > > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67366 > > > > > > --- Comment #2 from Richard Earnshaw <rearnsha at gcc dot gnu.org> --- > > > (In reply to Richard Biener from comment #1) > > > > I think this boils down to the fact that memcpy expansion is done too > > > > late > > > > and > > > > that (with more recent GCC) the "inlining" done on the GIMPLE level is > > > > restricted > > > > to !SLOW_UNALIGNED_ACCESS but arm defines STRICT_ALIGNMENT to 1 > > > > unconditionally. > > > > > > > > > > Yep, we have to define STRICT_ALIGNMENT to 1 because not all load > > > instructions > > > work with misaligned addresses (ldm, for example). The only way to handle > > > misaligned copies is through the movmisalign API. > > > > Are the movmisalign handled ones reasonably efficient? That is, more > > efficient than memcpy/memmove? Then we should experiment with > > minor nit - missing include of optabs.h - fixing that and adding a > movmisalignsi pattern in the backend that just generates either an unaligned / > storesi insn generates the following for me for the above mentioned testcase. > > > read32: > @ args = 0, pretend = 0, frame = 0 > @ frame_needed = 0, uses_anonymous_args = 0 > @ link register save eliminated. > ldr r0, [r0] @ unaligned > bx lr > > > > > I'm on holiday from this evening so don't really want to push something today > ... Sure ;) When adding the GIMPLE folding I was just careful here as I don't really have a STRICT_ALIGNMENT machine with movmisalign handling available. Thus full testing is appreciated (might turn up some testcases that need adjustment). There are more STRICT_ALIGN guarded cases below in the function, eventually they can be modified as well (at which point splitting out the alignment check to a separate function makes sense).