This patch generates LDRD/STRD sequences for movmemqi and movmemdi when source and destination buffers are aligned. It is disabled when optimizing for code size, becasue LDM/STM is preferable in such cases. It is enabled for performance for CPUs that set "prefer_ldrd_strd" tune flag. This flag is currently set only for cortex-a15.
The patch consists of the following changes: * movmemqi/movemdi call a new function that generates (whenever possible) a sequence of LDRD/STRD that is expected to achieve the best performance on Cortex-A15. * adjusted existing test for gcc's internal memcpy to accept LDRD/STRD as well as LDM/STM. gcc/ChangeLog 2011-10-28 Greta Yorsh <greta.yo...@arm.com> * config/arm/arm.md (movmemdi,movmemqi): Generate LDRD/STRD under certain conditions. * config/arm/arm-protos.h (gen_movmem_ldrd_strd): New prototype. * config/arm/arm.c (gen_movmem_ldrd_strd): New function. gcc/testsuite/ChangeLog 2011-10-28 Greta Yorsh <greta.yo...@arm.com> * gcc.target/arm/unaligned-memcpy-4.c: Adjust the test by separating between configurations that prefer LDRD/STRD from those that prefer LDM/STM. This file remains for LDM/STM case. * gcc.target/arm/unaligned-memcpy-5.c: New file. * gcc.target/arm/unaligned-memcpy.inc: New file.
5-internal-memcpy.patch
Description: Binary data