On 27/04/12 11:49, Richard Guenther wrote:
It feels to me that GCC46 version is better:
* no branch to subroutine memcpy;
* less stack usage (argument to enterl);
So, using our block copy (bc2) instruction is an optimisation, don't you
think?
Yes, it inlines it. You may want to look at s390 which I believe has
a similar block-copy operation.
I am not sure I understood your comment. GCC46 generates the bc2 call
due to my implementation of movmemqi. If I remove it, as you suggested,
GCC47 will always call memcpy and will be worse off.
I will me looking at s390 for inspiration. Thanks for the suggestion.
--
PMatos