On 27/04/12 11:49, Richard Guenther wrote:
It feels to me that GCC46 version is better:
* no branch to subroutine memcpy;
* less stack usage (argument to enterl);

So, using our block copy (bc2) instruction is an optimisation, don't you
think?

Yes, it inlines it.  You may want to look at s390 which I believe has
a similar block-copy operation.


I am not sure I understood your comment. GCC46 generates the bc2 call due to my implementation of movmemqi. If I remove it, as you suggested, GCC47 will always call memcpy and will be worse off.

I will me looking at s390 for inspiration. Thanks for the suggestion.

--
PMatos

Reply via email to