Hi,
Consider code:
int foo(char *t, char *v, int w)
{
int i;
for (i = 1; i != w; ++i)
{
int x = i << 2;
v[x + 4] = t[x + 4];
}
return 0;
}
Compile it to x86 (I used both gcc 4.7.2 and gcc 4.8.1) with options:
gcc -O2 -m32 -S test.c
You will see loop, formed like:
.L5:
leal 0(,%eax,4), %edx
addl $1, %eax
movzbl 4(%edi,%edx), %ecx
cmpl %ebx, %eax
movb %cl, 4(%esi,%edx)
jne .L5
But it can be easily simplified to something like this:
.L5:
addl $1, %eax
movzbl (%esi,%eax,4), %edx
cmpl %ecx, %eax
movb %dl, (%ebx,%eax,4)
jne .L5
(i.e. left shift may be moved to address).
First question to gcc-help maillist. May be there are some options,
that I've missed, and there IS a way to explain gcc my intention to do
this?
And second question to gcc developers mail list. I am working on
private backend and want to add this optimization to my backend. What
do you advise me to do -- custom gimple pass, or rtl pass, or modify
some existent pass, etc?
---
With best regards, Konstantin