On 3/20/23 08:01, Manolis Tsamis wrote:
On Fri, Mar 17, 2023 at 10:31 AM Richard Biener
<richard.guent...@gmail.com> wrote:

On Thu, Mar 16, 2023 at 4:27 PM Manolis Tsamis <manolis.tsa...@vrull.eu> wrote:

For this C testcase:

void g();
void f(unsigned int *a)
{
   if (++*a == 1)
     g();
}

GCC will currently emit a comparison with 1 by using the value
of *a after the increment. This can be improved by comparing
against 0 and using the value before the increment. As a result
there is a potentially shorter dependancy chain (no need to wait
for the result of +1) and on targets with compare zero instructions
the generated code is one instruction shorter.

The downside is we now need two registers and their lifetime overlaps.

Your patch mixes changing / inverting a parameter (which seems unneeded
for the actual change) with preferring compares against zero.


Indeed. I thought that without that change the original names wouldn't properly
describe what the parameter actually does and that's why I've changed it.
I can undo that in the next revision.
Typically the thing to do is send that separately. If it has no functional change, then it can often go in immediately.



What's the reason to specifically prefer compares against zero?  On x86
we have add that sets flags, so ++*a == 0 would be preferred, but
for your sequence we'd need a test reg, reg; branch on zero, so we do
not save any instruction.


My reasoning is that zero is treated preferentially  in most if not
all architectures. Some specifically have zero/non-zero comparisons so
we get one less instruction. X86 doesn't explicitly have that but I
think that test reg, reg may not be always needed depending on the
rest of the code. By what Andrew mentions below there may even be
optimizations for zero in the microarchitecture level.
There's all kinds of low level ways a test against zero is better than a test against any other value. I'm not aware of any architecture were the opposite is true.

Note that in this specific case rewriting does cause us to need two registers, so we'll want to think about the right time to make this transformation. It may be the case that doing it in gimple is too early.




Because this is still an arch-specific thing I initially tried to make
it arch-depended by invoking the target's const functions (e.g. If I
recall correctly aarch64 will return a lower cost for zero
comparisons). But the code turned out complicated and messy so I came
up with this alternative that just treats zero preferentially.

If you have in mind a way that this can be done in a better way I
could try to implement it.
And in general I think you approached this in the preferred way -- it's largely a target independent optimization, so let's tackle it in a generic way.

Anyway, we'll dive into it once gcc-14 development opens and try to figure out the best way forward.

jeff

Reply via email to