Hi Jeff,

On 20 May 2016 at 04:17, Jeff Law <l...@redhat.com> wrote:
> On 05/15/2016 06:45 PM, Kugan Vivekanandarajah wrote:
>>
>> Hi Richard,
>>
>> Now that stage1 is open, I would like to get the type promotion passes
>> reviewed again. I have tested the patches on aarch64, x86-64, and
>> ppc64le without any new execution failures. There some test-cases that
>> fails for patterns. I will address them after getting feedback on the
>> basic structure.
>
> I find myself wondering if this will eliminate some of the cases where Kai's
> type casting motion was useful.  And just to be clear, that would be a good
> thing.

Let me try with the above patch.
However:

aarch64 testsuite diff is:
# Comparing 11 common sum files
## /bin/sh ./gcc/contrib/compare_tests  /tmp/gxx-sum1.20074 /tmp/gxx-sum2.20074
Tests that now fail, but worked before:

gcc.dg/tree-ssa/pr54245.c scan-tree-dump-times slsr "Inserting initializer" 0
gcc.dg/tree-ssa/pr69270-3.c scan-tree-dump-times uncprop1 ", 1" 1

New tests that PASS:

gcc.c-torture/unsorted/dump-noaddr.c.*t.promotion,  -O1  comparison
gcc.c-torture/unsorted/dump-noaddr.c.*t.promotion,  -O2  comparison
gcc.c-torture/unsorted/dump-noaddr.c.*t.promotion,  -O2 -flto
-fno-use-linker-plugin -flto-partition=none  comparison
gcc.c-torture/unsorted/dump-noaddr.c.*t.promotion,  -O3
-fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer
-finline-functions  comparison
gcc.c-torture/unsorted/dump-noaddr.c.*t.promotion,  -O3 -g  comparison
gcc.c-torture/unsorted/dump-noaddr.c.*t.promotion,  -Os  comparison

## Differences found:
# 1 differences in 11 common sum files found


ppc64 had more.

I see improvements and some regressions too;

For example:

short unPack( unsigned char c )
{
    c = c & (unsigned char)0x0F ;

    if( c > 7 ) {
        return( ( short )( c - 8 ) ) ;
    }
    else
    {
        return( ( short )c ) ;
    }
}

asm differnce:
     and    w0, w0, 15
-    cmp    w0, 7
-    bhi    .L5
-    sxth    w0, w0
-    ret
-    .p2align 3
-.L5:
-    sub    w0, w0, #8
-    sxth    w0, w0
+    sub    w1, w0, #8
+    cmp    w0, 8
+    csel    w0, w1, w0, cs
     ret


short bar (short y);
int foo (short x)
{
  short y = bar (x) + 15;
  return y;
}

Base - optimized gimple
foo (short int x)
{
  short int y;
  short int _1;
  unsigned short _2;
  unsigned short _3;
  int _8;

  <bb 2>:
  _1 = bar (x_5(D));
  _2 = (unsigned short) _1;
  _3 = _2 + 15;
  y_7 = (short int) _3;
  _8 = (int) y_7;
  return _8;

}

With type promotion - optimized gimple
foo (short int x)
{
  unsigned int _2;
  unsigned int _3;
  signed int _7;
  int _8;
  short int _10;
  unsigned int _12;

  <bb 2>:
  _10 = bar (x_9(D));
  _2 = (unsigned int) _10;
  _3 = _2 + 15;
  _12 = _3 & 65535;
  _7 = (signed int) _12;
  _8 = (_7) sext (16);
  return _8;

}

ASM difference:

     stp    x29, x30, [sp, -16]!
     add    x29, sp, 0
     bl    bar
+    sxth    w0, w0
     add    w0, w0, 15
     ldp    x29, x30, [sp], 16
     sxth    w0, w0


Thanks,
Kugan


>
>>
>> 1. When we promote SSA as part of promote_ssa, we either promote the
>> definition. Or create a copy stmt that is inserted after the stmt that
>> define it. i.e, we want to promote the SSA and reflect the promotion
>> on all the uses (we promote in place). We do this because, we don’t
>> want to change all the uses.
>>
>> +/* Promote definition DEF to promoted type.  If the stmt that defines def
>> +   is def_stmt, make the type of def promoted type.  If the stmt is such
>> +   that, result of the def_stmt cannot be of promoted type, create a
>> new_def
>> +   of the original_type and make the def_stmt assign its value to newdef.
>> +   Then, create a NOP_EXPR to convert new_def to def of promoted type.
>> +
>> +   For example, for stmt with original_type char and promoted_type int:
>> +        char _1 = mem;
>> +    becomes:
>> +        char _2 = mem;
>> +        int _1 = (int)_2;
>
> When does this case happen, and how is this any better than PRE or other
> elimination/code motion algorithms in improving the generated code?
>
> I would hazard a guess that it could happen if you still needed the char
> sized used in a small number of cases, but generally wanted to promote most
> uses to int?
>
>> +
>>
>> However, if the defining stmt has to be the last stmt in the basic
>> block (eg, stmt that can throw), and if there is more than one normal
>> edges where we use this value, we cant insert the copy in all the
>> edges. Please note that the copy stmt copes the value to promoted SSA
>> with the same name.
>>
>> Therefore I had to return false in this case for promote_ssa and fixup
>> uses. I ran into this while testing ppc64le. I am sure it can happen
>> in other cases.
>
> Right.
>
> Jeff

Reply via email to