Hi Jeff, On 20 May 2016 at 04:17, Jeff Law <l...@redhat.com> wrote: > On 05/15/2016 06:45 PM, Kugan Vivekanandarajah wrote: >> >> Hi Richard, >> >> Now that stage1 is open, I would like to get the type promotion passes >> reviewed again. I have tested the patches on aarch64, x86-64, and >> ppc64le without any new execution failures. There some test-cases that >> fails for patterns. I will address them after getting feedback on the >> basic structure. > > I find myself wondering if this will eliminate some of the cases where Kai's > type casting motion was useful. And just to be clear, that would be a good > thing.
Let me try with the above patch. However: aarch64 testsuite diff is: # Comparing 11 common sum files ## /bin/sh ./gcc/contrib/compare_tests /tmp/gxx-sum1.20074 /tmp/gxx-sum2.20074 Tests that now fail, but worked before: gcc.dg/tree-ssa/pr54245.c scan-tree-dump-times slsr "Inserting initializer" 0 gcc.dg/tree-ssa/pr69270-3.c scan-tree-dump-times uncprop1 ", 1" 1 New tests that PASS: gcc.c-torture/unsorted/dump-noaddr.c.*t.promotion, -O1 comparison gcc.c-torture/unsorted/dump-noaddr.c.*t.promotion, -O2 comparison gcc.c-torture/unsorted/dump-noaddr.c.*t.promotion, -O2 -flto -fno-use-linker-plugin -flto-partition=none comparison gcc.c-torture/unsorted/dump-noaddr.c.*t.promotion, -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions comparison gcc.c-torture/unsorted/dump-noaddr.c.*t.promotion, -O3 -g comparison gcc.c-torture/unsorted/dump-noaddr.c.*t.promotion, -Os comparison ## Differences found: # 1 differences in 11 common sum files found ppc64 had more. I see improvements and some regressions too; For example: short unPack( unsigned char c ) { c = c & (unsigned char)0x0F ; if( c > 7 ) { return( ( short )( c - 8 ) ) ; } else { return( ( short )c ) ; } } asm differnce: and w0, w0, 15 - cmp w0, 7 - bhi .L5 - sxth w0, w0 - ret - .p2align 3 -.L5: - sub w0, w0, #8 - sxth w0, w0 + sub w1, w0, #8 + cmp w0, 8 + csel w0, w1, w0, cs ret short bar (short y); int foo (short x) { short y = bar (x) + 15; return y; } Base - optimized gimple foo (short int x) { short int y; short int _1; unsigned short _2; unsigned short _3; int _8; <bb 2>: _1 = bar (x_5(D)); _2 = (unsigned short) _1; _3 = _2 + 15; y_7 = (short int) _3; _8 = (int) y_7; return _8; } With type promotion - optimized gimple foo (short int x) { unsigned int _2; unsigned int _3; signed int _7; int _8; short int _10; unsigned int _12; <bb 2>: _10 = bar (x_9(D)); _2 = (unsigned int) _10; _3 = _2 + 15; _12 = _3 & 65535; _7 = (signed int) _12; _8 = (_7) sext (16); return _8; } ASM difference: stp x29, x30, [sp, -16]! add x29, sp, 0 bl bar + sxth w0, w0 add w0, w0, 15 ldp x29, x30, [sp], 16 sxth w0, w0 Thanks, Kugan > >> >> 1. When we promote SSA as part of promote_ssa, we either promote the >> definition. Or create a copy stmt that is inserted after the stmt that >> define it. i.e, we want to promote the SSA and reflect the promotion >> on all the uses (we promote in place). We do this because, we don’t >> want to change all the uses. >> >> +/* Promote definition DEF to promoted type. If the stmt that defines def >> + is def_stmt, make the type of def promoted type. If the stmt is such >> + that, result of the def_stmt cannot be of promoted type, create a >> new_def >> + of the original_type and make the def_stmt assign its value to newdef. >> + Then, create a NOP_EXPR to convert new_def to def of promoted type. >> + >> + For example, for stmt with original_type char and promoted_type int: >> + char _1 = mem; >> + becomes: >> + char _2 = mem; >> + int _1 = (int)_2; > > When does this case happen, and how is this any better than PRE or other > elimination/code motion algorithms in improving the generated code? > > I would hazard a guess that it could happen if you still needed the char > sized used in a small number of cases, but generally wanted to promote most > uses to int? > >> + >> >> However, if the defining stmt has to be the last stmt in the basic >> block (eg, stmt that can throw), and if there is more than one normal >> edges where we use this value, we cant insert the copy in all the >> edges. Please note that the copy stmt copes the value to promoted SSA >> with the same name. >> >> Therefore I had to return false in this case for promote_ssa and fixup >> uses. I ran into this while testing ppc64le. I am sure it can happen >> in other cases. > > Right. > > Jeff