On Fri, May 17, 2013 at 1:40 PM, Kugan <kugan.vivekanandara...@linaro.org> wrote: > On 13/05/13 17:47, Richard Biener wrote: >> >> On Mon, May 13, 2013 at 5:45 AM, Kugan >> <kugan.vivekanandara...@linaro.org> wrote: >>> >>> Hi, >>> >>> This patch removes some of the redundant sign/zero >>> extensions using value ranges informations generated by VRP. >>> >>> When GIMPLE_ASSIGN stmts with LHS type smaller than word is expanded to >>> rtl, if we can prove that RHS expression value can always fit in LHS >>> type and there is no sign conversion, truncation and extension to fit >>> the type is redundant. Subreg and Zero/sign extensions are therefore >>> redundant. >>> >>> For example, when an expression is evaluated and it's value is assigned >>> to variable of type short, the generated rtl would look something like >>> the following. >>> >>> (set (reg:SI 110) >>> (zero_extend:SI (subreg:HI (reg:SI 117) 0))) >>> >>> However, if during value range propagation, if we can say for certain >>> that the value of the expression which is present in register 117 is >>> within the limits of short and there is no sign conversion, we don’t >>> need to perform the subreg and zero_extend; instead we can generate the >>> following rtl. >>> >>> (set (reg:SI 110) >>> (reg:SI 117))) >>> >>> Attached patch adds value range information to gimple stmts during value >>> range propagation pass and expands rtl without subreg and zero/sign >>> extension if the value range is within the limits of it's type. >>> >>> This change improve the geomean of spec2k int benchmark with ref by about >>> ~3.5% on an arm chromebook. >>> >>> Tested on X86_64 and ARM. >>> >>> I would like review comments on this. >> >> >> A few comments on the way you preserve this information. >> > > Thanks Richard for reviewing it. > > >> --- a/gcc/gimple.h >> +++ b/gcc/gimple.h >> @@ -191,6 +191,11 @@ struct GTY((chain_next ("%h.next"))) >> gimple_statement_base { >> in there. */ >> unsigned int subcode : 16; >> >> + /* if an assignment gimple statement has RHS expression that can fit >> + LHS type, zero/sign extension to truncate is redundant. >> + Set this if we detect extension as redundant during VRP. */ >> + unsigned sign_zero_ext_redundant : 1; >> + >> >> this enlarges all gimple statements by 8 bytes, so it's out of the >> question. >> >> My original plan to preserve value-range information was to re-use >> SSA_NAME_PTR_INFO which is where we already store value-range >> information for pointers: >> >> struct GTY(()) ptr_info_def >> { >> /* The points-to solution. */ >> struct pt_solution pt; >> >> /* Alignment and misalignment of the pointer in bytes. Together >> align and misalign specify low known bits of the pointer. >> ptr & (align - 1) == misalign. */ >> >> /* When known, this is the power-of-two byte alignment of the object >> this >> pointer points into. This is usually DECL_ALIGN_UNIT for decls and >> MALLOC_ABI_ALIGNMENT for allocated storage. When the alignment is >> not >> known, it is zero. Do not access directly but use functions >> get_ptr_info_alignment, set_ptr_info_alignment, >> mark_ptr_info_alignment_unknown and similar. */ >> unsigned int align; >> >> /* When alignment is known, the byte offset this pointer differs from >> the >> above alignment. Access only through the same helper functions as >> align >> above. */ >> unsigned int misalign; >> }; >> >> what you'd do is to make the ptr_info_def reference from tree_ssa_name a >> reference to a union of the above (for pointer SSA names) and an alternate >> info like >> >> struct range_info_def { >> double_int min; >> double_int max; >> }; >> > > I have now changed the way I am preserving this information as you have > suggested. > > >> or a more compressed format (a reference to a mode or type it fits for >> example). >> >> The very specific case in question also points at the fact that we have >> two conversion tree codes, NOP_EXPR and CONVERT_EXPR, and we've >> tried to unify them (but didn't finish up that task)... >> > > We can also remove sign/zero extension in cases other than NOP_EXPR and > CONVERT_EXPR, if the value range of the RHS expressions fits LHS type > without sign change. > > For example, please look at the following gimple stmt and the resulting rtl: > > ;; c_3 = c_2(D) & 15; > > ;; Without the patch > (insn 6 5 7 (set (reg:SI 116) > (and:SI (reg/v:SI 115 [ c ]) > (const_int 15 [0xf]))) c5.c:4 -1 > (nil)) > > (insn 7 6 0 (set (reg/v:SI 111 [ c ]) > (zero_extend:SI (subreg:QI (reg:SI 116) 0))) c5.c:4 -1 > (nil)) > > ;; with the patch > (insn 6 5 7 (set (reg:SI 116) > (and:SI (reg/v:SI 115 [ c ]) > (const_int 15 [0xf]))) c5.c:4 -1 > (nil)) > > (insn 7 6 0 (set (reg/v:SI 111 [ c ]) > (reg:SI 116)) c5.c:4 -1 > (nil)) > > > >> +static void >> +process_stmt_for_ext_elimination () >> +{ >> >> we already do all the analysis when looking for opportunities to eliminate >> the intermediate cast in (T2)(T1)x in simplify_conversion_using_ranges, >> that's the place that should handle this case. >> > I have changed this. > > > I have attached patch with the above changes.
Quick comments below + /* value range information. */ + union vrp_info_type { + /* Pointer attributes used for alias analysis. */ + struct GTY ((tag ("TREE_SSA_PTR_INFO"))) ptr_info_def *ptr_info; + /* range attributes used for zero/sign extension elimination. */ + struct GTY ((tag ("TREE_SSA_RANGE_INFO"))) range_info_def *range_info; + } GTY ((desc ("tree_ssa_vrp_info_type (&%1)"))) vrp; I think desc ("POINTER_TYPE_P (TREE_TYPE (&%1))") and re-ordering of the union members would have worked, too? +struct GTY (()) range_info_def { + double_int min; + double_int max; + /* if an assignment gimple statement has RHS expression that can fit + LHS type, zero/sign extension to truncate is redundant. + Set this if we detect extension as redundant during VRP. */ + bool sign_zero_ext_redundant; +}; the sign_zero_ext_redundant flag is redundant I think. Also instead of post-processing the expand_expr_real_2 output in expand_gimple_stmt you should adjust the CASE_CONVERT case and/or the REDUCE_BITFIELD handling in expand_expr_real_2 to check whether the source operand value-range fits in the destination. Also note that this may be highly target specific and simply dropping SUBREGs and then emitting a move insn like you do looks odd. AFAIK you can at most remove truncations of > word_mode registers as SUBREGs behave differently (see also STRICT_LOW_PART). That said - you need to find another reviewer for the RTL expansion pieces. Can you split the patch into a piece introducing range_info_def and just populate the min/max members from VRP before we call substitute_and_fold? You probably want to add a flag for whether the min/max pair represents a range or an anti-range as well. Thanks, Richard. > > Thanks, > Kugan > >> Richard. >> >>> Thanks, >>> Kugan >>> >>> >>> 2013-05-09 Kugan Vivekanandarajah <kug...@linaro.org> >>> >>> * gcc/gimple.h (gimple_is_exp_fit_lhs, gimple_set_exp_fit_lhs): New >>> function. >>> * gcc/tree-vrp.c (process_stmt_for_ext_elimination): Likewise. >>> * (is_msb_set, range_fits_type): Likewise. >>> * (vrp_finalize): Call process_stmt_for_ext_elimination. >>> * gcc/dojump.c (do_compare_and_jump): generates rtl without >>> zero/sign extension >>> if process_stmt_for_ext_elimination tells so. >>> * gcc/cfgexpand.c (expand_gimple_stmt_1): Likewise. >>> >>> > >