On 20 July 2011 17:28, Richard Guenther <richard.guent...@gmail.com> wrote: > On Tue, Jul 19, 2011 at 10:57 AM, Richard Guenther > <richard.guent...@gmail.com> wrote: >> On Tue, Jul 19, 2011 at 8:44 AM, Ira Rosen <ira.ro...@linaro.org> wrote: >>> Hi, >>> >>> This patch tries to reduce over-promotion of vector operations that >>> could be done with narrower elements, e.g., for >>> >>> char a; >>> int b, c; >>> short d; >>> >>> b = (int) a; >>> c = b << 2; >>> d = (short) c; >>> >>> we currently produce six vec_unpack_lo/hi_expr statements for >>> char->int conversion and then two vec_pack_trunc_expr for short->int. >>> While the shift can be performed on short, using only two >>> vec_unpack_lo/hi_expr operations for char->short conversion in this >>> example. >>> >>> With this patch we detect such over-promoted sequences that start with >>> a type promotion operation and end with a type demotion operation. The >>> statements in between are checked if they can be performed using >>> smaller type (this patch only adds a support for shifts and bit >>> operations with a constant). If a sequence is detected we create a >>> sequence of scalar pattern statements to be vectorized instead the >>> original one. Since there may be two pattern statements created for >>> the same original statement - the operation itself (on an intermediate >>> type) and a type promotion (from a smaller type to the intermediate >>> type) for the non-constant operand - this patch adds a new field to >>> struct _stmt_vec_info to keep that pattern def statement. >>> >>> Bootstrapped and tested on powerpc64-suse-linux. >>> Comments are welcome. >> >> I wonder if we should do this optimization for scalars as well. We still >> do some sort of that in frontends shorten_* functions and I added >> the capability to remove intermediate conversions to VRP recently. >> >> At least it looks like VRP could be a good place to re-write operations >> in narrower types. That is, for a truncation statement >> >> d = (short) c; >> >> see if that truncation is value-preserving by looking at the value-range >> of C, then look if all related defs of C can be rewritten to that truncated >> type until you reach only stmts that need no further processing >> (not sure if that might be too expensive - at least I could imagine >> some artificial testcases that would exhibit quadratic behavior). >> >> You'd need to make VRP handle new SSA names during substitue_and_fold >> gracefully. > > Are you working on this? Otherwise I'll give it a try to also handle > > D.2730_9 = (_Bool) x_2(D); > D.2731_10 = (_Bool) y_5(D); > D.2732_11 = D.2730_9 & D.2731_10; > D.2729_12 = (int) D.2732_11; > > transforming to > > D.2732_11 = x_2(D) & y_5(D); > D.2729_12 = D.2732_11; > > that is, try to express a converted operation in the result type > (in your case it was shortening, here it is widening) avoiding > (or adjusting in your case) conversions of the operation operands. > > If we deal with single-use chains then we can even get away > without introducing new SSA names I guess (and at least in that > case it's reasonable to assess that the transformation is profitable).
I started looking into this, but haven't made much progress. So, please, go ahead. Thanks, Ira > > Thanks, > Richard. >