On Tue, Jul 19, 2011 at 10:57 AM, Richard Guenther <richard.guent...@gmail.com> wrote: > On Tue, Jul 19, 2011 at 8:44 AM, Ira Rosen <ira.ro...@linaro.org> wrote: >> Hi, >> >> This patch tries to reduce over-promotion of vector operations that >> could be done with narrower elements, e.g., for >> >> char a; >> int b, c; >> short d; >> >> b = (int) a; >> c = b << 2; >> d = (short) c; >> >> we currently produce six vec_unpack_lo/hi_expr statements for >> char->int conversion and then two vec_pack_trunc_expr for short->int. >> While the shift can be performed on short, using only two >> vec_unpack_lo/hi_expr operations for char->short conversion in this >> example. >> >> With this patch we detect such over-promoted sequences that start with >> a type promotion operation and end with a type demotion operation. The >> statements in between are checked if they can be performed using >> smaller type (this patch only adds a support for shifts and bit >> operations with a constant). If a sequence is detected we create a >> sequence of scalar pattern statements to be vectorized instead the >> original one. Since there may be two pattern statements created for >> the same original statement - the operation itself (on an intermediate >> type) and a type promotion (from a smaller type to the intermediate >> type) for the non-constant operand - this patch adds a new field to >> struct _stmt_vec_info to keep that pattern def statement. >> >> Bootstrapped and tested on powerpc64-suse-linux. >> Comments are welcome. > > I wonder if we should do this optimization for scalars as well. We still > do some sort of that in frontends shorten_* functions and I added > the capability to remove intermediate conversions to VRP recently. > > At least it looks like VRP could be a good place to re-write operations > in narrower types. That is, for a truncation statement > > d = (short) c; > > see if that truncation is value-preserving by looking at the value-range > of C, then look if all related defs of C can be rewritten to that truncated > type until you reach only stmts that need no further processing > (not sure if that might be too expensive - at least I could imagine > some artificial testcases that would exhibit quadratic behavior). > > You'd need to make VRP handle new SSA names during substitue_and_fold > gracefully.
Are you working on this? Otherwise I'll give it a try to also handle D.2730_9 = (_Bool) x_2(D); D.2731_10 = (_Bool) y_5(D); D.2732_11 = D.2730_9 & D.2731_10; D.2729_12 = (int) D.2732_11; transforming to D.2732_11 = x_2(D) & y_5(D); D.2729_12 = D.2732_11; that is, try to express a converted operation in the result type (in your case it was shortening, here it is widening) avoiding (or adjusting in your case) conversions of the operation operands. If we deal with single-use chains then we can even get away without introducing new SSA names I guess (and at least in that case it's reasonable to assess that the transformation is profitable). Thanks, Richard.