On Thu, Mar 15, 2012 at 3:34 PM, Kai Tietz <ktiet...@googlemail.com> wrote: > 2012/3/15 Richard Guenther <richard.guent...@gmail.com>: >> On Thu, Mar 15, 2012 at 3:00 PM, Jakub Jelinek <ja...@redhat.com> wrote: >>> On Thu, Mar 15, 2012 at 02:53:10PM +0100, Kai Tietz wrote: >>>> > This looks like to match unbound pattern sizes and thus does not fit >>>> > into the forwprop machinery. Instead it was suggested elsewhere >>>> > that promoting / demoting registers should be done in a separate pass >>>> > where you can compute a lattice of used bits and apply a transform >>>> > based on that lattice and target information (according to PROMOTE_MODE >>>> > for example). >>>> >>>> Well, the integer truncation part might be something for a separate >>>> pass. It could then also take care that within single-use >>>> gimple-statements the integral-constant is always on right-hand-side >>>> of first statement of an +, -, |, ^, &, and mul. >>>> >>>> But the cast-hoisting code itself is not unbound AFAICS and has fixed >>>> pattern size. >> >> Can you split that part out then please? > > I can do. In fact just the part of calling > > Sure, it would be the removal of the function truncate_integers and its call. > >>> The type demotion is PR45397/PR47477 among other PRs. >>> I'd just walk from the narrowing integer conversion stmts recursively >>> through the def stmts, see if they can be narrowed, note it, and finally if >>> everything or significant portion of the stmts can be demoted (if not all, >>> with some narrowing integer conversion stmt inserted), do it all together. > > Jakub, this might be something good to have it in separate pass. > Right now I need to avoid some type-hoisting in forward-propagation, > as otherwise it would loop endless caused by type-sinking code in > forward-propagation.
Well, that sounds like either of it is not applying costs properly. We should have a total ordering cost-wise on both forms. Which forms do we iterate on? > Only question would be where such pass would be > best placed. After or before forward-propagation pass? Somewhere before loop optimizations (especially IVOPTs). Generally it might help SCEV analysis, so I'd do it before PRE, after the CCP/copy-prop passes. >> For PROMOTE_MODE targets you'd promote but properly mask out >> constants (to make them cheaper to generate, for example). You'd >> also take advantate of targets that can do zero/sign-extending loads >> without extra cost (ISTR that's quite important for some SPEC 2k6 >> benchmark on x86_64). > > Hmm, as we are talking about truncation-casts here, what is the reaons > for PROMOTE_MODE here? > You mean to generate for a PROMOTE_MODE target explicit something like: > D1 = 0x80 > D2 = (int) D1 > > instead of having D1 = 0xffffff80. Isn't that a decision done on RTL level? PROMOTE_MODE is about targets only having basic operations in the mode PROMOTE_MODE promotes to. For example (IIRC) powerpc can only do arithmetic on full register width, not on arbitrarily sub-regs as i386. Which means that if you need sub-reg precision values we have to insert truncataions after every operation on RTL. Thus, if you artificially lower precision of computations on the tree level this will pessimize things on RTL. So, you never should narrow operations below what PROMOTE_MODE would do. In fact, you might as well expose the PROMOTE_MODE fact on the tree level. > Another interesting type-hoisting part is to check if a type-sign > change might be profitable. > > Eg: > signed char D0, D1, D5; > int D2,D3,D4 > ... > D2 = (int) D0 > D3 = (int) D1 > D4 = D2 + D3 > D5 = (signed char) D4 > D6 = D5 == 0x8f > > to > > signed char D0, D1, D5; > unsigned char D2,D3,D4 > ... > D2 = (unsigned char) D0 > D3 = (unsigned char) D1 > D4 = D2 + D3 > D5 = D4 == 0x7fu You need to watch for undefined overflow issues on the narrowed expressions anyway (thus, have to resort to unsigned arithmetic in nearly all cases). Richard. >> Richard. >> >>> Jakub > > Regards, > Kai