On Mon, May 16, 2011 at 7:30 PM, William J. Schmidt <wschm...@linux.vnet.ibm.com> wrote: > Richi, thank you for the detailed review! > > I'll plan to move the power-series expansion into the existing IL walk > during pass_cse_sincos. As part of this, I'll move > tree_expand_builtin_powi and its subfunctions from builtins.c into > tree-ssa-math-opts.c. I'll submit this as a separate patch. > > I will also stop attempting to make code generation match completely at > -O0. If there are tests in the test suite that fail only at -O0 due to > these changes, I'll modify the tests to require -O1 or higher. > > I understand that you'd prefer that I leave the existing > canonicalization folds in place, and only un-canonicalize them during my > new pass (now, during cse_sincos). Actually, that was my first approach > to this issue. The problem that I ran into is that the various folds > are not performed just by the front end, but can pop up later, after my > pass is done. In particular, pass_fold_builtins will undo my changes, > turning expressions involving roots back into expressions involving > pow/powi. It wasn't clear to me whether the folds could kick in > elsewhere as well, so I took the approach of shutting them down. I see > now that this does lose some optimizations such as > pow(sqrt(cbrx(x)),6.0), as you pointed out.
Yeah, it's always a delicate balance between canonicalization and optimal form for further optimization. Did you really see sqrt(cbrt(x)) being transformed back to pow()? I would doubt that, as on gimple the foldings that require two function calls to match shouldn't trigger. Or do you see sqrt(x) turned into pow(x,0.5)? I see that the vectorizer for example handles both pow(x,0.5) and pow(x,2), so indeed that might happen. I think what we might want to do is limit what the generic gimple fold_stmt folding does to function calls, also to avoid building regular generic call statements again. But that might be a bigger project and certainly should be done separately. So I'd say don't worry about this issue for the initial patch but instead deal with it separately. We also repeatedly thought about whether canonicalizing everything to pow is a good idea or not, especially our canonicalizing of x * x to pow (x, 2) leads to interesting effects in some cases, as several passes do not handle function calls very well. So I also thought about introducing a POW_EXPR tree code that would be easier in this regard and would be a more IL friendly canonical form of the power-related functions. > Should I attempt to leave the folds in place, and screen out the > particular cases that are causing trouble in pass_fold_builtins? Or is > it too fragile to try to catch all places where folds occur? If there's > a flag that indicates parsing is complete, I suppose I could disable > individual folds once we're into the optimizer. I'd appreciate your > guidance. Indeed restricting canonicalization to earlier passes would be the way to go I think. I will think of the best way to achieve this. Richard. > Thanks, > Bill > > >