> Hi! > > I'm looking at improving inlining heuristics at the moment, > especially by questioning the estimate_num_insns. All uses > of that function assume it to return a size cost, not a computation > cost - is that correct? If so, why do we penaltize f.i. EXACT_DIV_EXPR > compared to MULT_EXPR?
Well, not really. At least for inlining the idea of cost is mixed - if the function is either slow or big, inlining is not good idea. For post inline on CFG world, I plan to deambiguize these, but in current implementation both quantities seemed so raw, that doing something more precise on them didn't seem to make much sense. But I have the patch for separating code/size computations for tree-profiling around on my notebook (I believe), so I can pass you one in the case you want to help with tunning this. We can do pretty close esitmation there since we can build estimated profile so we know that functions with loop takes longer, while tree like functions can be fast even if they are big. > > Also, for the simple function > > double foo1(double x) > { > return x; > } > > we return 4 as a cost, because we have > > double tmp = x; > return tmp; > > and count the move cost (MODIFY_EXPR) twice. We could fix this > by not walking (i.e. ignoring) RETURN_EXPR. That would work, yes. I was also thinking about ignoring MODIFY_EXPR for var = var as those likely gets propagated later. > > Also, INSNS_PER_CALL is rather high (10) - what is this choice > based on? Wouldn't it be better to at least make it proportional > to the argument chain length? Or even more advanced to the move > cost of the arguments? Probably. The choice of constant is completely arbitrary. It is not too high cycle count wise (at least Athlon spends over 10 cycles per each call), but I never experimented with different values of this. There are two copies of this constant (I believe), one in tree-inline, other in cgraphunit that needs to be in sync. I have to cleanup this. > > Finally, is there a set of testcases that can be used as a metric > on wether improvements are improvements? This is major problem here - I use combination of spec (for C benchmarks), Gerald's applicaqtion and tramp3d, but all of these have very different behaviour and thus they hardly cover "common cases". If someone can come up with some more resonable testing method, I would be very happy - so far I simply test on all those and when results seems to be win in all three tests (or at least no loss), I apply them. Honza > > Thanks, > Richard. > > -- > Richard Guenther <richard dot guenther at uni-tuebingen dot de> > WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/