On Fri, Apr 22, 2011 at 5:35 AM, Jan Hubicka <hubi...@ucw.cz> wrote: > Hi, > this patch implements infrastructure to summarize function body size&time in a > way that is sensitive on function context. At the moment context means > > 1) if function is inline or offline > 2) if some of parameters are known compile time constants. > > but we should handle more later. > > The analysis is implemented by introducing notion of predicates, that are > simple logical formulas in conjunctive-disjunctive form on conditions. > Conditions are simple tests like "function is not inlined" "op0 is not > constant" "op0 > 6". That is one can express things like "this statement will > execute if op1 >6 or op0 is not constant". > > The patch implements simple infrastructure to represent the predicates. There > are hard limits on everything - i.e. there are no more than 32 different > conditions and no more than 8 clauses. This makes it possible to test clauses > by simple logicaloperations on integers and represent predicates at array of 8 > integers thatis very cheap. The implementation details are quite contained so > we might relax the limits, but I don't really see a need for that. > > The main point of designing this was to allow effective way of evaulating > those > predicates for given context, since this happens many times during inlining, > and to not make WPA memory usage grow too much. At the same time I wanted the > infrastructure to be flexible enough to allow adding more tricks in future. > Like we might consider extra inlining points if callee uses the argument to > drive number of iterations of loop or when caller pass a pointer to a static > variable that might be SRAed after inlining etc. etc. > > At the moment only consumer of predicates is size/time vector that is vector > of simple entries consiting of size, time and predicate. Function size/time > is then > computed as sum of all entries whose predicate might be true in given context > + > size/time of all call edges (this is because call edges can disappear at > different > conditions or be turned into constant). > > I plan to add more uses of predicates in near future - i.e. attaching > predicates to edges so we know what calls will be optimized out at WPA time. > Also I plan to use the analysis to drive function clonning (i.e. partial > specialization): when function is called from several places with the same > context and the context makes a difference at expected runtime, clone the > function. > > The analysis part deciding on predicates is currently very simple, kind of > proof > of concept: > > 1) Every BB gets assigned predicate when it is reachable. At the moment it > happens > only if the all predecestors of BB are conditionals that tests function > parameter. Obviously we will need to propagate this info for sane results. > > 2) Every statement gets assigned predicate when it will become constant. > Again > it is very simple, only statements using only function arguments are > considered. > Simple propagation across SSA graph will do better. > > 3) Finally the statement is accounted at a predicate that is conjunction of > both > above. > All call statements are accounted unconditoinally because we don't predicate > edges, yet. > > While computing function sizes is fast, it is not as speedy as original > "time-benefit". Small function inliner is quite insane about querying the > sizes/times again and again while it keeps up to date its queue. For this > purpose I added cache same way as we already cache function growths. Not that > I would not plan to make inliner&badness more sensible here soon. > So far I did not want to touch the actual heuristics part of inliner and hope > to do > it after getting the infrastructure to the point I want it to be for 4.7. > > The patch bootstraps®tests. I tested that compile time implication on > tramp3d is positive (because of caching, without it inliner grows from 1.1% to > 4% of compile time) I also tested SPECs and there are not great changes, that > is not bad result given stupidity of the analysis ;). > > I will look into Mozilla even though I plan to look into solving scability > problems of the inliner as followup instead of snowballing this. > > I plan to work on the patch little further during weekend, in particular make > dumps more readable since they got bit convoluted by random formatting. But i > am sending the patch for comments and I plan to get it finished till next > week. > > Honza > > * gengtype.c (open_base_files): Add ipa-inline.h include. > * ipa-cp.c (ipcp_get_lattice, ipcp_lattice_from_jfunc): Move to > ipa-prop.c > update all uses. > * ipa-prop.c: (ipa_get_lattice, ipa_lattice_from_jfunc): ... here. > * ipa-inline-transform.c (inline_call): Use inline_merge_summary to > merge > summary of inlined function into former caller. > * ipa-inline.c (max_benefit): Remove. > (edge_badness): Compensate for removal of benefits. > (update_caller_keys): Use > reset_node_growth_cache/reset_edge_growth_cache. > (update_callee_keys): Likewise. > (update_all_callee_keys): Likewise. > (inline_small_functions): Do not collect max_benefit; do not > reset stimated_growth; call free_growth_caches and > initialize_growth_caches. > * ipa-inline.h (struct condition, type clause_t, struct predicate, > struct > size_time_entry): New structures. > (INLINE_SIZE_SCALE, INLINE_TIME_SCALE, MAX_CLAUSES): New constants. > (inline_summary): Remove size_inlining_benefit, time_inlining_benefit > and > estimated_growth. > (edge_growth_cache_entry): New structure. > (node_growth_cache, edge_growth_cache): New global vars. > (estimate_growth): Turn into inline. > (inline_merge_summary, do_estimate_edge_growth, do_estimate_edge_time, > initialize_growth_caches, free_growth_caches): Declare. > (estimate_edge_growth): Rewrite. > (estimate_edge_time): Implement as inline cache lookup. > (reset_node_growth_cache, reset_edge_growth_cache): New inline > functions. > (MAX_TIME): Reduce to allow multiplicatoin by INLINE_SIZE_SCALE. > (NUM_CONDITIONS): New constant. > (predicate_conditions): New enum. > (IS_NOT_CONSTANT): New constant. > (edge_removal_hook_holder): New var. > (node_growth_cache, edge_growth_cache): New global vars. > (true_predicate, single_cond_predicate, false_predicate, > not_inlined_predicate, > add_condition, add_clause, and_predicates, or_predicates, > predicates_equal_p, > evaulate_predicate, dump_condition, dump_clause, dump_predicate, > account_size_time, > evaulate_conditions_for_edge): New functions. > (inline_summary_alloc): Move to heap. > (inline_node_removal_hook): Clear condition and entry vectors. > (inline_edge_removal_hook): New function. > (initialize_growth_caches, free_growth_caches): New function. > (dump_inline_summary): Update. > (edge_execution_predicate): New function. > (will_be_nonconstant_predicate): New function. > (estimate_function_body_sizes): Compute BB and constantness predicates. > (compute_inline_parameters): Do not clear estimated_growth. > (estimate_edge_size_and_time): New function. > (estimate_calls_size_and_time): New function. > (estimate_callee_size_and_time): New function. > (remap_predicate): New function. > (inline_merge_summary): New function. > (do_estimate_edge_time): New function based on... > (estimate_edge_time): ... this one. > (do_estimate_edge_growth): New function. > (do_estimate_growth): New function based on.... > (estimate_growth): ... this one. > (inline_analyze_function): Analyze after deciding on jump functions. > (inline_read_section): New function. > (inline_read_summary): Use it. > (inline_write_summary): Write all the new data. > * ipa-prop.c (ipa_get_param_decl_index): Export. > (ipa_lattice_from_jfunc): Move here from ipa-cp.c > * ipa-prop.h (ipa_get_param_decl_index, ipa_lattice_from_jfunc): > Declare. > (ipa_get_lattice): Move hre from ipa-cp.c > * Makefile.in (GTFILES): Add ipa-inline.h and ipa-inline-analysis.c > * params.def (PARAM_EARLY_INLINING_INSNS): Set to 11. >
This caused: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49179 H.J.