On Fri, 21 Feb 2014, Richard Biener wrote: > > This fixes the slowness of RTL expansion in PR60291 which is caused > by excessive collisions in mem-attr sharing. The issue here is > that sharing attempts happens across functions and we have a _lot_ > of functions in this testcase referencing the same lexically > equivalent memory, for example MEM[(StgWord *)_5 + -64B]. That > means those get the same hash value. But they don't compare > equal because an SSA name _5 from function A is of course not equal > to one from function B. > > The following fixes that by not doing mem-attr sharing across functions > by clearing the mem-attrs hashtable in rest_of_clean_state. > > Another fix may be to do what the comment in iterative_hash_expr > says for SSA names: > > case SSA_NAME: > /* We can just compare by pointer. */ > return iterative_hash_host_wide_int (SSA_NAME_VERSION (t), val); > > (probably blame me for changing that to hashing the SSA version).
It was lxo. > But I'm not sure that doesn't uncover issues with various hashtables and > walking them, generating code dependent on the order. It's IMHO just not > expected that you compare function-local expressions from different > functions. Same speedup result from Index: gcc/tree.c =================================================================== --- gcc/tree.c (revision 207960) +++ gcc/tree.c (working copy) @@ -7428,7 +7428,7 @@ iterative_hash_expr (const_tree t, hashv } case SSA_NAME: /* We can just compare by pointer. */ - return iterative_hash_host_wide_int (SSA_NAME_VERSION (t), val); + return iterative_hash_hashval_t ((uintptr_t)t>>3, val); case PLACEHOLDER_EXPR: /* The node itself doesn't matter. */ return val; and from Index: gcc/tree.c =================================================================== --- gcc/tree.c (revision 207960) +++ gcc/tree.c (working copy) @@ -7428,7 +7428,9 @@ iterative_hash_expr (const_tree t, hashv } case SSA_NAME: /* We can just compare by pointer. */ - return iterative_hash_host_wide_int (SSA_NAME_VERSION (t), val); + return iterative_hash_host_wide_int + (DECL_UID (cfun->decl), + iterative_hash_host_wide_int (SSA_NAME_VERSION (t), val)); case PLACEHOLDER_EXPR: /* The node itself doesn't matter. */ return val; better than hashing pointers but requring cfun != NULL in this function isn't good either. At this point I'm more comfortable with clearing the hashtable than with changing iterative_hash_expr in any way. It's also along the way to get rid of the hash completely. Oh, btw, the speedup is going from expand : 481.98 (94%) usr 1.15 (17%) sys 481.94 (93%) wall 293891 kB (15%) ggc to expand : 2.66 ( 7%) usr 0.13 ( 2%) sys 2.64 ( 6%) wall 262544 kB (13%) ggc at -O0 (less dramatic slowness for -On). > The other thing would be to discard mem-attr sharing alltogether, > but that doesn't seem appropriate at this stage (but it would > also simplify quite some code). With only one function in RTL > at a time that shouldn't be too bad (see several suggestions > along that line, even with statistics). > > Bootstrap / regtest running on x86_64-unknown-linux-gnu, ok for > trunk and 4.8 branch? > > Thanks, > Richard. > > 2014-02-21 Richard Biener <rguent...@suse.de> > > PR middle-end/60291 > * rtl.h (clear_mem_attrs_htab): Declare. > * emit-rtl.c (clear_mem_attrs_htab): New function. > * final.c (rest_of_clean_state): Call clear_mem_attrs_htab > to avoid sharing mem-attrs between functions. > > Index: gcc/rtl.h > =================================================================== > *** gcc/rtl.h (revision 207960) > --- gcc/rtl.h (working copy) > *************** extern int in_sequence_p (void); > *** 2546,2551 **** > --- 2546,2552 ---- > extern void init_emit (void); > extern void init_emit_regs (void); > extern void init_emit_once (void); > + extern void clear_mem_attrs_htab (void); > extern void push_topmost_sequence (void); > extern void pop_topmost_sequence (void); > extern void set_new_first_and_last_insn (rtx, rtx); > Index: gcc/emit-rtl.c > =================================================================== > *** gcc/emit-rtl.c (revision 207960) > --- gcc/emit-rtl.c (working copy) > *************** init_emit_once (void) > *** 5913,5918 **** > --- 5913,5926 ---- > simple_return_rtx = gen_rtx_fmt_ (SIMPLE_RETURN, VOIDmode); > cc0_rtx = gen_rtx_fmt_ (CC0, VOIDmode); > } > + > + /* Clear the mem-attrs sharing hash table. */ > + > + void > + clear_mem_attrs_htab (void) > + { > + htab_empty (mem_attrs_htab); > + } > > /* Produce exact duplicate of insn INSN after AFTER. > Care updating of libcall regions if present. */ > Index: gcc/final.c > =================================================================== > *** gcc/final.c (revision 207960) > --- gcc/final.c (working copy) > *************** rest_of_clean_state (void) > *** 4678,4683 **** > --- 4678,4686 ---- > > init_recog_no_volatile (); > > + /* Reset mem-attrs sharing. */ > + clear_mem_attrs_htab (); > + > /* We're done with this function. Free up memory if we can. */ > free_after_parsing (cfun); > free_after_compilation (cfun); > -- Richard Biener <rguent...@suse.de> SUSE / SUSE Labs SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer