> Jan Hubicka <hubi...@ucw.cz> writes:
> 
> > Hi,
> > this patch started as experiment moving hash_table_mod1 inline because it 
> > shows
> > high in streaming profiles and it represents a branch-less code that is good
> > to schedule to surrounding instructions.
> 
> FWIW with a good modern hash function it shouldn't be needed to have
> prime hash table sizes anymore. Without that just a power of two size 
> can be used, so it would be just a mask.
> 
> I considered this last time I messed with hashes, but I didn't actually
> see this function as beening hot.

If I measure the wpa-stream thread only, I get about 18% in the hashtable
lookup:
 18.33%  lto1-wpa-stream  lto1               [.] 
hash_table<hash_map<tree_node*, unsigned int, 
default_hashmap_traits>::hash_entry, xcallocator, 
true>::find_with_hash(tree_node* const&, uďż˝
  9.53%  lto1-wpa-stream  lto1               [.] 
DFS::DFS_write_tree(output_block*, DFS::sccs*, tree_node*, bool, bool, bool)    
                                                          ďż˝
  8.43%  lto1-wpa-stream  lto1               [.] linemap_lookup(line_maps*, 
unsigned int)                                                                   
                               ďż˝
  5.81%  lto1-wpa-stream  lto1               [.] 
streamer_write_uhwi_stream(lto_output_stream*, unsigned long)                   
                                                          ďż˝
  4.47%  lto1-wpa-stream  lto1               [.] lto_output_tree(output_block*, 
tree_node*, bool, bool)                                                         
                           ďż˝
  3.11%  lto1-wpa-stream  libc-2.13.so       [.] _int_malloc                    
                                                                                
                           ďż˝
  2.85%  lto1-wpa-stream  lto1               [.] 
DFS::DFS_write_tree_body(output_block*, tree_node*, DFS::sccs*, bool, bool)     
                                                          ďż˝

I am not quite sure hash function is the bottleneck though - I would more
expect it to be cache miss sink.  It may be interesting though to replace it by
something more resonable than this pre-computed divide.

Honza
> 
> -Andi
> 
> -- 
> a...@linux.intel.com -- Speaking for myself only

Reply via email to