03.03.2015 10:25, Jan Hubicka writes: >>> >>> The hash itself is quite simple (borrowing incremental hash for constants >>> adding >>> very simple match for other stuff + logic to skip things that may match >>> even if >>> they are syntactticaly different). The hash can be strenghtened >>> significantly, >>> but I suppose we may do it based on actual profiling. It also is loosely >>> based >>> on varasm.c constant pool hash implementation that was OK for years. >> >> FWIW i have some older patches to use a stronger/faster spooky hash for >> incremential hash. But I never submitted them because I wasn't able >> to show clear gains for the type hashing (that was the original >> motivator) since you fixed it. >> >> But if you're interested I can update them to the latest tree. >> spooky is very good at dealing with a lot of input data quickly. > > You are my expert on modern hashing functions - my knowledge is very 1970's > with some random insights of papers I read. I still think it may be > interesting > to get bit more up to date. >
When I read your conversation, I recalled that one of my colleagues has quite good experience in using and evaluating hash functions, so I asked him for advice. Yura gave me the link to his repo on github, it contains an implementation of his own hash function (you are free to use it, it's in public domain): https://github.com/funny-falcon/funny_hash Even if you are not interested in it (I understand that you are probably looking for something well-known), this repo is worth looking at, because it contains the benchmark of several popular hash functions. Noticeably, they include "lookup3" (the function used in libiberty and inchash.c is based on it's predecessor, "lookup2"), "Murmurhash" (which is also used in GCC, in libstdc++-v3/libsupc++/hash_bytes.cc) and spooky hash mentioned by Andi. I also ran the benchmark on GCC trunk and a recent version of clang (results are attached). If you are interested in more detailed studying of hash functions, SMHasher https://code.google.com/p/smhasher/wiki/SMHasher is probably a good framework for their analysis and benchmarking. > So I think main problem is not the actual strength of the incremental hash, > but the fact that we do not hash enough. > > Whe following WIP patch makes situation bit better by properly hashing > some type properties that are known to be stable and also moving some checkes > from 4) to 2). I have some idea. Unfortunately, I did not study the layout of GIMPLE nodes in detail, so it might be useless... But suppose, that we can extract the information which is relevant to ICF in the same way gengtype extracts pointers for garbage collection. If the fields are located close enough to each other, we could simply use bitwise "and" with a constant mask to extract them. I.e. we use some tool to get the fields of tree nodes which will be used when calculating the hash. At compile time this tool will generate a lookup table (TREE_CODE will be an index in it). Each element of the table can either contain a pair (bitmask, length) - in this case we copy the tree node into temporary buffer and do bitwise "and", or, perhaps, a list of (offset, length) pairs, corresponding to some fields of the node (and possibly another bit mask). In this case we copy the fields. When we have enough data in the buffer, we perform hashing (that would probably be faster than element-wise hashing, especially if we use SIMD). Depending on how sparse the information inside tree nodes is, we could fall back to something we already have: > + else if (VECTOR_TYPE_P (type)) > + { > + hstate.add_int (VECTOR_TYPE); > + hstate.add_int (TYPE_PRECISION (type)); > + sem_item::add_type (TREE_TYPE (type), hstate); > + } > + else if (TREE_CODE (type) == ARRAY_TYPE) > + { > + hstate.add_int (ARRAY_TYPE); > + sem_item::add_type (TREE_TYPE (type), hstate); > + sem_item::add_expr (TYPE_SIZE (type), hstate); > + hstate.add_int (POINTER_TYPE_P (type)); Note that this code can also be generated automatically, based on gengtype-like annotations or, better, some separate description like match.pd. -- Regards, Mikhail Maltsev
Exact options: "-O3 -march=native -mtune=native" Run on: "Intel(R) Core(TM) i7-5820K CPU @ 3.30GHz" We are probably more interested in "substrings" mode, strictly speaking, it's not incremental, but deals with small chunks of data, rather than 300 MiBs at once. Siphash is perhaps not appropriate, because it's design heavily relies on 64-bit architecture (it involves 64-bit rotations), besides it's somewhat slower because it's designed to be resistant against hash-flooding (i.e. collision attacks). The result is time in seconds (i.e. lower is better) x86_64 ------ By 1-20 byte substrings twice function | gcc -O3 | clang -O3 ----------|---------|---------- funny32 | 0.79 | 0.93 funny64 | 0.84 | 0.94 murmur32 | 0.91 | 0.95 murmur128 | 1.33 | 1.24 sip24 | 1.88 | 1.45 sip13 | 1.66 | 1.22 lookup3 | 1.18 | 0.90 spooky | 1.24 | 0.90 10 times 300M at once function | gcc -O3 | clang -O3 ----------|---------|---------- funny32 | 1.20 | 1.15 funny64 | 0.62 | 0.66 murmur32 | 1.18 | 1.14 murmur128 | 0.60 | 0.54 sip24 | 1.26 | 1.57 sip13 | 0.73 | 0.85 lookup3 | 1.35 | 1.06 spooky | 0.78 | 0.75 x86 (using -m32) ------------- By 1-20 byte substrings twice function | gcc -O3 | clang -O3 ----------|---------|---------- funny32 | 1.00 | 1.10 funny64 | 1.57 | 1.48 murmur32 | 0.96 | 1.06 murmur128 | 2.29 | 2.12 sip24 | 4.79 | 3.36 sip13 | 3.47 | 2.52 lookup3 | 1.41 | 1.02 spooky | 3.12 | 2.05 10 times 300M at once function | gcc -O3 | clang -O3 ----------|---------|---------- funny32 | 1.16 | 1.23 funny64 | 2.11 | 1.25 murmur32 | 1.14 | 1.14 murmur128 | 2.37 | 2.03 sip24 | 7.18 | 6.75 sip13 | 3.78 | 4.09 lookup3 | 1.13 | 1.08 spooky | 4.44 | 1.92