On Wed, 14 Feb 2024, Richard Biener wrote: > For the testcase in PR113910 we spend a lot of time in PTA comparing > bitmaps for looking up equivalence class members. This points to > the very weak bitmap_hash function which effectively hashes set > and a subset of not set bits. The following improves it by mixing > that weak result with the population count of the bitmap, reducing > the number of collisions significantly. It's still by no means > a good hash function. > > One major problem with it was that it simply truncated the > BITMAP_WORD sized intermediate hash to hashval_t which is > unsigned int, effectively not hashing half of the bits. That solves > most of the slowness. Mixing in the population count improves > compile-time by another 30% though. > > This reduces the compile-time for the testcase from tens of minutes > to 30 seconds and PTA time from 99% to 25%. bitmap_equal_p is gone > from the profile. > > Bootstrap and regtest running on x86_64-unknown-linux-gnu, will > push to trunk and branches.
Ha, and it breaks bootstrap because I misunderstood bitmap_count_bits_in_word (should be word_s_). Fixing this turns out that hashing the population count doesn't help anything so I'm re-testing the following simpler variant, giving up on the cheap last 25% but solving the regression as well. Richard. >From a76aebfdc4b6247db6a061e6395fd088a5694122 Mon Sep 17 00:00:00 2001 From: Richard Biener <rguent...@suse.de> Date: Wed, 14 Feb 2024 12:33:13 +0100 Subject: [PATCH] tree-optimization/113910 - huge compile time during PTA To: gcc-patches@gcc.gnu.org For the testcase in PR113910 we spend a lot of time in PTA comparing bitmaps for looking up equivalence class members. This points to the very weak bitmap_hash function which effectively hashes set and a subset of not set bits. The major problem with it is that it simply truncates the BITMAP_WORD sized intermediate hash to hashval_t which is unsigned int, effectively not hashing half of the bits. This reduces the compile-time for the testcase from tens of minutes to 42 seconds and PTA time from 99% to 46%. PR tree-optimization/113910 * bitmap.cc (bitmap_hash): Mix the full element "hash" to the hashval_t hash. --- gcc/bitmap.cc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/bitmap.cc b/gcc/bitmap.cc index 6cf326bca5a..459e32c1ad1 100644 --- a/gcc/bitmap.cc +++ b/gcc/bitmap.cc @@ -2706,7 +2706,7 @@ bitmap_hash (const_bitmap head) for (ix = 0; ix != BITMAP_ELEMENT_WORDS; ix++) hash ^= ptr->bits[ix]; } - return (hashval_t)hash; + return iterative_hash (&hash, sizeof (hash), 0); } -- 2.35.3