On Wed, 14 Feb 2024, Richard Biener wrote:

> For the testcase in PR113910 we spend a lot of time in PTA comparing
> bitmaps for looking up equivalence class members.  This points to
> the very weak bitmap_hash function which effectively hashes set
> and a subset of not set bits.  The following improves it by mixing
> that weak result with the population count of the bitmap, reducing
> the number of collisions significantly.  It's still by no means
> a good hash function.
> 
> One major problem with it was that it simply truncated the
> BITMAP_WORD sized intermediate hash to hashval_t which is
> unsigned int, effectively not hashing half of the bits.  That solves
> most of the slowness.  Mixing in the population count improves
> compile-time by another 30% though.
> 
> This reduces the compile-time for the testcase from tens of minutes
> to 30 seconds and PTA time from 99% to 25%.  bitmap_equal_p is gone
> from the profile.
> 
> Bootstrap and regtest running on x86_64-unknown-linux-gnu, will
> push to trunk and branches.

Ha, and it breaks bootstrap because I misunderstood
bitmap_count_bits_in_word (should be word_s_).  Fixing this turns
out that hashing the population count doesn't help anything
so I'm re-testing the following simpler variant, giving up on the
cheap last 25% but solving the regression as well.

Richard.

>From a76aebfdc4b6247db6a061e6395fd088a5694122 Mon Sep 17 00:00:00 2001
From: Richard Biener <rguent...@suse.de>
Date: Wed, 14 Feb 2024 12:33:13 +0100
Subject: [PATCH] tree-optimization/113910 - huge compile time during PTA
To: gcc-patches@gcc.gnu.org

For the testcase in PR113910 we spend a lot of time in PTA comparing
bitmaps for looking up equivalence class members.  This points to
the very weak bitmap_hash function which effectively hashes set
and a subset of not set bits.

The major problem with it is that it simply truncates the
BITMAP_WORD sized intermediate hash to hashval_t which is
unsigned int, effectively not hashing half of the bits.

This reduces the compile-time for the testcase from tens of minutes
to 42 seconds and PTA time from 99% to 46%.

        PR tree-optimization/113910
        * bitmap.cc (bitmap_hash): Mix the full element "hash" to
        the hashval_t hash.
---
 gcc/bitmap.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/bitmap.cc b/gcc/bitmap.cc
index 6cf326bca5a..459e32c1ad1 100644
--- a/gcc/bitmap.cc
+++ b/gcc/bitmap.cc
@@ -2706,7 +2706,7 @@ bitmap_hash (const_bitmap head)
       for (ix = 0; ix != BITMAP_ELEMENT_WORDS; ix++)
        hash ^= ptr->bits[ix];
     }
-  return (hashval_t)hash;
+  return iterative_hash (&hash, sizeof (hash), 0);
 }
 
 
-- 
2.35.3

Reply via email to