On Sun, Dec 10, 2023 at 2:18 AM Jeff Davis <pg...@j-davis.com> wrote:
>
> On Sat, 2023-12-09 at 18:52 +0700, John Naylor wrote:
> > > I tested using the new hash function APIs for my search path cache,
> > > and
> > > there's a significant speedup for cases not benefiting from
> > > a86c61c9ee.
> > > It's enough that we almost don't need a86c61c9ee. So a definite +1
> > > to
> > > the new APIs.

Interesting, thanks for testing! SearchPathCache is a better starting
point than dynahash for removing strlen calls anyway -- it's more
localized, uses simplehash, and we can test it with at-hand tests.

> > Do you have a new test?
>
> Still using the same basic test here:
>
> https://www.postgresql.org/message-id/04c8592dbd694e4114a3ed87139a7a04e4363030.camel%40j-davis.com
>
> What I did is:
>
>    a. add your v5 patches
>    b. disable optimization in a86c61c9ee
>    c. add attached patch to use new hash APIs

Of course, the CF bot doesn't know this, so it crashed and burned
before I had a chance to check how v6 did. I'm attaching v7 which just
improves commit messages for reviewing, and gets rid of git whitespace
errors.

My local branch of master is still at 457428d9e99b6 from Dec 4. That's
before both a86c61c9ee (Optimize SearchPathCache by saving the last
entry.) and 867dd2dc87 (Cache opaque handle for GUC option to avoid
repeasted lookups.). My plan was to keep testing against Dec. 4, but
like you I'm not sure if there is a better GUC test to do now.

> I got a slowdown between (a) and (b), and then (c) closed the gap about
> halfway. It started to get close to test noise at that point -- I could
> get some better numbers out of it if it's helpful.

We can also try (c) with using the "chunked" interface. Also note your
patch may no longer apply on top of v6 or v7.

> Also, what I'm doing in the attached path is using part of the key as
> the seed. Is that a good idea or should the seed be zero or come from
> somewhere else?

I think whether to use part of the key as a seed is a judgment call.
See this part in resowner.c:

/*
 * Most resource kinds store a pointer in 'value', and pointers are unique
 * all on their own.  But some resources store plain integers (Files and
 * Buffers as of this writing), so we want to incorporate the 'kind' in
 * the hash too, otherwise those resources will collide a lot.  But
 * because there are only a few resource kinds like that - and only a few
 * resource kinds to begin with - we don't need to work too hard to mix
 * 'kind' into the hash.  Just add it with hash_combine(), it perturbs the
 * result enough for our purposes.
 */
#if SIZEOF_DATUM == 8
    return hash_combine64(murmurhash64((uint64) value), (uint64) kind);

Given these comments, I'd feel free to use the "kind" as the seed if I
were writing this with fasthash.

The caller-provided seed can probably be zero unless we have a good
reason to, like the above, but with the incremental interface there is
an issue:

hs->hash = seed ^ (len * UINT64CONST(0x880355f21e6d1965));

Passing length 0 will wipe out the internal seed here, and that can't be good.

1) We could by convention pass "1" as the length for strings. That
could be a macro like

#define FH_UNKNOWN_LENGTH 1

...and maybe Assert(len != 0 || seed != 0)

Or 2) we could detect zero and force it to be one, but it's best if
the compiler can always constant-fold that branch. Future work may
invalidate that assumption.
From b1c63cddb988b6086486cfb739c07d33d872cafe Mon Sep 17 00:00:00 2001
From: John Naylor <john.nay...@postgresql.org>
Date: Sat, 9 Dec 2023 17:58:59 +0700
Subject: [PATCH v7 11/13] Add abiliy to case-fold with chunk (8-byte)
 interface

---
 src/include/common/hashfn_unstable.h | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/src/include/common/hashfn_unstable.h b/src/include/common/hashfn_unstable.h
index a798c42ba7..6157124cb4 100644
--- a/src/include/common/hashfn_unstable.h
+++ b/src/include/common/hashfn_unstable.h
@@ -49,6 +49,7 @@ typedef struct fasthash_state
 {
 	uint64 accum;
 #define FH_SIZEOF_ACCUM sizeof(uint64)
+	uint64 overlay; /* used for case-folding */
 	int8	accum_len;
 	uint64 hash;
 } fasthash_state;
@@ -74,7 +75,7 @@ fasthash_combine(fasthash_state* hs)
 }
 
 static inline void
-fasthash_init(fasthash_state *hs, int len, uint64 seed)
+fasthash_init(fasthash_state *hs, int len, uint64 seed, bool case_fold)
 {
 	memset(hs, 0, sizeof(fasthash_state));
 
@@ -82,6 +83,9 @@ fasthash_init(fasthash_state *hs, int len, uint64 seed)
 	// handle some other way -- maybe we can accum the length in
 	// the state and fold it in during the finalizer (cf. xxHash3)
 	hs->hash = seed ^ (len * UINT64CONST(0x880355f21e6d1965));
+
+	if (case_fold)
+		hs->overlay = UINT64CONST(0x2020202020202020);
 }
 
 static inline void
@@ -123,6 +127,8 @@ fasthash_accum(fasthash_state *hs, const unsigned char *k, int len)
 		case 0:
 			return;
 	}
+		/* case-fold, if set */
+		hs->accum |= hs->overlay;
 
 	fasthash_combine(hs);
 }
@@ -154,7 +160,7 @@ fasthash64(const unsigned char * k, int len, uint64 seed)
 {
 	fasthash_state hs;
 
-	fasthash_init(&hs, len, seed);
+	fasthash_init(&hs, len, seed, false);
 
 	while (len >= FH_SIZEOF_ACCUM)
 	{
-- 
2.43.0

From ca3030cc73310c728ef973b31f90ba99f0ae1745 Mon Sep 17 00:00:00 2001
From: John Naylor <john.nay...@postgresql.org>
Date: Sat, 9 Dec 2023 18:05:27 +0700
Subject: [PATCH v7 12/13] Use chunk interface for guc_name_hash

---
 src/backend/utils/misc/guc.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 46591172fd..507d35718f 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -1351,16 +1351,19 @@ guc_name_hash(const char *name)
 {
 	fasthash_state hs;
 
-	fasthash_init(&hs, 0, 0);
+	fasthash_init(&hs, 0, 0, true);
 
 	while (*name)
 	{
-		unsigned char		ch = *name++;
+		int chunk_len;
 
-		/* quick and dirty casefolding suitable for hashing */
-		ch |= 0x20;
+		for (chunk_len = 0;
+			chunk_len < FH_SIZEOF_ACCUM && name[chunk_len] != '\0';
+			chunk_len++)
+			;
 
-		fasthash_accum_byte(&hs, ch);
+		fasthash_accum(&hs, (const unsigned char *) name, chunk_len);
+		name += chunk_len;
 	}
 	return fasthash_final32(&hs);
 }
-- 
2.43.0

From b7ee120e05d48ebacb078e00ffefc5f98052d214 Mon Sep 17 00:00:00 2001
From: John Naylor <john.nay...@postgresql.org>
Date: Sat, 9 Dec 2023 18:06:05 +0700
Subject: [PATCH v7 13/13] PoC: Get rid of strlen() calls when using
 HASH_STRINGS

Add cstring_hash, which uses the chunked incremental interface
of fasthash. That way, we don't need know the length of the
key upfront.

Open questions:
- Is performance better?
- Since we have the total length when we reach the end, should
  well try to use it in the finalization stage?
- Do we need to keep string_hash around?
---
 src/backend/utils/hash/dynahash.c | 49 +++++++++++++++++++++++++++----
 1 file changed, 44 insertions(+), 5 deletions(-)

diff --git a/src/backend/utils/hash/dynahash.c b/src/backend/utils/hash/dynahash.c
index 012d4a0b1f..ba74126e73 100644
--- a/src/backend/utils/hash/dynahash.c
+++ b/src/backend/utils/hash/dynahash.c
@@ -98,6 +98,7 @@
 
 #include "access/xact.h"
 #include "common/hashfn.h"
+#include "common/hashfn_unstable.h"
 #include "port/pg_bitutils.h"
 #include "storage/shmem.h"
 #include "storage/spin.h"
@@ -307,6 +308,44 @@ string_compare(const char *key1, const char *key2, Size keysize)
 	return strncmp(key1, key2, keysize - 1);
 }
 
+/*
+ * cstring_hash: hash function for keys that are NUL-terminated strings.
+ *
+ * NOTE: this is the default hash function if none is specified.
+ */
+static uint32
+cstring_hash(const void *key, Size keysize)
+{
+	fasthash_state hs;
+	int s_len = 0;
+	const unsigned char *k = (const unsigned char *) key;
+
+	/*
+	 * If the string exceeds keysize-1 bytes, we want to hash only that many,
+	 * because when it is copied into the hash table it will be truncated at
+	 * that length.
+	 */
+
+	fasthash_init(&hs, 0, 0, false);
+
+	while (*k && s_len < keysize)
+	{
+		int chunk_len;
+
+		for (chunk_len = 0;
+			chunk_len < FH_SIZEOF_ACCUM && k[chunk_len] != '\0' && s_len < keysize;
+			chunk_len++)
+		{
+			s_len++;
+		}
+
+		fasthash_accum(&hs, k, chunk_len);
+		k += chunk_len;
+	}
+
+	return fasthash_final32(&hs);
+}
+
 
 /************************** CREATE ROUTINES **********************/
 
@@ -419,7 +458,7 @@ hash_create(const char *tabname, long nelem, const HASHCTL *info, int flags)
 	{
 		/*
 		 * string_hash used to be considered the default hash method, and in a
-		 * non-assert build it effectively still is.  But we now consider it
+		 * non-assert build it effectively still was until version 17.  Since version 14 we consider it
 		 * an assertion error to not say HASH_STRINGS explicitly.  To help
 		 * catch mistaken usage of HASH_STRINGS, we also insist on a
 		 * reasonably long string length: if the keysize is only 4 or 8 bytes,
@@ -428,12 +467,12 @@ hash_create(const char *tabname, long nelem, const HASHCTL *info, int flags)
 		Assert(flags & HASH_STRINGS);
 		Assert(info->keysize > 8);
 
-		hashp->hash = string_hash;
+		hashp->hash = cstring_hash;
 	}
 
 	/*
 	 * If you don't specify a match function, it defaults to string_compare if
-	 * you used string_hash, and to memcmp otherwise.
+	 * you used cstring_hash, and to memcmp otherwise.
 	 *
 	 * Note: explicitly specifying string_hash is deprecated, because this
 	 * might not work for callers in loadable modules on some platforms due to
@@ -442,7 +481,7 @@ hash_create(const char *tabname, long nelem, const HASHCTL *info, int flags)
 	 */
 	if (flags & HASH_COMPARE)
 		hashp->match = info->match;
-	else if (hashp->hash == string_hash)
+	else if (hashp->hash == cstring_hash)
 		hashp->match = (HashCompareFunc) string_compare;
 	else
 		hashp->match = memcmp;
@@ -452,7 +491,7 @@ hash_create(const char *tabname, long nelem, const HASHCTL *info, int flags)
 	 */
 	if (flags & HASH_KEYCOPY)
 		hashp->keycopy = info->keycopy;
-	else if (hashp->hash == string_hash)
+	else if (hashp->hash == cstring_hash)
 	{
 		/*
 		 * The signature of keycopy is meant for memcpy(), which returns
-- 
2.43.0

From f16778d29742d9c13bdc0e22c8c25e3b23e025ab Mon Sep 17 00:00:00 2001
From: Jeff Davis <j...@j-davis.com>
Date: Wed, 2 Aug 2023 23:04:06 -0700
Subject: [PATCH v7 09/13] Convert GUC hashtable to use simplehash.

---
 src/backend/utils/misc/guc.c | 147 ++++++++++++++---------------------
 1 file changed, 59 insertions(+), 88 deletions(-)

diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 1484e11a42..1d5d144c41 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -203,9 +203,10 @@ typedef struct
 {
 	const char *gucname;		/* hash key */
 	struct config_generic *gucvar;	/* -> GUC's defining structure */
-} GUCHashEntry;
 
-static HTAB *guc_hashtab;		/* entries are GUCHashEntrys */
+	/* needed by simplehash */
+	char status;
+} GUCHashEntry;
 
 /*
  * In addition to the hash table, variables having certain properties are
@@ -228,8 +229,7 @@ static int	GUCNestLevel = 0;	/* 1 when in main transaction */
 
 
 static int	guc_var_compare(const void *a, const void *b);
-static uint32 guc_name_hash(const void *key, Size keysize);
-static int	guc_name_match(const void *key1, const void *key2, Size keysize);
+static inline uint32 guc_name_hash(const char *name);
 static void InitializeGUCOptionsFromEnvironment(void);
 static void InitializeOneGUCOption(struct config_generic *gconf);
 static void RemoveGUCFromLists(struct config_generic *gconf);
@@ -266,6 +266,18 @@ static bool call_string_check_hook(struct config_string *conf, char **newval,
 static bool call_enum_check_hook(struct config_enum *conf, int *newval,
 								 void **extra, GucSource source, int elevel);
 
+#define SH_PREFIX		GUCHash
+#define SH_ELEMENT_TYPE	GUCHashEntry
+#define SH_KEY_TYPE		const char *
+#define	SH_KEY			gucname
+#define SH_HASH_KEY(tb, key)   	guc_name_hash(key)
+#define SH_EQUAL(tb, a, b)		(guc_name_compare(a, b) == 0)
+#define	SH_SCOPE		static inline
+#define SH_DECLARE
+#define SH_DEFINE
+#include "lib/simplehash.h"
+
+static GUCHash_hash *guc_hashtab = NULL;	/* entries are GUCHashEntrys */
 
 /*
  * This function handles both actual config file (re)loads and execution of
@@ -283,7 +295,7 @@ ProcessConfigFileInternal(GucContext context, bool applySettings, int elevel)
 	ConfigVariable *item,
 			   *head,
 			   *tail;
-	HASH_SEQ_STATUS status;
+	GUCHash_iterator iter;
 	GUCHashEntry *hentry;
 
 	/* Parse the main config file into a list of option names and values */
@@ -359,8 +371,8 @@ ProcessConfigFileInternal(GucContext context, bool applySettings, int elevel)
 	 * need this so that we can tell below which ones have been removed from
 	 * the file since we last processed it.
 	 */
-	hash_seq_init(&status, guc_hashtab);
-	while ((hentry = (GUCHashEntry *) hash_seq_search(&status)) != NULL)
+	GUCHash_start_iterate(guc_hashtab, &iter);
+	while ((hentry = GUCHash_iterate(guc_hashtab, &iter)) != NULL)
 	{
 		struct config_generic *gconf = hentry->gucvar;
 
@@ -446,8 +458,8 @@ ProcessConfigFileInternal(GucContext context, bool applySettings, int elevel)
 	 * boot-time defaults.  If such a variable can't be changed after startup,
 	 * report that and continue.
 	 */
-	hash_seq_init(&status, guc_hashtab);
-	while ((hentry = (GUCHashEntry *) hash_seq_search(&status)) != NULL)
+	GUCHash_start_iterate(guc_hashtab, &iter);
+	while ((hentry = GUCHash_iterate(guc_hashtab, &iter)) != NULL)
 	{
 		struct config_generic *gconf = hentry->gucvar;
 		GucStack   *stack;
@@ -868,17 +880,17 @@ struct config_generic **
 get_guc_variables(int *num_vars)
 {
 	struct config_generic **result;
-	HASH_SEQ_STATUS status;
+	GUCHash_iterator iter;
 	GUCHashEntry *hentry;
 	int			i;
 
-	*num_vars = hash_get_num_entries(guc_hashtab);
+	*num_vars = guc_hashtab->members;
 	result = palloc(sizeof(struct config_generic *) * *num_vars);
 
 	/* Extract pointers from the hash table */
 	i = 0;
-	hash_seq_init(&status, guc_hashtab);
-	while ((hentry = (GUCHashEntry *) hash_seq_search(&status)) != NULL)
+	GUCHash_start_iterate(guc_hashtab, &iter);
+	while ((hentry = GUCHash_iterate(guc_hashtab, &iter)) != NULL)
 		result[i++] = hentry->gucvar;
 	Assert(i == *num_vars);
 
@@ -900,7 +912,6 @@ build_guc_variables(void)
 {
 	int			size_vars;
 	int			num_vars = 0;
-	HASHCTL		hash_ctl;
 	GUCHashEntry *hentry;
 	bool		found;
 	int			i;
@@ -962,24 +973,14 @@ build_guc_variables(void)
 	 */
 	size_vars = num_vars + num_vars / 4;
 
-	hash_ctl.keysize = sizeof(char *);
-	hash_ctl.entrysize = sizeof(GUCHashEntry);
-	hash_ctl.hash = guc_name_hash;
-	hash_ctl.match = guc_name_match;
-	hash_ctl.hcxt = GUCMemoryContext;
-	guc_hashtab = hash_create("GUC hash table",
-							  size_vars,
-							  &hash_ctl,
-							  HASH_ELEM | HASH_FUNCTION | HASH_COMPARE | HASH_CONTEXT);
+	guc_hashtab = GUCHash_create(GUCMemoryContext, size_vars, NULL);
 
 	for (i = 0; ConfigureNamesBool[i].gen.name; i++)
 	{
 		struct config_generic *gucvar = &ConfigureNamesBool[i].gen;
 
-		hentry = (GUCHashEntry *) hash_search(guc_hashtab,
-											  &gucvar->name,
-											  HASH_ENTER,
-											  &found);
+		hentry = GUCHash_insert(guc_hashtab, gucvar->name, &found);
+
 		Assert(!found);
 		hentry->gucvar = gucvar;
 	}
@@ -988,10 +989,8 @@ build_guc_variables(void)
 	{
 		struct config_generic *gucvar = &ConfigureNamesInt[i].gen;
 
-		hentry = (GUCHashEntry *) hash_search(guc_hashtab,
-											  &gucvar->name,
-											  HASH_ENTER,
-											  &found);
+		hentry = GUCHash_insert(guc_hashtab, gucvar->name, &found);
+
 		Assert(!found);
 		hentry->gucvar = gucvar;
 	}
@@ -1000,10 +999,8 @@ build_guc_variables(void)
 	{
 		struct config_generic *gucvar = &ConfigureNamesReal[i].gen;
 
-		hentry = (GUCHashEntry *) hash_search(guc_hashtab,
-											  &gucvar->name,
-											  HASH_ENTER,
-											  &found);
+		hentry = GUCHash_insert(guc_hashtab, gucvar->name, &found);
+
 		Assert(!found);
 		hentry->gucvar = gucvar;
 	}
@@ -1012,10 +1009,8 @@ build_guc_variables(void)
 	{
 		struct config_generic *gucvar = &ConfigureNamesString[i].gen;
 
-		hentry = (GUCHashEntry *) hash_search(guc_hashtab,
-											  &gucvar->name,
-											  HASH_ENTER,
-											  &found);
+		hentry = GUCHash_insert(guc_hashtab, gucvar->name, &found);
+
 		Assert(!found);
 		hentry->gucvar = gucvar;
 	}
@@ -1024,15 +1019,13 @@ build_guc_variables(void)
 	{
 		struct config_generic *gucvar = &ConfigureNamesEnum[i].gen;
 
-		hentry = (GUCHashEntry *) hash_search(guc_hashtab,
-											  &gucvar->name,
-											  HASH_ENTER,
-											  &found);
+		hentry = GUCHash_insert(guc_hashtab, gucvar->name, &found);
+
 		Assert(!found);
 		hentry->gucvar = gucvar;
 	}
 
-	Assert(num_vars == hash_get_num_entries(guc_hashtab));
+	Assert(num_vars == guc_hashtab->members);
 }
 
 /*
@@ -1045,10 +1038,8 @@ add_guc_variable(struct config_generic *var, int elevel)
 	GUCHashEntry *hentry;
 	bool		found;
 
-	hentry = (GUCHashEntry *) hash_search(guc_hashtab,
-										  &var->name,
-										  HASH_ENTER_NULL,
-										  &found);
+	hentry = GUCHash_insert(guc_hashtab, var->name, &found);
+
 	if (unlikely(hentry == NULL))
 	{
 		ereport(elevel,
@@ -1237,10 +1228,8 @@ find_option(const char *name, bool create_placeholders, bool skip_errors,
 	Assert(name);
 
 	/* Look it up using the hash table. */
-	hentry = (GUCHashEntry *) hash_search(guc_hashtab,
-										  &name,
-										  HASH_FIND,
-										  NULL);
+	hentry = GUCHash_lookup(guc_hashtab, name);
+
 	if (hentry)
 		return hentry->gucvar;
 
@@ -1326,39 +1315,25 @@ guc_name_compare(const char *namea, const char *nameb)
 /*
  * Hash function that's compatible with guc_name_compare
  */
-static uint32
-guc_name_hash(const void *key, Size keysize)
+static inline uint32
+guc_name_hash(const char *name)
 {
-	const char *name = *(const char *const *) key;
 	fasthash_state hs;
 
 	fasthash_init(&hs, 0, 0);
 
 	while (*name)
 	{
-		char		ch = *name++;
+		unsigned char		ch = *name++;
 
 		/* quick and dirty casefolding suitable for hashing */
 		ch |= 0x20;
 
-		fasthash_accum_byte(&hs, (unsigned char) ch);
+		fasthash_accum_byte(&hs, ch);
 	}
 	return fasthash_final32(&hs);
 }
 
-/*
- * Dynahash match function to use in guc_hashtab
- */
-static int
-guc_name_match(const void *key1, const void *key2, Size keysize)
-{
-	const char *name1 = *(const char *const *) key1;
-	const char *name2 = *(const char *const *) key2;
-
-	return guc_name_compare(name1, name2);
-}
-
-
 /*
  * Convert a GUC name to the form that should be used in pg_parameter_acl.
  *
@@ -1528,7 +1503,7 @@ check_GUC_init(struct config_generic *gconf)
 void
 InitializeGUCOptions(void)
 {
-	HASH_SEQ_STATUS status;
+	GUCHash_iterator iter;
 	GUCHashEntry *hentry;
 
 	/*
@@ -1546,8 +1521,8 @@ InitializeGUCOptions(void)
 	 * Load all variables with their compiled-in defaults, and initialize
 	 * status fields as needed.
 	 */
-	hash_seq_init(&status, guc_hashtab);
-	while ((hentry = (GUCHashEntry *) hash_seq_search(&status)) != NULL)
+	GUCHash_start_iterate(guc_hashtab, &iter);
+	while ((hentry = GUCHash_iterate(guc_hashtab, &iter)) != NULL)
 	{
 		/* Check mapping between initial and default value */
 		Assert(check_GUC_init(hentry->gucvar));
@@ -2532,7 +2507,7 @@ AtEOXact_GUC(bool isCommit, int nestLevel)
 void
 BeginReportingGUCOptions(void)
 {
-	HASH_SEQ_STATUS status;
+	GUCHash_iterator iter;
 	GUCHashEntry *hentry;
 
 	/*
@@ -2556,8 +2531,8 @@ BeginReportingGUCOptions(void)
 						PGC_INTERNAL, PGC_S_OVERRIDE);
 
 	/* Transmit initial values of interesting variables */
-	hash_seq_init(&status, guc_hashtab);
-	while ((hentry = (GUCHashEntry *) hash_seq_search(&status)) != NULL)
+	GUCHash_start_iterate(guc_hashtab, &iter);
+	while ((hentry = GUCHash_iterate(guc_hashtab, &iter)) != NULL)
 	{
 		struct config_generic *conf = hentry->gucvar;
 
@@ -4813,10 +4788,8 @@ define_custom_variable(struct config_generic *variable)
 	/*
 	 * See if there's a placeholder by the same name.
 	 */
-	hentry = (GUCHashEntry *) hash_search(guc_hashtab,
-										  &name,
-										  HASH_FIND,
-										  NULL);
+	hentry = GUCHash_lookup(guc_hashtab, name);
+
 	if (hentry == NULL)
 	{
 		/*
@@ -5152,7 +5125,7 @@ void
 MarkGUCPrefixReserved(const char *className)
 {
 	int			classLen = strlen(className);
-	HASH_SEQ_STATUS status;
+	GUCHash_iterator iter;
 	GUCHashEntry *hentry;
 	MemoryContext oldcontext;
 
@@ -5162,8 +5135,8 @@ MarkGUCPrefixReserved(const char *className)
 	 * don't bother trying to free associated memory, since this shouldn't
 	 * happen often.)
 	 */
-	hash_seq_init(&status, guc_hashtab);
-	while ((hentry = (GUCHashEntry *) hash_seq_search(&status)) != NULL)
+	GUCHash_start_iterate(guc_hashtab, &iter);
+	while ((hentry = GUCHash_iterate(guc_hashtab, &iter)) != NULL)
 	{
 		struct config_generic *var = hentry->gucvar;
 
@@ -5178,10 +5151,8 @@ MarkGUCPrefixReserved(const char *className)
 					 errdetail("\"%s\" is now a reserved prefix.",
 							   className)));
 			/* Remove it from the hash table */
-			hash_search(guc_hashtab,
-						&var->name,
-						HASH_REMOVE,
-						NULL);
+			GUCHash_delete(guc_hashtab, var->name);
+
 			/* Remove it from any lists it's in, too */
 			RemoveGUCFromLists(var);
 		}
@@ -5212,7 +5183,7 @@ get_explain_guc_options(int *num)
 	 * While only a fraction of all the GUC variables are marked GUC_EXPLAIN,
 	 * it doesn't seem worth dynamically resizing this array.
 	 */
-	result = palloc(sizeof(struct config_generic *) * hash_get_num_entries(guc_hashtab));
+	result = palloc(sizeof(struct config_generic *) * guc_hashtab->members);
 
 	/* We need only consider GUCs with source not PGC_S_DEFAULT */
 	dlist_foreach(iter, &guc_nondef_list)
-- 
2.43.0

From f6a5341b06e5a17f7391b55cb453ca752f077e07 Mon Sep 17 00:00:00 2001
From: John Naylor <john.nay...@postgresql.org>
Date: Sat, 9 Dec 2023 17:54:13 +0700
Subject: [PATCH v7 10/13] Use inline equality function for guc hash

---
 src/backend/utils/misc/guc.c | 33 ++++++++++++++++++++++++++++++++-
 1 file changed, 32 insertions(+), 1 deletion(-)

diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 1d5d144c41..46591172fd 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -229,6 +229,7 @@ static int	GUCNestLevel = 0;	/* 1 when in main transaction */
 
 
 static int	guc_var_compare(const void *a, const void *b);
+static inline bool guc_name_eq(const char *namea, const char *nameb);
 static inline uint32 guc_name_hash(const char *name);
 static void InitializeGUCOptionsFromEnvironment(void);
 static void InitializeOneGUCOption(struct config_generic *gconf);
@@ -271,7 +272,7 @@ static bool call_enum_check_hook(struct config_enum *conf, int *newval,
 #define SH_KEY_TYPE		const char *
 #define	SH_KEY			gucname
 #define SH_HASH_KEY(tb, key)   	guc_name_hash(key)
-#define SH_EQUAL(tb, a, b)		(guc_name_compare(a, b) == 0)
+#define SH_EQUAL(tb, a, b)		(guc_name_eq(a, b))
 #define	SH_SCOPE		static inline
 #define SH_DECLARE
 #define SH_DEFINE
@@ -1312,6 +1313,36 @@ guc_name_compare(const char *namea, const char *nameb)
 	return 0;
 }
 
+static inline bool
+guc_name_eq(const char *namea, const char *nameb)
+{
+	char		cha;
+	char		chb;
+
+	while (*namea && *nameb)
+	{
+		cha = *namea++;
+		chb = *nameb++;
+
+		if (cha != chb)
+		{
+			/* Casefold lazily since we expect lower case */
+			if (cha >= 'A' && cha <= 'Z')
+				cha += 'a' - 'A';
+			if (chb >= 'A' && chb <= 'Z')
+				chb += 'a' - 'A';
+
+			if (cha != chb)
+				return false;
+		}
+	}
+
+	if (*namea == *nameb)
+		return true;
+	else
+		return false;
+}
+
 /*
  * Hash function that's compatible with guc_name_compare
  */
-- 
2.43.0

From b413ee15d17f933039ad4eff1bd4aedea0f37d20 Mon Sep 17 00:00:00 2001
From: John Naylor <john.nay...@postgresql.org>
Date: Sat, 9 Dec 2023 16:32:05 +0700
Subject: [PATCH v7 06/13] Add bytewise interface

This is useful for hashing values with unknown length,
like NUL-terminated strings. It should be faster than calling
strlen() first and passing the length, which most hash
functions require.

Note: This method can't give the same answer as
regular fasthash, so it will need to be evaluated. It's possible
we need to mix in the length at the finalization step (at which
time can know the length), in order to safeguard against
collisions.
---
 src/include/common/hashfn_unstable.h | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/src/include/common/hashfn_unstable.h b/src/include/common/hashfn_unstable.h
index 7ed1e5335a..a798c42ba7 100644
--- a/src/include/common/hashfn_unstable.h
+++ b/src/include/common/hashfn_unstable.h
@@ -49,6 +49,7 @@ typedef struct fasthash_state
 {
 	uint64 accum;
 #define FH_SIZEOF_ACCUM sizeof(uint64)
+	int8	accum_len;
 	uint64 hash;
 } fasthash_state;
 
@@ -69,6 +70,7 @@ fasthash_combine(fasthash_state* hs)
 
 	/* reset hash state for next input */
 	hs->accum = 0;
+	hs->accum_len = 0;
 }
 
 static inline void
@@ -82,6 +84,18 @@ fasthash_init(fasthash_state *hs, int len, uint64 seed)
 	hs->hash = seed ^ (len * UINT64CONST(0x880355f21e6d1965));
 }
 
+static inline void
+fasthash_accum_byte(fasthash_state *hs, const unsigned char ch)
+{
+	hs->accum <<= BITS_PER_BYTE;
+	hs->accum |= ch;
+	hs->accum_len++;
+
+	// wip: is there a better way to get sizeof struct member?
+	if (hs->accum_len == sizeof(((fasthash_state *) 0)->accum))
+		fasthash_combine(hs);
+}
+
 static inline void
 fasthash_accum(fasthash_state *hs, const unsigned char *k, int len)
 {
@@ -117,6 +131,11 @@ fasthash_accum(fasthash_state *hs, const unsigned char *k, int len)
 static inline uint64
 fasthash_final64(fasthash_state *hs)
 {
+	// check for remaining bytes to combine into hash
+	// should only be used by the bytewise interface
+	if (hs->accum_len > 0)
+		fasthash_combine(hs);
+
 	return fasthash_mix(hs->hash);
 }
 
-- 
2.43.0

From 60dee25ccb0c904904bb057aa7c7adc8998a9cf0 Mon Sep 17 00:00:00 2001
From: John Naylor <john.nay...@postgresql.org>
Date: Sat, 9 Dec 2023 16:24:56 +0700
Subject: [PATCH v7 05/13] Demonstrate fasthash32 with pgstat_hash_hash_key

Currently this calls the 32-bit Murmur finalizer on the
three elements, joined with hash_combine().
This is simpler, has better tested behavior, and probably shaves
a few cycles and some binary space.

Note: We may not need the full 32-bit finalizer reducing step.
It would be slightly cheaper to just use fasthash64 and
then take the lower 32 bits.
---
 src/include/utils/pgstat_internal.h | 9 ++-------
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/src/include/utils/pgstat_internal.h b/src/include/utils/pgstat_internal.h
index 60fbf9394b..df310efee1 100644
--- a/src/include/utils/pgstat_internal.h
+++ b/src/include/utils/pgstat_internal.h
@@ -14,7 +14,7 @@
 #define PGSTAT_INTERNAL_H
 
 
-#include "common/hashfn.h"
+#include "common/hashfn_unstable.h"
 #include "lib/dshash.h"
 #include "lib/ilist.h"
 #include "pgstat.h"
@@ -777,15 +777,10 @@ static inline uint32
 pgstat_hash_hash_key(const void *d, size_t size, void *arg)
 {
 	const PgStat_HashKey *key = (PgStat_HashKey *) d;
-	uint32		hash;
 
 	Assert(size == sizeof(PgStat_HashKey) && arg == NULL);
 
-	hash = murmurhash32(key->kind);
-	hash = hash_combine(hash, murmurhash32(key->dboid));
-	hash = hash_combine(hash, murmurhash32(key->objoid));
-
-	return hash;
+	return fasthash32((const unsigned char *) key, size, 0);
 }
 
 /*
-- 
2.43.0

From 9be7bb1212e390b356e3310c3e29e6eda7f56e59 Mon Sep 17 00:00:00 2001
From: John Naylor <john.nay...@postgresql.org>
Date: Sat, 9 Dec 2023 17:30:39 +0700
Subject: [PATCH v7 08/13] Casefold lazily in guc_name_compare

We can assume that almost all GUC names are lower case,
so compare each byte with strict equality first. If that
fails retry the comparison with down-casing.

TODO: This concept only been tested inlined into simplehash,
so need to see if this makes any difference on its own.
---
 src/backend/utils/misc/guc.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 053be81d14..1484e11a42 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -1305,12 +1305,16 @@ guc_name_compare(const char *namea, const char *nameb)
 		char		cha = *namea++;
 		char		chb = *nameb++;
 
-		if (cha >= 'A' && cha <= 'Z')
-			cha += 'a' - 'A';
-		if (chb >= 'A' && chb <= 'Z')
-			chb += 'a' - 'A';
 		if (cha != chb)
-			return cha - chb;
+		{
+			/* Casefold lazily since we expect lower case */
+			if (cha >= 'A' && cha <= 'Z')
+				cha += 'a' - 'A';
+			if (chb >= 'A' && chb <= 'Z')
+				chb += 'a' - 'A';
+			if (cha != chb)
+				return cha - chb;
+		}
 	}
 	if (*namea)
 		return 1;				/* a is longer */
-- 
2.43.0

From d3a9fdf6cbc1702e40952136b84be08c0089b69d Mon Sep 17 00:00:00 2001
From: John Naylor <john.nay...@postgresql.org>
Date: Sat, 9 Dec 2023 17:19:10 +0700
Subject: [PATCH v7 07/13] Use bytewise fasthash in guc_name_hash

The previous hash function did not have a final bit mixing
step, so only depended on a few characters of the input.
The intermediate mixing steps could result in collisions
even with a finalizer. Also, it's not necessary to branch
depending on the case of the input -- we can borrow a trick
from the keyword hashes and unconditionally bitwise-OR with
0x20. That will do the correct case folding for letters
while leaving most other characters legal in GUC names
unchanged.
---
 src/backend/utils/misc/guc.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index e76c083003..053be81d14 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -33,6 +33,7 @@
 #include "catalog/objectaccess.h"
 #include "catalog/pg_authid.h"
 #include "catalog/pg_parameter_acl.h"
+#include "common/hashfn_unstable.h"
 #include "guc_internal.h"
 #include "libpq/pqformat.h"
 #include "parser/scansup.h"
@@ -1324,22 +1325,21 @@ guc_name_compare(const char *namea, const char *nameb)
 static uint32
 guc_name_hash(const void *key, Size keysize)
 {
-	uint32		result = 0;
 	const char *name = *(const char *const *) key;
+	fasthash_state hs;
+
+	fasthash_init(&hs, 0, 0);
 
 	while (*name)
 	{
 		char		ch = *name++;
 
-		/* Case-fold in the same way as guc_name_compare */
-		if (ch >= 'A' && ch <= 'Z')
-			ch += 'a' - 'A';
+		/* quick and dirty casefolding suitable for hashing */
+		ch |= 0x20;
 
-		/* Merge into hash ... not very bright, but it needn't be */
-		result = pg_rotate_left32(result, 5);
-		result ^= (uint32) ch;
+		fasthash_accum_byte(&hs, (unsigned char) ch);
 	}
-	return result;
+	return fasthash_final32(&hs);
 }
 
 /*
-- 
2.43.0

From d5dfa560eac6ff66b580580383fbcb636b2e15bc Mon Sep 17 00:00:00 2001
From: John Naylor <john.nay...@postgresql.org>
Date: Sat, 9 Dec 2023 16:22:35 +0700
Subject: [PATCH v7 04/13] Assert that the incremental fasthash variants give
 the same answer as the original

XXX: Remove this and also the *_orig functions before commit
---
 src/common/hashfn.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/src/common/hashfn.c b/src/common/hashfn.c
index 2490607eea..37c8d307c7 100644
--- a/src/common/hashfn.c
+++ b/src/common/hashfn.c
@@ -26,6 +26,7 @@
 #include "common/hashfn.h"
 #include "port/pg_bitutils.h"
 
+#include "common/hashfn_unstable.h"
 
 /*
  * This hash function was written by Bob Jenkins
@@ -150,6 +151,9 @@ hash_bytes(const unsigned char *k, int keylen)
 				c,
 				len;
 
+	// XXX not for commit
+	Assert(fasthash64_orig((void *) k, keylen, 0) == fasthash64(k, keylen, 0));
+
 	/* Set up the internal state */
 	len = keylen;
 	a = b = c = 0x9e3779b9 + len + 3923095;
-- 
2.43.0

From 7416a25418fbc576fc549b9a31cc44d219501c92 Mon Sep 17 00:00:00 2001
From: John Naylor <john.nay...@postgresql.org>
Date: Sat, 9 Dec 2023 16:14:04 +0700
Subject: [PATCH v7 03/13] Add UINT64CONST (not sure when we actually need
 that)

fasthash*_orig left alone for now.
---
 src/include/common/hashfn_unstable.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/include/common/hashfn_unstable.h b/src/include/common/hashfn_unstable.h
index fbae7a5522..7ed1e5335a 100644
--- a/src/include/common/hashfn_unstable.h
+++ b/src/include/common/hashfn_unstable.h
@@ -56,7 +56,7 @@ static inline uint64
 fasthash_mix(uint64 h)
 {
 	h ^= h >> 23;
-	h *= 0x2127599bf4325c37ULL;
+	h *= UINT64CONST(0x2127599bf4325c37);
 	h ^= h >> 47;
 	return h;
 }
@@ -65,7 +65,7 @@ static inline void
 fasthash_combine(fasthash_state* hs)
 {
 	hs->hash ^= fasthash_mix(hs->accum);
-	hs->hash *= 0x880355f21e6d1965ULL;
+	hs->hash *= UINT64CONST(0x880355f21e6d1965);
 
 	/* reset hash state for next input */
 	hs->accum = 0;
@@ -79,7 +79,7 @@ fasthash_init(fasthash_state *hs, int len, uint64 seed)
 	// since we don't know the length for a nul-terminated string
 	// handle some other way -- maybe we can accum the length in
 	// the state and fold it in during the finalizer (cf. xxHash3)
-	hs->hash = seed ^ (len * 0x880355f21e6d1965ULL);
+	hs->hash = seed ^ (len * UINT64CONST(0x880355f21e6d1965));
 }
 
 static inline void
-- 
2.43.0

From 14af760608974246c9c4985e1ec333d2ac4b8820 Mon Sep 17 00:00:00 2001
From: John Naylor <john.nay...@postgresql.org>
Date: Sun, 10 Dec 2023 12:11:37 +0700
Subject: [PATCH v7 02/13] Rewrite fasthash functions using a homegrown
 incremental interface

The incremental interface will be useful for cases we don't know
the length up front, such as NUL-terminated strings. First, we
need to validate that this interface can give the same answer
as the original functions when we do know the length. A future

commit will add a temporary assert for testing in CI.
---
 src/include/common/hashfn_unstable.h | 161 +++++++++++++++++++++++++--
 1 file changed, 153 insertions(+), 8 deletions(-)

diff --git a/src/include/common/hashfn_unstable.h b/src/include/common/hashfn_unstable.h
index a5bf965fa2..fbae7a5522 100644
--- a/src/include/common/hashfn_unstable.h
+++ b/src/include/common/hashfn_unstable.h
@@ -1,3 +1,25 @@
+/*
+Building blocks for creating fast inlineable hash functions. The
+unstable designation is in contrast to hashfn.h, which cannot break
+compatibility because hashes can be writen to disk and so must have
+the same hashes between versions.
+
+ *
+ * Portions Copyright (c) 2018-2023, PostgreSQL Global Development Group
+ *
+ * src/include/common/hashfn_unstable.c
+ */
+
+#ifndef HASHFN_UNSTABLE_H
+#define HASHFN_UNSTABLE_H
+
+/*
+ * fasthash is a modification of code taken from
+ * https://code.google.com/archive/p/fast-hash/source/default/source
+ * under the terms of the MIT licencse. The original copyright
+ * notice follows:
+ */
+
 /* The MIT License
 
    Copyright (C) 2012 Zilong Tan (eric.zl...@gmail.com)
@@ -23,16 +45,130 @@
    SOFTWARE.
 */
 
-#include "fasthash.h"
+typedef struct fasthash_state
+{
+	uint64 accum;
+#define FH_SIZEOF_ACCUM sizeof(uint64)
+	uint64 hash;
+} fasthash_state;
+
+static inline uint64
+fasthash_mix(uint64 h)
+{
+	h ^= h >> 23;
+	h *= 0x2127599bf4325c37ULL;
+	h ^= h >> 47;
+	return h;
+}
+
+static inline void
+fasthash_combine(fasthash_state* hs)
+{
+	hs->hash ^= fasthash_mix(hs->accum);
+	hs->hash *= 0x880355f21e6d1965ULL;
+
+	/* reset hash state for next input */
+	hs->accum = 0;
+}
+
+static inline void
+fasthash_init(fasthash_state *hs, int len, uint64 seed)
+{
+	memset(hs, 0, sizeof(fasthash_state));
+
+	// since we don't know the length for a nul-terminated string
+	// handle some other way -- maybe we can accum the length in
+	// the state and fold it in during the finalizer (cf. xxHash3)
+	hs->hash = seed ^ (len * 0x880355f21e6d1965ULL);
+}
+
+static inline void
+fasthash_accum(fasthash_state *hs, const unsigned char *k, int len)
+{
+	Assert(hs->accum == 0);
+	Assert(len <= FH_SIZEOF_ACCUM);
+
+	switch (len)
+	{
+		case 8: memcpy(&hs->accum, k, 8);
+			break;
+		case 7: hs->accum |= (uint64) k[6] << 48;
+			/* FALLTHROUGH */
+		case 6: hs->accum |= (uint64) k[5] << 40;
+			/* FALLTHROUGH */
+		case 5: hs->accum |= (uint64) k[4] << 32;
+			/* FALLTHROUGH */
+		case 4: hs->accum |= (uint64) k[3] << 24;
+			/* FALLTHROUGH */
+		case 3: hs->accum |= (uint64) k[2] << 16;
+			/* FALLTHROUGH */
+		case 2: hs->accum |= (uint64) k[1] << 8;
+			/* FALLTHROUGH */
+		case 1: hs->accum |= (uint64) k[0];
+			break;
+		case 0:
+			return;
+	}
+
+	fasthash_combine(hs);
+}
+
+
+static inline uint64
+fasthash_final64(fasthash_state *hs)
+{
+	return fasthash_mix(hs->hash);
+}
+
+static inline uint32
+fasthash_final32(fasthash_state *hs)
+{
+	// the following trick converts the 64-bit hashcode to Fermat
+	// residue, which shall retain information from both the higher
+	// and lower parts of hashcode.
+	uint64 h = fasthash_final64(hs);
+	return h - (h >> 32);
+}
+
+static inline uint64
+fasthash64(const unsigned char * k, int len, uint64 seed)
+{
+	fasthash_state hs;
+
+	fasthash_init(&hs, len, seed);
+
+	while (len >= FH_SIZEOF_ACCUM)
+	{
+		fasthash_accum(&hs, k, FH_SIZEOF_ACCUM);
+		k += FH_SIZEOF_ACCUM;
+		len -= FH_SIZEOF_ACCUM;
+	}
+
+	fasthash_accum(&hs, k, len);
+	return fasthash_final64(&hs);
+}
+
+static inline uint64
+fasthash32(const unsigned char * k, int len, uint64 seed)
+{
+	uint64 h = fasthash64(k, len, seed);
+	return h - (h >> 32);
+}
+
+
+// XXX NOT FOR COMMIT
 
 // Compression function for Merkle-Damgard construction.
 // This function is generated using the framework provided.
-#define mix(h) ({					\
-			(h) ^= (h) >> 23;		\
-			(h) *= 0x2127599bf4325c37ULL;	\
-			(h) ^= (h) >> 47; })
+static inline uint64_t mix(uint64_t h) {
+	h ^= h >> 23;
+	h *= 0x2127599bf4325c37ULL;
+	h ^= h >> 47;
+	return h;
+}
 
-uint64_t fasthash64(const void *buf, size_t len, uint64_t seed)
+static inline
+uint64_t fasthash64_orig(const void *buf, size_t len, uint64_t seed)
 {
 	const uint64_t    m = 0x880355f21e6d1965ULL;
 	const uint64_t *pos = (const uint64_t *)buf;
@@ -52,11 +188,17 @@ uint64_t fasthash64(const void *buf, size_t len, uint64_t seed)
 
 	switch (len & 7) {
 	case 7: v ^= (uint64_t)pos2[6] << 48;
+		/* FALLTHROUGH */
 	case 6: v ^= (uint64_t)pos2[5] << 40;
+		/* FALLTHROUGH */
 	case 5: v ^= (uint64_t)pos2[4] << 32;
+		/* FALLTHROUGH */
 	case 4: v ^= (uint64_t)pos2[3] << 24;
+		/* FALLTHROUGH */
 	case 3: v ^= (uint64_t)pos2[2] << 16;
+		/* FALLTHROUGH */
 	case 2: v ^= (uint64_t)pos2[1] << 8;
+		/* FALLTHROUGH */
 	case 1: v ^= (uint64_t)pos2[0];
 		h ^= mix(v);
 		h *= m;
@@ -65,11 +207,14 @@ uint64_t fasthash64(const void *buf, size_t len, uint64_t seed)
 	return mix(h);
 }
 
-uint32_t fasthash32(const void *buf, size_t len, uint32_t seed)
+static inline
+uint32_t fasthash32_orig(const void *buf, size_t len, uint32_t seed)
 {
 	// the following trick converts the 64-bit hashcode to Fermat
 	// residue, which shall retain information from both the higher
 	// and lower parts of hashcode.
-	uint64_t h = fasthash64(buf, len, seed);
+	uint64_t h = fasthash64_orig(buf, len, seed);
 	return h - (h >> 32);
 }
+
+#endif							/* HASHFN_UNSTABLE_H */
-- 
2.43.0

From b5d621087bc96c48fc6bb9f7d86fc000e689059a Mon Sep 17 00:00:00 2001
From: John Naylor <john.nay...@postgresql.org>
Date: Mon, 27 Nov 2023 17:03:38 +0700
Subject: [PATCH v7 01/13] Vendor fasthash

MIT licensed
---
 src/include/common/hashfn_unstable.h | 75 ++++++++++++++++++++++++++++
 1 file changed, 75 insertions(+)
 create mode 100644 src/include/common/hashfn_unstable.h

diff --git a/src/include/common/hashfn_unstable.h b/src/include/common/hashfn_unstable.h
new file mode 100644
index 0000000000..a5bf965fa2
--- /dev/null
+++ b/src/include/common/hashfn_unstable.h
@@ -0,0 +1,75 @@
+/* The MIT License
+
+   Copyright (C) 2012 Zilong Tan (eric.zl...@gmail.com)
+
+   Permission is hereby granted, free of charge, to any person
+   obtaining a copy of this software and associated documentation
+   files (the "Software"), to deal in the Software without
+   restriction, including without limitation the rights to use, copy,
+   modify, merge, publish, distribute, sublicense, and/or sell copies
+   of the Software, and to permit persons to whom the Software is
+   furnished to do so, subject to the following conditions:
+
+   The above copyright notice and this permission notice shall be
+   included in all copies or substantial portions of the Software.
+
+   THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+   EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+   MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+   NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+   BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+   ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+   CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+   SOFTWARE.
+*/
+
+#include "fasthash.h"
+
+// Compression function for Merkle-Damgard construction.
+// This function is generated using the framework provided.
+#define mix(h) ({					\
+			(h) ^= (h) >> 23;		\
+			(h) *= 0x2127599bf4325c37ULL;	\
+			(h) ^= (h) >> 47; })
+
+uint64_t fasthash64(const void *buf, size_t len, uint64_t seed)
+{
+	const uint64_t    m = 0x880355f21e6d1965ULL;
+	const uint64_t *pos = (const uint64_t *)buf;
+	const uint64_t *end = pos + (len / 8);
+	const unsigned char *pos2;
+	uint64_t h = seed ^ (len * m);
+	uint64_t v;
+
+	while (pos != end) {
+		v  = *pos++;
+		h ^= mix(v);
+		h *= m;
+	}
+
+	pos2 = (const unsigned char*)pos;
+	v = 0;
+
+	switch (len & 7) {
+	case 7: v ^= (uint64_t)pos2[6] << 48;
+	case 6: v ^= (uint64_t)pos2[5] << 40;
+	case 5: v ^= (uint64_t)pos2[4] << 32;
+	case 4: v ^= (uint64_t)pos2[3] << 24;
+	case 3: v ^= (uint64_t)pos2[2] << 16;
+	case 2: v ^= (uint64_t)pos2[1] << 8;
+	case 1: v ^= (uint64_t)pos2[0];
+		h ^= mix(v);
+		h *= m;
+	}
+
+	return mix(h);
+}
+
+uint32_t fasthash32(const void *buf, size_t len, uint32_t seed)
+{
+	// the following trick converts the 64-bit hashcode to Fermat
+	// residue, which shall retain information from both the higher
+	// and lower parts of hashcode.
+	uint64_t h = fasthash64(buf, len, seed);
+	return h - (h >> 32);
+}
-- 
2.43.0

Reply via email to