Hi,

On Tue, Nov 05, 2024 at 12:24:48AM +1300, David Rowley wrote:
> On Sat, 2 Nov 2024 at 01:50, Bertrand Drouvot
> <bertranddrouvot...@gmail.com> wrote:
> >
> > On Fri, Nov 01, 2024 at 09:47:05PM +1300, David Rowley wrote:
> > > I've attached what I thought a more optimal version might look like in
> > > case anyone thinks making it better is a good idea.
> > >
> >
> > Thanks for the proposal!
> >
> > I like the idea, I think that's worth to add a few comments, something like:
> 
> I'm happy if you want to pick this up and continue working on it.

Sure, please find attached v1, the changes are:

- switch from "const char" to "const unsigned char" (could have been done in the
current version of pg_memory_is_all_zeros() though)
- added some comments
- adding an Assert for ptr != 0
- re-introduce the function call in PageIsVerifiedExtended()
- propose a commit message

> One thing you might want to consider is if it's worth having a macro
> to help decide if you want to inline the function for smaller sizes
> and not inline for larger sizes. A macro that checks something like:
> if (__builtin_constant_p(len) && len <= 32) could call an inline
> version of the function for smaller sizes and do a function call for
> lager sizes. Compilers seem to have heuristics that result in
> behaviour like this for library functions such as memset and memcpy.
> Maybe some experimentation with godbolt.org would yield the crossover
> point in bytes where compilers switch tactics.

Yeah, I did some experiments in [0], what we can observe:

- with gcc 14.2 (X86-64), the assembly code contains calls to memset and memcpy
as soon as the BLCKSZ > 8192 (means, IIUC, those are inlined if <= 8192)
- with clang 19.1.0 the switch is done once BLCKSZ > 256 

Given that our sizes of interest here are 8192, 88, 32 and 112 (as mentioned
up-thead) and if we follow the same logic as the compilers above do for memset
and memcpy, then we may want to keep the function inlined (gcc case) or do
a switch at 256 (clang). The switch at 256 would concern only the page case.

More use cases could come in the future with different sizes but it looks
hazardous to commit on a len that would trigger the inline/non-inline switch
on our side.

So, I'm tempted to just keep the function inlined, thoughts?

> I just feel that at the rate we receive small code change suggestions
> on the mailing lists, it's just a matter of time before someone will
> come along and suggest we use pg_memory_is_all_zeros() in
> PageIsVerifiedExtended() again. If we don't optimize that function,
> then there's a chance a committer might re-commit what's just been
> reverted.

Agree.

[0]: https://godbolt.org/z/hrYda53Tj

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
>From f23a5d8aca945bd5762f34095eb0de5ecbe23dfa Mon Sep 17 00:00:00 2001
From: Bertrand Drouvot <bertranddrouvot...@gmail.com>
Date: Fri, 1 Nov 2024 11:46:29 +0000
Subject: [PATCH v1] Optimize pg_memory_is_all_zeros()

pg_memory_is_all_zeros() is currently doing byte per byte comparison and so
could lead to performance regression or penalties when multi bytes comparison
could be done instead.

Let's provide an optimized version that divides the checks into three phases for
efficiency:

- Initial alignment (byte per byte comparison)
- Multiple bytes comparison at once
- Remaining bytes (byte per byte comparison)

In passing, let's make use of this new version in PageIsVerifiedExtended() now
that multi bytes comparison is handled.
---
 src/backend/storage/page/bufpage.c | 13 +----------
 src/include/utils/memutils.h       | 36 +++++++++++++++++++++++++++---
 2 files changed, 34 insertions(+), 15 deletions(-)
  19.9% src/backend/storage/page/
  80.0% src/include/utils/

diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index be6f1f62d2..aa264f61b9 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -89,10 +89,8 @@ PageIsVerifiedExtended(Page page, BlockNumber blkno, int flags)
 {
 	PageHeader	p = (PageHeader) page;
 	size_t	   *pagebytes;
-	int			i;
 	bool		checksum_failure = false;
 	bool		header_sane = false;
-	bool		all_zeroes = false;
 	uint16		checksum = 0;
 
 	/*
@@ -126,18 +124,9 @@ PageIsVerifiedExtended(Page page, BlockNumber blkno, int flags)
 	}
 
 	/* Check all-zeroes case */
-	all_zeroes = true;
 	pagebytes = (size_t *) page;
-	for (i = 0; i < (BLCKSZ / sizeof(size_t)); i++)
-	{
-		if (pagebytes[i] != 0)
-		{
-			all_zeroes = false;
-			break;
-		}
-	}
 
-	if (all_zeroes)
+	if (pg_memory_is_all_zeros(pagebytes, BLCKSZ))
 		return true;
 
 	/*
diff --git a/src/include/utils/memutils.h b/src/include/utils/memutils.h
index 3590c8bad9..5b992a8bbe 100644
--- a/src/include/utils/memutils.h
+++ b/src/include/utils/memutils.h
@@ -192,17 +192,47 @@ extern MemoryContext BumpContextCreate(MemoryContext parent,
 /*
  * Test if a memory region starting at "ptr" and of size "len" is full of
  * zeroes.
+ *
+ * The test is divided into three phases for efficiency:
+ *  - Initial alignment (byte per byte comparison)
+ *  - Multiple bytes comparison at once
+ *  - Remaining bytes (byte per byte comparison)
+ *
+ * Caller must ensure that "ptr" is non-null.
  */
 static inline bool
 pg_memory_is_all_zeros(const void *ptr, size_t len)
 {
-	const char *p = (const char *) ptr;
+	const unsigned char *p = (const unsigned char *) ptr;
+	const unsigned char *end = &p[len];
+	const unsigned char *aligned_end = (const unsigned char *) ((uintptr_t) end & (~(sizeof(size_t) - 1)));
 
-	for (size_t i = 0; i < len; i++)
+	Assert(ptr != NULL);
+
+	/* Compare bytes, byte by byte, until the pointer "p" is aligned */
+	while (((uintptr_t) p & (sizeof(size_t) - 1)) != 0)
 	{
-		if (p[i] != 0)
+		if (p == end)
+			return true;
+
+		if (*p++ != 0)
 			return false;
 	}
+
+	/* Multiple bytes comparison(s) at once */
+	for (; p < aligned_end; p += sizeof(size_t))
+	{
+		if (*(size_t *) p != 0)
+			return false;
+	}
+
+	/* Compare remaining bytes, byte by byte */
+	while (p < end)
+	{
+		if (*p++ != 0)
+			return false;
+	}
+
 	return true;
 }
 
-- 
2.34.1

Reply via email to