Hi,

The heap AMs' pages only grow their pd_linp array, and never shrink
when trailing entries are marked unused. This means that up to 14% of
free space (=291 unused line pointers) on a page could be unusable for
data storage, which I think is a shame. With a patch in the works that
allows the line pointer array to grow up to one third of the size of
the page [0], it would be quite catastrophic for the available data
space on old-and-often-used pages if this could not ever be reused for
data.

The shrinking of the line pointer array is already common practice in
indexes (in which all LP_UNUSED items are removed), but this specific
implementation cannot be used for heap pages due to ItemId
invalidation. One available implementation, however, is that we
truncate the end of this array, as mentioned in [1]. There was a
warning at the top of PageRepairFragmentation about not removing
unused line pointers, but I believe that was about not removing
_intermediate_ unused line pointers (which would imply moving in-use
line pointers); as far as I know there is nothing that relies on only
growing page->pd_lower, and nothing keeping us from shrinking it
whilst holding a pin on the page.

Please find attached a fairly trivial patch for which detects the last
unused entry on a page, and truncates the pd_linp array to that entry,
effectively freeing 4 bytes per line pointer truncated away (up to
1164 bytes for pages with MaxHeapTuplesPerPage unused lp_unused
lines).

One unexpected benefit from this patch is that the PD_HAS_FREE_LINES
hint bit optimization can now be false more often, increasing the
chances of not having to check the whole array to find an empty spot.

Note: This does _not_ move valid ItemIds, it only removes invalid
(unused) ItemIds from the end of the space reserved for ItemIds on a
page, keeping valid linepointers intact.


Enjoy,

Matthias van de Meent

[0] 
https://www.postgresql.org/message-id/flat/cad21aod0ske11fmw4jd4renawbmcw1wasvnwpjvw3tvqpoq...@mail.gmail.com
[1] 
https://www.postgresql.org/message-id/CAEze2Wjf42g8Ho%3DYsC_OvyNE_ziM0ZkXg6wd9u5KVc2nTbbYXw%40mail.gmail.com
From f9be3079cf0ff26b8ef603a9b0c8bc5d27561499 Mon Sep 17 00:00:00 2001
From: Matthias van de Meent <boekew...@gmail.com>
Date: Tue, 9 Mar 2021 14:42:52 +0100
Subject: [PATCH v1] Truncate a pages' line pointer array when it has trailing
 unused ItemIds.

This will allow reuse of what is effectively free space for data as well as
new line pointers, instead of keeping it reserved for line pointers only.

An additional benefit is that the HasFreeLinePointers hint-bit optimization
now doesn't hint for free line pointers at the end of the array, slightly
increasing the specificity of where the free lines are; and saving us from
needing to search to the end of the array if all other entries are already
filled.
---
 src/backend/storage/page/bufpage.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index 9ac556b4ae..10d8f26ad0 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -672,7 +672,11 @@ compactify_tuples(itemIdCompact itemidbase, int nitems, Page page, bool presorte
  * PageRepairFragmentation
  *
  * Frees fragmented space on a page.
- * It doesn't remove unused line pointers! Please don't change this.
+ * It doesn't remove intermediate unused line pointers (that would mean
+ * moving ItemIds, and that would imply invalidating indexed values), but it
+ * does truncate the page->pd_linp array to the last unused line pointer, so
+ * that this space may also be reused for data, instead of only for line
+ * pointers.
  *
  * This routine is usable for heap pages only, but see PageIndexMultiDelete.
  *
@@ -691,6 +695,7 @@ PageRepairFragmentation(Page page)
 	int			nline,
 				nstorage,
 				nunused;
+	OffsetNumber lastUsed = InvalidOffsetNumber;
 	int			i;
 	Size		totallen;
 	bool		presorted = true;	/* For now */
@@ -724,6 +729,7 @@ PageRepairFragmentation(Page page)
 		lp = PageGetItemId(page, i);
 		if (ItemIdIsUsed(lp))
 		{
+			lastUsed = i;
 			if (ItemIdHasStorage(lp))
 			{
 				itemidptr->offsetindex = i - 1;
@@ -771,6 +777,11 @@ PageRepairFragmentation(Page page)
 		compactify_tuples(itemidbase, nstorage, page, presorted);
 	}
 
+	if (lastUsed != nline) {
+		((PageHeader) page)->pd_lower = SizeOfPageHeaderData + (sizeof(ItemIdData) * lastUsed);
+		nunused = nunused - (nline - lastUsed);
+	}
+
 	/* Set hint bit for PageAddItem */
 	if (nunused > 0)
 		PageSetHasFreeLinePointers(page);
-- 
2.20.1

Reply via email to