Re: [PoC] Improve dead tuple storage for lazy vacuum

John Naylor Thu, 06 Oct 2022 22:29:38 -0700

On Fri, Sep 16, 2022 at 1:01 PM Masahiko Sawada <[email protected]>
wrote:
> In addition to two patches, I've attached the third patch. It's not
> part of radix tree implementation but introduces a contrib module
> bench_radix_tree, a tool for radix tree performance benchmarking. It
> measures loading and lookup performance of both the radix tree and a
> flat array.


Hi Masahiko, I've been using these benchmarks, along with my own
variations, to try various things that I've mentioned. I'm long overdue for
an update, but the picture is not yet complete.

For now, I have two questions that I can't figure out on my own:

1. There seems to be some non-obvious limit on the number of keys that are
loaded (or at least what the numbers report). This is independent of the
number of tids per block. Example below:

john=# select * from bench_shuffle_search(0, 8*1000*1000);
NOTICE:  num_keys = 8000000, height = 3, n4 = 0, n16 = 1, n32 = 0, n128 =
250000, n256 = 981
  nkeys  | rt_mem_allocated | array_mem_allocated | rt_load_ms |
array_load_ms | rt_search_ms | array_serach_ms
---------+------------------+---------------------+------------+---------------+--------------+-----------------
 8000000 |        268435456 |            48000000 |        661 |
 29 |          276 |             389

john=# select * from bench_shuffle_search(0, 9*1000*1000);
NOTICE:  num_keys = 8388608, height = 3, n4 = 0, n16 = 1, n32 = 0, n128 =
262144, n256 = 1028
  nkeys  | rt_mem_allocated | array_mem_allocated | rt_load_ms |
array_load_ms | rt_search_ms | array_serach_ms
---------+------------------+---------------------+------------+---------------+--------------+-----------------
 8388608 |        276824064 |            54000000 |        718 |
 33 |          311 |             446

The array is the right size, but nkeys hasn't kept pace. Can you reproduce
this? Attached is the patch I'm using to show the stats when running the
test. (Side note: The numbers look unfavorable for radix tree because I'm
using 1 tid per block here.)

2. I found that bench_shuffle_search() is much *faster* for traditional
binary search on an array than bench_seq_search(). I've found this to be
true in every case. This seems counterintuitive to me -- any idea why this
is? Example:

john=# select * from bench_seq_search(0, 1000000);
NOTICE:  num_keys = 1000000, height = 2, n4 = 0, n16 = 0, n32 = 31251, n128
= 1, n256 = 122
  nkeys  | rt_mem_allocated | array_mem_allocated | rt_load_ms |
array_load_ms | rt_search_ms | array_serach_ms
---------+------------------+---------------------+------------+---------------+--------------+-----------------
 1000000 |         10199040 |           180000000 |        168 |
106 |          827 |            3348

john=# select * from bench_shuffle_search(0, 1000000);
NOTICE:  num_keys = 1000000, height = 2, n4 = 0, n16 = 0, n32 = 31251, n128
= 1, n256 = 122
  nkeys  | rt_mem_allocated | array_mem_allocated | rt_load_ms |
array_load_ms | rt_search_ms | array_serach_ms
---------+------------------+---------------------+------------+---------------+--------------+-----------------
 1000000 |         10199040 |           180000000 |        171 |
107 |          827 |            1400

--
John Naylor
EDB: http://www.enterprisedb.com

From 43a50a385930ee340d0a3b003910c704a0ff342c Mon Sep 17 00:00:00 2001
From: John Naylor <[email protected]>
Date: Thu, 6 Oct 2022 09:07:41 +0700
Subject: [PATCH v65 1/5] Turn on per-node counts in benchmark

Also add gitigore, fix whitespace, and change to NOTICE
---
 contrib/bench_radix_tree/.gitignore         | 3 +++
 contrib/bench_radix_tree/bench_radix_tree.c | 5 +++++
 src/backend/lib/radixtree.c                 | 2 +-
 src/include/lib/radixtree.h                 | 2 +-
 4 files changed, 10 insertions(+), 2 deletions(-)
 create mode 100644 contrib/bench_radix_tree/.gitignore

diff --git a/contrib/bench_radix_tree/.gitignore b/contrib/bench_radix_tree/.gitignore
new file mode 100644
index 0000000000..8830f5460d
--- /dev/null
+++ b/contrib/bench_radix_tree/.gitignore
@@ -0,0 +1,3 @@
+*data
+log/*
+results/*
diff --git a/contrib/bench_radix_tree/bench_radix_tree.c b/contrib/bench_radix_tree/bench_radix_tree.c
index 5806ef7519..36c5218ae7 100644
--- a/contrib/bench_radix_tree/bench_radix_tree.c
+++ b/contrib/bench_radix_tree/bench_radix_tree.c
@@ -13,6 +13,7 @@
 #include "fmgr.h"
 #include "funcapi.h"
 #include "lib/radixtree.h"
+#include <math.h>
 #include "miscadmin.h"
 #include "utils/timestamp.h"
 
@@ -183,6 +184,8 @@ bench_search(FunctionCallInfo fcinfo, bool shuffle)
 	TimestampDifference(start_time, end_time, &secs, &usecs);
 	rt_load_ms = secs * 1000 + usecs / 1000;
 
+	rt_stats(rt);
+
 	/* measure the load time of the array */
 	itemptrs = MemoryContextAllocHuge(CurrentMemoryContext,
 									  sizeof(ItemPointerData) * ntids);
@@ -292,6 +295,8 @@ bench_load_random_int(PG_FUNCTION_ARGS)
 	TimestampDifference(start_time, end_time, &secs, &usecs);
 	load_time_ms = secs * 1000 + usecs / 1000;
 
+	rt_stats(rt);
+
 	MemSet(nulls, false, sizeof(nulls));
 	values[0] = Int64GetDatum(rt_memory_usage(rt));
 	values[1] = Int64GetDatum(load_time_ms);
diff --git a/src/backend/lib/radixtree.c b/src/backend/lib/radixtree.c
index b163eac480..a84c06f0d4 100644
--- a/src/backend/lib/radixtree.c
+++ b/src/backend/lib/radixtree.c
@@ -1980,7 +1980,7 @@ rt_verify_node(rt_node *node)
 void
 rt_stats(radix_tree *tree)
 {
-	fprintf(stderr, "num_keys = %lu, height = %u, n4 = %u, n16 = %u,n32 = %u, n128 = %u, n256 = %u",
+	elog(NOTICE, "num_keys = %lu, height = %u, n4 = %u, n16 = %u, n32 = %u, n128 = %u, n256 = %u",
 			tree->num_keys,
 			tree->root->shift / RT_NODE_SPAN,
 			tree->cnt[0],
diff --git a/src/include/lib/radixtree.h b/src/include/lib/radixtree.h
index 38cc6abf4c..d5d7668617 100644
--- a/src/include/lib/radixtree.h
+++ b/src/include/lib/radixtree.h
@@ -15,7 +15,7 @@
 
 #include "postgres.h"
 
-/* #define RT_DEBUG 1 */
+#define RT_DEBUG 1
 
 typedef struct radix_tree radix_tree;
 typedef struct rt_iter rt_iter;
-- 
2.37.3

Re: [PoC] Improve dead tuple storage for lazy vacuum

Reply via email to