On Fri, Sep 16, 2022 at 1:01 PM Masahiko Sawada <sawada.m...@gmail.com> wrote: > In addition to two patches, I've attached the third patch. It's not > part of radix tree implementation but introduces a contrib module > bench_radix_tree, a tool for radix tree performance benchmarking. It > measures loading and lookup performance of both the radix tree and a > flat array.
Hi Masahiko, I've been using these benchmarks, along with my own variations, to try various things that I've mentioned. I'm long overdue for an update, but the picture is not yet complete. For now, I have two questions that I can't figure out on my own: 1. There seems to be some non-obvious limit on the number of keys that are loaded (or at least what the numbers report). This is independent of the number of tids per block. Example below: john=# select * from bench_shuffle_search(0, 8*1000*1000); NOTICE: num_keys = 8000000, height = 3, n4 = 0, n16 = 1, n32 = 0, n128 = 250000, n256 = 981 nkeys | rt_mem_allocated | array_mem_allocated | rt_load_ms | array_load_ms | rt_search_ms | array_serach_ms ---------+------------------+---------------------+------------+---------------+--------------+----------------- 8000000 | 268435456 | 48000000 | 661 | 29 | 276 | 389 john=# select * from bench_shuffle_search(0, 9*1000*1000); NOTICE: num_keys = 8388608, height = 3, n4 = 0, n16 = 1, n32 = 0, n128 = 262144, n256 = 1028 nkeys | rt_mem_allocated | array_mem_allocated | rt_load_ms | array_load_ms | rt_search_ms | array_serach_ms ---------+------------------+---------------------+------------+---------------+--------------+----------------- 8388608 | 276824064 | 54000000 | 718 | 33 | 311 | 446 The array is the right size, but nkeys hasn't kept pace. Can you reproduce this? Attached is the patch I'm using to show the stats when running the test. (Side note: The numbers look unfavorable for radix tree because I'm using 1 tid per block here.) 2. I found that bench_shuffle_search() is much *faster* for traditional binary search on an array than bench_seq_search(). I've found this to be true in every case. This seems counterintuitive to me -- any idea why this is? Example: john=# select * from bench_seq_search(0, 1000000); NOTICE: num_keys = 1000000, height = 2, n4 = 0, n16 = 0, n32 = 31251, n128 = 1, n256 = 122 nkeys | rt_mem_allocated | array_mem_allocated | rt_load_ms | array_load_ms | rt_search_ms | array_serach_ms ---------+------------------+---------------------+------------+---------------+--------------+----------------- 1000000 | 10199040 | 180000000 | 168 | 106 | 827 | 3348 john=# select * from bench_shuffle_search(0, 1000000); NOTICE: num_keys = 1000000, height = 2, n4 = 0, n16 = 0, n32 = 31251, n128 = 1, n256 = 122 nkeys | rt_mem_allocated | array_mem_allocated | rt_load_ms | array_load_ms | rt_search_ms | array_serach_ms ---------+------------------+---------------------+------------+---------------+--------------+----------------- 1000000 | 10199040 | 180000000 | 171 | 107 | 827 | 1400 -- John Naylor EDB: http://www.enterprisedb.com
From 43a50a385930ee340d0a3b003910c704a0ff342c Mon Sep 17 00:00:00 2001 From: John Naylor <john.nay...@postgresql.org> Date: Thu, 6 Oct 2022 09:07:41 +0700 Subject: [PATCH v65 1/5] Turn on per-node counts in benchmark Also add gitigore, fix whitespace, and change to NOTICE --- contrib/bench_radix_tree/.gitignore | 3 +++ contrib/bench_radix_tree/bench_radix_tree.c | 5 +++++ src/backend/lib/radixtree.c | 2 +- src/include/lib/radixtree.h | 2 +- 4 files changed, 10 insertions(+), 2 deletions(-) create mode 100644 contrib/bench_radix_tree/.gitignore diff --git a/contrib/bench_radix_tree/.gitignore b/contrib/bench_radix_tree/.gitignore new file mode 100644 index 0000000000..8830f5460d --- /dev/null +++ b/contrib/bench_radix_tree/.gitignore @@ -0,0 +1,3 @@ +*data +log/* +results/* diff --git a/contrib/bench_radix_tree/bench_radix_tree.c b/contrib/bench_radix_tree/bench_radix_tree.c index 5806ef7519..36c5218ae7 100644 --- a/contrib/bench_radix_tree/bench_radix_tree.c +++ b/contrib/bench_radix_tree/bench_radix_tree.c @@ -13,6 +13,7 @@ #include "fmgr.h" #include "funcapi.h" #include "lib/radixtree.h" +#include <math.h> #include "miscadmin.h" #include "utils/timestamp.h" @@ -183,6 +184,8 @@ bench_search(FunctionCallInfo fcinfo, bool shuffle) TimestampDifference(start_time, end_time, &secs, &usecs); rt_load_ms = secs * 1000 + usecs / 1000; + rt_stats(rt); + /* measure the load time of the array */ itemptrs = MemoryContextAllocHuge(CurrentMemoryContext, sizeof(ItemPointerData) * ntids); @@ -292,6 +295,8 @@ bench_load_random_int(PG_FUNCTION_ARGS) TimestampDifference(start_time, end_time, &secs, &usecs); load_time_ms = secs * 1000 + usecs / 1000; + rt_stats(rt); + MemSet(nulls, false, sizeof(nulls)); values[0] = Int64GetDatum(rt_memory_usage(rt)); values[1] = Int64GetDatum(load_time_ms); diff --git a/src/backend/lib/radixtree.c b/src/backend/lib/radixtree.c index b163eac480..a84c06f0d4 100644 --- a/src/backend/lib/radixtree.c +++ b/src/backend/lib/radixtree.c @@ -1980,7 +1980,7 @@ rt_verify_node(rt_node *node) void rt_stats(radix_tree *tree) { - fprintf(stderr, "num_keys = %lu, height = %u, n4 = %u, n16 = %u,n32 = %u, n128 = %u, n256 = %u", + elog(NOTICE, "num_keys = %lu, height = %u, n4 = %u, n16 = %u, n32 = %u, n128 = %u, n256 = %u", tree->num_keys, tree->root->shift / RT_NODE_SPAN, tree->cnt[0], diff --git a/src/include/lib/radixtree.h b/src/include/lib/radixtree.h index 38cc6abf4c..d5d7668617 100644 --- a/src/include/lib/radixtree.h +++ b/src/include/lib/radixtree.h @@ -15,7 +15,7 @@ #include "postgres.h" -/* #define RT_DEBUG 1 */ +#define RT_DEBUG 1 typedef struct radix_tree radix_tree; typedef struct rt_iter rt_iter; -- 2.37.3