On Mon, 2 Aug 2021 at 00:31, David Rowley <dgrowle...@gmail.com> wrote: > I had another self review of these and I'm pretty happy with them. I'm > quite glad to see the performance of querying a single partition of a > table with large numbers of partitions no longer tails off as much as > it used to.
I did some profiling and benchmarking on master and with the v4 patch. With a hash partitioned table containing 8192 partitions I see the following when running a query that selects a value from a single partition: 19.39% postgres [.] apply_scanjoin_target_to_paths 5.35% postgres [.] base_yyparse 4.71% postgres [.] AllocSetAlloc 2.86% libc-2.33.so [.] __memset_avx2_unaligned_erms 2.17% postgres [.] SearchCatCacheInternal With the patched version, I see: 5.89% postgres [.] AllocSetAlloc 3.97% postgres [.] base_yyparse 3.87% libc-2.33.so [.] __memset_avx2_unaligned_erms 2.44% postgres [.] SearchCatCacheInternal 1.29% postgres [.] hash_search_with_hash_value I'm getting: master: 16613 tps patched: 22078 tps So there's about 32% performance improvement with this number of partitions. These results are not the same as my original email here as I've only recently discovered that I really need to pin pgbench and the postgres backend to the same CPU core to get good and stable performance from a single threaded pgbench job. FWIW, the next thing there on the profile the following line in expand_partitioned_rtentry() relinfo->part_rels = (RelOptInfo **) palloc0(relinfo->nparts * sizeof(RelOptInfo *)); If anyone has any objections to both the v4 0001 and 0002 patch, can they let me know soon. I'm currently seeing no reason that they can't go in. David