On Thu, Mar 24, 2022 at 1:55 AM Zhihong Yu <z...@yugabyte.com> wrote: > On Wed, Mar 23, 2022 at 5:52 AM Amit Langote <amitlangot...@gmail.com> wrote: >> I've attached an updated version of the patch, though I haven't >> changed the threshold constant. > + * Threshold of the number of tuples to need to have been processed before > + * maybe_cache_partition_bound_offset() (re-)assesses whether caching must be > > The first part of the comment should be: > > Threshold of the number of tuples which need to have been processed
Sounds the same to me, so leaving it as it is. > + (double) pd->n_tups_inserted / pd->n_offset_changed > 1) > > I think division can be avoided - the condition can be written as: > > pd->n_tups_inserted > pd->n_offset_changed > > + /* Check if the value is below the high bound */ > > high bound -> upper bound Both done, thanks. In the attached updated patch, I've also lowered the threshold number of tuples to wait before re-enabling caching from 1000 down to 10. AFAICT, it only makes things better for the cases in which the proposed caching is supposed to help, while not affecting the cases in which caching might actually make things worse. I've repeated the benchmark mentioned in [1]: -- creates a range-partitioned table with 1000 partitions create unlogged table foo (a int) partition by range (a); select 'create unlogged table foo_' || i || ' partition of foo for values from (' || (i-1)*100000+1 || ') to (' || i*100000+1 || ');' from generate_series(1, 1000) i; \gexec -- generates a 100 million record file copy (select generate_series(1, 100000000)) to '/tmp/100m.csv' csv; HEAD: postgres=# copy foo from '/tmp/100m.csv' csv; truncate foo; COPY 100000000 Time: 39445.421 ms (00:39.445) TRUNCATE TABLE Time: 381.570 ms postgres=# copy foo from '/tmp/100m.csv' csv; truncate foo; COPY 100000000 Time: 38779.235 ms (00:38.779) Patched: postgres=# copy foo from '/tmp/100m.csv' csv; truncate foo; COPY 100000000 Time: 33136.202 ms (00:33.136) TRUNCATE TABLE Time: 394.939 ms postgres=# copy foo from '/tmp/100m.csv' csv; truncate foo; COPY 100000000 Time: 33914.856 ms (00:33.915) TRUNCATE TABLE Time: 407.451 ms So roughly, 38 seconds with HEAD vs. 33 seconds with the patch applied. (Curiously, the numbers with both HEAD and patched look worse this time around, because they were 31 seconds with HEAD vs. 26 seconds with patched back in May 2021. Unless that's measurement noise, maybe something to look into.) -- Amit Langote EDB: http://www.enterprisedb.com [1] https://www.postgresql.org/message-id/CA%2BHiwqFbMSLDMinPRsGQVn_gfb-bMy0J2z_rZ0-b9kSfxXF%2BAg%40mail.gmail.com
v10-0001-Optimze-get_partition_for_tuple-by-caching-bound.patch
Description: Binary data