On Wed, Mar 23, 2022 at 5:52 AM Amit Langote <amitlangot...@gmail.com> wrote:
> Hi Greg, > > On Wed, Mar 16, 2022 at 6:54 AM Greg Stark <st...@mit.edu> wrote: > > There are a whole lot of different patches in this thread. > > > > However this last one https://commitfest.postgresql.org/37/3270/ > > created by Amit seems like a fairly straightforward optimization that > > can be evaluated on its own separately from the others and seems quite > > mature. I'm actually inclined to set it to "Ready for Committer". > > Thanks for taking a look at it. > > > Incidentally a quick read-through of the patch myself and the only > > question I have is how the parameters of the adaptive algorithm were > > chosen. They seem ludicrously conservative to me > > Do you think CACHE_BOUND_OFFSET_THRESHOLD_TUPS (1000) is too high? I > suspect maybe you do. > > Basically, the way this works is that once set, cached_bound_offset is > not reset until encountering a tuple for which cached_bound_offset > doesn't give the correct partition, so the threshold doesn't matter > when the caching is active. However, once reset, it is not again set > till the threshold number of tuples have been processed and that too > only if the binary searches done during that interval appear to have > returned the same bound offset in succession a number of times. Maybe > waiting a 1000 tuples to re-assess that is a bit too conservative, > yes. I guess even as small a number as 10 is fine here? > > I've attached an updated version of the patch, though I haven't > changed the threshold constant. > > -- > Amit Langote > EDB: http://www.enterprisedb.com > > On Wed, Mar 16, 2022 at 6:54 AM Greg Stark <st...@mit.edu> wrote: > > > > There are a whole lot of different patches in this thread. > > > > However this last one https://commitfest.postgresql.org/37/3270/ > > created by Amit seems like a fairly straightforward optimization that > > can be evaluated on its own separately from the others and seems quite > > mature. I'm actually inclined to set it to "Ready for Committer". > > > > Incidentally a quick read-through of the patch myself and the only > > question I have is how the parameters of the adaptive algorithm were > > chosen. They seem ludicrously conservative to me and a bit of simple > > arguments about how expensive an extra check is versus the time saved > > in the boolean search should be easy enough to come up with to justify > > whatever values make sense. > > Hi, + * Threshold of the number of tuples to need to have been processed before + * maybe_cache_partition_bound_offset() (re-)assesses whether caching must be The first part of the comment should be: Threshold of the number of tuples which need to have been processed + (double) pd->n_tups_inserted / pd->n_offset_changed > 1) I think division can be avoided - the condition can be written as: pd->n_tups_inserted > pd->n_offset_changed + /* Check if the value is below the high bound */ high bound -> upper bound Cheers