On Sun, Jul 17, 2022 at 07:07:15PM -0400, Dian M Fay wrote: > On Sat Jul 16, 2022 at 11:23 PM EDT, David G. Johnston wrote: > > Thanks for the review. I generally like everything you said but it made me > > realize that I still didn't really understand the intent behind the > > formula. I spent way too much time working that out for myself, then > > turned what I found useful into this v2 patch. > > > > It may need some semantic markup still but figured I'd see if the idea > > makes sense. > > > > I basically rewrote, in a bit different style, the same material into the > > code comments, then proceeded to rework the proof that was already present > > there. > > > > I did do this in somewhat of a vacuum. I'm not inclined to learn this all > > start-to-end though. If the abrupt style change is unwanted so be it. I'm > > not really sure how much benefit the proof really provides. The comments > > in the docs are probably sufficient for the code as well - just define why > > the three pieces of the formula exist and are packaged into a single > > multiplier called selectivity as an API choice. I suspect once someone > > gets to that comment it is fair to assume some prior knowledge. > > Admittedly, I didn't really come into this that way... > > Fair enough, I only know what I can glean from the comments in > eqjoinsel_inner and friends myself. I do think even this smaller change > is valuable because the current example talks about using an algorithm > based on the number of distinct values immediately after showing > n_distinct == -1, so making it clear that this case uses num_rows > instead is helpful. > > "This value does get scaled in the non-unique case" again could be more > specific ("since here all values are unique, otherwise the calculation > uses num_distinct" perhaps?). But past that quibble I'm good.
Patch applied to master. -- Bruce Momjian <br...@momjian.us> https://momjian.us EDB https://enterprisedb.com Only you can decide what is important to you.