On Mon, 9 Mar 2020 at 00:02, Tomas Vondra <tomas.von...@2ndquadrant.com> wrote: > > Speaking of which, would you take a look at [1]? I think supporting SAOP > is fine, but I wonder if you agree with my conclusion we can't really > support inclusion @> as explained in [2]. >
Hmm, I'm not sure. However, thinking about your example in [2] reminds me of a thought I had a while ago, but then forgot about --- there is a flaw in the formula used for computing probabilities with functional dependencies: P(a,b) = P(a) * [f + (1-f)*P(b)] because it might return a value that is larger that P(b), which obviously should not be possible. We should amend that formula to prevent a result larger than P(b). The obvious way to do that would be to use: P(a,b) = Min(P(a) * [f + (1-f)*P(b)], P(b)) but actually I think it would be better and more principled to use: P(a,b) = f*Min(P(a),P(b)) + (1-f)*P(a)*P(b) I.e., for those rows believed to be functionally dependent, we use the minimum probability, and for the rows believed to be independent, we use the product. I think that would solve the problem with the example you gave at the end of [2], but I'm not sure if it helps with the general case. Regards, Dean > [1] > https://www.postgresql.org/message-id/flat/13902317.Eha0YfKkKy@pierred-pdoc > [2] > https://www.postgresql.org/message-id/20200202184134.swoqkqlqorqolrqv%40development