Proposal: Deferred Replica Filtering for PostgreSQL Logical Replication

2025-03-15 Thread Dean
Hi hackers,

I'd like to propose an enhancement to PostgreSQL's logical replication
system: Deferred Replica Filtering (DRF). The goal of this feature is to
provide more granular control over which rows are replicated by applying
publication filters *after* the WAL decoding process, before sending data
to subscribers.

Currently, PostgreSQL's logical replication filters apply
deterministically. Deferred filtering, however, operates after the WAL has
been decoded, giving it access to the complete row data and making
filtering decisions based on mutable values. Additionally, record columns
may be omitted by the filter.

This opens up several possibilities for granular control. Consider the
following examples:
Alice and Bob subscribe to changes on a table with RLS enabled, allowing
CRUD operations based on user's IDs.
1. Alice needs to know the timestamp at which Bob updated the table. With
DRF, we can omit all columns except for the timestamp.
2. Bob wants to track DELETEs on the table. Without DRF, Bob can see all
columns on any deleted row, potentially exposing complete records he
shouldn't be authorized to view. DRF can filter these rows out.

Deferred replica filtering allows for session-specific, per-row, and
per-column filtering - features currently not supported by existing
replication filters, enhancing security and data privacy.

I look forward to hearing your thoughts!

Best,
Dean S


Re: Proposal: Deferred Replica Filtering for PostgreSQL Logical Replication

2025-03-18 Thread Dean
Unfortunately, neither column lists nor row filters can provide the level
of control I'm proposing. These revised examples might help illustrate the
use case for DRF:

Alice, Bob, and Eve subscribe to changes on a `friend_requests` table.
Row-level security ensures CRUD access based on user IDs.
1. Per-subscriber column control: Bob makes a change to the table. Alice
should receive the entire record, while Eve should only receive the
timestamp - no other columns. Why DRF is needed: Column lists are static
and apply equally to all subscribers, meaning we can't distinguish Alice's
subscription from Eve's.
2. Bob DELETEs a row from the table. Alice should see the DELETE event,
while Eve should not even be aware of an event. Why DRF is needed: The
deterministic nature of row filters makes them unsuitable for
per-subscriber filtering based on session data.

The goal of DRF is to allow per-subscriber variations in change broadcasts,
enabling granular control over what data is sent to each subscriber based
on their session context.

Best,
Dean S

On Mon, Mar 17, 2025 at 4:32 AM Amit Kapila  wrote:

> On Sun, Mar 16, 2025 at 12:59 AM Dean  wrote:
> >
> > I'd like to propose an enhancement to PostgreSQL's logical replication
> system: Deferred Replica Filtering (DRF). The goal of this feature is to
> provide more granular control over which rows are replicated by applying
> publication filters after the WAL decoding process, before sending data to
> subscribers.
> >
> > Currently, PostgreSQL's logical replication filters apply
> deterministically. Deferred filtering, however, operates after the WAL has
> been decoded, giving it access to the complete row data and making
> filtering decisions based on mutable values. Additionally, record columns
> may be omitted by the filter.
> >
> > This opens up several possibilities for granular control. Consider the
> following examples:
> > Alice and Bob subscribe to changes on a table with RLS enabled, allowing
> CRUD operations based on user's IDs.
> > 1. Alice needs to know the timestamp at which Bob updated the table.
> With DRF, we can omit all columns except for the timestamp.
> > 2. Bob wants to track DELETEs on the table. Without DRF, Bob can see all
> columns on any deleted row, potentially exposing complete records he
> shouldn't be authorized to view. DRF can filter these rows out.
> >
> > Deferred replica filtering allows for session-specific, per-row, and
> per-column filtering - features currently not supported by existing
> replication filters, enhancing security and data privacy.
> >
>
> We provide column lists [1] and row filters [2]. Doesn't that suffice
> the need, if not, kindly let us know what exactly you need with some
> examples.
>
> [1] -
> https://www.postgresql.org/docs/devel/logical-replication-col-lists.html
> [2] -
> https://www.postgresql.org/docs/devel/logical-replication-row-filter.html
>
> --
> With Regards,
> Amit Kapila.
>


Some optimisations for numeric division

2022-02-23 Thread Dean Rasheed
Attached are 3 small patches that improve the performance of numeric
division. Patch 0002 seems to have the biggest impact, but I think
they're all worth including, since they're quite simple changes, with
noticeable performance gains.


Patch 0001 vectorises the inner loop of div_var_fast(). This loop is
almost identical to the inner loop of mul_var(), which was vectorised
by commit 8870917623. The thing preventing the div_var_fast() loop
from being vectorised is the assignment to div[qi + i], and simply
replacing that with div_qi[i] where div_qi = &div[qi], in the same
style as mul_var(), fixes that.

One difference between this and the mul_var() code is that it is also
necessary to add an explicit cast to get the compiler to generate
matching output, and give the best results. This is because in
mul_var() the compiler recognises that var1digit is actually 16-bit,
rather than 32-bit, and so it doesn't need to multiply the high part,
but in div_var_fast() it's not obvious to the compiler that qdigit
also fits in 16 bits, hence the cast.

The actual performance gain is typically quite modest, since
div_var_fast() is always only a part of some more complex numeric
computation, but it can be seen in cases like raising numbers to
negative integer powers, e.g.:

CREATE TEMP TABLE num_test(x numeric);
SELECT setseed(0);
INSERT INTO num_test
  SELECT (SELECT ('1.'||string_agg((random()*9)::int::text, '')||x)::numeric
  FROM generate_series(1,100))
  FROM generate_series(1,10) g(x);

SELECT sum(x^(-2)) FROM num_test;

Time: 112.949 ms  (HEAD)
Time: 98.537 ms   (with patch)


Patch 0002 simplifies the inner loop of div_var() (the guts of the
public-facing numeric division operator) by more closely combining the
multiplication and subtraction operations and folding the separate
"carry" and "borrow" variables into a single "borrow", as suggested by
the old code comment.

IMO, this makes the code easier to understand, as well as giving more
significant performance gains:

CREATE TEMP TABLE div_test(x numeric);
SELECT setseed(0);
INSERT INTO div_test
  SELECT (SELECT ('1.'||string_agg((random()*9)::int::text, ''))::numeric + x
  FROM generate_series(1,50))
  FROM generate_series(1,1000) g(x);

SELECT sum(x/y) FROM div_test t1(x), div_test t2(y);

Time: 1477.340 ms  (HEAD)
Time: 826.748 ms   (with patch)


Patch 0003 replaces some uses of div_var_fast() with div_var().
Specifically, when the divisor has just one or two base-NBASE digits,
div_var() is faster. This is especially true for 1-digit divisors, due
to the "fast path" code in div_var() to handle that. It's also just
about true for 2-digit divisors, and it occurs to me that that case
could potentially be optimised further with similar fast path code in
div_var(), but I haven't tried that yet.

One-digit divisors are quite common. For example, in the Taylor series
expansions in exp_var() and ln_var(), since the number of Taylor
series terms never exceeds a few hundred in practice. Also,
exp_var()'s argument reduction step divides by 2^ndiv2, which is
roughly 100 times the input, rounded up to a power of two, and valid
inputs are less than 6000, so this will always be one or two digits.

Testing this with a bunch of random exp() and ln() computations I saw
a speed-up of 15-20%, and it reduced the run time of the numeric-big
regression test by around 10%, which seems worth having.

Regards,
Dean
From 41732ad9a44dcd12e52d823fb59cb23cce4fe217 Mon Sep 17 00:00:00 2001
From: Dean Rasheed 
Date: Sun, 20 Feb 2022 12:45:01 +
Subject: [PATCH 1/3] Apply auto-vectorization to the inner loop of
 div_var_fast().

This loop is basically the same as the inner loop of mul_var(), which
was auto-vectorized in commit 8870917623, but the compiler will only
consider auto-vectorizing the div_var_fast() loop if the assignment
target div[qi + i] is replaced by div_qi[i], where div_qi = &div[qi].

Additionally, since the compiler doesn't know that qdigit is
guaranteed to fit in a 16-bit NumericDigit, cast it to NumericDigit
before multiplying to make the resulting auto-vectorized code more
efficient.
---
 src/backend/utils/adt/numeric.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c
index 3208789f75..fc43d2a456 100644
--- a/src/backend/utils/adt/numeric.c
+++ b/src/backend/utils/adt/numeric.c
@@ -8908,13 +8908,22 @@ div_var_fast(const NumericVar *var1, const NumericVar *var2,
 			 * which would make the new value simply div[qi] mod vardigits[0].
 			 * The lower-order terms in qdigit can change this result by not
 			 * more than about twice INT_MAX/NBASE, so overflow is impossible.
+			 *
+			 * This inner loop is the performance bottleneck for division, so
+			 * code it in the same way as the inner loop of mul_var() so that
+	

Re: Some optimisations for numeric division

2022-02-23 Thread Dean Rasheed
On Wed, 23 Feb 2022 at 20:55, Tom Lane  wrote:
>
> I took a quick look through these (just eyeball, didn't try to verify
> your performance statements).

Thanks for looking!

>  I'm +1 on 0001 and 0002, but 0003 feels
> a bit ad-hoc.  It certainly *looks* weird for the allegedly faster
> function to be handing off to the allegedly slower one.  I also wonder
> if we're leaving anything on the table by not exploiting
> div_var_fast's weaker roundoff guarantees in this case.  Should we
> think about a more thoroughgoing redesign of these functions' APIs?

Hmm, I'm not sure what kind of thing you had in mind.

One thought that occurred to me was that it's a bit silly that
exp_var() and ln_var() have to use a NumericVar for what could just be
an int, if we had a div_var_int() function that could divide by an
int. Then both div_var() and div_var_fast() could hand off to it for
one and two digit divisors.

Regards,
Dean




Re: Some optimisations for numeric division

2022-02-25 Thread Dean Rasheed
On Wed, 23 Feb 2022 at 22:55, Tom Lane  wrote:
>
> Dean Rasheed  writes:
>
> > One thought that occurred to me was that it's a bit silly that
> > exp_var() and ln_var() have to use a NumericVar for what could just be
> > an int, if we had a div_var_int() function that could divide by an
> > int. Then both div_var() and div_var_fast() could hand off to it for
> > one and two digit divisors.
>
> Oooh, that seems like a good idea.
>

OK, I've replaced the 0003 patch with one that does that instead. The
div_var_int() API is slightly different in that it also accepts a
divisor weight argument, but the alternative would have been for the
callers to have to adjust both the result weight and rscale, which
would have been uglier.

There's a large block of code in div_var() that needs re-indenting,
but I think it would be better to leave that to a later pgindent run.

The performance results are quite pleasing. It's slightly faster than
the old one-digit div_var() code because it manages to avoid some
digit array copying, and for two digit divisors it's much faster:

CREATE TEMP TABLE div_test(x numeric, y numeric);
SELECT setseed(0);
INSERT INTO div_test
  SELECT (SELECT (((x%9)+1)||string_agg((random()*9)::int::text, ''))::numeric
  FROM generate_series(1,50)),
 (SELECT ('1.'||string_agg((random()*9)::int::text,
'')||(x%10)||'e3')::numeric
  FROM generate_series(1,6))
  FROM generate_series(1,5000) g(x);
select * from div_test limit 10;

SELECT sum(t1.x/t2.y) FROM div_test t1, div_test t2;

Time: 11600.034 ms  (HEAD)
Time: 9890.788 ms   (with 0002)
Time: 6202.851 ms   (with 0003)

And obviously it'll be a larger relative gain for div_var_fast(),
since that was slower to begin with in such cases.

This makes me think that it might also be worthwhile to follow this
with a similar div_var_int64() function on platforms with 128-bit
integers, which could then be used to handle 3- and 4-digit divisors,
which are probably quite common in practice.

Attached is the updated patch series (0001 and 0002 unchanged).

Regards,
Dean
From 481b7c0044afbe72adff1961624a1bb515791b96 Mon Sep 17 00:00:00 2001
From: Dean Rasheed 
Date: Sun, 20 Feb 2022 17:15:49 +
Subject: [PATCH 2/3] Simplify the inner loop of numeric division in div_var().

In the standard numeric division algorithm, the inner loop multiplies
the divisor by the next quotient digit and subtracts that from the
working dividend. As suggested by the original code comment, the
separate "carry" and "borrow" variables (from the multiplication and
subtraction steps respectively) can be folded together into a single
variable. Doing so significantly improves performance, as well as
simplifying the code.
---
 src/backend/utils/adt/numeric.c | 36 ++---
 1 file changed, 15 insertions(+), 21 deletions(-)

diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c
index fc43d2a456..909f4eed74 100644
--- a/src/backend/utils/adt/numeric.c
+++ b/src/backend/utils/adt/numeric.c
@@ -8605,31 +8605,25 @@ div_var(const NumericVar *var1, const NumericVar *var2, NumericVar *result,
 			/* As above, need do nothing more when quotient digit is 0 */
 			if (qhat > 0)
 			{
+NumericDigit *dividend_j = ÷nd[j];
+
 /*
  * Multiply the divisor by qhat, and subtract that from the
- * working dividend.  "carry" tracks the multiplication,
- * "borrow" the subtraction (could we fold these together?)
+ * working dividend.  The multiplication and subtraction are
+ * folded together here, noting that qhat <= NBASE (since it
+ * might be one too large), and so the intermediate result
+ * "tmp_result" is in the range [-NBASE^2, NBASE - 1], and
+ * "borrow" is in the range [0, NBASE].
  */
-carry = 0;
 borrow = 0;
 for (i = var2ndigits; i >= 0; i--)
 {
-	carry += divisor[i] * qhat;
-	borrow -= carry % NBASE;
-	carry = carry / NBASE;
-	borrow += dividend[j + i];
-	if (borrow < 0)
-	{
-		dividend[j + i] = borrow + NBASE;
-		borrow = -1;
-	}
-	else
-	{
-		dividend[j + i] = borrow;
-		borrow = 0;
-	}
+	int			tmp_result;
+
+	tmp_result = dividend_j[i] - borrow - divisor[i] * qhat;
+	borrow = (NBASE - 1 - tmp_result) / NBASE;
+	dividend_j[i] = tmp_result + borrow * NBASE;
 }
-Assert(carry == 0);
 
 /*
  * If we got a borrow out of the top dividend digit, then
@@ -8645,15 +8639,15 @@ div_var(const NumericVar *var1, const NumericVar *var2, NumericVar *result,
 	carry = 0;
 	for (i = var2ndigits; i >= 0; i--)
 	{
-		carry += dividend[j + i] + divisor[i];
+		carry += dividend_j[i] + divisor[i];
 		if (carry >= NBASE)
 		{
-			dividend[j + i] = carry - NBASE;
+			

Re: Some optimisations for numeric division

2022-02-25 Thread Dean Rasheed
On Fri, 25 Feb 2022 at 10:45, Dean Rasheed  wrote:
>
> Attached is the updated patch series (0001 and 0002 unchanged).
>

And another update following feedback from the cfbot.

Regards,
Dean
From 41732ad9a44dcd12e52d823fb59cb23cce4fe217 Mon Sep 17 00:00:00 2001
From: Dean Rasheed 
Date: Sun, 20 Feb 2022 12:45:01 +
Subject: [PATCH 1/3] Apply auto-vectorization to the inner loop of
 div_var_fast().

This loop is basically the same as the inner loop of mul_var(), which
was auto-vectorized in commit 8870917623, but the compiler will only
consider auto-vectorizing the div_var_fast() loop if the assignment
target div[qi + i] is replaced by div_qi[i], where div_qi = &div[qi].

Additionally, since the compiler doesn't know that qdigit is
guaranteed to fit in a 16-bit NumericDigit, cast it to NumericDigit
before multiplying to make the resulting auto-vectorized code more
efficient.
---
 src/backend/utils/adt/numeric.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c
index 3208789f75..fc43d2a456 100644
--- a/src/backend/utils/adt/numeric.c
+++ b/src/backend/utils/adt/numeric.c
@@ -8908,13 +8908,22 @@ div_var_fast(const NumericVar *var1, const NumericVar *var2,
 			 * which would make the new value simply div[qi] mod vardigits[0].
 			 * The lower-order terms in qdigit can change this result by not
 			 * more than about twice INT_MAX/NBASE, so overflow is impossible.
+			 *
+			 * This inner loop is the performance bottleneck for division, so
+			 * code it in the same way as the inner loop of mul_var() so that
+			 * it can be auto-vectorized.  We cast qdigit to NumericDigit
+			 * before multiplying to allow the compiler to generate more
+			 * efficient code (using 16-bit multiplication), which is safe
+			 * since we know that the quotient digit is off by at most one, so
+			 * there is no overflow risk.
 			 */
 			if (qdigit != 0)
 			{
 int			istop = Min(var2ndigits, div_ndigits - qi + 1);
+int		   *div_qi = &div[qi];
 
 for (i = 0; i < istop; i++)
-	div[qi + i] -= qdigit * var2digits[i];
+	div_qi[i] -= ((NumericDigit) qdigit) * var2digits[i];
 			}
 		}
 
-- 
2.26.2

From 481b7c0044afbe72adff1961624a1bb515791b96 Mon Sep 17 00:00:00 2001
From: Dean Rasheed 
Date: Sun, 20 Feb 2022 17:15:49 +
Subject: [PATCH 2/3] Simplify the inner loop of numeric division in div_var().

In the standard numeric division algorithm, the inner loop multiplies
the divisor by the next quotient digit and subtracts that from the
working dividend. As suggested by the original code comment, the
separate "carry" and "borrow" variables (from the multiplication and
subtraction steps respectively) can be folded together into a single
variable. Doing so significantly improves performance, as well as
simplifying the code.
---
 src/backend/utils/adt/numeric.c | 36 ++---
 1 file changed, 15 insertions(+), 21 deletions(-)

diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c
index fc43d2a456..909f4eed74 100644
--- a/src/backend/utils/adt/numeric.c
+++ b/src/backend/utils/adt/numeric.c
@@ -8605,31 +8605,25 @@ div_var(const NumericVar *var1, const NumericVar *var2, NumericVar *result,
 			/* As above, need do nothing more when quotient digit is 0 */
 			if (qhat > 0)
 			{
+NumericDigit *dividend_j = ÷nd[j];
+
 /*
  * Multiply the divisor by qhat, and subtract that from the
- * working dividend.  "carry" tracks the multiplication,
- * "borrow" the subtraction (could we fold these together?)
+ * working dividend.  The multiplication and subtraction are
+ * folded together here, noting that qhat <= NBASE (since it
+ * might be one too large), and so the intermediate result
+ * "tmp_result" is in the range [-NBASE^2, NBASE - 1], and
+ * "borrow" is in the range [0, NBASE].
  */
-carry = 0;
 borrow = 0;
 for (i = var2ndigits; i >= 0; i--)
 {
-	carry += divisor[i] * qhat;
-	borrow -= carry % NBASE;
-	carry = carry / NBASE;
-	borrow += dividend[j + i];
-	if (borrow < 0)
-	{
-		dividend[j + i] = borrow + NBASE;
-		borrow = -1;
-	}
-	else
-	{
-		dividend[j + i] = borrow;
-		borrow = 0;
-	}
+	int			tmp_result;
+
+	tmp_result = dividend_j[i] - borrow - divisor[i] * qhat;
+	borrow = (NBASE - 1 - tmp_result) / NBASE;
+	dividend_j[i] = tmp_result + borrow * NBASE;
 }
-Assert(carry == 0);
 
 /*
  * If we got a borrow out of the top dividend digit, then
@@ -8645,15 +8639,15 @@ div_var(const NumericVar *var1, const NumericVar *var2, NumericVar *result,
 	carry = 0;
 	for (i = var2ndigits; i >= 0; i--)
 	{
-		carry += dividend[j + i] + divisor[i];
+		carry += dividend_j[i] + divisor[i];
 		if (carry >= NBASE)
 		

Re: [PATCH] Add reloption for views to enable RLS

2022-02-25 Thread Dean Rasheed
On Fri, 18 Feb 2022 at 14:57, Laurenz Albe  wrote:
>
> Here is a new version, with improved documentation and the option renamed
> to "check_permissions_owner".  I just prefer the shorter form.
>

Re-reading this thread, I think I preferred the name
"security_invoker". The main objection seemed to come from the
potential confusion with SECURITY INVOKER/DEFINER functions, but I
think that's really a different thing. As long as the documentation
for the default behaviour is clear (which I think it was), then it
should be easy to explain how a security invoker view behaves
differently. Also, there's value in using the same terminology as
other databases, because many users will already be familiar with the
feature from those databases.

Some other review comments:

1). This new comment:

+   
+Be aware that USAGE privileges on schemas containing
+the underlying base relations are not checked.
+   

is not entirely accurate. It's more accurate to say that a user
creating or replacing a view must have CREATE privileges on the schema
containing the view and USAGE privileges on any schemas referred to in
the view query, whereas a user using the view only needs USAGE
privileges on the schema containing the view.

(Note that, for the view creator, USAGE is required on any schema
referred to in the query -- e.g., schemas containing functions as well
as base relations.)

2). The patch is adding a new field to RangeTblEntry which seems to be
unnecessary -- it's set, and copied around, but never read, so it
should just be removed.

3). Looking at this change:

-setRuleCheckAsUser((Node *) rule->actions, relation->rd_rel->relowner);
-setRuleCheckAsUser(rule->qual, relation->rd_rel->relowner);
+if (!(relation->rd_rel->relkind == RELKIND_VIEW
+  && !RelationSubqueryCheckPermsOwner(relation)))
+{
+setRuleCheckAsUser((Node *) rule->actions,
relation->rd_rel->relowner);
+setRuleCheckAsUser(rule->qual, relation->rd_rel->relowner);
+}

I think it should call setRuleCheckAsUser() in all cases. It might be
true that the rule fetched has checkAsUser set to InvalidOid
throughout its action and quals, but it seems unwise to rely on that
-- better to code defensively and explicitly set it in all cases.

4). In the same code block, I think the new behaviour should be
applied to SELECT rules only. The view may have other non-SELECT rules
(just as a table may have non-SELECT rules), created using CREATE
RULE, but their actions are independent of the view definition.
Currently their permissions are checked as the view/table owner, and
if anyone wanted to change that, it should be an option on the rule,
not the view (just as triggers can be made security definer or
invoker, depending on how the trigger function is defined).

(Note: I'm not suggesting that anyone actually spend any time adding
such an option to rules. Given all the pitfalls associated with rules,
I think their use should be discouraged, and no development effort
should be expended enhancing them.)

5). In the same function, the block of code that fetches rules and
triggers has been moved. I think it would be worth adding a comment to
explain why it's now important to extract the reloptions *before*
fetching the relation's rules and triggers.

6). The second set of tests added to rowsecurity.sql seem to have
nothing to do with RLS, and probably belong in updatable_views.sql,
and I think it would be worth adding a few more tests for things like
views on top of views.

Regards,
Dean




Re: Some optimisations for numeric division

2022-02-27 Thread Dean Rasheed
On Fri, 25 Feb 2022 at 18:30, Tom Lane  wrote:
>
> Dean Rasheed  writes:
> > And another update following feedback from the cfbot.
>
> This patchset LGTM.  On my machine there doesn't seem to be any
> measurable performance change for the numeric regression test,
> but numeric_big gets about 15% faster.
>

Yes, that matches my observations. Thanks for reviewing and testing.

> The only additional thought I have, based on your comments about
> 0001, is that maybe in mul_var we should declare var1digit as
> being NumericDigit not int.  I tried that here and didn't see
> any material change in the generated assembly code (using gcc
> 8.5.0), so you're right that modern gcc doesn't need any help
> there; but I wonder if it could help on other compiler versions.
>

Yes, that makes sense. Done that way.

Regards,
Dean




Re: [PATCH] Add reloption for views to enable RLS

2022-03-02 Thread Dean Rasheed
On Tue, 1 Mar 2022 at 16:40, Christoph Heiss
 wrote:
>
> That is also the main reason I preferred naming it "security_invoker" -
> it is consistent with other databases and eases transition from such
> systems.
>
> I kept "check_permissions_owner" for now. Constantly changing it around
> with each iteration doesn't really bring any value IMHO, I'd rather have
> a final consensus on how to name the option and *then* change it for good.
>

Yes indeed, it's annoying to keep changing the name between patch
versions, so let's try to get a consensus now.

For my part, I find myself more and more convinced that
"security_invoker" is the right name, because it matches the
terminology used for functions, and in other database systems. I think
the parallels between security invoker functions and security invoker
views are quite strong.

There are a couple of additional considerations that lend weight to
that choice of name, though not uniquely to it:

1). There is a slight advantage to having an option that defaults to
false/off, like the existing "security_barrier" option -- it allows a
shorthand to turn the option on, because the system automatically
turns "WITH (security_barrier)" into "WITH (security_barrier=true)".

2). Grammatically, a name like this works better, because it serves
both as the name of the boolean option, and as an adjective that can
be used to describe and name the feature -- as in "security barrier
views are cool" -- making it easier to talk about the feature.

"check_permissions_owner=false" doesn't work as well in either regard,
and just feels much more clumsy.

When we come to write the release notes for this feature, saying that
this version of PG now supports security invoker views is going to
mean a lot more to people who already use that feature in other
databases.

What are other people's opinions?

Regards,
Dean




Re: [PATCH] Add reloption for views to enable RLS

2022-03-18 Thread Dean Rasheed
On Mon, 14 Mar 2022 at 16:16, Laurenz Albe  wrote:
>
> The patch is fine from my point of view.
>
> It passes "make check-world".
>
> I'll mark it as "ready for committer".
>

Cool, thanks. I think this will make a useful addition to PG15.

I have been hacking on it a bit, and attached is an updated version.
Aside from some general copy editing, the most notable changes are:

In the updatable_views tests, I have moved the new tests to
immediately after the existing permission checking tests, which seems
like a more logical place to put them, and modified them to use the
same style as those existing tests. IMO, this test style makes the
task of writing tests simpler, since the expected output is a little
more obvious.

Similarly in the rowsecurity tests, I have moved the new tests to
immediately after the existing tests for RLS policies on tables
accessed via views, and added a few new tests in the same style,
including verifying permission checks on relations in subqueries in
RLS policies, when the table is accessed via a view.

I wasn't happy with the overall level of test coverage for this new
feature, so I have expanded on them quite a bit. This includes tests
for a bug in rewriteTargetView() -- it wasn't consistently handling
the case of an update involving an ordinary view on top of a security
invoker view.

I have added explicit documentation for the fact that a security
invoker view always does permission checks as the current user, even
if it is accessed from a non-security invoker view, since that was the
cause of some discussion on this thread.

I've also added some more detailed documentation describing how all
this affects RLS, since that's likely to be a common use case.

I've done a fairly extensive doc search, and I *think* I've identified
all the other places that needed updating.

One additional thing that had been missed was that the LOCK command
can be used to lock views, which includes locking all underlying base
relations, after checking permissions as the view owner. The
logical/consistent thing to do for security invoker views is to do the
permission checks as the invoking user, so I've done that.

Barring any other comments or objections, I'll push this in a couple
of days or so, after a bit more proof-reading.

Regards,
Dean
diff --git a/doc/src/sgml/ref/alter_view.sgml b/doc/src/sgml/ref/alter_view.sgml
new file mode 100644
index 98c312c..8bdc90a
--- a/doc/src/sgml/ref/alter_view.sgml
+++ b/doc/src/sgml/ref/alter_view.sgml
@@ -156,7 +156,17 @@ ALTER VIEW [ IF EXISTS ] 
  
   Changes the security-barrier property of the view.  The value must
-  be Boolean value, such as true
+  be a Boolean value, such as true
+  or false.
+ 
+
+   
+   
+security_invoker (boolean)
+
+ 
+  Changes the security-invoker property of the view.  The value must
+  be a Boolean value, such as true
   or false.
  
 
diff --git a/doc/src/sgml/ref/create_policy.sgml b/doc/src/sgml/ref/create_policy.sgml
new file mode 100644
index 9f53206..f898b7a
--- a/doc/src/sgml/ref/create_policy.sgml
+++ b/doc/src/sgml/ref/create_policy.sgml
@@ -608,7 +608,9 @@ AND
This does not change how views
work, however.  As with normal queries and views, permission checks and
policies for the tables which are referenced by a view will use the view
-   owner's rights and any policies which apply to the view owner.
+   owner's rights and any policies which apply to the view owner, except if
+   the view is defined using the security_invoker option
+   (see CREATE VIEW).
   
 
   
diff --git a/doc/src/sgml/ref/create_view.sgml b/doc/src/sgml/ref/create_view.sgml
new file mode 100644
index bf03287..72a92bb
--- a/doc/src/sgml/ref/create_view.sgml
+++ b/doc/src/sgml/ref/create_view.sgml
@@ -137,8 +137,6 @@ CREATE VIEW [ schemalocal or
   cascaded, and is equivalent to specifying
   WITH [ CASCADED | LOCAL ] CHECK OPTION (see below).
-  This option can be changed on existing views using ALTER VIEW.
  
 

@@ -152,7 +150,22 @@ CREATE VIEW [ schema
 

-  
+
+   
+security_invoker (boolean)
+
+ 
+  This option causes the underlying base relations to be checked
+  against the privileges of the user of the view rather than the view
+  owner.  See the notes below for full details.
+ 
+
+   
+  
+
+  All of the above options can be changed on existing views using ALTER VIEW.
+ 
 

 
@@ -265,18 +278,74 @@ CREATE VIEW vista AS SELECT text 'Hello

 

-Access to tables referenced in the view is determined by permissions of
-the view owner.  In some cases, this can be used to provide secure but
-restricted access to the underlying

Re: [PATCH] Add reloption for views to enable RLS

2022-03-22 Thread Dean Rasheed
On Mon, 21 Mar 2022 at 09:47, Laurenz Albe  wrote:
>
> Thanks for your diligent work on this, and the patch looks good to me.

Thanks for looking again. Pushed.

Regards,
Dean




Re: Additional improvements to extended statistics

2020-12-01 Thread Dean Rasheed
On Sun, 29 Nov 2020 at 21:02, Tomas Vondra
 wrote:
>
> Those are fairly minor issues. I don't have any deeper objections, and
> it seems committable. Do you plan to do that sometime soon?
>

OK, I've updated the patch status in the CF app, and I should be able
to push it in the next day or so.

Regards,
Dean




Re: Additional improvements to extended statistics

2020-12-02 Thread Dean Rasheed
 makes that code more future-proof. I
also think it's neater because now the signature of
mcv_clause_selectivity_or() is more natural --- it's primary return
value is now the clause's MCV selectivity, as suggested by the
function's name, rather than the overlap selectivity that the previous
version was returning. Also, after your "Var Op Var" patch is applied,
I think it would be possible to construct queries that would benefit
from this, so it would be good to get that committed too.

Barring any further comments, I'll push this sometime soon.

Regards,
Dean
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
new file mode 100644
index 37a735b..b88b29e
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -44,6 +44,12 @@ static void addRangeClause(RangeQueryCla
 		   bool varonleft, bool isLTsel, Selectivity s2);
 static RelOptInfo *find_single_rel_for_clauses(PlannerInfo *root,
 			   List *clauses);
+static Selectivity clauselist_selectivity_or(PlannerInfo *root,
+			 List *clauses,
+			 int varRelid,
+			 JoinType jointype,
+			 SpecialJoinInfo *sjinfo,
+			 bool use_extended_stats);
 
 /
  *		ROUTINES TO COMPUTE SELECTIVITIES
@@ -61,64 +67,10 @@ static RelOptInfo *find_single_rel_for_c
  *
  * The basic approach is to apply extended statistics first, on as many
  * clauses as possible, in order to capture cross-column dependencies etc.
- * The remaining clauses are then estimated using regular statistics tracked
- * for individual columns.  This is done by simply passing the clauses to
- * clauselist_selectivity_simple.
- */
-Selectivity
-clauselist_selectivity(PlannerInfo *root,
-	   List *clauses,
-	   int varRelid,
-	   JoinType jointype,
-	   SpecialJoinInfo *sjinfo)
-{
-	Selectivity s1 = 1.0;
-	RelOptInfo *rel;
-	Bitmapset  *estimatedclauses = NULL;
-
-	/*
-	 * Determine if these clauses reference a single relation.  If so, and if
-	 * it has extended statistics, try to apply those.
-	 */
-	rel = find_single_rel_for_clauses(root, clauses);
-	if (rel && rel->rtekind == RTE_RELATION && rel->statlist != NIL)
-	{
-		/*
-		 * Estimate as many clauses as possible using extended statistics.
-		 *
-		 * 'estimatedclauses' tracks the 0-based list position index of
-		 * clauses that we've estimated using extended statistics, and that
-		 * should be ignored.
-		 */
-		s1 *= statext_clauselist_selectivity(root, clauses, varRelid,
-			 jointype, sjinfo, rel,
-			 &estimatedclauses);
-	}
-
-	/*
-	 * Apply normal selectivity estimates for the remaining clauses, passing
-	 * 'estimatedclauses' so that it skips already estimated ones.
-	 */
-	return s1 * clauselist_selectivity_simple(root, clauses, varRelid,
-			  jointype, sjinfo,
-			  estimatedclauses);
-}
-
-/*
- * clauselist_selectivity_simple -
- *	  Compute the selectivity of an implicitly-ANDed list of boolean
- *	  expression clauses.  The list can be empty, in which case 1.0
- *	  must be returned.  List elements may be either RestrictInfos
- *	  or bare expression clauses --- the former is preferred since
- *	  it allows caching of results.  The estimatedclauses bitmap tracks
- *	  clauses that have already been estimated by other means.
- *
- * See clause_selectivity() for the meaning of the additional parameters.
- *
- * Our basic approach is to take the product of the selectivities of the
- * subclauses.  However, that's only right if the subclauses have independent
- * probabilities, and in reality they are often NOT independent.  So,
- * we want to be smarter where we can.
+ * The remaining clauses are then estimated by taking the product of their
+ * selectivities, but that's only right if they have independent
+ * probabilities, and in reality they are often NOT independent even if they
+ * only refer to a single column.  So, we want to be smarter where we can.
  *
  * We also recognize "range queries", such as "x > 34 AND x < 42".  Clauses
  * are recognized as possible range query components if they are restriction
@@ -147,28 +99,68 @@ clauselist_selectivity(PlannerInfo *root
  * selectivity functions; perhaps some day we can generalize the approach.
  */
 Selectivity
-clauselist_selectivity_simple(PlannerInfo *root,
-			  List *clauses,
-			  int varRelid,
-			  JoinType jointype,
-			  SpecialJoinInfo *sjinfo,
-			  Bitmapset *estimatedclauses)
+clauselist_selectivity(PlannerInfo *root,
+	   List *clauses,
+	   int varRelid,
+	   JoinType jointype,
+	   SpecialJoinInfo *sjinfo)
+{
+	return clauselist_selectivity_ext(root, clauses, varRelid,
+	  jointype, sjinfo, true);
+}
+
+/*
+ * clauselist_selectivity_ext -
+ *	  Extended ver

Re: Additional improvements to extended statistics

2020-12-03 Thread Dean Rasheed
On Wed, 2 Dec 2020 at 16:34, Tomas Vondra  wrote:
>
> On 12/2/20 4:51 PM, Dean Rasheed wrote:
> >
> > Barring any further comments, I'll push this sometime soon.
>
> +1
>

Pushed.

Regards,
Dean




Re: PoC/WIP: Extended statistics on expressions

2020-12-07 Thread Dean Rasheed
On Thu, 3 Dec 2020 at 15:23, Tomas Vondra  wrote:
>
> Attached is a patch series rebased on top of 25a9e54d2d.

After reading this thread and [1], I think I prefer the name
"standard" rather than "expressions", because it is meant to describe
the kind of statistics being built rather than what they apply to, but
maybe that name doesn't actually need to be exposed to the end user:

Looking at the current behaviour, there are a couple of things that
seem a little odd, even though they are understandable. For example,
the fact that

  CREATE STATISTICS s (expressions) ON (expr), col FROM tbl;

fails, but

  CREATE STATISTICS s (expressions, mcv) ON (expr), col FROM tbl;

succeeds and creates both "expressions" and "mcv" statistics. Also, the syntax

  CREATE STATISTICS s (expressions) ON (expr1), (expr2) FROM tbl;

tends to suggest that it's going to create statistics on the pair of
expressions, describing their correlation, when actually it builds 2
independent statistics. Also, this error text isn't entirely accurate:

  CREATE STATISTICS s ON col FROM tbl;
  ERROR:  extended statistics require at least 2 columns

because extended statistics don't always require 2 columns, they can
also just have an expression, or multiple expressions and 0 or 1
columns.

I think a lot of this stems from treating "expressions" in the same
way as the other (multi-column) stats kinds, and it might actually be
neater to have separate documented syntaxes for single- and
multi-column statistics:

  CREATE STATISTICS [ IF NOT EXISTS ] statistics_name
ON (expression)
FROM table_name

  CREATE STATISTICS [ IF NOT EXISTS ] statistics_name
[ ( statistics_kind [, ... ] ) ]
ON { column_name | (expression) } , { column_name | (expression) } [, ...]
FROM table_name

The first syntax would create single-column stats, and wouldn't accept
a statistics_kind argument, because there is only one kind of
single-column statistic. Maybe that might change in the future, but if
so, it's likely that the kinds of single-column stats will be
different from the kinds of multi-column stats.

In the second syntax, the only accepted kinds would be the current
multi-column stats kinds (ndistinct, dependencies, and mcv), and it
would always build stats describing the correlations between the
columns listed. It would continue to build standard/expression stats
on any expressions in the list, but that's more of an implementation
detail.

It would no longer be possible to do "CREATE STATISTICS s
(expressions) ON (expr1), (expr2) FROM tbl". Instead, you'd have to
issue 2 separate "CREATE STATISTICS" commands, but that seems more
logical, because they're independent stats.

The parsing code might not change much, but some of the errors would
be different. For example, the errors "building only extended
expression statistics on simple columns not allowed" and "extended
expression statistics require at least one expression" would go away,
and the error "extended statistics require at least 2 columns" might
become more specific, depending on the stats kind.

Regards,
Dean

[1] 
https://www.postgresql.org/message-id/flat/1009.1579038764%40sss.pgh.pa.us#8624792a20ae595683b574f5933dae53




Re: PoC/WIP: Extended statistics on expressions

2020-12-07 Thread Dean Rasheed
On Mon, 7 Dec 2020 at 14:15, Tomas Vondra  wrote:
>
> On 12/7/20 10:56 AM, Dean Rasheed wrote:
> > it might actually be
> > neater to have separate documented syntaxes for single- and
> > multi-column statistics:
> >
> >   CREATE STATISTICS [ IF NOT EXISTS ] statistics_name
> > ON (expression)
> > FROM table_name
> >
> >   CREATE STATISTICS [ IF NOT EXISTS ] statistics_name
> > [ ( statistics_kind [, ... ] ) ]
> > ON { column_name | (expression) } , { column_name | (expression) } [, 
> > ...]
> > FROM table_name
>
> I think it makes sense in general. I see two issues with this approach,
> though:
>
> * By adding expression/standard stats for individual statistics, it
> makes the list of statistics longer - I wonder if this might have
> measurable impact on lookups in this list.
>
> * I'm not sure it's a good idea that the second syntax would always
> build the per-expression stats. Firstly, it seems a bit strange that it
> behaves differently than the other kinds. Secondly, I wonder if there
> are cases where it'd be desirable to explicitly disable building these
> per-expression stats. For example, what if we have multiple extended
> statistics objects, overlapping on a couple expressions. It seems
> pointless to build the stats for all of them.
>

Hmm, I'm not sure it would really be a good idea to build MCV stats on
expressions without also building the standard stats for those
expressions, otherwise the assumptions that
mcv_combine_selectivities() makes about simple_sel and mcv_basesel
wouldn't really hold. But then, if multiple MCV stats shared the same
expression, it would be quite wasteful to build standard stats on the
expression more than once.

It feels like it should build a single extended stats object for each
unique expression, with appropriate dependencies for any MCV stats
that used those expressions, but I'm not sure how complex that would
be. Dropping the last MCV stat object using a standard expression stat
object might reasonably drop the expression stats ... except if they
were explicitly created by the user, independently of any MCV stats.
That could get quite messy.

Regards,
Dean




Re: Additional improvements to extended statistics

2020-12-07 Thread Dean Rasheed
On Wed, 2 Dec 2020 at 15:51, Dean Rasheed  wrote:
>
> The sort of queries I had in mind were things like this:
>
>   WHERE (a = 1 AND b = 1) OR (a = 2 AND b = 2)
>
> However, the new code doesn't apply the extended stats directly using
> clauselist_selectivity_or() for this kind of query because there are
> no RestrictInfos for the nested AND clauses, so
> find_single_rel_for_clauses() (and similarly
> statext_is_compatible_clause()) regards those clauses as not
> compatible with extended stats. So what ends up happening is that
> extended stats are used only when we descend down to the two AND
> clauses, and their results are combined using the original "s1 + s2 -
> s1 * s2" formula. That actually works OK in this case, because there
> is no overlap between the two AND clauses, but it wouldn't work so
> well if there was.
>
> I'm pretty sure that can be fixed by teaching
> find_single_rel_for_clauses() and statext_is_compatible_clause() to
> handle BoolExpr clauses, looking for RestrictInfos underneath them,
> but I think that should be left for a follow-in patch.

Attached is a patch doing that, which improves a couple of the
estimates for queries with AND clauses underneath OR clauses, as
expected.

This also revealed a minor bug in the way that the estimates for
multiple statistics objects were combined while processing an OR
clause -- the estimates for the overlaps between clauses only apply
for the current statistics object, so we really have to combine the
estimates for each set of clauses for each statistics object as if
they were independent of one another.

0001 fixes the multiple-extended-stats issue for OR clauses, and 0002
improves the estimates for sub-AND clauses underneath OR clauses.

These are both quite small patches, that hopefully won't interfere
with any of the other extended stats patches.

Regards,
Dean


0001-Improve-estimation-of-OR-clauses-using-multiple-exte.patch
Description: Binary data


0002-Improve-estimation-of-ANDs-under-ORs-using-extended-.patch
Description: Binary data


Re: PoC/WIP: Extended statistics on expressions

2020-12-11 Thread Dean Rasheed
On Tue, 8 Dec 2020 at 12:44, Tomas Vondra  wrote:
>
> Possibly. But I don't think it's worth the extra complexity. I don't
> expect people to have a lot of overlapping stats, so the amount of
> wasted space and CPU time is expected to be fairly limited.
>
> So I don't think it's worth spending too much time on this now. Let's
> just do what you proposed, and revisit this later if needed.
>

Yes, I think that's a reasonable approach to take. As long as the
documentation makes it clear that building MCV stats also causes
standard expression stats to be built on any expressions included in
the list, then the user will know and can avoid duplication most of
the time. I don't think there's any need for code to try to prevent
that -- just as we don't bother with code to prevent a user building
multiple indexes on the same column.

The only case where duplication won't be avoidable is where there are
multiple MCV stats sharing the same expression, but that's probably
quite unlikely in practice, and it seems acceptable to leave improving
that as a possible future optimisation.

Regards,
Dean




Re: PoC/WIP: Extended statistics on expressions

2021-01-04 Thread Dean Rasheed
On Fri, 11 Dec 2020 at 20:17, Tomas Vondra
 wrote:
>
> OK. Attached is an updated version, reworking it this way.

Cool. I think this is an exciting development, so I hope it makes it
into the next release.

I have started looking at it. So far I have only looked at the
catalog, parser and client changes, but I thought it's worth posting
my comments so far.

> I tried tweaking the grammar to differentiate these two syntax variants,
> but that led to shift/reduce conflicts with the existing ones. I tried
> fixing that, but I ended up doing that in CreateStatistics().

Yeah, that makes sense. I wasn't expecting the grammar to change.

> The other thing is that we probably can't tie this to just MCV, because
> functional dependencies need the per-expression stats too. So I simply
> build expression stats whenever there's at least one expression.

Makes sense.

> I also decided to keep the "expressions" statistics kind - it's not
> allowed to specify it in CREATE STATISTICS, but it's useful internally
> as it allows deciding whether to build the stats in a single place.
> Otherwise we'd need to do that every time we build the statistics, etc.

Yes, I thought that would be the easiest way to do it. Essentially the
"expressions" stats kind is an internal implementation detail, hidden
from the user, because it's built automatically when required, so you
don't need to (and can't) explicitly ask for it. This new behaviour
seems much more logical to me.

> I added a brief explanation to the sgml docs, not sure if that's good
> enough - maybe it needs more details.

Yes, I think that could use a little tidying up, but I haven't looked
too closely yet.


Some other comments:

* I'm not sure I understand the need for 0001. Wasn't there an earlier
version of this patch that just did it by re-populating the type
array, but which still had it as an array rather than turning it into
a list? Making it a list falsifies some of the comments and
function/variable name choices in that file.

* There's a comment typo in catalog/Makefile -- "are are reputedly
other...", should be "there are reputedly other...".

* Looking at the pg_stats_ext view, I think perhaps expressions stats
should be omitted entirely from that view, since it doesn't show any
useful information about them. So it could remove "e" from the "kinds"
array, and exclude rows whose only kind is "e", since such rows have
no interesting data in them. Essentially, the new view
pg_stats_ext_exprs makes having any expression stats in pg_stats_ext
redundant. Hiding this data in pg_stats_ext would also be consistent
with making the "expressions" stats kind hidden from the user.

* In gram.y, it wasn't quite obvious why you converted the column list
for CREATE STATISTICS from an expr_list to a stats_params list. I
figured it out, and it makes sense, but I think it could use a
comment, perhaps something along the lines of the one for index_elem,
e.g.:

/*
 * Statistics attributes can be either simple column references, or arbitrary
 * expressions in parens.  For compatibility with index attributes permitted
 * in CREATE INDEX, we allow an expression that's just a function call to be
 * written without parens.
 */

* In parse_func.c and parse_agg.c, there are a few new error strings
that use the abbreviation "stats expressions", whereas most other
errors refer to "statistics expressions". For consistency, I think
they should all be the latter.

* In generateClonedExtStatsStmt(), I think the "expressions" stats
kind needs to be explicitly excluded, otherwise CREATE TABLE (LIKE
...) fails if the source table has expression stats.

* CreateStatistics() uses ShareUpdateExclusiveLock, but in
tcop/utility.c the relation is opened with a ShareLock. ISTM that the
latter lock mode should be made to match CreateStatistics().

* Why does the new code in tcop/utility.c not use
RangeVarGetRelidExtended together with RangeVarCallbackOwnsRelation?
That seems preferable to doing the ACL check in CreateStatistics().
For one thing, as it stands, it allows the lock to be taken even if
the user doesn't own the table. Is it intentional that the current
code allows extended stats to be created on system catalogs? That
would be one thing that using RangeVarCallbackOwnsRelation would
change, but I can't see a reason to allow it.

* In src/bin/psql/describe.c, I think the \d output should also
exclude the "expressions" stats kind and just list the other kinds (or
have no kinds list at all, if there are no other kinds), to make it
consistent with the CREATE STATISTICS syntax.

* The pg_dump output for a stats object whose only kind is
"expressions" is broken -- it includes a spurious "()" for the kinds
list.

That's it for now. I'll look at the optimiser changes next, and try to
post more comments later this week.

Regards,
Dean




Bug in numeric_power() if exponent is INT_MIN

2021-01-04 Thread Dean Rasheed
(Amusingly I only found this after discovering that Windows Calculator
has a similar bug which causes it to crash if you try to raise a
number to the power INT_MIN.)

On my machine, numeric_power() loses all precision if the exponent is
INT_MIN, though the actual failure mode might well be
platform-dependent:

SELECT n, 1.0123^n AS pow
  FROM (VALUES (-2147483647), (-2147483648), (-2147483649)) AS v(n);
  n  |pow
-+
 -2147483647 | 0.7678656557347558
 -2147483648 | 1.
 -2147483649 | 0.7678656555458609
(3 rows)

The issue is in this line from power_var_int():

sig_digits += (int) log(Abs(exp)) + 8;

because "exp" is a signed int, so Abs(exp) leaves INT_MIN unchanged.
The most straightforward fix is to use fabs() instead, so that "exp"
is cast to double *before* the absolute value is taken, as in the
attached patch.

This code was added in 7d9a4737c2, which first appeared in PG 9.6, so
barring objections, I'll push and back-patch this fix that far.

Regards,
Dean
diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c
new file mode 100644
index 68d8791..7cf5656
--- a/src/backend/utils/adt/numeric.c
+++ b/src/backend/utils/adt/numeric.c
@@ -10290,7 +10290,7 @@ power_var_int(const NumericVar *base, in
 	 * to around log10(abs(exp)) digits, so work with this many extra digits
 	 * of precision (plus a few more for good measure).
 	 */
-	sig_digits += (int) log(Abs(exp)) + 8;
+	sig_digits += (int) log(fabs(exp)) + 8;
 
 	/*
 	 * Now we can proceed with the multiplications.
diff --git a/src/test/regress/expected/numeric.out b/src/test/regress/expected/numeric.out
new file mode 100644
index 56e7799..30a5642
--- a/src/test/regress/expected/numeric.out
+++ b/src/test/regress/expected/numeric.out
@@ -2321,6 +2321,12 @@ select 0.12 ^ (-20);
  2608405330458882702.5529619561355838
 (1 row)
 
+select 1.0123 ^ (-2147483648);
+  ?column?  
+
+ 0.7678656556403084
+(1 row)
+
 -- cases that used to error out
 select 0.12 ^ (-25);
  ?column?  
diff --git a/src/test/regress/sql/numeric.sql b/src/test/regress/sql/numeric.sql
new file mode 100644
index f19793a..db812c8
--- a/src/test/regress/sql/numeric.sql
+++ b/src/test/regress/sql/numeric.sql
@@ -1089,6 +1089,7 @@ select 3.789 ^ 21;
 select 3.789 ^ 35;
 select 1.2 ^ 345;
 select 0.12 ^ (-20);
+select 1.0123 ^ (-2147483648);
 
 -- cases that used to error out
 select 0.12 ^ (-25);


Re: PoC/WIP: Extended statistics on expressions

2021-01-05 Thread Dean Rasheed
On Tue, 5 Jan 2021 at 00:45, Tomas Vondra  wrote:
>
> On 1/4/21 4:34 PM, Dean Rasheed wrote:
> >
> > * In src/bin/psql/describe.c, I think the \d output should also
> > exclude the "expressions" stats kind and just list the other kinds (or
> > have no kinds list at all, if there are no other kinds), to make it
> > consistent with the CREATE STATISTICS syntax.
>
> Not sure I understand. Why would this make it consistent with CREATE
> STATISTICS? Can you elaborate?
>

This isn't absolutely essential, but I think it would be neater. For
example, if I have a table with stats like this:

CREATE TABLE foo(a int, b int);
CREATE STATISTICS foo_s_ab (mcv) ON a,b FROM foo;

then the \d output is as follows:

\d foo
Table "public.foo"
 Column |  Type   | Collation | Nullable | Default
+-+---+--+-
 a  | integer |   |  |
 b  | integer |   |  |
Statistics objects:
"public"."foo_s_ab" (mcv) ON a, b FROM foo

and the stats line matches the DDL used to create the stats. It could,
for example, be copy-pasted and tweaked to create similar stats on
another table, but even if that's not very likely, it's neat that it
reflects how the stats were created.

OTOH, if there are expressions in the list, it produces something like this:

Table "public.foo"
 Column |  Type   | Collation | Nullable | Default
+-+---+--+-
 a  | integer |   |  |
 b  | integer |   |  |
Statistics objects:
"public"."foo_s_ab" (mcv, expressions) ON a, b, ((a * b)) FROM foo

which no longer matches the DDL used, and isn't part of an accepted
syntax, so seems a bit inconsistent.

In general, if we're making the "expressions" kind an internal
implementation detail that just gets built automatically when needed,
then I think we should hide it from this sort of output, so the list
of kinds matches the list that the user used when the stats were
created.

Regards,
Dean




Re: PoC/WIP: Extended statistics on expressions

2021-01-06 Thread Dean Rasheed
Looking over the statscmds.c changes, there are a few XXX's and
FIXME's that need resolving, and I had a couple of other minor
comments:

+   /*
+* An expression using mutable functions is probably wrong,
+* since if you aren't going to get the same result for the
+* same data every time, it's not clear what the index entries
+* mean at all.
+*/
+   if (CheckMutability((Expr *) expr))
+   ereport(ERROR,

That comment is presumably copied from the index code, so needs updating.


+   /*
+* Disallow data types without a less-than operator
+*
+* XXX Maybe allow this, but only for EXPRESSIONS stats and
+* prevent building e.g. MCV etc.
+*/
+   atttype = exprType(expr);
+   type = lookup_type_cache(atttype, TYPECACHE_LT_OPR);
+   if (type->lt_opr == InvalidOid)
+   ereport(ERROR,
+   (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+errmsg("expression cannot be used in
statistics because its type %s has no default btree operator class",
+   format_type_be(atttype;

As the comment suggests, it's probably worth skipping this check if
numcols is 1 so that single-column stats can be built for more types
of expressions. (I'm assuming that it's basically no more effort to
make that work, so I think it falls into the might-as-well-do-it
category.)


+   /*
+* Parse the statistics kinds.  Firstly, check that this is not the
+* variant building statistics for a single expression, in which case
+* we don't allow specifying any statistis kinds.  The simple variant
+* only has one expression, and does not allow statistics kinds.
+*/
+   if ((list_length(stmt->exprs) == 1) && (list_length(stxexprs) == 1))
+   {

Typo: "statistis"
Nit-picking, this test could just be:

+   if ((numcols == 1) && (list_length(stxexprs) == 1))

which IMO is a little more readable, and matches a similar test a
little further down.


+   /*
+* If there are no simply-referenced columns, give the statistics an
+* auto dependency on the whole table.  In most cases, this will
+* be redundant, but it might not be if the statistics expressions
+* contain no Vars (which might seem strange but possible).
+*
+* XXX This is copied from index_create, not sure if it's applicable
+* to extended statistics too.
+*/

Seems right to me.


+   /*
+* FIXME use 'expr' for expressions, which have empty column names.
+* For indexes this is handled in ChooseIndexColumnNames, but we
+* have no such function for stats.
+*/
+   if (!name)
+   name = "expr";

In theory, this function could be made to duplicate the logic used for
indexes, creating names like "expr1", "expr2", etc. To be honest
though, I don't think it's worth the effort. The code for indexes
isn't really bulletproof anyway -- for example there might be a column
called "expr" that is or isn't included in the index, which would make
the generated name ambiguous. And in any case, a name like
"tbl_cola_expr_colb_expr1_colc_stat" isn't really any more useful than
"tbl_cola_expr_colb_expr_colc_stat". So I'd be tempted to leave that
code as it is.


+
+/*
+ * CheckMutability
+ * Test whether given expression is mutable
+ *
+ * FIXME copied from indexcmds.c, maybe use some shared function?
+ */
+static bool
+CheckMutability(Expr *expr)
+{

As the comment says, it's quite messy duplicating this code, but I'm
wondering whether it would be OK to just skip this check entirely. I
think someone else suggested that elsewhere, and I think it might not
be a bad idea.

For indexes, it could easily lead to wrong query results, but for
stats the most likely problem is that the stats would get out of date
(which they tend to do all by themselves anyway) and need rebuilding.

If you ignore intentionally crazy examples (which are still possible
even with this check), then there are probably many legitimate cases
where someone might want to use non-immutable functions in stats, and
this check just forces them to create an immutable wrapper function.

Regards,
Dean




Re: PoC/WIP: Extended statistics on expressions

2021-01-07 Thread Dean Rasheed
Starting to look at the planner code, I found an oversight in the way
expression stats are read at the start of planning -- it is necessary
to call ChangeVarNodes() on any expressions if the relid isn't 1,
otherwise the stats expressions may contain Var nodes referring to the
wrong relation. Possibly the easiest place to do that would be in
get_relation_statistics(), if rel->relid != 1.

Here's a simple test case:

CREATE TABLE foo AS SELECT x FROM generate_series(1,10) g(x);
CREATE STATISTICS foo_s ON (x%10) FROM foo;
ANALYSE foo;

EXPLAIN SELECT * FROM foo WHERE x%10 = 0;
EXPLAIN SELECT * FROM (SELECT 1) t, foo WHERE x%10 = 0;

(in the second query, the stats don't get applied).

Regards,
Dean




Re: PoC/WIP: Extended statistics on expressions

2021-03-05 Thread Dean Rasheed
On Thu, 4 Mar 2021 at 22:16, Tomas Vondra  wrote:
>
> Attached is a slightly improved version of the patch series, addressing
> most of the issues raised in the previous message.

Cool. Sorry for the delay replying.

> 0003-Extended-statistics-on-expressions-20210304.patch
>
> Mostly unchanged, The one improvement is removing some duplicate code in
> in mvc.c.
>
> 0004-WIP-rework-tracking-of-expressions-20210304.patch
>
> This is mostly unchanged of the patch reworking how we assign artificial
> attnums to expressions (negative instead of (MaxHeapAttributeNumber+i)).

Looks good.

I see you undid the change to get_relation_statistics() in plancat.c,
which offset the attnums of plain attributes in the StatisticExtInfo
struct. I was going to suggest that as a simplification to the
previous 0004 patch. Related to that, is this comment in
dependencies_clauselist_selectivity():

/*
 * Count matching attributes - we have to undo two attnum offsets.
 * First, the dependency is offset using the number of expressions
 * for that statistics, and then (if it's a plain attribute) we
 * need to apply the same offset as above, by unique_exprs_cnt.
 */

which needs updating, since there is now just one attnum offset, not
two. Only the unique_exprs_cnt offset is relevant now.

Also, related to that change, I don't think that
stat_covers_attributes() is needed anymore. I think that the code that
calls it can now just be reverted back to using bms_is_subset(), since
that bitmapset holds plain attributes that aren't offset.

> 0005-WIP-unify-handling-of-attributes-and-expres-20210304.patch
>
> This reworks how we build statistics on attributes and expressions.
> Instead of treating attributes and expressions separately, this  allows
> handling them uniformly.
>
> Until now, the various "build" functions (for different statistics
> kinds) extracted attribute values from sampled tuples, but expressions
> were pre-calculated in a separate array. Firstly to save CPU time (not
> having to evaluate expensive expressions repeatedly) and to keep the
> different stats consistent (there might be volatile functions etc.).
>
> So the build functions had to look at the attnum, determine if it's
> attribute or expression, and in some cases it was tricky / easy to get
> wrong.
>
> This patch replaces this "split" view with a simple "consistent"
> representation merging values from attributes and expressions, and just
> passes that to the build functions. There's no need to check the attnum,
> and handle expressions in some special way, so the build functions are
> much simpler / easier to understand (at least I think so).
>
> The build data is represented by "StatsBuildData" struct - not sure if
> there's a better name.
>
> I'm mostly happy with how this turned out. I'm sure there's a bit more
> cleanup needed (e.g. the merging/remapping of dependencies needs some
> refactoring, I think) but overall this seems reasonable.

Agreed. That's a nice improvement.

I wonder if dependency_is_compatible_expression() can be merged with
dependency_is_compatible_clause() to reduce code duplication. It
probably also ought to be possible to support "Expr IN Array" there,
in a similar way to the other code in statext_is_compatible_clause().
Also, should this check rinfo->clause_relids against the passed-in
relid to rule out clauses referencing other relations, in the same way
that statext_is_compatible_clause() does?

> I did some performance testing, I don't think there's any measurable
> performance degradation. I'm actually wondering if we need to transform
> the AttrNumber arrays into bitmaps in various places - maybe we should
> just do a plain linear search. We don't really expect many elements, as
> each statistics has 8 attnums at most. So maybe building the bitmapsets
> is a net loss? The one exception might be functional dependencies, where
> we can "merge" multiple statistics together. But even then it'd require
> many statistics objects to make a difference.

Possibly. There's a danger in trying to change too much at once
though. As it stands, I think it's fairly close to being committable,
with just a little more tidying up.

Regards,
Dean




Re: pgbench - add pseudo-random permutation function

2021-03-11 Thread Dean Rasheed
On Thu, 11 Mar 2021 at 00:58, Bruce Momjian  wrote:
>
> Maybe Dean Rasheed can help because of his math background --- CC'ing him.
>

Reading the thread I can see how such a function might be useful to
scatter non-uniformly random values.

The implementation looks plausible too, though it adds quite a large
amount of new code. The main thing that concerns me is justifying the
code. With this kind of thing, it's all too easy to overlook corner
cases and end up with trivial sequences in certain special cases. I'd
feel better about that if we were implementing a standard algorithm
with known pedigree.

Thinking about the use case for this, it seems that it's basically
designed to turn a set of non-uniform random numbers (produced by
random_exponential() et al.) into another set of non-uniform random
numbers, where the non-uniformity is scattered so that the more/less
common values aren't all clumped together.

I'm wondering if that's something that can't be done more simply by
passing the non-uniform random numbers through the uniform random
number generator to scatter them uniformly across some range -- e.g.,
given an integer n, return the n'th value from the sequence produced
by random(), starting from some initial seed -- i.e., implement
nth_random(lb, ub, seed, n). That would actually be pretty
straightforward to implement using O(log(n)) time to execute (see the
attached python example), though it wouldn't generate a permutation,
so it'd need a bit of thought to see if it met the requirements.

Regards,
Dean
import math

a = 0x5deece66d
c = 0xb
m = 0x

def erand48(seed):
x = (seed[2] << 32) | (seed[1] << 16) | seed[0]

x = (x * a + c) & m
seed[0] = x & 0x
seed[1] = (x >> 16) & 0x
seed[2] = (x >> 32) & 0x

return math.ldexp(x, -48)

def nth_erand48(seed, n):
x = (seed[2] << 32) | (seed[1] << 16) | seed[0]

# Binary exponentiation to compute a^n and the sum of the geometric
# series gs = 1+a+...+a^(n-1)
a_pow_n = 1
gs = 0
t = 1
e = a
while n > 0:
if n & 1 == 1:
a_pow_n = (a_pow_n * e) & m
gs = (gs * e + t) & m
t = ((e+1)*t) & m
e = (e*e) & m
n = n/2

# n'th value in erand48 sequence
x = (x * a_pow_n + gs * c) & m

return math.ldexp(x, -48)

iseed = [1234, 5678, 9012]
seed = list(iseed)

for n in range(1,21):
r = erand48(seed)
print n, seed, r, nth_erand48(iseed, n)


Re: pgbench - add pseudo-random permutation function

2021-03-11 Thread Dean Rasheed
On Thu, 11 Mar 2021 at 19:06, David Bowen  wrote:
>
> The algorithm for generating a random permutation with a uniform distribution 
> across all permutations is easy:
> for (i=0; iswap a[n-i] with a[rand(n-i+1)]
> }
>
> where 0 <= rand(x) < x and a[i] is initially i (see Knuth, Section 3.4.2 
> Algorithm P)
>

True, but n can be very large, so that might use a lot of memory and
involve a lot of pre-processing.

Regards,
Dean




Re: Additional improvements to extended statistics

2020-11-12 Thread Dean Rasheed
On Thu, 12 Nov 2020 at 14:18, Tomas Vondra  wrote:
>
> Here is an improved WIP version of the patch series, modified to address
> the issue with repeatedly applying the extended statistics, as discussed
> with Dean in this thread. It's a bit rough and not committable, but I
> need some feedback so I'm posting it in this state.
>

Cool. I haven't forgotten that I promised to look at this.

> Dean, does this address the issue you had in mind? Can you come up with
> an example of that issue in the form of a regression test or something?
>

I'm quite busy with my day job at the moment, but I expect to have
time to review this next week.

Regards,
Dean




Re: Additional improvements to extended statistics

2020-11-17 Thread Dean Rasheed
On Thu, 12 Nov 2020 at 14:18, Tomas Vondra  wrote:
>
> Here is an improved WIP version of the patch series, modified to address
> the issue with repeatedly applying the extended statistics, as discussed
> with Dean in this thread. It's a bit rough and not committable, but I
> need some feedback so I'm posting it in this state.

As it stands, it doesn't compile if 0003 is applied, because it missed
one of the callers of clauselist_selectivity_simple(), but that's
easily fixed.

> 0001 is the original patch improving estimates of OR clauses
>
> 0002 adds thin wrappers for clause[list]_selectivity, with "internal"
> functions allowing to specify whether to keep considering extended stats
>
> 0003 does the same for the "simple" functions
>
>
> I've kept it like this to demonstrate that 0002 is not sufficient. In my
> response from March 24 I wrote this:
>
> > Isn't it the case that clauselist_selectivity_simple (and the OR
> > variant) should ignore extended stats entirely? That is, we'd need
> > to add a flag (or _simple variant) to clause_selectivity, so that it
> > calls causelist_selectivity_simple_or.
> But that's actually wrong, as 0002 shows (as it breaks a couple of
> regression tests), because of the way we handle OR clauses. At the top
> level, an OR-clause is actually just a single clause and it may get
> passed to clauselist_selectivity_simple. So entirely disabling extended
> stats for the "simple" functions would also mean disabling extended
> stats for a large number of OR clauses. Which is clearly wrong.
>
> So 0003 addresses that, by adding a flag to the two "simple" functions.
> Ultimately, this should probably do the same thing as 0002 and add thin
> wrappers, because the existing functions are part of the public API.

I agree that, taken together, these patches fix the
multiple-extended-stats-evaluation issue. However:

I think this has ended up with too many variants of these functions,
since we now have "_internal" and "_simple" variants, and you're
proposing adding more. The original purpose of the "_simple" variants
was to compute selectivities without looking at extended stats, and
now the "_internal" variants compute selectivities with an additional
"use_extended_stats" flag to control whether or not to look at
extended stats. Thus they're basically the same, and could be rolled
together.

Additionally, it's worth noting that the "_simple" variants expose the
"estimatedclauses" bitmap as an argument, which IMO is a bit messy as
an API. All callers of the "_simple" functions outside of clausesel.c
actually pass in estimatedclauses=NULL, so it's possible to refactor
and get rid of that, turning estimatedclauses into a purely internal
variable.

Also, it's quite messy that clauselist_selectivity_simple_or() needs
to be passed a Selectivity input (the final argument) that is the
selectivity of any already-estimated clauses, or the value to return
if no not-already-estimated clauses are found, and must be 0.0 when
called from the extended stats code.

Attached is the kind of thing I had in mind (as a single patch, since
I don't think it's worth splitting up). This replaces the "_simple"
and "_internal" variants of these functions with "_opt_ext_stats"
variants whose signatures match the originals except for having the
single extra "use_extended_stats" boolean parameter. Additionally, the
"_simple" functions are merged into the originals (making them more
like they were in PG11) so that the "estimatedclauses" bitmap and
partial-OR-list Selectivity become internal details, no longer exposed
in the API.

Regards,
Dean
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
new file mode 100644
index 37a735b..ab16a4c
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -44,6 +44,12 @@ static void addRangeClause(RangeQueryCla
 		   bool varonleft, bool isLTsel, Selectivity s2);
 static RelOptInfo *find_single_rel_for_clauses(PlannerInfo *root,
 			   List *clauses);
+static Selectivity clause_selectivity_opt_ext_stats(PlannerInfo *root,
+	Node *clause,
+	int varRelid,
+	JoinType jointype,
+	SpecialJoinInfo *sjinfo,
+	bool use_extended_stats);
 
 /
  *		ROUTINES TO COMPUTE SELECTIVITIES
@@ -61,64 +67,8 @@ static RelOptInfo *find_single_rel_for_c
  *
  * The basic approach is to apply extended statistics first, on as many
  * clauses as possible, in order to capture cross-column dependencies etc.
- * The remaining clauses 

Re: Additional improvements to extended statistics

2020-11-19 Thread Dean Rasheed
On Wed, 18 Nov 2020 at 22:37, Tomas Vondra
 wrote:
>
> Seems fine to me, although the "_opt_ext_stats" is rather cryptic.
> AFAICS we use "_internal" for similar functions.
>

There's precedent for using "_opt_xxx" for function variants that add
an option to existing functions, but I agree that in this case it's a
bit of a mouthful. I don't think "_internal" is appropriate though,
since the clauselist function isn't internal. Perhaps using just
"_ext" would be OK.

Regards,
Dean




Re: [bug+patch] Inserting DEFAULT into generated columns from VALUES RTE

2020-11-20 Thread Dean Rasheed
On Sun, 6 Sept 2020 at 22:42, Tom Lane  wrote:
>
> I think you'd be better off to make transformInsertStmt(), specifically
> its multi-VALUES-rows code path, check for all-DEFAULT columns and adjust
> the tlist itself.  Doing it there might be a good bit less inefficient
> for very long VALUES lists, too, which is a case that we do worry about.
> Since that's already iterating through the sub-lists, you could track
> whether all rows seen so far contain SetToDefault in each column position,
> and avoid extra scans of the sublists.  (A BitmapSet might be a convenient
> representation of that, though you could also use a bool array I suppose.)
>
> I do not care for what you did in rewriteValuesRTE() either: just removing
> a sanity check isn't OK, unless you've done something to make the sanity
> check unnecessary which you surely have not.  Perhaps you could extend
> the initial scan of the tlist (lines 1294-1310) to notice SetToDefault
> nodes as well as Var nodes and keep track of which columns have those.
> Then you could cross-check that one or the other case applies whenever
> you see a SetToDefault in the VALUES lists.

That's not quite right because by the time rewriteValuesRTE() sees the
tlist, it will contain already-rewritten generated column expressions,
not SetToDefault nodes. If we're going to keep that sanity check (and
I think that we should), I think that the way to do it is to have
rewriteTargetListIU() record which columns it has expanded defaults
for, and pass that information to rewriteValuesRTE(). Those columns of
the VALUES RTE are no longer used in the query, so the sanity check
can be amended to ignore them while continuing to check the other
columns.

Incidentally, there is another way of causing that sanity check to
fail -- an INSERT ... OVERRIDING USER VALUE query with some DEFAULTS
in the VALUES RTE (but not necessarily all DEFAULTs) will trigger it.
So even if the parser completely removed any all-DEFAULT columns from
the VALUES RTE, some work in the rewriter would still be necessary.


> BTW, another thing that needs checking is whether a rule containing
> an INSERT like this will reverse-list sanely.  The whole idea of
> replacing some of the Vars might not work so far as ruleutils.c is
> concerned.  In that case I think we might have to implement this
> by having transformInsertStmt restructure the VALUES lists to
> physically remove the all-DEFAULT column, and adjust the target column
> list accordingly --- that is, make a parse-time transformation of
> INSERT INTO gtest0 VALUES (1, DEFAULT), (2, DEFAULT);
> into
> INSERT INTO gtest0(a) VALUES (1), (2);
> That'd have the advantage that you'd not have to hack up the
> rewriter at all.

I think it's actually easier to just do it all in the rewriter -- at
the point where we see that we're about to insert potentially illegal
values from a VALUES RTE into a generated column, scan it to see if
all the values in that column are DEFAULTs, and if so trigger the
existing code to replace the Var in the tlist with the generated
column expression. That way we only do extra work in the case for
which we're currently throwing an error, rather than for every query.
Also, I think that scanning the VALUES lists in this way is likely to
be cheaper than rebuilding them to remove a column.

Attached is a patch doing it that way, along with additional
regression tests that trigger both the original error and the
sanity-check error triggered by INSERT ... OVERRIDING USER VALUES. I
also added a few additional comments where I found the existing code a
little non-obvious.

I haven't touched the existing error messages. I think that's a
subject for a separate patch.

Regards,
Dean
diff --git a/src/backend/rewrite/rewriteHandler.c b/src/backend/rewrite/rewriteHandler.c
new file mode 100644
index 41dd670..966cab5
--- a/src/backend/rewrite/rewriteHandler.c
+++ b/src/backend/rewrite/rewriteHandler.c
@@ -69,13 +69,18 @@ static List *rewriteTargetListIU(List *t
  CmdType commandType,
  OverridingKind override,
  Relation target_relation,
- int result_rti);
+ int result_rti,
+ RangeTblEntry *values_rte,
+ int values_rte_index,
+ Bitmapset **unused_values_attrnos);
 static TargetEntry *process_matched_tle(TargetEntry *src_tle,
 		TargetEntry *prior_tle,
 		const char *attrName);
 static Node *get_assignment_input(Node *node);
+static Bitmapset *findDefaultOnlyColumns(RangeTblEntry *rte);
 static bool rewriteValuesRTE(Query *parsetree, RangeTblEntry *rte, int rti,
-			 Relation target_relation, bool force_nulls);
+			 Relation target_relation, bool force_nulls,
+			 Bitmapset *unused_cols);
 static void markQueryForLocking(Query *qry, Node *jtnode,
 LockClauseStrength strength, LockWait

Re: [bug+patch] Inserting DEFAULT into generated columns from VALUES RTE

2020-11-23 Thread Dean Rasheed
On Sun, 22 Nov 2020 at 20:58, Tom Lane  wrote:
>
> I found only one nitpicky bug: in
> findDefaultOnlyColumns, the test must be bms_is_empty(default_only_cols)
> not just default_only_cols == NULL, or it will fail to fall out early
> as intended when the first row contains some DEFAULTs but later rows
> don't.

Ah, good point. Thanks for fixing that.

> > I haven't touched the existing error messages. I think that's a
> > subject for a separate patch.
>
> Fair.  After looking around a bit, I think that getting an error
> cursor as I'd speculated about is more trouble than it's worth.
> For starters, we'd have to pass down the query string into this
> code, and there might be some ticklish issues about whether a given
> chunk of parsetree came from that string or from some rule or view.
> However, I think that just adjusting the error string would be
> helpful, as attached.

+1

> (I'm also wondering why the second case is generic ERRCODE_SYNTAX_ERROR
> and not ERRCODE_GENERATED_ALWAYS.  Didn't change it here, though.)

I can't see any reason for it to be different, and
ERRCODE_GENERATED_ALWAYS seems like the right code to use for both
cases.

Regards,
Dean




Re: proposal: possibility to read dumped table's name from file

2020-11-25 Thread Dean Rasheed
On Thu, 19 Nov 2020 at 19:57, Pavel Stehule  wrote:
>
> minor update - fixed handling of processing names with double quotes inside
>

I see this is marked RFC, but reading the thread it doesn't feel like
we have reached consensus on the design for this feature.

I agree that being able to configure pg_dump via a config file would
be very useful, but the syntax proposed here feels much more like a
hacked-up syntax designed to meet this one use case, rather than a
good general-purpose design that can be easily extended.

IMO, a pg_dump config file should be able to specify all options
currently supported through the command line, and vice versa (subject
to command line length limits), with a single common code path for
handling options. That way, any new options we add will work on the
command line and in config files. Likewise, the user should only need
to learn one set of options, and have the choice of specifying them on
the command line or in a config file (or a mix of both).

I can imagine eventually supporting multiple different file formats,
each just being a different representation of the same data, so
perhaps this could work with 2 new options:

  --option-file-format=plain|yaml|json|...
  --option-file=filename

with "plain" being the default initial implementation, which might be
something like our current postgresql.conf file format.

Also, I think we should allow multiple "--option-file" arguments
(e.g., to list different object types in different files), and for a
config file to contain its own "--option-file" arguments, to allow
config files to include other config files.

The current design feels far too limited to me, and requires new code
and new syntax to be added each time we extend it, so I'm -1 on this
patch as it stands.

Regards,
Dean




Re: proposal: possibility to read dumped table's name from file

2020-11-26 Thread Dean Rasheed
On Thu, 26 Nov 2020 at 06:43, Pavel Stehule  wrote:
>
> st 25. 11. 2020 v 21:00 odesílatel Tom Lane  napsal:
>>
>> (One thing to consider is
>> how painful will it be for people to quote table names containing
>> funny characters, for instance.  On the command line, we largely
>> depend on the shell's quoting behavior to solve that, but we'd not
>> have that infrastructure when reading from a file.)
>
> This is not a problem with the current patch - and the last version of this 
> patch supports well obscure names.
>

Actually, that raises a different possible benefit of passing options
in an options file -- if the user wants to pass in a table name
pattern, it can be a nuisance if the shell's argument processing does
additional unwanted things like globbing and environment variable
substitutions. Using an options file could provide a handy way to
ensure that any option values are interpreted exactly as written,
without any additional mangling.

> There was a requirement for supporting all and future pg_dump options - ok it 
> can make sense. I have not a problem to use instead a line format
>
> "option argument" or "long-option=argument"
>
> This format - can it be a solution? I'll try to rewrite the parser for this 
> format.
>

Yes, that's the sort of thing I was thinking of, to make the feature
more general-purpose.

> It is implementable, but this is in collision with Stephen's requirement for 
> human well readable format designed for handy writing. There are requests 
> that have no intersection. Well readable format needs a more complex parser. 
> And machine generating in this format needs more fork - generating flat file 
> is more simple and more robust than generating JSON or YAML.
>

To be clear, I wasn't suggesting that this patch implement multiple
formats. Just the "plain" format above. Other formats like YAML might
well be more human-readable, and be able to take advantage of values
that are lists to avoid repeating option names, and they would have
the advantage of being readable and writable by other standard tools,
which might be useful. But I think such things would be for the
future. Maybe no one will ever add support for other formats, or maybe
someone will just write a separate external tool to convert YAML or
JSON to our plain format. I don't know. But supporting all pg_dump
options makes more things possible.

>> I wasn't very excited about multiple switch files either, though
>> depending on how the implementation is done, that could be simple
>> enough to be in the might-as-well category.
>>

That's what I was hoping.

Regards,
Dean




Re: Additional improvements to extended statistics

2020-11-29 Thread Dean Rasheed
> On Wed, 18 Nov 2020 at 22:37, Tomas Vondra
>  wrote:
> >
> > Seems fine to me, although the "_opt_ext_stats" is rather cryptic.
> > AFAICS we use "_internal" for similar functions.
> >

I have been thinking about this some more. The one part of this that I
still wasn't happy with was the way that base frequencies were used to
compute the selectivity correction to apply. As noted in [1], using
base frequencies in this way isn't really appropriate for clauses
combined using "OR". The reason is that an item's base frequency is
computed as the product of the per-column selectivities, so that (freq
- base_freq) is the right correction to apply for a set of clauses
combined with "AND", but it doesn't really work properly for clauses
combined with "OR". This is why a number of the estimates in the
regression tests end up being significant over-estimates.

I speculated in [1] that we might fix that by tracking which columns
of the match bitmap actually matched the clauses being estimated, and
then only use those base frequencies. Unfortunately that would also
mean changing the format of the stats that we store, and so would be a
rather invasive change.

It occurred to me though, that there is another, much more
straightforward way to do it. We can rewrite the "OR" clauses, and
turn them into "AND" clauses using the fact that

  P(A OR B) = P(A) + P(B) - P(A AND B)

and then use the multivariate stats to estimate the P(A AND B) part in
the usual way.

Attached is the resulting patch doing it that way. The main change is
in the way that statext_mcv_clauselist_selectivity() works, combined
with a new function mcv_clause_selectivity_or() that does the
necessary MCV bitmap manipulations.

Doing it this way also means that clausesel.c doesn't need to export
clauselist_selectivity_or(), and the new set of exported functions
seem a bit neater now.

A handful of regression test results change, and in all cases except
one the new estimates are much better. One estimate is made worse, but
in that case we only have 2 sets of partial stats:

  SELECT * FROM mcv_lists_multi WHERE a = 0 OR b = 0 OR c = 0 OR d = 0

with stats on (a,b) and (c,d) so it's not surprising that combining (a
= 0 OR b = 0) with (c = 0 OR d = 0) mis-estimates a bit. I suspect the
old MV stats estimate was more down to chance in this case.

Regards,
Dean

[1] 
https://www.postgresql.org/message-id/CAEZATCX8u9bZzcWEzqA_t7f_OQHu2oxeTUGnFHNEOXnJo35AQg%40mail.gmail.com
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
new file mode 100644
index 37a735b..7d6e678
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -44,6 +44,12 @@ static void addRangeClause(RangeQueryCla
 		   bool varonleft, bool isLTsel, Selectivity s2);
 static RelOptInfo *find_single_rel_for_clauses(PlannerInfo *root,
 			   List *clauses);
+static Selectivity clauselist_selectivity_or(PlannerInfo *root,
+			 List *clauses,
+			 int varRelid,
+			 JoinType jointype,
+			 SpecialJoinInfo *sjinfo,
+			 bool use_extended_stats);
 
 /
  *		ROUTINES TO COMPUTE SELECTIVITIES
@@ -61,64 +67,8 @@ static RelOptInfo *find_single_rel_for_c
  *
  * The basic approach is to apply extended statistics first, on as many
  * clauses as possible, in order to capture cross-column dependencies etc.
- * The remaining clauses are then estimated using regular statistics tracked
- * for individual columns.  This is done by simply passing the clauses to
- * clauselist_selectivity_simple.
- */
-Selectivity
-clauselist_selectivity(PlannerInfo *root,
-	   List *clauses,
-	   int varRelid,
-	   JoinType jointype,
-	   SpecialJoinInfo *sjinfo)
-{
-	Selectivity s1 = 1.0;
-	RelOptInfo *rel;
-	Bitmapset  *estimatedclauses = NULL;
-
-	/*
-	 * Determine if these clauses reference a single relation.  If so, and if
-	 * it has extended statistics, try to apply those.
-	 */
-	rel = find_single_rel_for_clauses(root, clauses);
-	if (rel && rel->rtekind == RTE_RELATION && rel->statlist != NIL)
-	{
-		/*
-		 * Estimate as many clauses as possible using extended statistics.
-		 *
-		 * 'estimatedclauses' tracks the 0-based list position index of
-		 * clauses that we've estimated using extended statistics, and that
-		 * should be ignored.
-		 */
-		s1 *= statext_clauselist_selectivity(root, clauses, varRelid,
-			 jointype, sjinfo, rel,
-			 &estimatedclauses);
-	}
-
-	/*
-	 * Apply normal selectivity estimates for the remaining clauses, passing
-	 * 'estimatedclauses' so that it skips already estimated ones.
-	 */
-	return s1 * clauselist_selectivity_simple(root, clauses, varRelid,
-			  jointype, sjin

Re: PoC/WIP: Extended statistics on expressions

2021-01-18 Thread Dean Rasheed
Looking through extended_stats.c, I found a corner case that can lead
to a seg-fault:

CREATE TABLE foo();
CREATE STATISTICS s ON (1) FROM foo;
ANALYSE foo;

This crashes in lookup_var_attr_stats(), because it isn't expecting
nvacatts to be 0. I can't think of any case where building stats on a
table with no analysable columns is useful, so it should probably just
exit early in that case.


In BuildRelationExtStatistics(), it looks like min_attrs should be
declared assert-only.


In evaluate_expressions():

+   /* set the pointers */
+   result = (ExprInfo *) ptr;
+   ptr += sizeof(ExprInfo);

I think that should probably have a MAXALIGN().


A slightly bigger issue that I don't like is the way it assigns
attribute numbers for expressions starting from
MaxHeapAttributeNumber+1, so the first expression has an attnum of
1601. That leads to pretty inefficient use of Bitmapsets, since most
tables only contain a handful of columns, and there's a large block of
unused space in the middle the Bitmapset.

An alternative approach might be to use regular attnums for columns
and use negative indexes -1, -2, -3, ... for expressions in the stored
stats. Then when storing and retrieving attnums from Bitmapsets, it
could just offset by STATS_MAX_DIMENSIONS (8) to avoid negative values
in the Bitmapsets, since there can't be more than that many
expressions (just like other code stores system attributes using
FirstLowInvalidHeapAttributeNumber).

That would be a somewhat bigger change, but hopefully fairly
mechanical, and then some code like add_expressions_to_attributes()
would go away.


Looking at the new view pg_stats_ext_exprs, I noticed that it fails to
show expressions until the statistics have been built. For example:

CREATE TABLE foo(a int, b int);
CREATE STATISTICS s ON (a+b), (a*b) FROM foo;
SELECT statistics_name, tablename, expr, n_distinct FROM pg_stats_ext_exprs;

 statistics_name | tablename | expr | n_distinct
-+---+--+
 s   | foo   |  |
(1 row)

but after populating and analysing the table, this becomes:

 statistics_name | tablename |  expr   | n_distinct
-+---+-+
 s   | foo   | (a + b) | 11
 s   | foo   | (a * b) | 11
(2 rows)

I think it should show the expressions even before the stats have been built.

Another issue is that it returns rows for non-expression stats as
well. For example:

CREATE TABLE foo(a int, b int);
CREATE STATISTICS s ON a, b FROM foo;
SELECT statistics_name, tablename, expr, n_distinct FROM pg_stats_ext_exprs;

 statistics_name | tablename | expr | n_distinct
-+---+--+
 s   | foo   |  |
(1 row)

and those values will never be populated, since they're not
expressions, so I would expect them to not be shown in the view.

So basically, instead of

+ LEFT JOIN LATERAL (
+ SELECT
+ *
+ FROM (
+ SELECT
+
unnest(pg_get_statisticsobjdef_expressions(s.oid)) AS expr,
+ unnest(sd.stxdexpr)::pg_statistic AS a
+ ) x
+ ) stat ON sd.stxdexpr IS NOT NULL;

perhaps just

+ JOIN LATERAL (
+ SELECT
+ *
+ FROM (
+ SELECT
+
unnest(pg_get_statisticsobjdef_expressions(s.oid)) AS expr,
+ unnest(sd.stxdexpr)::pg_statistic AS a
+ ) x
+ ) stat ON true;

Regards,
Dean




Re: PoC/WIP: Extended statistics on expressions

2021-01-21 Thread Dean Rasheed
On Tue, 19 Jan 2021 at 01:57, Tomas Vondra
 wrote:
>
> > A slightly bigger issue that I don't like is the way it assigns
> > attribute numbers for expressions starting from
> > MaxHeapAttributeNumber+1, so the first expression has an attnum of
> > 1601. That leads to pretty inefficient use of Bitmapsets, since most
> > tables only contain a handful of columns, and there's a large block of
> > unused space in the middle the Bitmapset.
> >
> > An alternative approach might be to use regular attnums for columns
> > and use negative indexes -1, -2, -3, ... for expressions in the stored
> > stats. Then when storing and retrieving attnums from Bitmapsets, it
> > could just offset by STATS_MAX_DIMENSIONS (8) to avoid negative values
> > in the Bitmapsets, since there can't be more than that many
> > expressions (just like other code stores system attributes using
> > FirstLowInvalidHeapAttributeNumber).
>
> Well, I tried this but unfortunately it's not that simple. We still need
> to build the bitmaps, so I don't think add_expression_to_attributes can
> be just removed. I mean, we need to do the offsetting somewhere, even if
> we change how we do it.

Hmm, I was imagining that the offsetting would happen in each place
that adds or retrieves an attnum from a Bitmapset, much like a lot of
other code does for system attributes, and then you'd know you had an
expression if the resulting attnum was negative.

I was also thinking that it would be these negative attnums that would
be stored in the stats data, so instead of something like "1, 2 =>
1601", it would be "1, 2 => -1", so in some sense "-1" would be the
"real" attnum associated with the expression.

> But the main issue is that in some cases the number of expressions is
> not really limited by STATS_MAX_DIMENSIONS - for example when applying
> functional dependencies, we "merge" multiple statistics, so we may end
> up with more expressions. So we can't just use STATS_MAX_DIMENSIONS.

Ah, I see. I hadn't really fully understood what that code was doing.

ISTM though that this is really an internal problem to the
dependencies code, in that these "merged" Bitmapsets containing attrs
from multiple different stats objects do not (and must not) ever go
outside that local code that uses them. So that code would be free to
use a different offset for its own purposes -- e..g., collect all the
distinct expressions across all the stats objects and then offset by
the number of distinct expressions.

> Also, if we offset regular attnums by STATS_MAX_DIMENSIONS, that inverts
> the order of processing (so far we've assumed expressions are after
> regular attnums). So the changes are more extensive - I tried doing that
> anyway, and I'm still struggling with crashes and regression failures.
> Of course, that doesn't mean we shouldn't do it, but it's far from
> mechanical. (Some of that is probably a sign this code needs a bit more
> work to polish.)

Interesting. What code assumes expressions come after attributes?
Ideally, I think it would be cleaner if no code assumed any particular
order, but I can believe that it might be convenient in some
circumstances.

> But I wonder if it'd be easier to just calculate the actual max attnum
> and then use it instead of MaxHeapAttributeNumber ...

Hmm, I'm not sure how that would work. There still needs to be an
attnum that gets stored in the database, and it has to continue to
work if the user adds columns to the table. That's why I was
advocating storing negative values, though I haven't actually tried it
to see what might go wrong.

Regards,
Dean




Re: PoC/WIP: Extended statistics on expressions

2021-01-22 Thread Dean Rasheed
On Fri, 22 Jan 2021 at 04:46, Justin Pryzby  wrote:
>
> I think you'll maybe have to do something better - this seems a bit too weird:
>
> | postgres=# CREATE STATISTICS s2 ON (i+1) ,i FROM t;
> | postgres=# \d t
> | ...
> | "public"."s2" (ndistinct, dependencies, mcv) ON i FROM t
>

I guess that's not surprising, given that old psql knows nothing about
expressions in stats.

In general, I think connecting old versions of psql to newer servers
is not supported. You're lucky if \d works at all. So it shouldn't be
this patch's responsibility to make that output nicer.

Regards,
Dean




Re: Schema variables - new implementation for Postgres 15

2022-01-13 Thread Dean Rasheed
On Wed, 3 Nov 2021 at 13:05, Tomas Vondra  wrote:
>
> 2) I find this a bit confusing:
>
> SELECT non_existent_variable;
> test=# select s;
> ERROR:  column "non_existent_variable" does not exist
> LINE 1: select non_existent_variable;
>
> I wonder if this means using SELECT to read variables is a bad idea, and
> we should have a separate command, just like we have LET (instead of
> just using UPDATE in some way).
>

Hmm. This way of reading variables worries me for a different reason
-- I think it makes it all too easy to break existing applications by
inadvertently (or deliberately) defining variables that conflict with
column names referred to in existing queries.

For example, if I define a variable called "relkind", then psql's \sv
meta-command is broken because the query it performs can't distinguish
between the column and the variable.

Similarly, there's ambiguity between alias.colname and
schema.variablename. So, for example, if I do the following:

CREATE SCHEMA n;
CREATE VARIABLE n.nspname AS int;

then lots of things are broken, including pg_dump and a number of psql
meta-commands. I don't think it's acceptable to make it so easy for a
user to break the system in this way.

Those are examples that a malicious user might use, but even without
such examples, I think it would be far too easy to inadvertently break
a large application by defining a variable that conflicted with a
column name you didn't know about.

Regards,
Dean




Re: Schema variables - new implementation for Postgres 15

2022-01-13 Thread Dean Rasheed
On Thu, 13 Jan 2022 at 17:42, Pavel Stehule  wrote:
>
> I like the idea of prioritizing tables over variables with warnings when 
> collision is detected. It cannot break anything. And it allows to using short 
> identifiers when there is not collision.

Yeah, that seems OK, as long as it's clearly documented. I don't think
a warning is necessary.

(FWIW, testing with dbfiddle, that appears to match Db2's behaviour).

Regards,
Dean




Re: PoC/WIP: Extended statistics on expressions

2021-03-17 Thread Dean Rasheed
On Sun, 7 Mar 2021 at 21:10, Tomas Vondra  wrote:
>
> 2) ndistinct
>
> There's one thing that's bugging me, in how we handle "partial" matches.
> For each expression we track both the original expression and the Vars
> we extract from it. If we can't find a statistics matching the whole
> expression, we try to match those individual Vars, and we remove the
> matching ones from the list. And in the end we multiply the estimates
> for the remaining Vars.
>
> This works fine with one matching ndistinct statistics. Consider for example
>
>  GROUP BY (a+b), (c+d)
>
> with statistics on [(a+b),c] - that is, expression and one column.

I've just been going over this example, and I think it actually works
slightly differently from how you described, though I haven't worked
out the full general implications of that.

> We parse the expressions into two GroupExprInfo
>
>  {expr: (a+b), vars: [a, b]}
>  {expr: (c+d), vars: [c, d]}
>

Here, I think what you actually get, in the presence of stats on
[(a+b),c] is actually the following two GroupExprInfos:

  {expr: (a+b), vars: []}
  {expr: (c+d), vars: [c, d]}

because of the code immediately after this comment in estimate_num_groups():

/*
 * If examine_variable is able to deduce anything about the GROUP BY
 * expression, treat it as a single variable even if it's really more
 * complicated.
 */

As it happens, that makes no difference in this case, where there is
just this one stats object, but it does change things when there are
two stats objects.

> and the statistics matches the first item exactly (the expression). The
> second expression is not in the statistics, but we match "c". So we end
> up with an estimate for "(a+b), c" and have one remaining GroupExprInfo:
>
>  {expr: (c+d), vars: [d]}

Right.

> Without any other statistics we estimate that as ndistinct for "d", so
> we end up with
>
>  ndistinct((a+b), c) * ndistinct(d)
>
> which mostly makes sense. It assumes ndistinct(c+d) is product of the
> ndistinct estimates, but that's kinda what we've been always doing.

Yes, that appears to be what happens, and it's probably the best that
can be done with the available stats.

> But now consider we have another statistics on just (c+d). In the second
> loop we end up matching this expression exactly, so we end up with
>
>  ndistinct((a+b), c) * ndistinct((c+d))

In this case, with stats on (c+d) as well, the two GroupExprInfos
built at the start change to:

  {expr: (a+b), vars: []}
  {expr: (c+d), vars: []}

so it actually ends up not using any multivariate stats, but instead
uses the two univariate expression stats, giving

 ndistinct((a+b)) * ndistinct((c+d))

which actually seems pretty good, and gave very good estimates in the
simple test case I tried.

> i.e. we kinda use the "c" twice. Which is a bit unfortunate. I think
> what we should do after the first loop is just discarding the whole
> expression and "expand" into per-variable GroupExprInfo, so in the
> second step we would not match the (c+d) statistics.

Not using the (c+d) stats would give either

 ndistinct((a+b)) * ndistinct(c) * ndistinct(d)

or

 ndistinct((a+b), c) * ndistinct(d)

depending on exactly how the algorithm was changed. In my test, these
both gave worse estimates, but there are probably other datasets for
which they might do better. It all depends on how much correlation
there is between (a+b) and c.

I suspect that there is no optimal strategy for handling overlapping
stats that works best for all datasets, but the current algorithm
seems to do a pretty decent job.

> Of course, maybe there's a better way to pick the statistics, but I
> think our conclusion so far was that people should just create
> statistics covering all the columns in the query, to not have to match
> multiple statistics like this.

Yes, I think that's always likely to work better, especially for
ndistinct stats, where all possible permutations of subsets of the
columns are included, so a single ndistinct stat can work well for a
range of different queries.

Regards,
Dean




Re: PoC/WIP: Extended statistics on expressions

2021-03-17 Thread Dean Rasheed
On Wed, 17 Mar 2021 at 17:26, Tomas Vondra
 wrote:
>
> My concern is that the current behavior (where we prefer expression
> stats over multi-column stats to some extent) works fine as long as the
> parts are independent, but once there's dependency it's probably more
> likely to produce underestimates. I think underestimates for grouping
> estimates were a risk in the past, so let's not make that worse.
>

I'm not sure the current behaviour really is preferring expression
stats over multi-column stats. In this example, where we're grouping
by (a+b), (c+d) and have stats on [(a+b),c] and (c+d), neither of
those multi-column stats actually match more than one
column/expression. If anything, I'd go the other way and say that it
was wrong to use the [(a+b),c] stats in the first case, where they
were the only stats available, since those stats aren't really
applicable to (c+d), which probably ought to be treated as
independent. IOW, it might have been better to estimate the first case
as

 ndistinct((a+b)) * ndistinct(c) * ndistinct(d)

and the second case as

 ndistinct((a+b)) * ndistinct((c+d))

Regards,
Dean




Re: PoC/WIP: Extended statistics on expressions

2021-03-17 Thread Dean Rasheed
On Wed, 17 Mar 2021 at 19:07, Tomas Vondra
 wrote:
>
> On 3/17/21 7:54 PM, Dean Rasheed wrote:
> >
> > it might have been better to estimate the first case as
> >
> >  ndistinct((a+b)) * ndistinct(c) * ndistinct(d)
> >
> > and the second case as
> >
> >  ndistinct((a+b)) * ndistinct((c+d))
>
> OK. I might be confused, but isn't that what the algorithm currently
> does? Or am I just confused about what the first/second case refers to?
>

No, it currently estimates the first case as ndistinct((a+b),c) *
ndistinct(d). Having said that, maybe that's OK after all. It at least
makes an effort to account for any correlation between (a+b) and
(c+d), using the known correlation between (a+b) and c. For reference,
here is the test case I was using (which isn't really very good for
catching dependence between columns):

DROP TABLE IF EXISTS foo;
CREATE TABLE foo (a int, b int, c int, d int);
INSERT INTO foo SELECT x%10, x%11, x%12, x%13 FROM generate_series(1,10) x;
SELECT COUNT(DISTINCT a) FROM foo; -- 10
SELECT COUNT(DISTINCT b) FROM foo; -- 11
SELECT COUNT(DISTINCT c) FROM foo; -- 12
SELECT COUNT(DISTINCT d) FROM foo; -- 13
SELECT COUNT(DISTINCT (a+b)) FROM foo; -- 20
SELECT COUNT(DISTINCT (c+d)) FROM foo; -- 24
SELECT COUNT(DISTINCT ((a+b),c)) FROM foo; -- 228
SELECT COUNT(DISTINCT ((a+b),(c+d))) FROM foo; -- 478

-- First case: stats on [(a+b),c]
CREATE STATISTICS s1(ndistinct) ON (a+b),c FROM foo;
ANALYSE foo;
EXPLAIN ANALYSE
SELECT (a+b), (c+d) FROM foo GROUP BY (a+b), (c+d);
  -- Estimate = 2964, Actual = 478
  -- This estimate is ndistinct((a+b),c) * ndistinct(d) = 228*13

-- Second case: stats on (c+d) as well
CREATE STATISTICS s2 ON (c+d) FROM foo;
ANALYSE foo;
EXPLAIN ANALYSE
SELECT (a+b), (c+d) FROM foo GROUP BY (a+b), (c+d);
  -- Estimate = 480, Actual = 478
  -- This estimate is ndistinct((a+b)) * ndistinct((c+d)) = 20*24

I think that's probably pretty reasonable behaviour, given incomplete
stats (the estimate with no extended stats is capped at 1).

Regards,
Dean




Re: PoC/WIP: Extended statistics on expressions

2021-03-17 Thread Dean Rasheed
On Wed, 17 Mar 2021 at 20:48, Dean Rasheed  wrote:
>
> For reference, here is the test case I was using (which isn't really very 
> good for
> catching dependence between columns):
>

And here's a test case with much more dependence between the columns:

DROP TABLE IF EXISTS foo;
CREATE TABLE foo (a int, b int, c int, d int);
INSERT INTO foo SELECT x%2, x%5, x%10, x%15 FROM generate_series(1,10) x;
SELECT COUNT(DISTINCT a) FROM foo; -- 2
SELECT COUNT(DISTINCT b) FROM foo; -- 5
SELECT COUNT(DISTINCT c) FROM foo; -- 10
SELECT COUNT(DISTINCT d) FROM foo; -- 15
SELECT COUNT(DISTINCT (a+b)) FROM foo; -- 6
SELECT COUNT(DISTINCT (c+d)) FROM foo; -- 20
SELECT COUNT(DISTINCT ((a+b),c)) FROM foo; -- 10
SELECT COUNT(DISTINCT ((a+b),(c+d))) FROM foo; -- 30

-- First case: stats on [(a+b),c]
CREATE STATISTICS s1(ndistinct) ON (a+b),c FROM foo;
ANALYSE foo;
EXPLAIN ANALYSE
SELECT (a+b), (c+d) FROM foo GROUP BY (a+b), (c+d);
  -- Estimate = 150, Actual = 30
  -- This estimate is ndistinct((a+b),c) * ndistinct(d) = 10*15,
  -- which is much better than ndistinct((a+b)) * ndistinct(c) *
ndistinct(d) = 6*10*15 = 900
  -- Estimate with no stats = 1500

-- Second case: stats on (c+d) as well
CREATE STATISTICS s2 ON (c+d) FROM foo;
ANALYSE foo;
EXPLAIN ANALYSE
SELECT (a+b), (c+d) FROM foo GROUP BY (a+b), (c+d);
  -- Estimate = 120, Actual = 30
  -- This estimate is ndistinct((a+b)) * ndistinct((c+d)) = 6*20

Again, I'd say the current behaviour is pretty good.

Regards,
Dean




Re: PoC/WIP: Extended statistics on expressions

2021-03-18 Thread Dean Rasheed
On Wed, 17 Mar 2021 at 21:31, Tomas Vondra
 wrote:
>
> I agree applying at least the [(a+b),c] stats is probably the right
> approach, as it means we're considering at least the available
> information about dependence between the columns.
>
> I think to improve this, we'll need to teach the code to use overlapping
> statistics, a bit like conditional probability. In this case we might do
> something like this:
>
> ndistinct((a+b),c) * (ndistinct((c+d)) / ndistinct(c))

Yes, I was thinking the same thing. That would be equivalent to
applying a multiplicative "correction" factor of

  ndistinct(a,b,c,...) / ( ndistinct(a) * ndistinct(b) * ndistinct(c) * ... )

for each multivariate stat applicable to more than one
column/expression, regardless of whether those columns were already
covered by other multivariate stats. That might well simplify the
implementation, as well as probably produce better estimates.

> But that's clearly a matter for a future patch, and I'm sure there are
> cases where this will produce worse estimates.

Agreed.

> Anyway, I plan to go over the patches one more time, and start pushing
> them sometime early next week. I don't want to leave it until the very
> last moment in the CF.

+1. I think they're in good enough shape for that process to start.

Regards,
Dean




Re: pgbench - add pseudo-random permutation function

2021-03-22 Thread Dean Rasheed
On Sun, 14 Mar 2021 at 16:08, Fabien COELHO  wrote:
>
> > My main question on this now is, do you have a scholar reference for
> > this algorithm?
>
> Nope, otherwise I would have put a reference. I'm a scholar though, if
> it helps:-)
>
> I could not find any algorithm that fitted the bill. The usual approach
> (eg benchmarking designs) is too use some hash function and assume that it
> is not a permutation, too bad.
>
> Basically the algorithm mimics a few rounds of cryptographic encryption
> adapted to any size and simple operators, whereas encryption function are
> restricted to power of two blocks (eg the Feistel network). The structure
> is the same AES with its AddRoundKey the xor-ing stage (adapted to non
> power of two in whiten above), MixColumns which does the scattering, and
> for key expansion, I used Donald Knuth generator. Basically one could say
> that permute is an inexpensive and insecure AES:-)
>

I spent a little time playing around with this, trying to come up with
a reasonable way to test it. It's apparent from the code and
associated comments that the transformation is bijective and so will
produce a permutation, but it's less obvious how random that
permutation will be. Obviously the Fisher-Yates algorithm would give
the most random permutation, but that's not really practical in this
context. Any other algorithm will, in some sense, be less random than
that, so I think it's really just a question of testing that it's
random enough to satisfy the intended use case.

The approach to testing I tried was to use the Kolmogorov-Smirnov test
to test how uniformly random a couple of quantities are:

1). For a given size and fixed input value, and a large number of
randomly chosen seed values, is the output uniformly random?

2). For a given size and a pair of nearby input values, are the two
output values uniformly randomly spaced apart?

This second test seems desirable to ensure sufficient shuffling so
that inputs coming from some non-uniform random distribution are
scattered sufficiently to distribute the maxima/minima (x -> x + rand
mod size would pass test 1, so that by itself is insufficient).

I tested this using the attached (somewhat ugly) stand-alone test C
program, and for the most part these 2 tests seemed to pass. The
program also tests that the output really is a permutation, just to be
sure, and this was confirmed in all cases.

However, I noticed that the results are less good when size is a power
of 2. In this case, the results seem significantly less uniform,
suggesting that for such sizes, there is an increased chance that the
permuted output might still be skewed.

Based on the above, I also experimented with a different permutation
algorithm (permute2() in the test), which attempts to inject more
randomness into the bijection, and between pairs of inputs, to make
the output less predictable and less likely to be accidentally
non-uniform. It's possible that it suffers from other problems, but it
did do better in these 2 tests.

Maybe some other tests might be useful, but I really don't know. As
noted above, any algorithm is likely to lead to some pattern in the
output (e.g., it looks like both these algorithms cause adjacent
inputs to always end up separated by an odd number), so a judgement
call will be required to decide what is random enough.

Regards,
Dean
#include 
#include 
#include 

typedef unsigned char uint8;
typedef long int64;
typedef unsigned long uint64;
typedef unsigned __int128 uint128;
typedef int64 (*permute_fn)(const int64, const int64, const int64);

#define PRP_PRIMES 16

static uint64 primes[PRP_PRIMES] = {
8388617,
8912921,
9437189,
9961487,
10485767,
11010059,
11534351,
12058679,
12582917,
13107229,
13631489,
14155777,
14680067,
15204391,
15728681,
16252967
};

#define PRP_ROUNDS 4

static uint64
compute_mask(uint64 n)
{
n |= n >> 1;
n |= n >> 2;
n |= n >> 4;
n |= n >> 8;
n |= n >> 16;
n |= n >> 32;
return n;
}

static uint64
modular_multiply(uint64 x, uint64 y, const uint64 m)
{
return (uint128) x * (uint128) y % (uint128) m;
}

#define DK_LCG_MUL 6364136223846793005L
#define DK_LCG_INC 1442695040888963407L

#define LCG_SHIFT 13

static int64
permute(const int64 data, const int64 isize, const int64 seed)
{
uint64  size = (uint64) isize;
uint64  v = (uint64) data % size;
uint64  key = (uint64) seed;
uint64  mask = compute_mask(size - 1) >> 1;

if (isize == 1)
return 0;

for (unsigned int i = 0, p = key % PRP_PRIMES;
 i < PRP_ROUNDS; i++, p = (p + 1) % PRP_PRIMES)
{
uint64  t;

key = key * DK_LCG_MUL + DK_LCG_INC;
if (v <= mask)
v ^= (key >> LCG_SHIFT) & mask;

key 

Re: PoC/WIP: Extended statistics on expressions

2021-03-24 Thread Dean Rasheed
On Wed, 24 Mar 2021 at 10:22, Tomas Vondra
 wrote:
>
> Thanks, it seems to be some thinko in handling in PlaceHolderVars, which
> seem to break the code's assumptions about varnos. This fixes it for me,
> but I need to look at it more closely.
>

I think that makes sense.

Reviewing the docs, I noticed a couple of omissions, and had a few
other suggestions (attached).

Regards,
Dean
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
new file mode 100644
index dadca67..382cbd7
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7377,6 +7377,15 @@ SCRAM-SHA-256$<iteration
m for most common values (MCV) list statistics
   
  
+
+ 
+  
+   stxexprs pg_node_tree
+  
+  
+   A list of any expressions covered by this statistics object.
+  
+ 
 

   
@@ -7474,6 +7483,16 @@ SCRAM-SHA-256$<iteration
pg_mcv_list type
   
  
+
+ 
+  
+   stxdexpr pg_statistic[]
+  
+  
+   Per-expression statistics, serialized as an array of
+   pg_statistic type
+  
+ 
 

   
@@ -12843,7 +12862,8 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_p
 
   
The view pg_stats_ext provides access to
-   the information stored in the pg_statistic_ext
and pg_statistic_ext_data
catalogs.  This view allows access only to rows of
@@ -12930,7 +12950,16 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_p
(references pg_attribute.attname)
   
   
-   Names of the columns the extended statistics is defined on
+   Names of the columns included in the extended statistics
+  
+ 
+
+ 
+  
+   exprs text[]
+  
+  
+   Expressions included in the extended statistics
   
  
 
@@ -13033,7 +13062,8 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_p
 
   
The view pg_stats_ext_exprs provides access to
-   the information stored in the pg_statistic_ext
and pg_statistic_ext_data
catalogs.  This view allows access only to rows of
@@ -13119,7 +13149,7 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_p
expr text
   
   
-   Expression the extended statistics is defined on
+   Expression included in the extended statistics
   
  
 
diff --git a/doc/src/sgml/ref/create_statistics.sgml b/doc/src/sgml/ref/create_statistics.sgml
new file mode 100644
index 5f3aefd..f561599
--- a/doc/src/sgml/ref/create_statistics.sgml
+++ b/doc/src/sgml/ref/create_statistics.sgml
@@ -27,7 +27,7 @@ CREATE STATISTICS [ IF NOT EXISTS ] statistics_name
 [ ( statistics_kind [, ... ] ) ]
-ON { column_name | ( expression ) } [, ...]
+ON { column_name | ( expression ) }, { column_name | ( expression ) } [, ...]
 FROM table_name
 
 
@@ -45,12 +45,15 @@ CREATE STATISTICS [ IF NOT EXISTS ] 
The CREATE STATISTICS command has two basic forms. The
-   simple variant allows building statistics for a single expression, does
-   not allow specifying any statistics kinds and provides benefits similar
-   to an expression index. The full variant allows defining statistics objects
-   on multiple columns and expressions, and selecting which statistics kinds will
-   be built. The per-expression statistics are built automatically when there
-   is at least one expression.
+   first form allows univariate statistics for a single expression to be
+   collected, providing benefits similar to an expression index without the
+   overhead of index maintenance.  This form does not allow the statistics
+   kind to be specified, since the various statistics kinds refer only to
+   multivariate statistics.  The second form of the command allows
+   multivariate statistics on multiple columns and/or expressions to be
+   collected, optionally specifying which statistics kinds to include.  This
+   form will also automatically cause univariate statistics to be collected on
+   any expressions included in the list.
   
 
   
@@ -93,16 +96,16 @@ CREATE STATISTICS [ IF NOT EXISTS ] statistics_kind
 
  
-  A statistics kind to be computed in this statistics object.
+  A multivariate statistics kind to be computed in this statistics object.
   Currently supported kinds are
   ndistinct, which enables n-distinct statistics,
   dependencies, which enables functional
   dependency statistics, and mcv which enables
   most-common values lists.
   If this clause is omitted, all supported statistics kinds are
-  included in the statistics object. Expression statistics are built
-  automatically when the statistics definition includes complex
-  expressions and not just simple column references.
+  included in the statistics object. Univariate expression statistics are
+  built automatically if the statistics definition includes any complex
+  expressions rather than just simple column references.
   For more information, see 
   and .
  
@@ -114,8 +117,9 @@ CREATE 

Re: PoC/WIP: Extended statistics on expressions

2021-03-24 Thread Dean Rasheed
On Wed, 24 Mar 2021 at 14:48, Tomas Vondra
 wrote:
>
> AFAIK the primary issue here is that the two places disagree. While
> estimate_num_groups does this
>
> varnos = pull_varnos(root, (Node *) varshere);
> if (bms_membership(varnos) == BMS_SINGLETON)
> { ... }
>
> the add_unique_group_expr does this
>
> varnos = pull_varnos(root, (Node *) groupexpr);
>
> That is, one looks at the group expression, while the other look at vars
> extracted from it by pull_var_clause(). Apparently for PlaceHolderVar
> this can differ, causing the crash.
>
> So we need to change one of those places - my fix tweaked the second
> place to also look at the vars, but maybe we should change the other
> place? Or maybe it's not the right fix for PlaceHolderVars ...
>

I think that it doesn't make any difference which place is changed.

This is a case of an expression with no stats. With your change,
you'll get a single GroupExprInfo containing a list of
VariableStatData's for each of it's Var's, whereas if you changed it
the other way, you'd get a separate GroupExprInfo for each Var. But I
think they'd both end up being treated the same by
estimate_multivariate_ndistinct(), since there wouldn't be any stats
matching the expression, only the individual Var's. Maybe changing the
first place would be the more bulletproof fix though.

Regards,
Dean




Re: PoC/WIP: Extended statistics on expressions

2021-03-24 Thread Dean Rasheed
On Wed, 24 Mar 2021 at 16:48, Tomas Vondra
 wrote:
>
> As for the changes proposed in the create_statistics, do we really want
> to use univariate / multivariate there? Yes, the terms are correct, but
> I'm not sure how many people looking at CREATE STATISTICS will
> understand them.
>

Hmm, I think "univariate" and "multivariate" are pretty ubiquitous,
when used to describe statistics. You could use "single-column" and
"multi-column", but then "column" isn't really right anymore, since it
might be a column or an expression. I can't think of any other terms
that fit.

Regards,
Dean




Re: PoC/WIP: Extended statistics on expressions

2021-03-25 Thread Dean Rasheed
On Thu, 25 Mar 2021 at 00:05, Tomas Vondra
 wrote:
>
> Actually, I think we need that block at all - there's no point in
> keeping the exact expression, because if there was a statistics matching
> it it'd be matched by the examine_variable. So if we get here, we have
> to just split it into the vars anyway. So the second block is entirely
> useless.

Good point.

> That however means we don't need the processing with GroupExprInfo and
> GroupVarInfo lists, i.e. we can revert back to the original simpler
> processing, with a bit of extra logic to match expressions, that's all.
>
> The patch 0003 does this (it's a bit crude, but hopefully enough to
> demonstrate).

Cool. I did wonder about that, but I didn't fully think it through.
I'll take a look.

> 0002 is an attempt to fix an issue I noticed today - we need to handle
> type changes.
>
> I think we have two options:
>
> a) Make UpdateStatisticsForTypeChange smarter to also transform and
> update the expression string, and reset pg_statistics[] data.
>
> b) Just recreate the statistics, just like we do for indexes. Currently
> this does not force analyze, so it just resets all the stats. Maybe it
> should do analyze, though.

I'd vote for (b) without an analyse, and I agree with getting rid of
UpdateStatisticsForTypeChange(). I've always been a bit skeptical
about trying to preserve extended statistics after a type change, when
we don't preserve regular per-column stats.

> BTW I wonder how useful the updated statistics actually is. Consider
> this example:
> ...
> the expression now looks like this:
>
> 
> "public"."s" (ndistinct) ON ((a + b)), b)::numeric)::double
> precision + c)) FROM t
> 
>
> But we're matching it to (((b)::double precision + c)), so that fails.
>
> This is not specific to extended statistics - indexes have exactly the
> same issue. Not sure how common this is in practice.

Hmm, that's unfortunate. Maybe it's not that common in practice
though. I'm not sure if there is any practical way to fix it, but if
there is, I guess we'd want to apply the same fix to both stats and
indexes, and that certainly seems out of scope for this patch.

Regards,
Dean




Re: PoC/WIP: Extended statistics on expressions

2021-03-25 Thread Dean Rasheed
On Thu, 25 Mar 2021 at 00:05, Tomas Vondra
 wrote:
>
> here's an updated patch. 0001

The change to the way that CreateStatistics() records dependencies
isn't quite right -- recordDependencyOnSingleRelExpr() will not create
any dependencies if the expression uses only a whole-row Var. However,
pull_varattnos() will include whole-row Vars, and so nattnums_exprs
will be non-zero, and CreateStatistics() will not create a whole-table
dependency when it should.

I suppose that could be fixed up by inspecting the bitmapset returned
by pull_varattnos() in more detail, but I think it's probably safer to
revert to the previous code, which matched what index_create() did.

Regards,
Dean




Re: PoC/WIP: Extended statistics on expressions

2021-03-26 Thread Dean Rasheed
On Thu, 25 Mar 2021 at 19:59, Tomas Vondra
 wrote:
>
> Attached is an updated patch series, with all the changes discussed
> here. I've cleaned up the ndistinct stuff a bit more (essentially
> reverting back from GroupExprInfo to GroupVarInfo name), and got rid of
> the UpdateStatisticsForTypeChange.
>

I've looked over all that and I think it's in pretty good shape. I
particularly like how much simpler the ndistinct code has now become.

Some (hopefully final) review comments:

1). I don't think index.c is the right place for
StatisticsGetRelation(). I appreciate that it is very similar to the
adjacent IndexGetRelation() function, but index.c is really only for
index-related code, so I think StatisticsGetRelation() should go in
statscmds.c

2). Perhaps the error message at statscmds.c:293 should read

   "expression cannot be used in multivariate statistics because its
type %s has no default btree operator class"

(i.e., add the word "multivariate", since the same expression *can* be
used in univariate statistics even though it has no less-than
operator).

3). The comment for ATExecAddStatistics() should probably mention that
"ALTER TABLE ADD STATISTICS" isn't a command in the grammar, in a
similar way to other similar functions, e.g.:

/*
 * ALTER TABLE ADD STATISTICS
 *
 * This is no such command in the grammar, but we use this internally to add
 * AT_ReAddStatistics subcommands to rebuild extended statistics after a table
 * column type change.
 */

4). The comment at the start of ATPostAlterTypeParse() needs updating
to mention CREATE STATISTICS statements.

5). I think ATPostAlterTypeParse() should also attempt to preserve any
COMMENTs attached to statistics objects, i.e., something like:

--- src/backend/commands/tablecmds.c.orig2021-03-26 10:39:38.328631864 +
+++ src/backend/commands/tablecmds.c2021-03-26 10:47:21.042279580 +
@@ -12619,6 +12619,9 @@
 CreateStatsStmt  *stmt = (CreateStatsStmt *) stm;
 AlterTableCmd *newcmd;

+/* keep the statistics object's comment */
+stmt->stxcomment = GetComment(oldId, StatisticExtRelationId, 0);
+
 newcmd = makeNode(AlterTableCmd);
 newcmd->subtype = AT_ReAddStatistics;
 newcmd->def = (Node *) stmt;

6). Comment typo at extended_stats.c:2532 - s/statitics/statistics/

7). I don't think that the big XXX comment near the start of
estimate_multivariate_ndistinct() is really relevant anymore, now that
the code has been simplified and we no longer extract Vars from
expressions, so perhaps it can just be deleted.

Regards,
Dean




Re: pgbench - add pseudo-random permutation function

2021-03-30 Thread Dean Rasheed
On Mon, 22 Mar 2021 at 13:43, Dean Rasheed  wrote:
>
> On Sun, 14 Mar 2021 at 16:08, Fabien COELHO  wrote:
> >
> > > My main question on this now is, do you have a scholar reference for
> > > this algorithm?
> >
> > Nope, otherwise I would have put a reference. I'm a scholar though, if
> > it helps:-)
> >
> > I could not find any algorithm that fitted the bill. The usual approach
> > (eg benchmarking designs) is too use some hash function and assume that it
> > is not a permutation, too bad.
> >
> > Basically the algorithm mimics a few rounds of cryptographic encryption
> > adapted to any size and simple operators, whereas encryption function are
> > restricted to power of two blocks (eg the Feistel network). The structure
> > is the same AES with its AddRoundKey the xor-ing stage (adapted to non
> > power of two in whiten above), MixColumns which does the scattering, and
> > for key expansion, I used Donald Knuth generator. Basically one could say
> > that permute is an inexpensive and insecure AES:-)
> >
>
> I spent a little time playing around with this, trying to come up with
> a reasonable way to test it.

I spent more time testing this -- the previous testing was mostly for
large values of size, so I decided to also test it for small sizes.
The theoretical number of possible permutations grows rapidly with
size (size!), and the actual number of permutations observed grew
almost as quickly:

 size  size!distinct perms
  2 22
  3 66
  4 24   24
  5 120  120
  6 720  720
  7 5040 5040
  8 4032024382
  9 362880   181440
  103628800  3627355

My test script stopped once the first permutation had been seen 100
times, so it might have seen more permutations had it kept going for
longer.

However, looking at the actual output, there is a very significant
non-uniformity in the probability of particular permutations being
chosen, especially at certain sizes like 8 and 10. For example, at
size = 8, the identity permutation is significantly more likely than
almost any other permutation (roughly 13 times more likely than it
would be by random chance). Additionally, the signs are that this
non-uniformity tends to increase with size. At size = 10, the average
number of occurrences of each permutation in the test was 148, but
there were some that didn't appear at all, many that only appeared 2
or 3 times, and then some that appeared nearly 5000 times.

Also, there appears to be an unfortunate dependence on the seed -- if
the seed is even and the size is a power of 2, it looks like the
number of distinct permutations produced is just size/2, which is a
tiny fraction of size!.

This, together with the results from the previous K-S uniformity tests
at larger sizes, suggests that the current algorithm may not be random
enough to scatter values and remove correlations from a non-uniform
distribution.

In an attempt to do better, I tweaked the algorithm in the attached
patch, which makes use of erand48() to inject more randomness into the
permutation choice. Repeating the tests with the updated algorithm
shows that it has a number of advantages:

1). For small sizes (2-10), each of the size! possible permutations is
now produced with roughly equal probability. No single permutation was
much more likely than expected.

2). The loss of randomness for even seeds is gone.

3). For any given input, the output is more uniformly randomly
distributed for different seeds.

4). For any pair of nearby inputs, the corresponding outputs are more
uniformly randomly separated from one another.

5). The updated algorithm no longer uses modular_multiply(), so it
works the same on all platforms.

I still feel slightly uneasy about inventing our own algorithm here,
but I wasn't able to find anything else suitable, and I've now tested
this quite extensively. All the indications are that this updated
algorithm works well at all sizes and produces sufficiently random
results, though if anyone else can think of other approaches to
testing the core algorithm, that would be useful. For reference, I
attach 2 standalone test C programs I used for testing, which include
copies of the old and new algorithms.

I also did a quick copy editing pass over the docs, and I think the
patch is in pretty good shape. Barring objections, I'll see about
committing it later this week.

Regards,
Dean
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
new file mode 100644
index 50cf22b..84d9566
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -1057,7 +1057,7 @@ pgbench  options<
 
   
 default_seed 
-   seed used in hash functions by default
+   seed used in hash and pseudorandom permutation functions by default
   
 
   
@@ -1866,6 +1866,24 @@ SELECT 

Re: pgbench - add pseudo-random permutation function

2021-03-30 Thread Dean Rasheed
On Tue, 30 Mar 2021 at 19:26, Fabien COELHO  wrote:
>
> First, I have a thing against erand48.

Yeah, that's probably a fair point. However, all the existing pgbench
random functions are using it, so I think it's fair enough for
permute() to do the same (and actually 2^48 is pretty huge). Switching
to a 64-bit PRNG might not be a bad idea, but I think that's something
we'd want to do across the board, and so I think it should be out of
scope for this patch.

> Second, I have a significant reservation about the very structure of the
> transformation in this version:
>
>loop 4 times :
>
>  // FIRST HALF STEER
>  m/r = pseudo randoms
>  if v in first "half"
> v = ((v * m) ^ r) & mask;
> rotate1(v)
>
>  // FULL SHIFT 1
>  r = pseudo random
>  v = (v + r) % size
>
>  // SECOND HALF STEER
>  m/r = pseudo randoms
>  if v in second "half"
> same as previous on second half
>
>  // FULL SHIFT 2
>  r = pseudo random
>  v = (v + r) % size
>
> I'm really at odds with FULL SHIFT 1, because it means that up to 1/256 of
> values are kept out of STEERING. Whole chunks of values could be kept
> unshuffled because they would only have SHIFTS apply to them and each time
> fall in the not steered half. It should be an essential part of the design
> that at least one steer is applied on a value at each round, and if two
> are applied then fine, but certainly not zero. So basically I think that
> the design would be significantly improved by removing "FULL SHIFT 1".

Ah, that's a good point. Something else that also concerned me there
was that it might lead to 2 consecutive full shifts with nothing in
between, which would lead to less uniform randomness (like the
Irwin-Hall distribution).

I just did a quick test without the first full shift, and the results
do appear to be better, so removing that looks like a good idea.

> Third, I think that the rotate code can be simplified, in particular the
> ?: should be avoided because it may induce branches quite damaging to
> processor performance.

Yeah, I wondered about that. Perhaps there's a "trick" that can be
used to simplify it. Pre-computing the number of bits in the mask
would probably help. I'll give it some thought.

Regards,
Dean




Re: pgbench - add pseudo-random permutation function

2021-03-30 Thread Dean Rasheed
On Tue, 30 Mar 2021 at 20:31, Dean Rasheed  wrote:
>
> Yeah, that's probably a fair point. However, all the existing pgbench
> random functions are using it, so I think it's fair enough for
> permute() to do the same (and actually 2^48 is pretty huge). Switching
> to a 64-bit PRNG might not be a bad idea, but I think that's something
> we'd want to do across the board, and so I think it should be out of
> scope for this patch.
>

Of course the immediate counter-argument to changing the existing
random functions would be that doing so would break lots of people's
tests, and no one would thank us for that. Still, I think that, since
the existing random functions use a 48-bit PRNG, it's not unreasonable
for permute() to do the same.

Regards,
Dean




Re: pgbench - add pseudo-random permutation function

2021-03-31 Thread Dean Rasheed
On Wed, 31 Mar 2021 at 09:02, Fabien COELHO  wrote:
>
> >> First, I have a thing against erand48.
> >
> Also, there is a 64 bits seed provided to the function which instantly
> ignores 16 of them, which looks pretty silly to me.
>

Yeah, that was copied from set_random_seed().

> At least, I suggest that two 48-bits prng could be initialized with parts
> of the seed and used in different places, eg for r & m.
>

That could work. I'd certainly feel better about that than
implementing a whole new PRNG.

> Also, the seed could be used to adjust the rotation, maybe.
>

Perhaps. I'm not sure it's really necessary though.

> >> I'm really at odds with FULL SHIFT 1, because it means that up to 1/256 of
> >> values are kept out of STEERING. [...]
> >
> > Ah, that's a good point. Something else that also concerned me there was
> > that it might lead to 2 consecutive full shifts with nothing in between,
> > which would lead to less uniform randomness (like the Irwin-Hall
> > distribution). I just did a quick test without the first full shift, and
> > the results do appear to be better,
>
> Indeed, it makes sense to me.
>

OK, attached is an update making this change and simplifying the
rotate code, which hopefully just leaves the question of what (if
anything) to do with pg_erand48().

> >> Third, I think that the rotate code can be simplified, in particular
> >> the ?: should be avoided because it may induce branches quite damaging
> >> to processor performance.
> >
> > Yeah, I wondered about that. Perhaps there's a "trick" that can be
> > used to simplify it. Pre-computing the number of bits in the mask
> > would probably help.
>
> See pg_popcount64().
>

Actually, I used pg_leftmost_one_pos64() to calculate the mask length,
allowing the mask to be computed from that, so there is no longer a
need for compute_mask(), which seems like a neat little
simplification.

Regards,
Dean
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
new file mode 100644
index 50cf22b..84d9566
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -1057,7 +1057,7 @@ pgbench  options<
 
   
 default_seed 
-   seed used in hash functions by default
+   seed used in hash and pseudorandom permutation functions by default
   
 
   
@@ -1866,6 +1866,24 @@ SELECT 4 AS four \; SELECT 5 AS five \as
 
   

+permute ( i, size [, seed ] )
+integer
+   
+   
+Permuted value of i, in the range
+[0, size).  This is the new position of
+i (modulo size) in a
+pseudorandom permutation of the integers 0...size-1,
+parameterized by seed.
+   
+   
+permute(0, 4)
+an integer between 0 and 3
+   
+  
+
+  
+   
 pi ()
 double

@@ -2071,29 +2089,70 @@ f(x) = PHI(2.0 * parameter * (x - mu) /
 

 
+   
+
+  When designing a benchmark which selects rows non-uniformly, be aware
+  that the rows chosen may be correlated with other data such as IDs from
+  a sequence or the physical row ordering, which may skew performance
+  measurements.
+
+
+  To avoid this, you may wish to use the permute
+  function, or some other additional step with similar effect, to shuffle
+  the selected rows and remove such correlations.
+
+   
+
   
 Hash functions hash, hash_murmur2 and
 hash_fnv1a accept an input value and an optional seed parameter.
 In case the seed isn't provided the value of :default_seed
 is used, which is initialized randomly unless set by the command-line
--D option. Hash functions can be used to scatter the
-distribution of random functions such as random_zipfian or
-random_exponential. For instance, the following pgbench
-script simulates possible real world workload typical for social media and
-blogging platforms where few accounts generate excessive load:
+-D option.
+  
+
+  
+permute accepts an input value, a size, and an optional
+seed parameter.  It generates a pseudorandom permutation of integers in
+the range [0, size), and returns the index of the input
+value in the permuted values.  The permutation chosen is parameterized by
+the seed, which defaults to :default_seed, if not
+specified.  Unlike the hash functions, permute ensures
+that there are no collisions or holes in the output values.  Input values
+outside the interval are interpreted modulo the size.  The function raises
+an error if the size is not positive.  permute can be
+used to scatter the distribution of non-uniform random functions such as
+random_zipfian or random_exponential
+so that values drawn more often are n

Re: pgbench - add pseudo-random permutation function

2021-04-01 Thread Dean Rasheed
On Wed, 31 Mar 2021 at 18:53, Fabien COELHO  wrote:
>
> While looking at it, I have some doubts on this part:
>
>   m = (uint64) (pg_erand48(random_state.xseed) * (mask + 1)) | 1;
>   r = (uint64) (pg_erand48(random_state.xseed) * (mask + 1));
>   r = (uint64) (pg_erand48(random_state.xseed) * size);
>
> I do not understand why the random values are multiplied by anything in
> the first place…
>

These are just random integers in the range [0,mask] and [0,size-1],
formed in exactly the same way as getrand().

> This one looks like a no-op :
>
> r = (uint64) (pg_erand48(random_state.xseed) * size);
> v = (v + r) % size;
>
> v = (v + r) % size
>   = (v + rand * size) % size
>   =? (v % size + rand * size % size) % size
>   =? (v % size + 0) % size
>   = v % size
>   = v
>

rand * size % size is not zero because rand is a floating point number
in the range [0,1), so actually rand * size % size = rand * size.
Similarly in the other case, you're forgetting that rand is not an
integer.

Thinking more about our use of erand48(), the only real impact it has
is to limit the number of possible permutations produced, and actually
2^48 is so huge (roughly 281 million million) that I can't ever see
that being an issue in practice. (In a quick dummy test, replacing
erand48() with a silly "erand8()" function that only returned one of
256 distinct values, permute() still worked fine at any size, except
for the fact that only up to 256 distinct permutations were produced.
In other words, limitations on the source of randomness don't prevent
it from producing permutations of any size, they just limit the number
of distinct permutations possible. And since 2^48 is so big, that
shouldn't be an issue.)

Also, I think the source of the input seed is most likely to be either
manually hand-picked integers or pgbench's own random() function, so
the only real issue I can see is that by ignoring the upper 16-bits,
there's a very small chance of always using the same random sequence
if some hand-picked numbers only vary in the top 16 bits, though I
think that's highly unlikely in practice.

Nonetheless, it's not much more effort to make another random state
and use those remaining bits of the seed and get more internal random
states, so here's an update doing that. I intentionally chose to reuse
the lower 16 bits of the seed in the second random function (in a
different slot of the random state), since those are probably the ones
most likely to vary in practice.

This doesn't actually make any measurable difference to any of the
tests, but it closes that potential loophole of ignoring part of the
seed. In all my tests, the biggest improvement was between v23 and v24
of the patch. By comparison, the later versions have been relatively
small improvements, and it's probably now "random enough" for the
intended purposes.

Regards,
Dean
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
new file mode 100644
index 50cf22b..84d9566
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -1057,7 +1057,7 @@ pgbench  options<
 
   
 default_seed 
-   seed used in hash functions by default
+   seed used in hash and pseudorandom permutation functions by default
   
 
   
@@ -1866,6 +1866,24 @@ SELECT 4 AS four \; SELECT 5 AS five \as
 
   

+permute ( i, size [, seed ] )
+integer
+   
+   
+Permuted value of i, in the range
+[0, size).  This is the new position of
+i (modulo size) in a
+pseudorandom permutation of the integers 0...size-1,
+parameterized by seed.
+   
+   
+permute(0, 4)
+an integer between 0 and 3
+   
+  
+
+  
+   
 pi ()
 double

@@ -2071,29 +2089,70 @@ f(x) = PHI(2.0 * parameter * (x - mu) /
 

 
+   
+
+  When designing a benchmark which selects rows non-uniformly, be aware
+  that the rows chosen may be correlated with other data such as IDs from
+  a sequence or the physical row ordering, which may skew performance
+  measurements.
+
+
+  To avoid this, you may wish to use the permute
+  function, or some other additional step with similar effect, to shuffle
+  the selected rows and remove such correlations.
+
+   
+
   
 Hash functions hash, hash_murmur2 and
 hash_fnv1a accept an input value and an optional seed parameter.
 In case the seed isn't provided the value of :default_seed
 is used, which is initialized randomly unless set by the command-line
--D option. Hash functions can be used to scatter the
-distribution of random functions such as random_zipfian or
-random_exponential. For instance, the following pgbench
-script simulates possible real w

Re: pgbench - add pseudo-random permutation function

2021-04-04 Thread Dean Rasheed
ice some slight non-uniformity in the
way adjacent inputs were separated from another when the size was just
under a power of 2. I think that's the hardest case for this
algorithm, because there's very little overlap between the 2 halves.
Increasing the number of rounds from 4 to 6 ironed out that
non-uniformity (and as mentioned above, is still cheaper than using
randu64() with 4 rounds), so I think we should go with that.

> I still think that *rand48 is a poor (relatively small state) and
> inefficient (the implementation includes packing and unpacking 16 bits
> ints to build a 64 bits int) choice.
>

I can somewhat understand your dislike of *rand48(), but in the
context of pgbench I think that it is perfectly adequate. Since we now
use 2 RandomStates, I don't think the state size is an issue anymore,
if it ever really was. Using erand48() gives better performance than
jrand48() because it returns 48 bits rather than 32, so fewer calls
are needed, which allows more rounds for the same cost. Additionally,
following the same pattern as existing code reduces the risk of bugs,
and builds on proven, tested code.

You may wish to submit a separate patch to replace pgbench's use of
*rand48() with something else, and that would be discussed on its own
merits, but I don't see why that should hold up adding permute().

Regards,
Dean




Re: pgbench - add pseudo-random permutation function

2021-04-06 Thread Dean Rasheed
On Mon, 5 Apr 2021 at 13:07, Fabien COELHO  wrote:
>
> Attached a v28 which I hope fixes the many above issues and stays with
> ints. The randu64 is still some kind of a joke, I artificially reduced the
> cost by calling jrand48 once and extending it to 64 bits, so it could give
> an idea of the cost endured if a 64-bit prng was used.
>
> Now you are the committer, you can do as you please, I'm just stating my
> (mathematical) opinions about using floating point computations for that.
> I think that apart from this point of principle/philosophy the permute
> performance and implementation are reasonable, and better than my initial
> version because it avoids int128 computations and the large prime number
> business.
>

Pushed.

I decided not to go with the "joke" randu64() function, but instead
used getrand() directly. This at least removes any *direct* use of
doubles in permute() (though of course they're still used indirectly),
and means that if getrand() is improved in the future, permute() will
benefit too.

Regards,
Dean




Re: Wrong results from in_range() tests with infinite offset

2020-07-21 Thread Dean Rasheed
On Tue, 21 Jul 2020 at 03:06, Tom Lane  wrote:
>
> Pushed, but I chickened out of back-patching.  The improvement in what
> happens for finite comparison values seems somewhat counterbalanced by
> the possibility that someone might not like the definition we arrived
> at for infinities.  So, it's not quite an open-and-shut bug fix, so
> I just put it in HEAD (for now anyway).
>

Yeah, that seems fair enough, and it's quite an obscure corner-case
that has gone unnoticed for quite some time.

Regards,
Dean




Re: Infinities in type numeric

2020-07-22 Thread Dean Rasheed
On Tue, 21 Jul 2020 at 23:18, Tom Lane  wrote:
>
> Here's a v4 that syncs numeric in_range() with the new behavior of
> float in_range(), and addresses your other comments too.
>

LGTM.

Regards,
Dean




Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2018-03-26 Thread Dean Rasheed
On 18 March 2018 at 23:57, Tomas Vondra  wrote:
> Attached is an updated version of the patch series, addressing issues
> pointed out by Alvaro.

I'm just starting to look at this now, and I think I'll post
individual comments/questions as I get to them rather than trying to
review the whole thing, because it's quite a large patch. Apologies if
some of this has already been discussed.

Looking at the changes to UpdateStatisticsForTypeChange():

+   memset(nulls, 1, Natts_pg_statistic_ext * sizeof(bool));

why the "1" there -- is it just a typo?

A wider concern I have is that I think this function is trying to be
too clever by only resetting selected stats. IMO it should just reset
all stats unconditionally when the column type changes, which would be
consistent with what we do for regular stats.

Consider, for example, what would happen if a column was changed from
real to int -- all the data values will be coerced to integers, losing
precision, and any ndistinct and dependency stats would likely be
completely wrong afterwards. IMO that's a bug, and should be
back-patched independently of these new types of extended stats.

Thoughts?

Regards,
Dean



Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2018-03-26 Thread Dean Rasheed
On 26 March 2018 at 14:08, Tomas Vondra  wrote:
> On 03/26/2018 12:31 PM, Dean Rasheed wrote:
>> A wider concern I have is that I think this function is trying to be
>> too clever by only resetting selected stats. IMO it should just reset
>> all stats unconditionally when the column type changes, which would
>> be consistent with what we do for regular stats.
>>
> The argument a year ago was that it's more plausible that the semantics
> remains the same. I think the question is how the type change affects
> precision - had the type change in the opposite direction (int to real)
> there would be no problem, because both ndistinct and dependencies would
> produce the same statistics.
>
> In my experience people are far more likely to change data types in a
> way that preserves precision, so I think the current behavior is OK.

Hmm, I don't really buy that argument. Altering a column's type allows
the data in it to be rewritten in arbitrary ways, and I don't think we
should presume that the statistics will still be valid just because
the user *probably* won't do something that changes the data much.

Regards,
Dean



Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2018-03-26 Thread Dean Rasheed
On 18 March 2018 at 23:57, Tomas Vondra  wrote:
> Attached is an updated version of the patch series, addressing issues
> pointed out by Alvaro.

I've just been reading the new code in
statext_clauselist_selectivity() and mcv_clauselist_selectivity(), and
I'm having a hard time convincing myself that it's correct.

This code in statext_clauselist_selectivity() looks a bit odd:

/*
 * Evaluate the MCV selectivity. See if we got a full match and the
 * minimal selectivity.
 */
if (stat->kind == STATS_EXT_MCV)
s1 = mcv_clauselist_selectivity(root, stat, clauses, varRelid,
jointype, sjinfo, rel,
&fullmatch, &mcv_lowsel);

/*
 * If we got a full equality match on the MCV list, we're done (and the
 * estimate is likely pretty good).
 */
if (fullmatch && (s1 > 0.0))
return s1;

/*
 * If it's a full match (equalities on all columns) but we haven't found
 * it in the MCV, then we limit the selectivity by frequency of the last
 * MCV item. Otherwise reset it to 1.0.
 */
if (!fullmatch)
mcv_lowsel = 1.0;

return Min(s1, mcv_lowsel);

So if fullmatch is true and s1 is greater than 0, it will return s1.
If fullmatch is true and s1 equals 0, it will return Min(s1,
mcv_lowsel) which will also be s1. If fullmatch is false, mcv_lowsel
will be set to 1 and it will return Min(s1, mcv_lowsel) which will
also be s1. So it always just returns s1, no? Maybe there's no point
in computing fullmatch.

Also, wouldn't mcv_lowsel potentially be a significant overestimate
anyway? Perhaps 1 minus the sum of the MCV frequencies might be
closer, but even that ought to take into account the number of
distinct values remaining, although that information may not always be
available.

Also, just above that, in statext_clauselist_selectivity(), it
computes the list stat_clauses, then doesn't appear to use it
anywhere. I think that would have been the appropriate thing to pass
to mcv_clauselist_selectivity(). Otherwise, passing unrelated clauses
into mcv_clauselist_selectivity() will cause it to fail to find any
matches and then underestimate.

I've also come across a few incorrect/out-of-date comments:

/*
 * mcv_clauselist_selectivity
 *  Return the estimated selectivity of the given clauses using MCV list
 *  statistics, or 1.0 if no useful MCV list statistic exists.
 */

-- I can't see any code path that returns 1.0 if there are no MCV
stats. The last part of that comment is probably more appropriate to
statext_clauselist_selectivity()


/*
 * mcv_update_match_bitmap
 * [snip]
 * The function returns the number of items currently marked as 'match', and
 * ...

-- it doesn't seem to return the number of items marked as 'match'.

Then inside that function, this comment is wrong (copied from the
preceding comment):

/* AND clauses assume nothing matches, initially */
memset(bool_matches, STATS_MATCH_FULL, sizeof(char) *
mcvlist->nitems);

Still reading...

Regards,
Dean



Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2018-03-26 Thread Dean Rasheed
On 26 March 2018 at 20:17, Tomas Vondra  wrote:
> On 03/26/2018 09:01 PM, Dean Rasheed wrote:
>> Also, just above that, in statext_clauselist_selectivity(), it
>> computes the list stat_clauses, then doesn't appear to use it
>> anywhere. I think that would have been the appropriate thing to pass
>> to mcv_clauselist_selectivity(). Otherwise, passing unrelated clauses
>> into mcv_clauselist_selectivity() will cause it to fail to find any
>> matches and then underestimate.
>
> Will check.
>

Here's a test case demonstrating this bug:

drop table if exists foo;
create table foo(a int, b int, c int);
insert into foo select 0,0,0 from generate_series(1,10);
insert into foo select 1,1,1 from generate_series(1,1);
insert into foo select 2,2,2 from generate_series(1,1000);
insert into foo select 3,3,3 from generate_series(1,100);
insert into foo select x,x,x from generate_series(4,1000) g(x);
insert into foo select x,x,x from generate_series(4,1000) g(x);
insert into foo select x,x,x from generate_series(4,1000) g(x);
insert into foo select x,x,x from generate_series(4,1000) g(x);
insert into foo select x,x,x from generate_series(4,1000) g(x);
analyse foo;
explain analyse select * from foo where a=1 and b=1 and c=1;
create statistics foo_mcv_ab (mcv) on a,b from foo;
analyse foo;
explain analyse select * from foo where a=1 and b=1 and c=1;

With the multivariate MCV statistics, the estimate gets worse because
it passes the c=1 clause to mcv_clauselist_selectivity(), and nothing
matches.

There's also another bug, arising from the fact that
statext_is_compatible_clause() says that NOT clauses are supported,
but mcv_clauselist_selectivity() doesn't support them. So with the
above table:

select * from foo where (a=0 or b=0) and not (b in (1,2));
ERROR:  unknown clause type: 111

Regards,
Dean



Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2018-03-27 Thread Dean Rasheed
On 27 March 2018 at 01:36, Tomas Vondra  wrote:
> 4) handling of NOT clauses in MCV lists (and in histograms)
>
> The query you posted does not fail anymore...
>

Ah, it turns out the previous query wasn't actually failing for the
reason I thought it was -- it was failing because it had a
ScalarArrayOpExpr that was being passed to
mcv_clauselist_selectivity() because of the wrong list being passed to
it. I could see from the code that a NOT clause would have tripped it
up, but most NOT clauses actually get rewritten by negate_clause() so
they end up not being NOT clauses.

One way to get a NOT clause, is with a boolean column, and this
reveals another couple of problems:

drop table if exists foo;
create table foo(a int, b boolean);
insert into foo values(1,true);
insert into foo values(1,true);
insert into foo values(1,false);
create statistics foo_mcv_ab (mcv) on a,b from foo;
analyse foo;

select * from foo where a=1 and b;
ERROR:  unknown clause type: 99

This fails because the clause is now a Var, which
statext_is_compatible_clause() lets through, but
mcv_clauselist_selectivity() doesn't support. So it's important to
keep those 2 functions in sync, and it might be worth having comments
in each to emphasise that.

And, if a NOT clause is used:

select * from foo where a=1 and not b;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.

This is an Assert failure in mcv_update_match_bitmap()'s BoolExpr
handling block:

Assert(bool_clauses != NIL);
Assert(list_length(bool_clauses) >= 2);

The first of those Asserts is actually redundant with the second, but
the second fails because a NOT clause always only has one argument.

Regards,
Dean



Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2018-03-27 Thread Dean Rasheed
On 27 March 2018 at 01:36, Tomas Vondra  wrote:
> BTW I think there's a bug in handling the fullmatch flag - it should not
> be passed to AND/OR subclauses the way it is, because then
>
> WHERE a=1 OR (a=2 AND b=2)
>
> will probably set it to 'true' because of (a=2 AND b=2). Which will
> short-circuit the statext_clauselist_selectivity, forcing it to ignore
> the non-MCV part.
>

I'm not sure that's true. Won't the outer call to
mcv_update_match_bitmap() overwrite the value of fullmatch returned by
the nested call, and set fullmatch to false because it has only seen 1
attribute equality match? I think that's the correct result, but I
think that's just luck.

The dubious part is the way fullmatch is calculated for OR clauses --
I think for an OR clause we want to know the attributes matched in
*every* subclause, rather than in *any* subclause, as we do for AND.
So I think the only way an OR clause at the top-level should return a
full match is if every sub-clause was a full match, for example:

  WHERE (a=1 AND b=2) OR (a=2 AND b=1)

But then consider this:

  WHERE a=1 AND (b=1 OR b=2)

That should also potentially be a full match, but that can only work
if mcv_update_match_bitmap() returned the set of matching attributes
(eqmatches), rather than fullmatch, so that it can be merged
appropriately in the caller. So for an OR clause, it needs to return
eqmatches containing the list of attributes for which every sub-clause
matched with equality against the MCV list, and in an outer AND clause
that can be added to the outer eqmatches list, which is the list of
attributes for which any sub-clause matched with equality.

Regards,
Dean



Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2018-03-27 Thread Dean Rasheed
On 27 March 2018 at 14:58, Dean Rasheed  wrote:
> On 27 March 2018 at 01:36, Tomas Vondra  wrote:
>> 4) handling of NOT clauses in MCV lists (and in histograms)
>>
>> The query you posted does not fail anymore...
>>
> Ah, it turns out the previous query wasn't actually failing for the
> reason I thought it was -- it was failing because it had a
> ScalarArrayOpExpr that was being passed to
> mcv_clauselist_selectivity() because of the wrong list being passed to
> it. I could see from the code that a NOT clause would have tripped it
> up, but most NOT clauses actually get rewritten by negate_clause() so
> they end up not being NOT clauses.
>

Thinking about that some, I think that the only NOT clauses this needs
to actually worry about are NOTs of boolean Vars. Anything else that
this code supports will have been transformed into something other
than a NOT before reaching this point. Thus it might be much simpler
to handle that as a special case in statext_is_compatible_clause() and
mcv_update_match_bitmap(), rather than trying to support general NOT
clauses, and going through a recursive call to
mcv_update_match_bitmap(), and then having to merge bitmaps. NOT of a
boolean Var could then be treated just like var=false, setting the
appropriate attribute match entry if it's found in the MCV list. This
would allow clauses like (a=1 and NOT b) to be supported, which I
don't think currently works, because fullmatch won't get set.

Regards,
Dean



Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2018-03-28 Thread Dean Rasheed
On 28 March 2018 at 01:34, Tomas Vondra  wrote:
> Attached is a patch fixing this. In the end I've decided to keep both
> branches - one handling boolean Vars and one for NOT clauses. I think
> you're right we can only see (NOT var) cases, but I'm not sure about that.
>
> For example, what if an operator does not have a negator? Then we can't
> transform NOT (a AND b) => (NOT a OR NOT b), I guess. So I kept this for
> now, and we can remove this later.
>

OK, but it's going to have to work harder to set "fullmatch"
correctly. If you have a boolean Var clause, which is identical to
"bool_var = true", it ought to add to "eqmatches" if true is found in
the MCV list. Likewise a boolean Var under a NOT clause is identical
to "bool_var = false", so it ought to add to "eqmatches" if false is
found in the MCV list. Both those cases would be easy to handle, if
general NOT support wasn't required, and you just special-cased "NOT
bool_var".

If you're going to handle the general case of an arbitrary clause
under a NOT, then the recursive call to mcv_update_match_bitmap()
would seem to need to know that it's under a NOT (a new "is_not"
parameter?), to invert the logic around adding to "eqmatches". That
applies to other general OpExpr's too -- for example, "NOT (box_var =
?)" won't be rewritten because there is no box_ne operator, but when
mcv_update_match_bitmap() is called recursively with the "box_var =
?", it shouldn't add to "eqmatches", despite this being an EQSEL
operator.

As mentioned before, I think this whole thing only works if
mcv_update_match_bitmap() returns the "eqmatches" list, so that if it
is called recursively, it can be merged with the caller's list. What
isn't immediately obvious to me is what happens to a NOT clause under
another NOT clause, possibly with an AND or OR in-between. Would the
"is_not" parameter just flip back to false again?

There's also an interesting question around the NullTest clause. Since
NULLs are being recorded in the MCV list, shouldn't "IS NULL" be
treated as semantically like an equality clause, and cause that
attribute to be added to "eqmatches" if NULL is found in the MCV list?


> I've also realized that the "fullmatch" flag is somewhat confused,
> because some places interpreted it as "there is equality on each
> attribute" but in fact it also required an actual MCV match.

Yes, I was having similar thoughts. I think "eqmatches" / "fullmatch"
probably just wants to track whether there was an exact comparison on
all the attributes, not whether or not the value was in the MCV list,
because the latter is already available in the "matches" bitmap.
Knowing that complete, exact comparisons happened, and it wasn't in
the MCV list, makes the "(1 - mcv_totalsel)) / otherdistinct" estimate
reasonable.

However, I don't think that tracking "eqmatches" or "fullmatch" is
sufficient for the general case. For example, for other operators like
"!=", "<", "<=", all (or maybe half) the "1 - mcv_totalsel" ought to
count towards the selectivity, plus possibly part of the MCV list
(e.g., for "<=", using the sum of the matching MCV frequencies plus
half the sum of the non-MCV frequencies might be reasonable -- c.f.
scalarineqsel()). For an OR clause, you might want to count the number
of non-MCV matches, because logically each one adds another "(1 -
mcv_totalsel)) / otherdistinct" to the total selectivity. It's not
immediately obvious how that can be made to fit into the current code
structure. Perhaps it could be made to work by tracking the overall
selectivity as it goes along. Or perhaps it could track the
count/proportion of non-MCV matches.

Regards,
Dean



Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2018-03-28 Thread Dean Rasheed
On 28 March 2018 at 15:50, Tomas Vondra  wrote:
> After thinking about this a bit more, I'm not sure if updating the info
> based on recursive calls makes sense. The fullmatch flag was supposed to
> answer a simple question - can there be just a single matching item?
>
> If there are equality conditions on all columns, there can be just a
> single matching item - if we have found it in the MCV (i.e. s1 > 0.0),
> then we don't need to inspect the non-MCV part.
>
> But handling this in recursive manner breaks this entirely, because with
> something like
>
>(a=1) AND (b=1 OR b=2)
>
> you suddenly can have multiple matching items. Which makes the fullmatch
> flag somewhat useless.
>
> So I think we should be looking at top-level equality clauses only, just
> like number_of_groups() does.
>

I'm not quite sure what you mean by that, but it sounds a bit limiting
in terms of the kinds of user queries that would be supported.


> I think we can remove the fullmatch flag from mcv_update_bitmap
> entirely. All we need to know is the presence of equality clauses and
> whether there was a match in MCV (which we know from s1 > 0.0).
>

I agree with removing the fullmatch flag, but I don't think we
actually need to know about the presence of equality clauses:

The way that mcv_update_bitmap() recursively computes the set of
matching MCVs seems to be correct. That gives us a value (call it
mcv_matchsel) for the proportion of the table's rows that are in the
MCV list and satisfy the clauses in stat_clauses.

We can also estimate that there are (1-mcv_totalsel)*N rows that are
not in the MCV list, for which the MCV stats therefore tell us
nothing. The best way to estimate those rows would seem to be to use
the logic from the guts of clauselist_selectivity(), without
consulting any extended MCV stats (but still consulting other extended
stats, I think). Doing that would return a selectivity value (call it
nonmcv_sel) for those remaining rows. Then a reasonable estimate for
the overall selectivity would seem to be

  mcv_matchsel + (1-mcv_totalsel) * nonmcv_sel

and there would be no need for mcv_update_bitmap() to track eqmatches
or return fullmatch, and it wouldn't actually matter whether or not we
had equality clauses or if all the MCV columns were used.

Regards,
Dean



Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2018-04-07 Thread Dean Rasheed
On 7 April 2018 at 15:12, Bruce Momjian  wrote:
> Uh, where are we on this patch?  It isn't going to make it into PG 11?
> Feature development for this has been going on for years.

Unfortunately, I think there's no way that this will be ready for
PG11. So far, I have only read parts of the patch, focusing mainly on
the planner changes, and how it will make use of the new statistics. I
think there are still issues to work out in that area, although I
haven't read the latest version yet, I have some doubts about the new
approach described.

Looking back at the history of this patch, it appears to have moved
through all 4 PG11 commitfests with fairly minimal review, never
becoming Ready for Committer. That's a real shame because I agree that
it's an important feature, but IMO it's not yet in a committable
state. I feel bad saying that, because the patch hasn't really had a
fair chance to-date, despite Tomas' efforts and quick responses to all
review comments.

At this stage though, all I can say is that I'll make every effort to
keep reviewing it, and I hope Tomas will persist, so that it has a
better chance in PG12.

Regards,
Dean



Re: BF failure: could not open relation with OID XXXX while querying pg_views

2019-09-15 Thread Dean Rasheed
On Sat, 14 Sep 2019 at 05:25, Tom Lane  wrote:
>
> Tomas Vondra  writes:
> > On Wed, Aug 14, 2019 at 05:24:26PM +1200, Thomas Munro wrote:
> >> On Wed, Aug 14, 2019 at 5:06 PM Tom Lane  wrote:
> >>> Oh, hmm --- yeah, that should mean it's safe.  Maybe somebody incautiously
> >>> changed one of the other tests that run concurrently with "rules"?
>
> >> Looks like stats_ext.sql could be the problem.  It creates and drops
> >> priv_test_view, not in a schema.  Adding Dean, author of commit
> >> d7f8d26d.
>
> > Yeah, that seems like it might be the cause. I'll take a look at fixing
> > this, probably by creating the view in a different schema.
>
> Ping?  We're still getting intermittent failures of this ilk, eg
>
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dragonet&dt=2019-09-14%2003%3A37%3A03
>
> With v12 release approaching, I'd like to not have failures
> like this in a released branch.
>

Ah sorry, I missed this thread before. As author of that commit, it's
really on me to fix it, and the cause seems pretty clear-cut, so I'll
aim to get that done today.

Regards,
Dean




Re: BF failure: could not open relation with OID XXXX while querying pg_views

2019-09-15 Thread Dean Rasheed
On Sun, 15 Sep 2019 at 11:11, Tomas Vondra  wrote:
>
> On Sun, Sep 15, 2019 at 10:16:30AM +0100, Dean Rasheed wrote:
> >
> >Ah sorry, I missed this thread before. As author of that commit, it's
> >really on me to fix it, and the cause seems pretty clear-cut, so I'll
> >aim to get that done today.
>
> FWIW here is a draft patch that I was going to propose - it simply moves
> the table+view into a "tststats" schema. I suppose that's rougly what we
> discussed earlier in this thread.
>

Yes, that matches what I just drafted.
Do you want to push it, or shall I?

Regards,
Dean




Re: BF failure: could not open relation with OID XXXX while querying pg_views

2019-09-15 Thread Dean Rasheed
On Sun, 15 Sep 2019 at 12:20, Tomas Vondra  wrote:
>
> On Sun, Sep 15, 2019 at 11:27:19AM +0100, Dean Rasheed wrote:
> >On Sun, 15 Sep 2019 at 11:11, Tomas Vondra  
> >wrote:
> >>
> >> On Sun, Sep 15, 2019 at 10:16:30AM +0100, Dean Rasheed wrote:
> >> >
> >> >Ah sorry, I missed this thread before. As author of that commit, it's
> >> >really on me to fix it, and the cause seems pretty clear-cut, so I'll
> >> >aim to get that done today.
> >>
> >> FWIW here is a draft patch that I was going to propose - it simply moves
> >> the table+view into a "tststats" schema. I suppose that's rougly what we
> >> discussed earlier in this thread.
> >>
> >
> >Yes, that matches what I just drafted.
> >Do you want to push it, or shall I?
>
> Please go ahead and push. I'm temporarily without commit access.
>

OK, pushed.

Regards,
Dean




Re: BUG #15293: Stored Procedure Triggered by Logical Replication is Unable to use Notification Events

2019-02-27 Thread Marc Dean
If you are trying to get around the issue for now, what my team did was
cron an insert statement on the database server. We have a queue table that
has some of these triggers setup so it was easy to write a no-op row to the
queue. This had the side effect of flushing the notification queue.

We haven't had any issues with this approach.

I can also volunteer to help test out patches.

Thanks,

-mdj

On Fri, Feb 22, 2019 at 11:37 AM Robert Welin  wrote:

> I have reproduced this bug on PostgreSQL version 10.7 and 11.2 using the
> steps described in Michael Powers' original report. The issue also still
> seems to be present even with the patch provided by Sergei Kornilov.
>
> Are there plans to address this issue any time soon or is there some way
> I can assist in fixing it? It would be great to have notifications from
> logical replication.
>
> Regards,
> Robert Welin
>
>


Re: BUG #15623: Inconsistent use of default for updatable view

2019-02-28 Thread Dean Rasheed
On Thu, 28 Feb 2019 at 07:47, Amit Langote
 wrote:
>
> +if (attrno == 0)
> +elog(ERROR, "Cannot set value in column %d to
> DEFAULT", i);
>
> Maybe: s/Cannot/cannot/g
>

Ah yes, you're right. That is the convention.


> +Assert(list_length(sublist) == numattrs);
>
> Isn't this Assert useless, because we're setting numattrs to
> list_length() and transformInsertStmt ensures that all
> sublists are same length?
>

Well possibly I'm being over-paranoid, but given that it may have come
via a previous invocation of rewriteValuesRTE() that may have
completely rebuilt the lists, it seemed best to be sure that it hadn't
done something unexpected. It's about to use that to read from the
attrnos array, so it might read beyond the array bounds if any of the
prior logic was faulty.

Thanks for looking.

Regards,
Dean



Re: Row Level Security − leakproof-ness and performance implications

2019-02-28 Thread Dean Rasheed
On Thu, 28 Feb 2019 at 14:13, Robert Haas  wrote:
> A wild idea might be to let
> proleakproof take on three values: yes, no, and maybe.  When 'maybe'
> functions are involved, we tell them whether or not the current query
> involves any security barriers, and if so they self-censor.
>

Does self-censoring mean that they might still throw an error for some
inputs, but that error won't reveal any information about the input
values? That's not entirely consistent with my understanding of the
definition of leakproof, but maybe there are situations where that
amount of information leakage would be OK. So maybe we could have
"strictly leakproof" functions that never throw errors and "weakly
leakproof" functions (needs a better name) that can throw errors, as
long as those errors don't include data values. Then we could allow
strict and weak security barriers on a per-table basis, depending on
how sensitive the data is in each table (I'm not a fan of using GUCs
to control this).

Regards,
Dean



Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2019-03-09 Thread Dean Rasheed
On Thu, 28 Feb 2019 at 19:56, Tomas Vondra  wrote:
> Attached is an updated version of this patch series.

Here are some random review comments. I'll add more later, but I'm out
of energy for today.

1). src/test/regress/expected/type_sanity.out has bit-rotted.

2). Duplicate OIDs (3425).

3). It looks a bit odd that clauselist_selectivity() calls
statext_clauselist_selectivity(), which does MCV stats and will do
histograms, but it doesn't do dependencies, so
clauselist_selectivity() has to then separately call
dependencies_clauselist_selectivity(). It would seem neater if
statext_clauselist_selectivity() took care of calling
dependencies_clauselist_selectivity(), since dependencies are just
another kind of extended stats.

4). There are no tests for pg_mcv_list_items(). Given a table with a
small enough amount of data, so that it's all sampled, it ought to be
possible to get predictable MCV stats.

5). It's not obvious what some of the new test cases in the
"stats_ext" tests are intended to show. For example, the first test
creates a table with 5000 rows and a couple of indexes, does a couple
of queries, builds some MCV stats, and then repeats the queries, but
the results seem to be the same with and without the stats.

I wonder if it's possible to write smaller, more targeted tests.
Currently "stats_ext" is by far the slowest test in its group, and I'm
not sure that some of those tests add much. It ought to be possible to
write a function that calls EXPLAIN and returns a query's row
estimate, and then you could write tests to confirm the effect of the
new stats by verifying the row estimates change as expected.

6). This enum isn't needed for MCVs:

/*
 * Degree of how much MCV item matches a clause.
 * This is then considered when computing the selectivity.
 */
#define STATS_MATCH_NONE0/* no match at all */
#define STATS_MATCH_PARTIAL1/* partial match */
#define STATS_MATCH_FULL2/* full match */

STATS_MATCH_PARTIAL is never used for MCVs, so you may as well just
use booleans instead of this enum. If those are needed for histograms,
they can probably be made local to histogram.c.

7). estimate_num_groups_simple() isn't needed in this patch.

8). In README.mcv,
s/clauselist_mv_selectivity_mcvlist/mcv_clauselist_selectivity/.

9). In the list of supported clause types that follows (e) is the same
a (c), but with a more general description.

10). It looks like most of the subsequent description of the algorithm
is out of date and needs rewriting. All the stuff about full matches
and the use of ndistinct is now obsolete.

Regards,
Dean



Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2019-03-10 Thread Dean Rasheed
On Sat, 9 Mar 2019 at 18:33, Dean Rasheed  wrote:
>
> On Thu, 28 Feb 2019 at 19:56, Tomas Vondra  
> wrote:
> > Attached is an updated version of this patch series.
>
> Here are some random review comments. I'll add more later, but I'm out
> of energy for today.
>

Here are some more comments:

11). In dependency_degree():

-/* sort the items so that we can detect the groups */
-qsort_arg((void *) items, numrows, sizeof(SortItem),
-  multi_sort_compare, mss);
+/*
+ * build an array of SortItem(s) sorted using the multi-sort support
+ *
+ * XXX This relies on all stats entries pointing to the same tuple
+ * descriptor. Not sure if that might not be the case.
+ */
+items = build_sorted_items(numrows, rows, stats[0]->tupDesc,
+   mss, k, attnums_dep);

That XXX comment puzzled me for a while. Actually it's OK though,
unless/until we try to support stats across multiple relations, which
will require a much larger refactoring of this code. For now though,
The stats entries all point to the same tuple descriptor from the
onerel passed to BuildRelationExtStatistics(), so it's OK to just use
the first tuple descriptor in this way. The comment should be updated
to explain that.

12). bms_member_index() should surely be in bitmapset.c. It could be
more efficient by just traversing the bitmap words and making use of
bmw_popcount(). Also, its second argument should be of type 'int' for
consistency with other bms_* functions.

13). estimate_ndistinct() has been moved from mvdistinct.c to
extended_stats.c and changed from static to extern, but it is only
called from mvdistinct.c, so that change is unnecessary (at least as
far as this patch is concerned).

14). The attnums Bitmapset passed to
statext_is_compatible_clause_internal() is an input/output argument
that it updates. That should probably be documented. When it calls
itself recursively for AND/OR/NOT clauses, it could just pass the
original Bitmapset through to be updated (rather than creating a new
one and merging it), as it does for other types of clause.

On the other hand, the outer function statext_is_compatible_clause()
does need to return a new bitmap, which may or may not be used by its
caller, so it would be cleaner to make that a strictly "out" parameter
and initialise it to NULL in that function, rather than in its caller.

15). As I said yesterday, I don't think that there is a clean
separator of concerns between the functions clauselist_selectivity(),
statext_clauselist_selectivity(),
dependencies_clauselist_selectivity() and
mcv_clauselist_selectivity(), I think things could be re-arranged as
follows:

statext_clauselist_selectivity() - as the name suggests - should take
care of *all* extended stats estimation, not just MCVs and histograms.
So make it a fairly small function, calling
mcv_clauselist_selectivity() and
dependencies_clauselist_selectivity(), and histograms when that gets
added.

Most of the current code in statext_clauselist_selectivity() is really
MCV-specific, so move that to mcv_clauselist_selectivity(). Amongst
other things, that would move the call to choose_best_statistics() to
mcv_clauselist_selectivity() (just as
dependencies_clauselist_selectivity() calls choose_best_statistics()
to get the best dependencies statistics). Then, when histograms are
added later, you won't have the problem pointed out before where it
can't apply both MCV and histogram stats if they're on different
STATISTICS objects.

Most of the comments for statext_clauselist_selectivity() are also
MCV-related. Those too would move to mcv_clauselist_selectivity().

Regards,
Dean



Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2019-03-10 Thread Dean Rasheed
On Sun, 10 Mar 2019 at 13:09, Dean Rasheed  wrote:
> Here are some more comments:
>

One more thing --- the comment for statext_clauselist_selectivity() says:

 * So (simple_selectivity - base_selectivity) may be seen as a correction for
 * the part not covered by the MCV list.

That's not quite right. It should really say that (simple_selectivity
- base_selectivity) is an estimate for the part not covered by the MCV
list, or that (mcv_selectivity - base_selectivity) is a correction for
the part covered by the MCV list. Those 2 statements are actually
equivalent, and different from what you wrote.

Perhaps the easiest way to see it is to work through a simple example:

Suppose you had the following clauses:

  a = 1 AND b >= 0 AND b <= 10

The per-column stats might be expected to give reasonable independent
estimates for the following 2 things:

  P(a = 1)

  P(b >= 0 AND b <= 10)  -- in general, greater than P(b >= 0) * P(b <= 10)

but the overall estimate produced by clauselist_selectivity_simple()
would then just be the product of those 2 things:

  simple_sel = P(a = 1) * P(b >= 0 AND b <= 10)

which might not be so good if the columns were correlated.

Now suppose you had MCV stats, which included MCV items for the
following specific values:

  (a=1,b=1), (a=1,b=2), (a=1,b=3)

but no other relevant MCV entries. (There might be lots of other MCV
items that don't match the original clauses, but they're irrelavent
for this discssion.) That would mean that we could get reasonable
estimates for the following 2 quantities:

  mcv_sel = P(a = 1 AND b IN (1,2,3))
  = P(a = 1 AND b = 1) + P(a = 1 AND b = 2) + P(a = 1 AND b = 3)

  mcv_basesel = base_freq(a = 1 AND b IN (1,2,3))
  = P(a = 1) * (P(b = 1) + P(b = 2) + P(b = 3))

So how is that useful? Well, returning to the quantity that we're
actually trying to compute, it can be split into MCV and non-MCV
parts, and since they're mutually exclusive possibilities, their
probabilities just add up. Thus we can write:

  P(a = 1 AND b >= 0 AND b <= 10)

= P(a = 1 AND b IN (1,2,3)) -- MCV part
+ P(a = 1 AND b >= 0 AND b <= 10 AND b NOT IN (1,2,3))  -- non-MCV part

= mcv_sel + other_sel

So the first term is easy -- it's just mcv_sel, from above. The second
term is trickier though, since we have no information about the
correlation between a and b in the non-MCV region. Just about the best
we can do is assume that they're independent, which gives:

  other_sel = P(a = 1 AND b >= 0 AND b <= 10 AND b NOT IN (1,2,3))
   ~= P(a = 1) * P(b >= 0 AND b <= 10 AND b NOT IN (1,2,3))

and that can now be written in terms of things that we know

  other_sel ~= P(a = 1) * P(b >= 0 AND b <= 10 AND b NOT IN (1,2,3))

 = P(a = 1) * P(b >= 0 AND b <= 10)
 - P(a = 1) * P(b IN (1,2,3))  -- mutually exclusive possibilities

 = simple_sel - mcv_basesel

So, as I said above, (simple_selectivity - base_selectivity) is an
estimate for the part not covered by the MCV list.


Another way to look at it is to split the original per-column estimate
up into MCV and non-MCV parts, and correct the MCV part using the MCV
stats:

  simple_sel = P(a = 1) * P(b >= 0 AND b <= 10)

 = P(a = 1) * P(b IN (1,2,3))
 + P(a = 1) * P(b >= 0 AND b <= 10 AND b NOT IN (1,2,3))

The first term is just mcv_basesel, so we can define other_sel to be
the other term, giving

  simple_sel = mcv_basesel  -- MCV part
 + other_sel-- non-MCV part

Clearly mcv_basesel isn't the best estimate for the MCV part, and it
should really be mcv_sel, so we can improve upon simple_sel by
applying a correction of (mcv_sel - basesel) to it:

  better estimate = simple_sel + (mcv_sel - mcv_basesel)
  = mcv_sel + other_sel

(where other_sel = simple_sel - mcv_basesel)

Of course, that's totally equivalent, but looking at it this way
(mcv_selectivity - base_selectivity) can be seen as a correction for
the part covered by the MCV list.


All of that generalises to arbitrary clauses, because the matching
items in the MCV list are independent possibilities that sum up, and
the MCV and non-MCV parts are mutually exclusive. That's also why the
basesel calculation in mcv_clauselist_selectivity() must only include
matching MCV items, and the following XXX comment is wrong:

+   for (i = 0; i < mcv->nitems; i++)
+   {
+   *totalsel += mcv->items[i]->frequency;
+
+   if (matches[i] != STATS_MATCH_NONE)
+   {
+   /* XXX Shouldn't the basesel be outside the if condition? */
+   *basesel += mcv->items[i]->base_frequency;
+   s += mcv->items[i]->frequency;
+   }
+   }

So I believe that that code is correct, as written.

Regards,
Dean



Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2019-03-11 Thread Dean Rasheed
On Sun, 10 Mar 2019 at 17:36, Tomas Vondra  wrote:
> On 3/10/19 2:09 PM, Dean Rasheed wrote:
> > 14). The attnums Bitmapset passed to
> > statext_is_compatible_clause_internal() is an input/output argument
> > that it updates. That should probably be documented. When it calls
> > itself recursively for AND/OR/NOT clauses, it could just pass the
> > original Bitmapset through to be updated (rather than creating a new
> > one and merging it), as it does for other types of clause.
>
> I don't think it's really possible, because the AND/OR/NOT clause is
> considered compatible only when all the pieces are compatible. So we
> can't tweak the original bitmapset directly in case the incompatible
> clause is not the very first one.
>

In the case where the overall clause is incompatible, you don't
actually care about the attnums returned. Right now it will return an
empty set (NULL). With this change it would return all the attnums
encountered before the incompatible piece, but that wouldn't matter.
In fact, you could easily preserve the current behaviour just by
having the outer statext_is_compatible_clause() function set attnums
back to NULL if the result is false.

Regards,
Dean



Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2019-03-11 Thread Dean Rasheed
On Sun, 10 Mar 2019 at 22:28, David Rowley  wrote:
>
> On Mon, 11 Mar 2019 at 06:36, Tomas Vondra  
> wrote:
> >
> > On 3/9/19 7:33 PM, Dean Rasheed wrote:
> > > I wonder if it's possible to write smaller, more targeted tests.
> > > Currently "stats_ext" is by far the slowest test in its group, and I'm
> > > not sure that some of those tests add much. It ought to be possible to
> > > write a function that calls EXPLAIN and returns a query's row
> > > estimate, and then you could write tests to confirm the effect of the
> > > new stats by verifying the row estimates change as expected.
> >
> > Sure, if we can write more targeted tests, that would be good. But it's
> > not quite clear to me how wrapping EXPLAIN in a function makes those
> > tests any faster?
>
> I've not looked at the tests in question, but if they're executing an
> inferior plan is used when no extended stats exists, then maybe that's
> why they're slow.
>
> I think Dean might mean to create a function similar to
> explain_parallel_append() in partition_prune.sql then write tests that
> check the row estimate with EXPLAIN (COSTS ON) but strip out the other
> costing stuff instead of validating that the poor plan was chosen.
>

Yeah that's the sort of thing I was thinking of. I think it might be
possible to write simpler and faster tests by inserting far fewer rows
and relying on ANALYSE having sampled everything, so the row estimates
should be predictable. It may be the case that, with just a handful of
rows, the extended stats don't affect the plan, but you'd still see a
difference in the row estimates, and that could be a sufficient test I
think.

> > On 3/10/19 2:09 PM, Dean Rasheed wrote:
> > > 12). bms_member_index() should surely be in bitmapset.c. It could be
> > > more efficient by just traversing the bitmap words and making use of
> > > bmw_popcount(). Also, its second argument should be of type 'int' for
> > > consistency with other bms_* functions.
> >
> > Yes, moving to bitmapset.c definitely makes sense. I don't see how it
> > could use bms_popcount() though.
>
> I think it could be done by first checking if the parameter is a
> member of the set, and then if so, count all the bits that come on and
> before that member. You can use bmw_popcount() for whole words before
> the specific member's word then just bitwise-and a bit mask of a
> bitmapword that has all bits set for all bits on and before your
> parameter's BITNUM(), and add the bmw_popcount of the final word
> bitwise-anding the mask. bms_add_range() has some masking code you
> could copy.

Yep, that's what I was imagining. Except I think that to get a 0-based
index result you'd want the mask to have all bits set for bits
*before* the parameter's BITNUM(), rather than on and before. So I
think the mask would simply be

  ((bitmapword) 1 << bitnum) - 1

Regards,
Dean



Re: [Patch] Log10 and hyperbolic functions for SQL:2016 compliance

2019-03-11 Thread Dean Rasheed
On Sun, 3 Feb 2019 at 15:12, Tom Lane  wrote:
>
> Andrew Gierth  writes:
> > The spec doesn't require the inverse functions (asinh, acosh, atanh),
> > but surely there is no principled reason to omit them?
>
> +1 --- AFAICS, the C library has offered all six since C89.
>

+1 for including the inverse functions. However, it looks to me like
the inverse functions are C99-specific, so they might not be available
on all supported platforms. If they're not, we may need to provide our
own implementations.

I did a bit of research and had play. It looks like it might not be
too hard to provide our own implementations, but it does take a bit of
care to avoid rounding and overflow errors. Attached are some
standalone C programs where I tested various algorithms. A decent
approach seems to be to either use log1p() (which is itself
C99-specific, hence that's the first thing I played with) or to use a
single round of Newton-Raphson to correct rounding errors, in a
similar way to how we implement cbrt() on platforms that don't have
that.

Of course that may all be moot -- those functions may in fact be
available everywhere we care about, but it was interesting to play
around with them anyway.

Regards,
Dean
#include 
#include 
#include 

/*
 * A commonly used algorithm, due to Kahan (citation needed).
 *
 * Max error: 1 ulp.
 * Avg error: 0.158 ulp.
 *
 * It's not obvious, but this does appear to be monotonic at the cutover point
 * (at least on my machine). Can that be relied upon on all platforms?
 *
 * -5.5511151231257851673e-17 -> -5.5511151231257851673e-17 (u != 1 branch)
 * -5.5511151231257839347e-17 -> -5.5511151231257839347e-17 (u != 1 branch)
 * -5.5511151231257827021e-17 -> -5.5511151231257827021e-17 (u == 1 branch)
 * -5.5511151231257820858e-17 -> -5.5511151231257820858e-17 (u == 1 branch)
 *
 * 1.1102230246251564172e-16 -> 1.1102230246251564172e-16 (u == 1 branch)
 * 1.1102230246251565404e-16 -> 1.1102230246251565404e-16 (u == 1 branch)
 * 1.1102230246251567869e-16 -> 1.1102230246251565404e-16 (u != 1 branch)
 * 1.1102230246251570335e-16 -> 1.1102230246251567869e-16 (u != 1 branch)
 *
 * However, it doesn't appear to be monotonic at various other points.
 */
double kahan_log1p(double x)
{
	volatile double u = 1+x;
	return u == 1 ? x : x * (log(u) / (u-1));
}

/*
 * Algorithm used in the GNU Scientific Library.
 *
 * Max error: 1 ulp.
 * Avg error: 0.086 ulp.
 *
 * Where does this formula come from? Licensing?
 * Doesn't return -0 when x is -0, but that could be fixed up.
 * Looks to be monotonic everywhere.
 */
double gsl_log1p(double x)
{
	volatile double y, z;
	y = 1 + x;
	z = y - 1;
	return log(y) - (z-x)/y;
}

/*
 * Alternative algorithm using one round of Newton-Raphson.
 *
 * Max error: 1 ulp.
 * Avg error: 0.094 ulp.
 *
 * Works well for all inputs.
 * Looks to be monotonic everywhere.
 */
double nr_log1p(double x)
{
	double y, e;

	if (x == 0)
		return x;

	y = log(1+x);
	e = exp(y);

	return y - (e-1-x)/e;
}

/* === TESTS === */

int num_tests = 0;
double max_kahan_err = 0.0;
double max_gsl_err = 0.0;
double max_nr_err = 0.0;
double total_kahan_err = 0.0;
double total_gsl_err = 0.0;
double total_nr_err = 0.0;

double err(double x, double y)
{
	if (x < y)
	{
		double diff = y - x;
		double ulp = nextafter(x, y) - x;
		return diff / ulp;
	}
	if (x > y)
	{
		double diff = x - y;
		double ulp = nextafter(y, x) - y;
		return diff / ulp;
	}
	return 0.0;
}

void test_log1p(double x)
{
	double y = log1p(x);
	double kahan_y = kahan_log1p(x);
	double gsl_y = gsl_log1p(x);
	double nr_y = nr_log1p(x);
	double kahan_err = err(y, kahan_y);
	double gsl_err = err(y, gsl_y);
	double nr_err = err(y, nr_y);

	double prev_x = nextafter(x, -DBL_MAX);
	double next_x = nextafter(x, DBL_MAX);
#define monotonic(fn) \
	( prev_x == -1 || (fn)(prev_x) <= (fn)(x) ) && \
	( next_x == DBL_MAX || (fn)(next_x) >= (fn)(x) ) ? \
	"Monotonic" : "Not monotonic"

	printf("x = %.20g\n", x);
	printf("Standard result: %.20g %s\n", y, monotonic(log1p));
	printf("Kahan log1p():   %.20g,  err=%f %s\n", kahan_y, kahan_err,
		   monotonic(kahan_log1p));
	printf("GSL log1p(): %.20g,  err=%f %s\n", gsl_y, gsl_err,
		   monotonic(gsl_log1p));
	printf("NR log1p():  %.20g,  err=%f %s\n", nr_y, nr_err,
		   monotonic(nr_log1p));

	if (kahan_err > max_kahan_err) max_kahan_err = kahan_err;
	if (gsl_err > max_gsl_err) max_gsl_err = gsl_err;
	if (nr_err > max_nr_err) max_nr_err = nr_err;

	total_kahan_err += kahan_err;
	total_gsl_err += gsl_err;
	total_nr_err += nr_err;
	num_tests++;
}

int main()
{
	test_log1p(nextafter(-1, 0));
	test_log1p(3.141e-16-1);
	test_log1p(3.141e-15-1);
	test_log1p(3.141e-14-1);
	test_log1p(3.141e-13-1);
	test_log1p(3.141e-12-1);
	test_log1p(3.141e-11-1);
	test_log1p(3.141e-10-1);
	test_log1p(3.141e-9-1);
	test_log1p(3.141e-8-1);

Re: pgsql: Add support for hyperbolic functions, as well as log10().

2019-03-13 Thread Dean Rasheed
On Wed, 13 Mar 2019, 21:56 Tom Lane,  wrote:

>
> Of these, probably the least bad is #3, even though it might require
> a few rounds of experimentation to find the best extra_float_digits
> setting to use.  I'll go try it without any roundoff, just to see
> what the raw results look like ...
>


Yeah, that seems like a reasonable thing to try.

I'm amazed that jacana's asinh() returned -0 for an input of +0. I'm not
aware of any implementation that does that. I'd be quite interested to know
what it returned for an input like 1e-20. If that returned any variety of
zero, I'd say that it's worse than useless. Another interesting test case
would be whether or not it satisfies asinh(-x) = -asinh(x) for a variety of
different values of x, because that's something that commonly breaks down
badly with naive implementations.

It's not unreasonable to expect these functions to be accurate to within
the last 1 or 2 digits, so testing with extra_float_digits or whatever
seems reasonable, but I think a wider variety of test inputs is required.

I also wonder if we should be doing what we do for the regular trig
functions and explicitly handle special cases like Inf and NaN to ensure
POSIX compatibility on all platforms.

Regards,
Dean


>


Re: pgsql: Add support for hyperbolic functions, as well as log10().

2019-03-14 Thread Dean Rasheed
On Thu, 14 Mar 2019 at 04:41, Tom Lane  wrote:
>
> Dean Rasheed  writes:
> > I'm amazed that jacana's asinh() returned -0 for an input of +0.
>
> Even more amusingly, it returns NaN for acosh('infinity'), cf
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jacana&dt=2019-03-14%2003%3A00%3A34
>
> Presumably that means they calculated "infinity - infinity" at some
> point, but why?
>

Given the -0 result, I don't find that particularly surprising. I
suspect lots of formulae would end up doing that without proper
special-case handling upfront.

It looks like that's the only platform that isn't POSIX compliant
though, so maybe it's not worth worrying about.

Regards,
Dean



Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2019-03-16 Thread Dean Rasheed
On Fri, 15 Mar 2019 at 00:06, Tomas Vondra  wrote:
> I've noticed an annoying thing when modifying type of column not
> included in a statistics...
>
> That is, we don't remove the statistics, but the estimate still changes.
> But that's because the ALTER TABLE also resets reltuples/relpages:
>
> That's a bit unfortunate, and it kinda makes the whole effort to not
> drop the statistics unnecessarily kinda pointless :-(
>

Well not entirely. Repeating that test with 100,000 rows, I get an
initial estimate of 9850 (actual 10,000), which then drops to 2451
after altering the column. But if you drop the dependency statistics,
the estimate drops to 241, so clearly there is some benefit in keeping
them in that case.

Besides, I thought there was no extra effort in keeping the extended
statistics in this case -- isn't it just using the column
dependencies, so in this case UpdateStatisticsForTypeChange() never
gets called anyway?

Regards,
Dean



Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2019-03-16 Thread Dean Rasheed
it run out of memory before getting to that test? Even if not,
it would likely take an excessive amount of time.

I think this part of the patch needs a bit of a rethink. My first
thought is to do something similar to what happens for per-column
MCVs, and set an upper limit on the size of each value that is ever
considered for inclusion in the stats (c.f. WIDTH_THRESHOLD and
toowide_cnt in analyse.c). Over-wide values should be excluded early
on, and it will need to track whether or not any such values were
excluded, because then it wouldn't be appropriate to treat the stats
as complete and keep the entire list, without calling
get_mincount_for_mcv_list().

That's it for now.

Regards,
Dean



Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2019-03-17 Thread Dean Rasheed
On Sat, 16 Mar 2019 at 23:44, Tomas Vondra  wrote:
>
> > 21). For consistency with other bms_ functions, I think the name of
> > the Bitmapset argument for bms_member_index() should just be called
> > "a". Nitpicking, I'd also put bms_member_index() immediately after
> > bms_is_member() in the source, to match the header.
>
> I think I've already done the renames in the last patch I submitted (are
> you looking at an older version of the code, perhaps?). I've moved it
> right after bms_is_member - good idea.
>

Ah OK, I was on the 20190315 patch yesterday. I've just updated to the
20190317 patch.
It looks like you forgot to update the argument name in the header file though.

Regards,
Dean



Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2019-03-17 Thread Dean Rasheed
On Sat, 16 Mar 2019 at 23:44, Tomas Vondra  wrote:
>
> > 28). I just spotted the 1MB limit on the serialised MCV list size. I
> > think this is going to be too limiting. For example, if the stats
> > target is at its maximum of 1, that only leaves around 100 bytes
> > for each item's values, which is easily exceeded. In fact, I think
> > this approach for limiting the MCV list size isn't a good one --
> > consider what would happen if there were lots of very large values.
> > Would it run out of memory before getting to that test? Even if not,
> > it would likely take an excessive amount of time.
> >
>
> True. I don't have a very good argument for a specific value, or even
> having an explicit limit at all. I've initially added it mostly as a
> safety for development purposes, but I think you're right we can just
> get rid of it. I don't think it'd run out of memory before hitting the
> limit, but I haven't tried very hard (but I recall running into the 1MB
> limit in the past).
>

I've just been playing around a little with this and found that it
isn't safely dealing with toasted values. For example, consider the
following test:

create or replace function random_string(x int) returns text
as $$
  select substr(string_agg(md5(random()::text), ''), 1, x)
from generate_series(1,(x+31)/32);
$$ language sql;

drop table if exists t;
create table t(a int, b text);
insert into t values (1, random_string(1000));
create statistics s (mcv) on a,b from t;
analyse t;

select length(b), left(b,5), right(b,5) from t;
select length(stxmcv), length((m.values::text[])[2]),
   left((m.values::text[])[2], 5), right((m.values::text[])[2],5)
  from pg_statistic_ext, pg_mcv_list_items(stxmcv) m
 where stxrelid = 't'::regclass;

The final query returns the following:

 length |  length  | left  | right
+--+---+---
250 | 1000 | c2667 | 71492
(1 row)

suggesting that there's something odd about the stxmcv value. Note,
also, that it doesn't hit the 1MB limit, even though the value is much
bigger than that.

If I then delete the value from the table, without changing the stats,
and repeat the final query, it falls over:

delete from t where a=1;
select length(stxmcv), length((m.values::text[])[2]),
   left((m.values::text[])[2], 5), right((m.values::text[])[2],5)
  from pg_statistic_ext, pg_mcv_list_items(stxmcv) m
 where stxrelid = 't'::regclass;

ERROR:  unexpected chunk number 5008 (expected 0) for toast value
16486 in pg_toast_16480

So I suspect it was using the toast data from the table t, although
I've not tried to investigate further.


> > I think this part of the patch needs a bit of a rethink. My first
> > thought is to do something similar to what happens for per-column
> > MCVs, and set an upper limit on the size of each value that is ever
> > considered for inclusion in the stats (c.f. WIDTH_THRESHOLD and
> > toowide_cnt in analyse.c). Over-wide values should be excluded early
> > on, and it will need to track whether or not any such values were
> > excluded, because then it wouldn't be appropriate to treat the stats
> > as complete and keep the entire list, without calling
> > get_mincount_for_mcv_list().
> >
> Which part? Serialization / deserialization? Or how we handle long
> values when building the MCV list?
>

I was thinking (roughly) of something like the following:

* When building the values array for the MCV list, strip out rows with
values wider than some threshold (probably something like the
WIDTH_THRESHOLD = 1024 from analyse.c would be reasonable).

* When building the MCV list, if some over-wide values were previously
stripped out, always go into the get_mincount_for_mcv_list() block,
even if nitems == ngroups (for the same reason a similar thing happens
for per-column stats -- if some items were stripped out, we're already
saying that not all items will go in the MCV list, and it's not safe
to assume that the remaining items are common enough to give accurate
estimates).

* In the serialisation code, remove the size limit entirely. We know
that each value is now at most 1024 bytes, and there are at most 1
items, and at most 8 columns, so the total size is already reasonably
well bounded. In the worst case, it might be around 80MB, but in
practice, it's always likely to be much much smaller than that.

Regards,
Dean



Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2019-03-17 Thread Dean Rasheed
On Sat, 16 Mar 2019 at 23:44, Tomas Vondra  wrote:
> >
> > 16). This regression test fails for me:
> >
> > @@ -654,11 +654,11 @@
> >  -- check change of unrelated column type does not reset the MCV statistics
> >  ALTER TABLE mcv_lists ALTER COLUMN d TYPE VARCHAR(64);
> >  SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a =
> > 1 AND b = ''1''');
> >   estimated | actual
> >  ---+
> > -50 | 50
> > +11 | 50
> >  (1 row)
> >
> > Maybe that's platform-dependent, given what you said about
> > reltuples/relpages being reset. An easy workaround for this would be
> > to modify this test (and perhaps the one that follows) to just query
> > pg_statistic_ext to see if the MCV statistics have been reset.
> >
>
> Ah, sorry for not explaining this bit - the failure is expected, due to
> the reset of relpages/reltuples I mentioned. We do keep the extended
> stats, but the relsize estimate changes a bit. It surprised me a bit,
> and this test made the behavior apparent. The last patchset included a
> piece that changes that - if we decide not to change this, I think we
> can simply accept the actual output.
>

I don't think changing the way reltuples is reset ought to be within
the scope of this patch. There might be good reasons for it being the
way it is. Perhaps open a discussion on a separate thread?

As far as this test goes, how about just doing this:

-- check change of unrelated column type does not reset the MCV statistics
ALTER TABLE mcv_lists ALTER COLUMN d TYPE VARCHAR(64);
SELECT stxmcv IS NOT NULL AS has_mcv
  FROM pg_statistic_ext WHERE stxrelid = 'mcv_lists'::regclass;

-- check change of column type resets the MCV statistics
ALTER TABLE mcv_lists ALTER COLUMN c TYPE numeric;
SELECT stxmcv IS NOT NULL AS has_mcv
  FROM pg_statistic_ext WHERE stxrelid = 'mcv_lists'::regclass;

Regards,
Dean



Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2019-03-24 Thread Dean Rasheed
On Sun, 24 Mar 2019 at 00:17, David Rowley  wrote:
>
> On Sun, 24 Mar 2019 at 12:41, Tomas Vondra  
> wrote:
> >
> > On 3/21/19 4:05 PM, David Rowley wrote:
>
> > > 29. Looking at the tests I see you're testing that you get bad
> > > estimates without extended stats.  That does not really seem like
> > > something that should be done in tests that are meant for extended
> > > statistics.
> > >
> >
> > True, it might be a bit unnecessary. Initially the tests were meant to
> > show old/new estimates for development purposes, but it might not be
> > appropriate for regression tests. I don't think it's a big issue, it's
> > not like it'd slow down the tests significantly. Opinions?
>
> My thoughts were that if someone did something to improve non-MV
> stats, then is it right for these tests to fail? What should the
> developer do in the case? update the expected result? remove the test?
> It's not so clear.
>

I think the tests are fine as they are. Don't think of these as "good"
and "bad" estimates. They should both be "good" estimates, but under
different assumptions -- one assuming no correlation between columns,
and one taking into account the relationship between the columns. If
someone does do something to "improve" the non-MV stats, then the
former tests ought to tell us whether it really was an improvement. If
so, then the test result can be updated and perhaps whatever was done
ought to be factored into the MV-stats' calculation of base
frequencies. If not, the test is providing valuable feedback that
perhaps it wasn't such a good improvement after all.

Regards,
Dean



Re: BUG #15708: RLS 'using' running as wrong user when called from a view

2019-03-24 Thread Dean Rasheed
On Thu, 21 Mar 2019 at 00:39, PG Bug reporting form
 wrote:
>
> This fails, seemingly because the RLS on 'bar' is being checked by alice,
> instead of the view owner bob:
>

Yes I agree, that appears to be a bug. The subquery in the RLS policy
should be checked as the view owner -- i.e., we need to propagate the
checkAsUser for the RTE with RLS to any subqueries in its RLS
policies.

It looks like the best place to fix it is in
get_policies_for_relation(), since that's where all the policies to be
applied for a given RTE are pulled together. Patch attached.

Regards,
Dean


rls-perm-check-fix.patch
Description: Binary data


Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2019-03-26 Thread Dean Rasheed
On Mon, 25 Mar 2019 at 23:36, Tomas Vondra  wrote:
>
> Attached is an updated patch, fixing all the issues pointed out so far.
> Unless there are some objections, I plan to commit the 0001 part by the
> end of this CF. Part 0002 is a matter for PG13, as previously agreed.
>

Yes, I think that's reasonable. It looks to be in pretty good shape. I
have reviewed most of the actual code, but note that I haven't
reviewed the docs changes and I didn't spend much time reading code
comments. It might benefit from a quick once-over comment tidy up.

I just looked through the latest set of changes and I have a couple of
additional review comments:

In the comment about WIDTH_THRESHOLD, s/pg_statistic/pg_statistic_ext/.

In statext_mcv_build(), I'm not convinced by the logic around keeping
the whole MCV list if it fits. Suppose there were a small number of
very common values, and then a bunch of uniformly distributed less
common values. The sample might consist of all the common values, plus
one or two instances of some of the uncommon ones, leading to a list
that would fit, but it would not be appropriate to keep the uncommon
values on the basis of having seen them only one or two times. The
fact that the list of items seen fits doesn't by itself mean that
they're all common enough to justify being kept. In the per-column
stats case, there are a bunch of other checks that have to pass, which
are intended to test not just that the list fits, but that it believes
that those are all the items in the table. For MV stats, you don't
have that, and so I think it would be best to just remove that test
(the "if (ngroups > nitems)" test) and *always* call
get_mincount_for_mcv_list() to determine how many MCV items to keep.
Otherwise there is a risk of keeping too many MCV items, with the ones
at the tail end of the list producing poor estimates.

Regards,
Dean



Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2019-03-26 Thread Dean Rasheed
On Tue, 26 Mar 2019 at 11:59, Dean Rasheed  wrote:
>
> On Mon, 25 Mar 2019 at 23:36, Tomas Vondra  
> wrote:
> >
> > Attached is an updated patch...
>
> I just looked through the latest set of changes and I have a couple of
> additional review comments:
>

I just spotted another issue while reading the code:

It's possible to build an MCV list with more than
STATS_MCVLIST_MAX_ITEMS = 8192 items, which then causes an error when
the code tries to read it back in:

create temp table foo(a int, b int);
insert into foo select x,x from generate_series(1,1) g(x);
insert into foo select x,x from generate_series(1,1) g(x);
alter table foo alter column a set statistics 1;
alter table foo alter column b set statistics 1;
create statistics s (mcv) on a,b from foo;
analyse foo;
select * from foo where a=1 and b=1;

ERROR:  invalid length (1) item array in MCVList

So this needs to be checked when building the MCV list.

In fact, the stats targets for table columns can be as large as 1
(a hard-coded constant in tablecmds.c, which is pretty ugly, but
that's a different matter), so I think STATS_MCVLIST_MAX_ITEMS
probably ought to match that.

There are also a couple of comments that refer to the 8k limit, which
would need updating, if you change it.

Regards,
Dean



Re: BUG #15708: RLS 'using' running as wrong user when called from a view

2019-03-27 Thread Dean Rasheed
On Mon, 25 Mar 2019 at 20:27, Stephen Frost  wrote:
>
> * Dean Rasheed (dean.a.rash...@gmail.com) wrote:
>
> > It looks like the best place to fix it is in
> > get_policies_for_relation(), since that's where all the policies to be
> > applied for a given RTE are pulled together. Patch attached.
>
> Yes, on a quick review, that looks like a good solution to me as well.
>

On second thoughts, it actually needs to be in
get_row_security_policies(), after making copies of the quals from the
policies, otherwise it would be scribbling on the copies from the
relcache. Actually that makes the code change a bit simpler too.

Regards,
Dean
diff --git a/src/backend/rewrite/rowsecurity.c b/src/backend/rewrite/rowsecurity.c
new file mode 100644
index 835540c..e9f532c
--- a/src/backend/rewrite/rowsecurity.c
+++ b/src/backend/rewrite/rowsecurity.c
@@ -47,6 +47,7 @@
 #include "nodes/pg_list.h"
 #include "nodes/plannodes.h"
 #include "parser/parsetree.h"
+#include "rewrite/rewriteDefine.h"
 #include "rewrite/rewriteHandler.h"
 #include "rewrite/rewriteManip.h"
 #include "rewrite/rowsecurity.h"
@@ -382,6 +383,13 @@ get_row_security_policies(Query *root, R
 	table_close(rel, NoLock);
 
 	/*
+	 * Copy checkAsUser to the row security quals and WithCheckOption checks,
+	 * in case they contain any subqueries referring to other relations.
+	 */
+	setRuleCheckAsUser((Node *) *securityQuals, rte->checkAsUser);
+	setRuleCheckAsUser((Node *) *withCheckOptions, rte->checkAsUser);
+
+	/*
 	 * Mark this query as having row security, so plancache can invalidate it
 	 * when necessary (eg: role changes)
 	 */
diff --git a/src/test/regress/expected/rowsecurity.out b/src/test/regress/expected/rowsecurity.out
new file mode 100644
index 4b53401..d764991
--- a/src/test/regress/expected/rowsecurity.out
+++ b/src/test/regress/expected/rowsecurity.out
@@ -3910,6 +3910,33 @@ DROP OWNED BY regress_rls_dob_role1;
 DROP POLICY p1 ON dob_t2; -- should succeed
 DROP USER regress_rls_dob_role1;
 DROP USER regress_rls_dob_role2;
+-- Bug #15708: view + table with RLS should check policies as view owner
+CREATE TABLE ref_tbl (a int);
+INSERT INTO ref_tbl VALUES (1);
+CREATE TABLE rls_tbl (a int);
+INSERT INTO rls_tbl VALUES (10);
+ALTER TABLE rls_tbl ENABLE ROW LEVEL SECURITY;
+CREATE POLICY p1 ON rls_tbl USING (EXISTS (SELECT 1 FROM ref_tbl));
+GRANT SELECT ON ref_tbl TO regress_rls_bob;
+GRANT SELECT ON rls_tbl TO regress_rls_bob;
+CREATE VIEW rls_view AS SELECT * FROM rls_tbl;
+ALTER VIEW rls_view OWNER TO regress_rls_bob;
+GRANT SELECT ON rls_view TO regress_rls_alice;
+SET SESSION AUTHORIZATION regress_rls_alice;
+SELECT * FROM ref_tbl; -- Permission denied
+ERROR:  permission denied for table ref_tbl
+SELECT * FROM rls_tbl; -- Permission denied
+ERROR:  permission denied for table rls_tbl
+SELECT * FROM rls_view; -- OK
+ a  
+
+ 10
+(1 row)
+
+RESET SESSION AUTHORIZATION;
+DROP VIEW rls_view;
+DROP TABLE rls_tbl;
+DROP TABLE ref_tbl;
 --
 -- Clean up objects
 --
diff --git a/src/test/regress/sql/rowsecurity.sql b/src/test/regress/sql/rowsecurity.sql
new file mode 100644
index ea83153..df8fe11
--- a/src/test/regress/sql/rowsecurity.sql
+++ b/src/test/regress/sql/rowsecurity.sql
@@ -1764,6 +1764,32 @@ DROP POLICY p1 ON dob_t2; -- should succ
 DROP USER regress_rls_dob_role1;
 DROP USER regress_rls_dob_role2;
 
+-- Bug #15708: view + table with RLS should check policies as view owner
+CREATE TABLE ref_tbl (a int);
+INSERT INTO ref_tbl VALUES (1);
+
+CREATE TABLE rls_tbl (a int);
+INSERT INTO rls_tbl VALUES (10);
+ALTER TABLE rls_tbl ENABLE ROW LEVEL SECURITY;
+CREATE POLICY p1 ON rls_tbl USING (EXISTS (SELECT 1 FROM ref_tbl));
+
+GRANT SELECT ON ref_tbl TO regress_rls_bob;
+GRANT SELECT ON rls_tbl TO regress_rls_bob;
+
+CREATE VIEW rls_view AS SELECT * FROM rls_tbl;
+ALTER VIEW rls_view OWNER TO regress_rls_bob;
+GRANT SELECT ON rls_view TO regress_rls_alice;
+
+SET SESSION AUTHORIZATION regress_rls_alice;
+SELECT * FROM ref_tbl; -- Permission denied
+SELECT * FROM rls_tbl; -- Permission denied
+SELECT * FROM rls_view; -- OK
+RESET SESSION AUTHORIZATION;
+
+DROP VIEW rls_view;
+DROP TABLE rls_tbl;
+DROP TABLE ref_tbl;
+
 --
 -- Clean up objects
 --


Re: ANALYZE: ERROR: tuple already updated by self

2019-07-29 Thread Dean Rasheed
On Sun, 28 Jul 2019 at 11:15, Tomas Vondra  wrote:
>
> Attached is a patch fixing the error by not building extended stats for
> the inh=true case (as already proposed in this thread). That's about the
> simplest way to resolve this issue for v12. It should add a simple
> regression test too, I guess.
>

Seems like a reasonable thing to do for 10, 11 and possibly also 12
(actually, as you noted, I think it's the only thing that can be done
for 10 and 11).

> To fix this properly we need to add a flag similar to stainherit
> somewhere. And I've started working on that, thinking that maybe we
> could do that even for v12 - it'd be yet another catversion bump, but
> there's already been one since beta2 so maybe it would be OK.
>

Yeah, I think that makes sense, if it's not too hard. Since v12 is
where the stats definition is split out from the stats data, this
might work out quite neatly, since the inh flag would apply only to
the stats data.

> But it's actually a bit more complicated than just adding a column to
> the catalog, for two reasons:
>
> 1) The optimizer part has to be tweaked to pick the right object, with
> the flag set to either true/false. Not trivial, but doable.
>

Isn't it just a matter of passing the inh flag to
get_relation_statistics() from get_relation_info(), so then the
optimiser would get the right kind of stats data, depending on whether
or not inheritance was requested in the query.

Regards,
Dean




Re: Multivariate MCV list vs. statistics target

2019-08-01 Thread Dean Rasheed
On Thu, 1 Aug 2019 at 11:30, Tomas Vondra  wrote:
>
> I'll move it to the next CF. Aside from the issues pointed by Kyotaro-san
> in his review, I still haven't made my mind about whether to base the use
> statistics targets set for the attributes. That's what we're doing now,
> but I'm not sure it's a good idea after adding separate statistics target.
> I wonder what Dean's opinion on this is, as he added the current logic.
>

If this were being released in the same version as MCV stats first
appeared, I'd say that there's not much point basing the default
multivariate stats target on the per-column targets, when it has its
own knob to control it. However, since this won't be released for a
year, those per-column-based defaults will be in the field for that
long, and so I'd say that we shouldn't change the default when adding
this, otherwise users who don't use this new feature might be
surprised by the change in behaviour.

Regards,
Dean




Re: MCV lists for highly skewed distributions

2018-01-21 Thread Dean Rasheed
On 21 January 2018 at 07:26, John Naylor  wrote:
> I spent a few hours hacking on this, and it turns out calculating the
> right number of MCVs taking into account both uniform and highly
> non-uniform distributions is too delicate a problem for me to solve
> right now. The logic suggested by Dean Rasheed in [1] always produces
> no MCVs for a perfectly uniform distribution (which is good), but very
> often also for other distributions, which is not good. My efforts to
> tweak that didn't work, so I didn't get as far as adapting it for the
> problem Jeff is trying to solve.

Hmm, Tom suggested that the test based on the average frequency over
all values might be too strict because the estimated number of
distinct values is often too low, so that might explain what you're
seeing.

It occurs to me that maybe a better test to exclude a value from the
MCV list would be to demand that its relative standard error not be
too high. Such a test, in addition to the existing tests, might be
sufficient to solve the opposite problem of too many values in the MCV
list, because the real problem there is including a value after having
seen relatively few occurrences of it in the sample, and thus having a
wildly inaccurate estimate for it. Setting a bound on the relative
standard error would mean that we could have a reasonable degree of
confidence in estimates produced from the sample.

Regards,
Dean



Re: stricter MCV tests for uniform distributions (was Re: MCV lists for highly skewed distributions)

2018-01-22 Thread Dean Rasheed
On 22 January 2018 at 08:07, John Naylor  wrote:
> On 1/21/18, Dean Rasheed  wrote:
>> It occurs to me that maybe a better test to exclude a value from the
>> MCV list would be to demand that its relative standard error not be
>> too high. Such a test, in addition to the existing tests, might be
>> sufficient to solve the opposite problem of too many values in the MCV
>> list, because the real problem there is including a value after having
>> seen relatively few occurrences of it in the sample, and thus having a
>> wildly inaccurate estimate for it. Setting a bound on the relative
>> standard error would mean that we could have a reasonable degree of
>> confidence in estimates produced from the sample.
>
> If you don't mind, what would the math look like for that?

Using the same syntax as before:

N = Total rows in table (population size)
n = Number of rows sampled
x = Number of times a particular value appears in the sample
p = x/n = Frequency of the value in the sample

So that the standard error of the proportion is

SE = sqrt(p*(1-p)/n) * sqrt((N-n)/(N-1))

Then the relative standard error (which is usually expressed as a percentage) is

RSE = 100 * SE / p

So if we were to demand that the relative standard error was less
than, say, 10%, then the constraint would just be

SE < 0.1 * p

Note:

* This formula not valid if x is very small (see what I wrote about
being able to approximate the distribution of p with a normal
distribution). So we should also enforce the "rule of thumb" x >= 10.

* The frequency p that we're storing is the count divided by the
overall sample size. So the values for N and n above should not be the
counts that exclude NULLs. As far as this logic is concerned, NULLs
(and too-wide values) are just values not equal to value under
consideration. Thus it appears, from just a quick glance at your
patch, that you're using the wrong values for N and n.

* The RSE is a monotonically decreasing function of p in the range
[0,1], so an upper bound on the RSE equates to a lower bound on the
number of occurrences of the value.


This last point gives me a different idea. Instead of applying this
test *in addition to* the existing tests, how about applying it
*instead of* the existing tests. That is, we keep all MCVs that appear
sufficiently often that we can be reasonably confident in the
estimates produced from their sample frequencies, regardless of
whether or not they are more common than average (the average being
fairly meaningless for a highly skewed distribution anyway).

This is still keeping the most common values, but now we'd be saying
that we keep any value that appears sufficiently often in the sample
that we can extrapolate its sample frequency to produce a reasonably
accurate estimate of the population frequency, and discard values for
which the estimate is likely to be inaccurate.

I have not tested this idea at all, but it seems like it might be
worth considering. It has the nice property that the test depends
entirely on how often the value appears, rather than on other
previously computed statistics, such as Ndistinct.

Doing a quick test in pure SQL, using the highly skewed distribution
Jeff Janes gave in the other thread, with the default sample size of
30,000:

with samples as (
  select floor(-log(random())/log(2))::int  as who
  from generate_series(1,3)
), freqs as (
  select who, count(*) as x, count(*)/3::float8 as p
  from samples group by who
), stats as (
  select *, sqrt(p*(1-p)/3) *
sqrt((1000-3)::float8/(1000-1)) as se
  from freqs
)
select *, (1000*p)::int||'+/-'||(2*se*1000)::int as "95% interval",
   case when x >=10 and se < 0.1*p then 'KEEP' else 'DISCARD' end
from stats order by p desc limit 100;

it pretty consistently keeps the 8 most common values:

 who |   x   |  p   |  se  |  95%
interval   |  case
-+---+--+--+-+-
   0 | 15017 |0.5005667 |  0.00288241625942075 |
5005667+/-57648 | KEEP
   1 |  7607 |0.2535667 |  0.00250800597590887 |
2535667+/-50160 | KEEP
   2 |  3713 |0.1237667 | 0.0018984483 |
1237667+/-37969 | KEEP
   3 |  1855 |   0.0618 |   0.0013884757600711 |
618333+/-27770  | KEEP
   4 |   914 |   0.03046667 | 0.000990788179299791 |
304667+/-19816  | KEEP
   5 |   448 |   0.0149 | 0.000699194759916533 |
149333+/-13984  | KEEP
   6 |   229 |  0.00763 | 0.000501741670388358 |
76333+/-10035   | KEEP
   7 |   108 |   0.0036 | 0.000345267009604061 |
36000+/-6905| KEEP
   8 |46 |  0.00153 | 0.000225565173744715 |
15333+/-4511| DISCARD
   9 |34 |  0.00113 | 0.000193963300230354 |
11333+/-3879| DISCARD
  10 |17 |

Re: WINDOW RANGE patch versus leakproofness

2018-01-31 Thread Dean Rasheed
On 30 January 2018 at 16:42, Tom Lane  wrote:
> So I'm thinking that (a) we do not need to check for leaky functions used
> in window support, and (b) therefore there's no need to avoid leaky
> behavior in in_range support functions.  Objections?
>

Yes, I concur. Since window functions can only appear in the SELECT
target list and ORDER BY clauses, they should never appear in a qual
that gets considered for push down, and thus contain_leaked_vars()
should never see a window function.

Moreover, contain_leaked_vars() is intentionally coded defensively, so
if it ever does somehow see a window function (or any other unexpected
node type) it will return true and the resulting qual/restrictinfo
will be marked leaky, and not pushed through security barriers.

Regards,
Dean



Re: WINDOW RANGE patch versus leakproofness

2018-01-31 Thread Dean Rasheed
On 31 January 2018 at 21:51, Robert Haas  wrote:
> On Wed, Jan 31, 2018 at 5:52 AM, Dean Rasheed  
> wrote:
>> On 30 January 2018 at 16:42, Tom Lane  wrote:
>>> So I'm thinking that (a) we do not need to check for leaky functions used
>>> in window support, and (b) therefore there's no need to avoid leaky
>>> behavior in in_range support functions.  Objections?
>>
>> Yes, I concur. Since window functions can only appear in the SELECT
>> target list and ORDER BY clauses, they should never appear in a qual
>> that gets considered for push down, and thus contain_leaked_vars()
>> should never see a window function.
>
> What about a query that uses window functions within a subquery?
>

A qual containing a subquery is treated as not push down safe, so it
wouldn't even be considered for pushing down into a security barrier
view. On a table with RLS, contain_leaked_vars() would see a subplan
on the restrictinfo's clause and return true, causing the restrictinfo
to be treated as leaky. So in each case, the qual with the subquery
would be executed after any security quals, regardless of what it
contained.

The contents of the subquery aren't currently examined, it just
defaults to leaky. If they were to be examined, the window function
would be found and it would still default to being leaky.

Regards,
Dean



  1   2   3   4   5   6   7   >