Some optimisations for numeric division

2022-02-23 Thread Dean Rasheed
always be one or two digits. Testing this with a bunch of random exp() and ln() computations I saw a speed-up of 15-20%, and it reduced the run time of the numeric-big regression test by around 10%, which seems worth having. Regards, Dean From 41732ad9a44dcd12e52d823fb59cb23cce4fe217 Mon Sep 17 0

Re: Some optimisations for numeric division

2022-02-23 Thread Dean Rasheed
On Wed, 23 Feb 2022 at 20:55, Tom Lane wrote: > > I took a quick look through these (just eyeball, didn't try to verify > your performance statements). Thanks for looking! > I'm +1 on 0001 and 0002, but 0003 feels > a bit ad-hoc. It certainly *looks* weird for the allegedly faster > function t

Re: Some optimisations for numeric division

2022-02-25 Thread Dean Rasheed
On Wed, 23 Feb 2022 at 22:55, Tom Lane wrote: > > Dean Rasheed writes: > > > One thought that occurred to me was that it's a bit silly that > > exp_var() and ln_var() have to use a NumericVar for what could just be > > an int, if we had a div_var_int() function

Re: Some optimisations for numeric division

2022-02-25 Thread Dean Rasheed
On Fri, 25 Feb 2022 at 10:45, Dean Rasheed wrote: > > Attached is the updated patch series (0001 and 0002 unchanged). > And another update following feedback from the cfbot. Regards, Dean From 41732ad9a44dcd12e52d823fb59cb23cce4fe217 Mon Sep 17 00:00:00 2001 From: Dean Rasheed Date

Re: [PATCH] Add reloption for views to enable RLS

2022-02-25 Thread Dean Rasheed
On Fri, 18 Feb 2022 at 14:57, Laurenz Albe wrote: > > Here is a new version, with improved documentation and the option renamed > to "check_permissions_owner". I just prefer the shorter form. > Re-reading this thread, I think I preferred the name "security_invoker". The main objection seemed to

Re: Some optimisations for numeric division

2022-02-27 Thread Dean Rasheed
On Fri, 25 Feb 2022 at 18:30, Tom Lane wrote: > > Dean Rasheed writes: > > And another update following feedback from the cfbot. > > This patchset LGTM. On my machine there doesn't seem to be any > measurable performance change for the numeric regression test, >

Re: [PATCH] Add reloption for views to enable RLS

2022-03-02 Thread Dean Rasheed
On Tue, 1 Mar 2022 at 16:40, Christoph Heiss wrote: > > That is also the main reason I preferred naming it "security_invoker" - > it is consistent with other databases and eases transition from such > systems. > > I kept "check_permissions_owner" for now. Constantly changing it around > with each

Re: [PATCH] Add reloption for views to enable RLS

2022-03-18 Thread Dean Rasheed
On Mon, 14 Mar 2022 at 16:16, Laurenz Albe wrote: > > The patch is fine from my point of view. > > It passes "make check-world". > > I'll mark it as "ready for committer". > Cool, thanks. I think this will make a useful addition to PG15. I have been hacking on it a bit, and attached is an update

Re: [PATCH] Add reloption for views to enable RLS

2022-03-22 Thread Dean Rasheed
On Mon, 21 Mar 2022 at 09:47, Laurenz Albe wrote: > > Thanks for your diligent work on this, and the patch looks good to me. Thanks for looking again. Pushed. Regards, Dean

Re: Additional improvements to extended statistics

2020-12-01 Thread Dean Rasheed
On Sun, 29 Nov 2020 at 21:02, Tomas Vondra wrote: > > Those are fairly minor issues. I don't have any deeper objections, and > it seems committable. Do you plan to do that sometime soon? > OK, I've updated the patch status in the CF app, and I should be able to push it in the next day or so. Reg

Re: Additional improvements to extended statistics

2020-12-02 Thread Dean Rasheed
On Sun, 29 Nov 2020 at 21:02, Tomas Vondra wrote: > > I wonder how much of the comment before clauselist_selectivity should > move to clauselist_selectivity_ext - it does talk about range clauses > and so on, but clauselist_selectivity does not really deal with that. > But maybe that's just an imp

Re: Additional improvements to extended statistics

2020-12-03 Thread Dean Rasheed
On Wed, 2 Dec 2020 at 16:34, Tomas Vondra wrote: > > On 12/2/20 4:51 PM, Dean Rasheed wrote: > > > > Barring any further comments, I'll push this sometime soon. > > +1 > Pushed. Regards, Dean

Re: PoC/WIP: Extended statistics on expressions

2020-12-07 Thread Dean Rasheed
On Thu, 3 Dec 2020 at 15:23, Tomas Vondra wrote: > > Attached is a patch series rebased on top of 25a9e54d2d. After reading this thread and [1], I think I prefer the name "standard" rather than "expressions", because it is meant to describe the kind of statistics being built rather than what they

Re: PoC/WIP: Extended statistics on expressions

2020-12-07 Thread Dean Rasheed
On Mon, 7 Dec 2020 at 14:15, Tomas Vondra wrote: > > On 12/7/20 10:56 AM, Dean Rasheed wrote: > > it might actually be > > neater to have separate documented syntaxes for single- and > > multi-column statistics: > > > > CREATE STATISTICS [ IF NOT EXISTS ] st

Re: Additional improvements to extended statistics

2020-12-07 Thread Dean Rasheed
On Wed, 2 Dec 2020 at 15:51, Dean Rasheed wrote: > > The sort of queries I had in mind were things like this: > > WHERE (a = 1 AND b = 1) OR (a = 2 AND b = 2) > > However, the new code doesn't apply the extended stats directly using > clauselist_selectivity_or() for

Re: PoC/WIP: Extended statistics on expressions

2020-12-11 Thread Dean Rasheed
On Tue, 8 Dec 2020 at 12:44, Tomas Vondra wrote: > > Possibly. But I don't think it's worth the extra complexity. I don't > expect people to have a lot of overlapping stats, so the amount of > wasted space and CPU time is expected to be fairly limited. > > So I don't think it's worth spending too

Re: PoC/WIP: Extended statistics on expressions

2021-01-04 Thread Dean Rasheed
On Fri, 11 Dec 2020 at 20:17, Tomas Vondra wrote: > > OK. Attached is an updated version, reworking it this way. Cool. I think this is an exciting development, so I hope it makes it into the next release. I have started looking at it. So far I have only looked at the catalog, parser and client c

Bug in numeric_power() if exponent is INT_MIN

2021-01-04 Thread Dean Rasheed
(Amusingly I only found this after discovering that Windows Calculator has a similar bug which causes it to crash if you try to raise a number to the power INT_MIN.) On my machine, numeric_power() loses all precision if the exponent is INT_MIN, though the actual failure mode might well be platform

Re: PoC/WIP: Extended statistics on expressions

2021-01-05 Thread Dean Rasheed
On Tue, 5 Jan 2021 at 00:45, Tomas Vondra wrote: > > On 1/4/21 4:34 PM, Dean Rasheed wrote: > > > > * In src/bin/psql/describe.c, I think the \d output should also > > exclude the "expressions" stats kind and just list the other kinds (or > > have no kinds

Re: PoC/WIP: Extended statistics on expressions

2021-01-06 Thread Dean Rasheed
Looking over the statscmds.c changes, there are a few XXX's and FIXME's that need resolving, and I had a couple of other minor comments: + /* +* An expression using mutable functions is probably wrong, +* since if you aren't going to get the same result for the +

Re: PoC/WIP: Extended statistics on expressions

2021-01-07 Thread Dean Rasheed
Starting to look at the planner code, I found an oversight in the way expression stats are read at the start of planning -- it is necessary to call ChangeVarNodes() on any expressions if the relid isn't 1, otherwise the stats expressions may contain Var nodes referring to the wrong relation. Possib

Re: PoC/WIP: Extended statistics on expressions

2021-03-05 Thread Dean Rasheed
On Thu, 4 Mar 2021 at 22:16, Tomas Vondra wrote: > > Attached is a slightly improved version of the patch series, addressing > most of the issues raised in the previous message. Cool. Sorry for the delay replying. > 0003-Extended-statistics-on-expressions-20210304.patch > > Mostly unchanged, The

Re: pgbench - add pseudo-random permutation function

2021-03-11 Thread Dean Rasheed
On Thu, 11 Mar 2021 at 00:58, Bruce Momjian wrote: > > Maybe Dean Rasheed can help because of his math background --- CC'ing him. > Reading the thread I can see how such a function might be useful to scatter non-uniformly random values. The implementation looks plausible too,

Re: pgbench - add pseudo-random permutation function

2021-03-11 Thread Dean Rasheed
On Thu, 11 Mar 2021 at 19:06, David Bowen wrote: > > The algorithm for generating a random permutation with a uniform distribution > across all permutations is easy: > for (i=0; iswap a[n-i] with a[rand(n-i+1)] > } > > where 0 <= rand(x) < x and a[i] is initially i (see Knuth, Section 3.4.2

Re: Additional improvements to extended statistics

2020-11-12 Thread Dean Rasheed
On Thu, 12 Nov 2020 at 14:18, Tomas Vondra wrote: > > Here is an improved WIP version of the patch series, modified to address > the issue with repeatedly applying the extended statistics, as discussed > with Dean in this thread. It's a bit rough and not committable, but I > need some feedback so

Re: Additional improvements to extended statistics

2020-11-17 Thread Dean Rasheed
On Thu, 12 Nov 2020 at 14:18, Tomas Vondra wrote: > > Here is an improved WIP version of the patch series, modified to address > the issue with repeatedly applying the extended statistics, as discussed > with Dean in this thread. It's a bit rough and not committable, but I > need some feedback so

Re: Additional improvements to extended statistics

2020-11-19 Thread Dean Rasheed
On Wed, 18 Nov 2020 at 22:37, Tomas Vondra wrote: > > Seems fine to me, although the "_opt_ext_stats" is rather cryptic. > AFAICS we use "_internal" for similar functions. > There's precedent for using "_opt_xxx" for function variants that add an option to existing functions, but I agree that in

Re: [bug+patch] Inserting DEFAULT into generated columns from VALUES RTE

2020-11-20 Thread Dean Rasheed
On Sun, 6 Sept 2020 at 22:42, Tom Lane wrote: > > I think you'd be better off to make transformInsertStmt(), specifically > its multi-VALUES-rows code path, check for all-DEFAULT columns and adjust > the tlist itself. Doing it there might be a good bit less inefficient > for very long VALUES list

Re: [bug+patch] Inserting DEFAULT into generated columns from VALUES RTE

2020-11-23 Thread Dean Rasheed
On Sun, 22 Nov 2020 at 20:58, Tom Lane wrote: > > I found only one nitpicky bug: in > findDefaultOnlyColumns, the test must be bms_is_empty(default_only_cols) > not just default_only_cols == NULL, or it will fail to fall out early > as intended when the first row contains some DEFAULTs but later r

Re: proposal: possibility to read dumped table's name from file

2020-11-25 Thread Dean Rasheed
On Thu, 19 Nov 2020 at 19:57, Pavel Stehule wrote: > > minor update - fixed handling of processing names with double quotes inside > I see this is marked RFC, but reading the thread it doesn't feel like we have reached consensus on the design for this feature. I agree that being able to configur

Re: proposal: possibility to read dumped table's name from file

2020-11-26 Thread Dean Rasheed
On Thu, 26 Nov 2020 at 06:43, Pavel Stehule wrote: > > st 25. 11. 2020 v 21:00 odesílatel Tom Lane napsal: >> >> (One thing to consider is >> how painful will it be for people to quote table names containing >> funny characters, for instance. On the command line, we largely >> depend on the shel

Re: Additional improvements to extended statistics

2020-11-29 Thread Dean Rasheed
> On Wed, 18 Nov 2020 at 22:37, Tomas Vondra > wrote: > > > > Seems fine to me, although the "_opt_ext_stats" is rather cryptic. > > AFAICS we use "_internal" for similar functions. > > I have been thinking about this some more. The one part of this that I still wasn't happy with was the way that

Re: PoC/WIP: Extended statistics on expressions

2021-01-18 Thread Dean Rasheed
Looking through extended_stats.c, I found a corner case that can lead to a seg-fault: CREATE TABLE foo(); CREATE STATISTICS s ON (1) FROM foo; ANALYSE foo; This crashes in lookup_var_attr_stats(), because it isn't expecting nvacatts to be 0. I can't think of any case where building stats on a tab

Re: PoC/WIP: Extended statistics on expressions

2021-01-21 Thread Dean Rasheed
On Tue, 19 Jan 2021 at 01:57, Tomas Vondra wrote: > > > A slightly bigger issue that I don't like is the way it assigns > > attribute numbers for expressions starting from > > MaxHeapAttributeNumber+1, so the first expression has an attnum of > > 1601. That leads to pretty inefficient use of Bitma

Re: PoC/WIP: Extended statistics on expressions

2021-01-22 Thread Dean Rasheed
On Fri, 22 Jan 2021 at 04:46, Justin Pryzby wrote: > > I think you'll maybe have to do something better - this seems a bit too weird: > > | postgres=# CREATE STATISTICS s2 ON (i+1) ,i FROM t; > | postgres=# \d t > | ... > | "public"."s2" (ndistinct, dependencies, mcv) ON i FROM t > I guess th

Re: Schema variables - new implementation for Postgres 15

2022-01-13 Thread Dean Rasheed
On Wed, 3 Nov 2021 at 13:05, Tomas Vondra wrote: > > 2) I find this a bit confusing: > > SELECT non_existent_variable; > test=# select s; > ERROR: column "non_existent_variable" does not exist > LINE 1: select non_existent_variable; > > I wonder if this means using SELECT to read variables is a b

Re: Schema variables - new implementation for Postgres 15

2022-01-13 Thread Dean Rasheed
On Thu, 13 Jan 2022 at 17:42, Pavel Stehule wrote: > > I like the idea of prioritizing tables over variables with warnings when > collision is detected. It cannot break anything. And it allows to using short > identifiers when there is not collision. Yeah, that seems OK, as long as it's clearly

Re: PoC/WIP: Extended statistics on expressions

2021-03-17 Thread Dean Rasheed
On Sun, 7 Mar 2021 at 21:10, Tomas Vondra wrote: > > 2) ndistinct > > There's one thing that's bugging me, in how we handle "partial" matches. > For each expression we track both the original expression and the Vars > we extract from it. If we can't find a statistics matching the whole > expressio

Re: PoC/WIP: Extended statistics on expressions

2021-03-17 Thread Dean Rasheed
On Wed, 17 Mar 2021 at 17:26, Tomas Vondra wrote: > > My concern is that the current behavior (where we prefer expression > stats over multi-column stats to some extent) works fine as long as the > parts are independent, but once there's dependency it's probably more > likely to produce underestim

Re: PoC/WIP: Extended statistics on expressions

2021-03-17 Thread Dean Rasheed
On Wed, 17 Mar 2021 at 19:07, Tomas Vondra wrote: > > On 3/17/21 7:54 PM, Dean Rasheed wrote: > > > > it might have been better to estimate the first case as > > > > ndistinct((a+b)) * ndistinct(c) * ndistinct(d) > > > > and the second case as >

Re: PoC/WIP: Extended statistics on expressions

2021-03-17 Thread Dean Rasheed
On Wed, 17 Mar 2021 at 20:48, Dean Rasheed wrote: > > For reference, here is the test case I was using (which isn't really very > good for > catching dependence between columns): > And here's a test case with much more dependence between the columns: DROP TABLE IF EXI

Re: PoC/WIP: Extended statistics on expressions

2021-03-18 Thread Dean Rasheed
On Wed, 17 Mar 2021 at 21:31, Tomas Vondra wrote: > > I agree applying at least the [(a+b),c] stats is probably the right > approach, as it means we're considering at least the available > information about dependence between the columns. > > I think to improve this, we'll need to teach the code t

Re: pgbench - add pseudo-random permutation function

2021-03-22 Thread Dean Rasheed
On Sun, 14 Mar 2021 at 16:08, Fabien COELHO wrote: > > > My main question on this now is, do you have a scholar reference for > > this algorithm? > > Nope, otherwise I would have put a reference. I'm a scholar though, if > it helps:-) > > I could not find any algorithm that fitted the bill. The us

Re: PoC/WIP: Extended statistics on expressions

2021-03-24 Thread Dean Rasheed
On Wed, 24 Mar 2021 at 10:22, Tomas Vondra wrote: > > Thanks, it seems to be some thinko in handling in PlaceHolderVars, which > seem to break the code's assumptions about varnos. This fixes it for me, > but I need to look at it more closely. > I think that makes sense. Reviewing the docs, I not

Re: PoC/WIP: Extended statistics on expressions

2021-03-24 Thread Dean Rasheed
On Wed, 24 Mar 2021 at 14:48, Tomas Vondra wrote: > > AFAIK the primary issue here is that the two places disagree. While > estimate_num_groups does this > > varnos = pull_varnos(root, (Node *) varshere); > if (bms_membership(varnos) == BMS_SINGLETON) > { ... } > > the add_unique_group

Re: PoC/WIP: Extended statistics on expressions

2021-03-24 Thread Dean Rasheed
On Wed, 24 Mar 2021 at 16:48, Tomas Vondra wrote: > > As for the changes proposed in the create_statistics, do we really want > to use univariate / multivariate there? Yes, the terms are correct, but > I'm not sure how many people looking at CREATE STATISTICS will > understand them. > Hmm, I thin

Re: PoC/WIP: Extended statistics on expressions

2021-03-25 Thread Dean Rasheed
On Thu, 25 Mar 2021 at 00:05, Tomas Vondra wrote: > > Actually, I think we need that block at all - there's no point in > keeping the exact expression, because if there was a statistics matching > it it'd be matched by the examine_variable. So if we get here, we have > to just split it into the va

Re: PoC/WIP: Extended statistics on expressions

2021-03-25 Thread Dean Rasheed
On Thu, 25 Mar 2021 at 00:05, Tomas Vondra wrote: > > here's an updated patch. 0001 The change to the way that CreateStatistics() records dependencies isn't quite right -- recordDependencyOnSingleRelExpr() will not create any dependencies if the expression uses only a whole-row Var. However, pull

Re: PoC/WIP: Extended statistics on expressions

2021-03-26 Thread Dean Rasheed
On Thu, 25 Mar 2021 at 19:59, Tomas Vondra wrote: > > Attached is an updated patch series, with all the changes discussed > here. I've cleaned up the ndistinct stuff a bit more (essentially > reverting back from GroupExprInfo to GroupVarInfo name), and got rid of > the UpdateStatisticsForTypeChang

Re: pgbench - add pseudo-random permutation function

2021-03-30 Thread Dean Rasheed
On Mon, 22 Mar 2021 at 13:43, Dean Rasheed wrote: > > On Sun, 14 Mar 2021 at 16:08, Fabien COELHO wrote: > > > > > My main question on this now is, do you have a scholar reference for > > > this algorithm? > > > > Nope, otherwise I would have put a

Re: pgbench - add pseudo-random permutation function

2021-03-30 Thread Dean Rasheed
On Tue, 30 Mar 2021 at 19:26, Fabien COELHO wrote: > > First, I have a thing against erand48. Yeah, that's probably a fair point. However, all the existing pgbench random functions are using it, so I think it's fair enough for permute() to do the same (and actually 2^48 is pretty huge). Switching

Re: pgbench - add pseudo-random permutation function

2021-03-30 Thread Dean Rasheed
On Tue, 30 Mar 2021 at 20:31, Dean Rasheed wrote: > > Yeah, that's probably a fair point. However, all the existing pgbench > random functions are using it, so I think it's fair enough for > permute() to do the same (and actually 2^48 is pretty huge). Switching > to a 6

Re: pgbench - add pseudo-random permutation function

2021-03-31 Thread Dean Rasheed
On Wed, 31 Mar 2021 at 09:02, Fabien COELHO wrote: > > >> First, I have a thing against erand48. > > > Also, there is a 64 bits seed provided to the function which instantly > ignores 16 of them, which looks pretty silly to me. > Yeah, that was copied from set_random_seed(). > At least, I sugges

Re: pgbench - add pseudo-random permutation function

2021-04-01 Thread Dean Rasheed
On Wed, 31 Mar 2021 at 18:53, Fabien COELHO wrote: > > While looking at it, I have some doubts on this part: > > m = (uint64) (pg_erand48(random_state.xseed) * (mask + 1)) | 1; > r = (uint64) (pg_erand48(random_state.xseed) * (mask + 1)); > r = (uint64) (pg_erand48(random_state.xseed) * size

Re: pgbench - add pseudo-random permutation function

2021-04-04 Thread Dean Rasheed
On Fri, 2 Apr 2021 at 06:38, Fabien COELHO wrote: > > >> r = (uint64) (pg_erand48(random_state.xseed) * size); > >> > >> I do not understand why the random values are multiplied by anything in > >> the first place… > > > > These are just random integers in the range [0,mask] and [0,size-1], > >

Re: pgbench - add pseudo-random permutation function

2021-04-06 Thread Dean Rasheed
On Mon, 5 Apr 2021 at 13:07, Fabien COELHO wrote: > > Attached a v28 which I hope fixes the many above issues and stays with > ints. The randu64 is still some kind of a joke, I artificially reduced the > cost by calling jrand48 once and extending it to 64 bits, so it could give > an idea of the co

Re: Wrong results from in_range() tests with infinite offset

2020-07-21 Thread Dean Rasheed
On Tue, 21 Jul 2020 at 03:06, Tom Lane wrote: > > Pushed, but I chickened out of back-patching. The improvement in what > happens for finite comparison values seems somewhat counterbalanced by > the possibility that someone might not like the definition we arrived > at for infinities. So, it's n

Re: Infinities in type numeric

2020-07-22 Thread Dean Rasheed
On Tue, 21 Jul 2020 at 23:18, Tom Lane wrote: > > Here's a v4 that syncs numeric in_range() with the new behavior of > float in_range(), and addresses your other comments too. > LGTM. Regards, Dean

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2018-03-26 Thread Dean Rasheed
On 18 March 2018 at 23:57, Tomas Vondra wrote: > Attached is an updated version of the patch series, addressing issues > pointed out by Alvaro. I'm just starting to look at this now, and I think I'll post individual comments/questions as I get to them rather than trying to review the whole thing,

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2018-03-26 Thread Dean Rasheed
On 26 March 2018 at 14:08, Tomas Vondra wrote: > On 03/26/2018 12:31 PM, Dean Rasheed wrote: >> A wider concern I have is that I think this function is trying to be >> too clever by only resetting selected stats. IMO it should just reset >> all stats unconditionally when th

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2018-03-26 Thread Dean Rasheed
On 18 March 2018 at 23:57, Tomas Vondra wrote: > Attached is an updated version of the patch series, addressing issues > pointed out by Alvaro. I've just been reading the new code in statext_clauselist_selectivity() and mcv_clauselist_selectivity(), and I'm having a hard time convincing myself th

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2018-03-26 Thread Dean Rasheed
On 26 March 2018 at 20:17, Tomas Vondra wrote: > On 03/26/2018 09:01 PM, Dean Rasheed wrote: >> Also, just above that, in statext_clauselist_selectivity(), it >> computes the list stat_clauses, then doesn't appear to use it >> anywhere. I think that would have been the

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2018-03-27 Thread Dean Rasheed
On 27 March 2018 at 01:36, Tomas Vondra wrote: > 4) handling of NOT clauses in MCV lists (and in histograms) > > The query you posted does not fail anymore... > Ah, it turns out the previous query wasn't actually failing for the reason I thought it was -- it was failing because it had a ScalarArr

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2018-03-27 Thread Dean Rasheed
On 27 March 2018 at 01:36, Tomas Vondra wrote: > BTW I think there's a bug in handling the fullmatch flag - it should not > be passed to AND/OR subclauses the way it is, because then > > WHERE a=1 OR (a=2 AND b=2) > > will probably set it to 'true' because of (a=2 AND b=2). Which will > short-

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2018-03-27 Thread Dean Rasheed
On 27 March 2018 at 14:58, Dean Rasheed wrote: > On 27 March 2018 at 01:36, Tomas Vondra wrote: >> 4) handling of NOT clauses in MCV lists (and in histograms) >> >> The query you posted does not fail anymore... >> > Ah, it turns out the previous query wasn't

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2018-03-28 Thread Dean Rasheed
On 28 March 2018 at 01:34, Tomas Vondra wrote: > Attached is a patch fixing this. In the end I've decided to keep both > branches - one handling boolean Vars and one for NOT clauses. I think > you're right we can only see (NOT var) cases, but I'm not sure about that. > > For example, what if an op

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2018-03-28 Thread Dean Rasheed
On 28 March 2018 at 15:50, Tomas Vondra wrote: > After thinking about this a bit more, I'm not sure if updating the info > based on recursive calls makes sense. The fullmatch flag was supposed to > answer a simple question - can there be just a single matching item? > > If there are equality condi

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2018-04-07 Thread Dean Rasheed
On 7 April 2018 at 15:12, Bruce Momjian wrote: > Uh, where are we on this patch? It isn't going to make it into PG 11? > Feature development for this has been going on for years. Unfortunately, I think there's no way that this will be ready for PG11. So far, I have only read parts of the patch,

Re: BF failure: could not open relation with OID XXXX while querying pg_views

2019-09-15 Thread Dean Rasheed
On Sat, 14 Sep 2019 at 05:25, Tom Lane wrote: > > Tomas Vondra writes: > > On Wed, Aug 14, 2019 at 05:24:26PM +1200, Thomas Munro wrote: > >> On Wed, Aug 14, 2019 at 5:06 PM Tom Lane wrote: > >>> Oh, hmm --- yeah, that should mean it's safe. Maybe somebody incautiously > >>> changed one of the

Re: BF failure: could not open relation with OID XXXX while querying pg_views

2019-09-15 Thread Dean Rasheed
On Sun, 15 Sep 2019 at 11:11, Tomas Vondra wrote: > > On Sun, Sep 15, 2019 at 10:16:30AM +0100, Dean Rasheed wrote: > > > >Ah sorry, I missed this thread before. As author of that commit, it's > >really on me to fix it, and the cause seems pretty clear-cut, so I

Re: BF failure: could not open relation with OID XXXX while querying pg_views

2019-09-15 Thread Dean Rasheed
On Sun, 15 Sep 2019 at 12:20, Tomas Vondra wrote: > > On Sun, Sep 15, 2019 at 11:27:19AM +0100, Dean Rasheed wrote: > >On Sun, 15 Sep 2019 at 11:11, Tomas Vondra > >wrote: > >> > >> On Sun, Sep 15, 2019 at 10:16:30AM +0100, Dean Rasheed wrote: > >>

Re: BUG #15623: Inconsistent use of default for updatable view

2019-02-28 Thread Dean Rasheed
On Thu, 28 Feb 2019 at 07:47, Amit Langote wrote: > > +if (attrno == 0) > +elog(ERROR, "Cannot set value in column %d to > DEFAULT", i); > > Maybe: s/Cannot/cannot/g > Ah yes, you're right. That is the convention. > +Assert(list_length(sublist) == num

Re: Row Level Security − leakproof-ness and performance implications

2019-02-28 Thread Dean Rasheed
On Thu, 28 Feb 2019 at 14:13, Robert Haas wrote: > A wild idea might be to let > proleakproof take on three values: yes, no, and maybe. When 'maybe' > functions are involved, we tell them whether or not the current query > involves any security barriers, and if so they self-censor. > Does self-c

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2019-03-09 Thread Dean Rasheed
On Thu, 28 Feb 2019 at 19:56, Tomas Vondra wrote: > Attached is an updated version of this patch series. Here are some random review comments. I'll add more later, but I'm out of energy for today. 1). src/test/regress/expected/type_sanity.out has bit-rotted. 2). Duplicate OIDs (3425). 3). It l

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2019-03-10 Thread Dean Rasheed
On Sat, 9 Mar 2019 at 18:33, Dean Rasheed wrote: > > On Thu, 28 Feb 2019 at 19:56, Tomas Vondra > wrote: > > Attached is an updated version of this patch series. > > Here are some random review comments. I'll add more later, but I'm out > of energy for today. &

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2019-03-10 Thread Dean Rasheed
On Sun, 10 Mar 2019 at 13:09, Dean Rasheed wrote: > Here are some more comments: > One more thing --- the comment for statext_clauselist_selectivity() says: * So (simple_selectivity - base_selectivity) may be seen as a correction for * the part not covered by the MCV list. That'

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2019-03-11 Thread Dean Rasheed
On Sun, 10 Mar 2019 at 17:36, Tomas Vondra wrote: > On 3/10/19 2:09 PM, Dean Rasheed wrote: > > 14). The attnums Bitmapset passed to > > statext_is_compatible_clause_internal() is an input/output argument > > that it updates. That should probably be documented. Whe

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2019-03-11 Thread Dean Rasheed
On Sun, 10 Mar 2019 at 22:28, David Rowley wrote: > > On Mon, 11 Mar 2019 at 06:36, Tomas Vondra > wrote: > > > > On 3/9/19 7:33 PM, Dean Rasheed wrote: > > > I wonder if it's possible to write smaller, more targeted tests. > > > Currently "s

Re: [Patch] Log10 and hyperbolic functions for SQL:2016 compliance

2019-03-11 Thread Dean Rasheed
On Sun, 3 Feb 2019 at 15:12, Tom Lane wrote: > > Andrew Gierth writes: > > The spec doesn't require the inverse functions (asinh, acosh, atanh), > > but surely there is no principled reason to omit them? > > +1 --- AFAICS, the C library has offered all six since C89. > +1 for including the inver

Re: pgsql: Add support for hyperbolic functions, as well as log10().

2019-03-13 Thread Dean Rasheed
On Wed, 13 Mar 2019, 21:56 Tom Lane, wrote: > > Of these, probably the least bad is #3, even though it might require > a few rounds of experimentation to find the best extra_float_digits > setting to use. I'll go try it without any roundoff, just to see > what the raw results look like ... > Y

Re: pgsql: Add support for hyperbolic functions, as well as log10().

2019-03-14 Thread Dean Rasheed
On Thu, 14 Mar 2019 at 04:41, Tom Lane wrote: > > Dean Rasheed writes: > > I'm amazed that jacana's asinh() returned -0 for an input of +0. > > Even more amusingly, it returns NaN for acosh('infinity'), cf > https://buildfarm.postgresql.org/cgi-bin/show

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2019-03-16 Thread Dean Rasheed
On Fri, 15 Mar 2019 at 00:06, Tomas Vondra wrote: > I've noticed an annoying thing when modifying type of column not > included in a statistics... > > That is, we don't remove the statistics, but the estimate still changes. > But that's because the ALTER TABLE also resets reltuples/relpages: > > T

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2019-03-16 Thread Dean Rasheed
On Fri, 15 Mar 2019 at 00:06, Tomas Vondra wrote: > ... attached patch ... Some more review comments, carrying on from where I left off: 16). This regression test fails for me: @@ -654,11 +654,11 @@ -- check change of unrelated column type does not reset the MCV statistics ALTER TABLE mcv_lis

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2019-03-17 Thread Dean Rasheed
On Sat, 16 Mar 2019 at 23:44, Tomas Vondra wrote: > > > 21). For consistency with other bms_ functions, I think the name of > > the Bitmapset argument for bms_member_index() should just be called > > "a". Nitpicking, I'd also put bms_member_index() immediately after > > bms_is_member() in the sour

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2019-03-17 Thread Dean Rasheed
On Sat, 16 Mar 2019 at 23:44, Tomas Vondra wrote: > > > 28). I just spotted the 1MB limit on the serialised MCV list size. I > > think this is going to be too limiting. For example, if the stats > > target is at its maximum of 1, that only leaves around 100 bytes > > for each item's values, wh

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2019-03-17 Thread Dean Rasheed
On Sat, 16 Mar 2019 at 23:44, Tomas Vondra wrote: > > > > 16). This regression test fails for me: > > > > @@ -654,11 +654,11 @@ > > -- check change of unrelated column type does not reset the MCV statistics > > ALTER TABLE mcv_lists ALTER COLUMN d TYPE VARCHAR(64); > > SELECT * FROM check_estim

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2019-03-24 Thread Dean Rasheed
On Sun, 24 Mar 2019 at 00:17, David Rowley wrote: > > On Sun, 24 Mar 2019 at 12:41, Tomas Vondra > wrote: > > > > On 3/21/19 4:05 PM, David Rowley wrote: > > > > 29. Looking at the tests I see you're testing that you get bad > > > estimates without extended stats. That does not really seem like

Re: BUG #15708: RLS 'using' running as wrong user when called from a view

2019-03-24 Thread Dean Rasheed
On Thu, 21 Mar 2019 at 00:39, PG Bug reporting form wrote: > > This fails, seemingly because the RLS on 'bar' is being checked by alice, > instead of the view owner bob: > Yes I agree, that appears to be a bug. The subquery in the RLS policy should be checked as the view owner -- i.e., we need to

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2019-03-26 Thread Dean Rasheed
On Mon, 25 Mar 2019 at 23:36, Tomas Vondra wrote: > > Attached is an updated patch, fixing all the issues pointed out so far. > Unless there are some objections, I plan to commit the 0001 part by the > end of this CF. Part 0002 is a matter for PG13, as previously agreed. > Yes, I think that's rea

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2019-03-26 Thread Dean Rasheed
On Tue, 26 Mar 2019 at 11:59, Dean Rasheed wrote: > > On Mon, 25 Mar 2019 at 23:36, Tomas Vondra > wrote: > > > > Attached is an updated patch... > > I just looked through the latest set of changes and I have a couple of > additional review comments: > I

Re: BUG #15708: RLS 'using' running as wrong user when called from a view

2019-03-27 Thread Dean Rasheed
On Mon, 25 Mar 2019 at 20:27, Stephen Frost wrote: > > * Dean Rasheed (dean.a.rash...@gmail.com) wrote: > > > It looks like the best place to fix it is in > > get_policies_for_relation(), since that's where all the policies to be > > applied for a given RTE are

Re: ANALYZE: ERROR: tuple already updated by self

2019-07-29 Thread Dean Rasheed
On Sun, 28 Jul 2019 at 11:15, Tomas Vondra wrote: > > Attached is a patch fixing the error by not building extended stats for > the inh=true case (as already proposed in this thread). That's about the > simplest way to resolve this issue for v12. It should add a simple > regression test too, I gue

Re: Multivariate MCV list vs. statistics target

2019-08-01 Thread Dean Rasheed
On Thu, 1 Aug 2019 at 11:30, Tomas Vondra wrote: > > I'll move it to the next CF. Aside from the issues pointed by Kyotaro-san > in his review, I still haven't made my mind about whether to base the use > statistics targets set for the attributes. That's what we're doing now, > but I'm not sure it

Re: MCV lists for highly skewed distributions

2018-01-21 Thread Dean Rasheed
On 21 January 2018 at 07:26, John Naylor wrote: > I spent a few hours hacking on this, and it turns out calculating the > right number of MCVs taking into account both uniform and highly > non-uniform distributions is too delicate a problem for me to solve > right now. The logic sugge

Re: stricter MCV tests for uniform distributions (was Re: MCV lists for highly skewed distributions)

2018-01-22 Thread Dean Rasheed
On 22 January 2018 at 08:07, John Naylor wrote: > On 1/21/18, Dean Rasheed wrote: >> It occurs to me that maybe a better test to exclude a value from the >> MCV list would be to demand that its relative standard error not be >> too high. Such a test, in addition to the exi

Re: WINDOW RANGE patch versus leakproofness

2018-01-31 Thread Dean Rasheed
On 30 January 2018 at 16:42, Tom Lane wrote: > So I'm thinking that (a) we do not need to check for leaky functions used > in window support, and (b) therefore there's no need to avoid leaky > behavior in in_range support functions. Objections? > Yes, I concur. Since window functions can only ap

Re: WINDOW RANGE patch versus leakproofness

2018-01-31 Thread Dean Rasheed
On 31 January 2018 at 21:51, Robert Haas wrote: > On Wed, Jan 31, 2018 at 5:52 AM, Dean Rasheed > wrote: >> On 30 January 2018 at 16:42, Tom Lane wrote: >>> So I'm thinking that (a) we do not need to check for leaky functions used >>> in window support, a

Re: MCV lists for highly skewed distributions

2018-02-01 Thread Dean Rasheed
On 1 February 2018 at 13:16, Simon Riggs wrote: > On 25 January 2018 at 22:19, Tom Lane wrote: >> In any case, since it looks like the next step is for someone to come >> up with a new proposal, I'm going to set this to Waiting on Author. > > Dean and John's results show that different algorithms

Re: MCV lists for highly skewed distributions

2018-02-07 Thread Dean Rasheed
On 1 February 2018 at 17:49, Robert Haas wrote: > One point which I want to emphasize is that the length of the MCV list > bounds the estimated frequency of non-MCVs in two ways: no non-MCV is > ever thought to be more frequent than the least-common MCVs, and > however many non-MCVs we think we ha

Re: MCV lists for highly skewed distributions

2018-02-07 Thread Dean Rasheed
On 4 February 2018 at 12:18, John Naylor wrote: > I did the same basic eyeball testing I did on earlier patches, and > this is the best implementation so far. I've attached some typical > pg_stats contents for HEAD and this patch. More rigorous testing, > including of planner estimates on real dat

  1   2   3   4   5   6   7   >