On 11/24/20 5:23 PM, Justin Pryzby wrote:
> On Mon, Nov 23, 2020 at 04:30:26AM +0100, Tomas Vondra wrote:
>> 0004 - Seems fine. IMHO not really "silly errors" but OK.
> 
> This is one of the same issues you pointed out - shadowing a variable.
> Could be backpatched.
> 
> On Mon, Nov 23, 2020 at 04:30:26AM +0100, Tomas Vondra wrote:
>>> +                                errmsg("statistics expressions and 
>>> predicates can refer only to the table being indexed")));
>>> +        * partial-index predicates.  Create it in the per-index context to 
>>> be
>>>
>>> I think these are copied and shouldn't mention "indexes" or "predicates".  
>>> Or
>>> should statistics support predicates, too ?
>>>
>>
>> Right. Stupid copy-pasto.
> 
> Right, but then I was wondering if CREATE STATS should actually support
> predicates, since one use case is to do what indexes do without their 
> overhead.
> I haven't thought about it enough yet.
> 

Well, it's not supported now, so the message is bogus. I'm not against
supporting "partial statistics" with predicates in the future, but it's
going to be non-trivial project on it's own. It's not something I can
bolt onto the current patch easily.

>> 0006 - Not sure. I think CreateStatistics can be fixed with less code,
>> keeping it more like PG13 (good for backpatching). Not sure why rename
>> extended statistics to multi-variate statistics - we use "extended"
>> everywhere.
> 
> -       if (build_expressions && (list_length(stxexprs) == 0))
> +       if (!build_expressions_only && (list_length(stmt->exprs) < 2))
>                 ereport(ERROR,  
>                                 (errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
> -                                errmsg("extended expression statistics 
> require at least one expression")));
> +                                errmsg("multi-variate statistics require at 
> least two columns")));
> 
> I think all of "CREATE STATISTICS" has been known as "extended stats", so I
> think it may be confusing to say that it requires two columns for the general
> facility.
> 
>> Not sure what's the point of serialize_expr_stats changes,
>> that's code is mostly copy-paste from update_attstats.
> 
> Right.  I think "i" is poor variable name when it isn't a loop variable and 
> not
> of limited scope.
> 

OK, I understand. I'll consider tweaking that.

>> 0007 - I suspect this makes the pg_stats_ext too complex to work with,
>> IMHO we should move this to a separate view.
> 
> Right - then unnest() the whole thing and return one row per expression rather
> than array, as you've done.  Maybe the docs should say that this returns one
> row per expression.
> 
> Looking quickly at your new patch: I guess you know there's a bunch of
> lingering references to "indexes" and "predicates":
> 
> I don't know if you want to go to the effort to prohibit this.
> postgres=# CREATE STATISTICS asf ON (tableoid::int+1) FROM t;
> CREATE STATISTICS
> 

Hmm, we're already rejecting system attributes, I suppose we should do
the same thing for expressions on system attributes.

> I think a lot of people will find this confusing:
> 
> postgres=# CREATE STATISTICS asf ON i FROM t;
> ERROR:  extended statistics require at least 2 columns
> postgres=# CREATE STATISTICS asf ON (i) FROM t;
> CREATE STATISTICS
> 
> postgres=# CREATE STATISTICS asf (expressions) ON i FROM t;
> ERROR:  extended expression statistics require at least one expression
> postgres=# CREATE STATISTICS asf (expressions) ON (i) FROM t;
> CREATE STATISTICS
> 
> I haven't looked, but is it possible to make it work without parens ?
> 

Hmm, you're right that may be surprising. I suppose we could walk the
expressions while creating the statistics, and replace such trivial
expressions with the nested variable, but I haven't tried. I wonder what
the CREATE INDEX behavior would be in these cases.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Reply via email to