description of Aggregate Expressions

2019-12-05 Thread John Lumby
In PostgreSQL 12.1 Documentation chapter 4.2.7. Aggregate Expressions  it says


The syntax of an aggregate expression is one of the following:
  ... 
aggregate_name (DISTINCT expression [ , ... ] [ order_by_clause ] ) [ FILTER ( 
WHERE filter_clause ) ]
...

I believe this is incorrect in the case where the DISTINCT is on a 
comma-separated list of expressions.
It would imply that this is legal

select count(DISTINCT parent_id , name) from  mytable

but that is rejected with 
ERROR:  function count(bigint, text) does not exist

whereas 

select count(DISTINCT ( parent_id , name) ) from mytable

is accepted.

So I think to handle all cases the line in the doc should read

aggregate_name (DISTINCT ( expression [ , ... ] ) [ order_by_clause ] ) [ 
FILTER ( WHERE filter_clause ) ]

I don't know how to indicate that those extra parentheses can be omitted if the 
list has only one expression.

Cheers,  John Lumby



Re: description of Aggregate Expressions

2019-12-05 Thread David G. Johnston
On Thu, Dec 5, 2019 at 3:18 PM John Lumby  wrote:

> In PostgreSQL 12.1 Documentation chapter 4.2.7. Aggregate Expressions  it
> says
>
>
> The syntax of an aggregate expression is one of the following:
>   ...
> aggregate_name (DISTINCT expression [ , ... ] [ order_by_clause ] ) [
> FILTER ( WHERE filter_clause ) ]
> ...
>
> I believe this is incorrect in the case where the DISTINCT is on a
> comma-separated list of expressions.
> It would imply that this is legal
>

It is...you didn't get a syntax error.

>
> select count(DISTINCT parent_id , name) from  mytable
>
> but that is rejected with
> ERROR:  function count(bigint, text) does not exist
>

The error is that while the query is syntactically correct in order to
execute it as written a function would need to exist that does not.  As far
as a general syntax diagram goes it has correctly communicated what is
legal.



> whereas
>
> select count(DISTINCT ( parent_id , name) ) from mytable
>
> is accepted.
>

Correct, converting the two individual columns into a "tuple" allows the
default tuple distinct-making infrastructure to be used to execute the
query.


> So I think to handle all cases the line in the doc should read
>
> aggregate_name (DISTINCT ( expression [ , ... ] ) [ order_by_clause ] ) [
> FILTER ( WHERE filter_clause ) ]
>
> I don't know how to indicate that those extra parentheses can be omitted
> if the list has only one expression.
>

Then I would have to say the proposed solution to this edge case is worse
than the problem.  I also don't expect there to be a clean solution to
dealing with the complexities of expressions at the syntax diagram level.

David J.


Re: description of Aggregate Expressions

2019-12-05 Thread Tom Lane
"David G. Johnston"  writes:
> On Thu, Dec 5, 2019 at 3:18 PM John Lumby  wrote:
>> whereas
>> select count(DISTINCT ( parent_id , name) ) from mytable
>> is accepted.

> Correct, converting the two individual columns into a "tuple" allows the
> default tuple distinct-making infrastructure to be used to execute the
> query.

Yeah.  This might be more intelligible if it were written

select count(DISTINCT ROW(parent_id, name) ) from mytable

However, the SQL committee in their finite wisdom have decreed that
the ROW keyword is optional.  (As long as there's more than one
column expression; the need for that special case is another reason
why omitting ROW isn't really a nice thing to do.)

regards, tom lane




Re: monitoring-stats.html is too impenetrable

2019-12-05 Thread Michael Paquier
On Wed, Dec 04, 2019 at 03:29:55AM -0800, James Salsman wrote:
> Thank you for your thoughtful reply. This might be much easier:
> 
> How about adding another example to
> https://www.postgresql.org/docs/11/planner-stats.html ?

Not sure I see the parallel here.  This page talks about planner
statistics, and yours about being able to find missing indexes because
of incorrect stats.

> SELECT relname, seq_scan-idx_scan AS too_much_seq,
>case when seq_scan-idx_scan>0 THEN 'Missing Index?' ELSE 'OK' END,
>pg_relation_size(relid::regclass) AS rel_size, seq_scan, idx_scan
> FROM pg_stat_all_tables
> WHERE schemaname='public' AND pg_relation_size(relid::regclass)>8
> ORDER BY too_much_seq DESC;

Again.  this is a bit more complex than that.
--
Michael


signature.asc
Description: PGP signature