Re: Why are nested aggregations illegal? Best alternatives?

Julian Hyde Thu, 10 Feb 2022 12:27:59 -0800

Yes, if you want to do multiple layers of aggregation, use CTEs (WITH) or 
nested sub-queries. For example, the following is I believe valid standard SQL, 
and actually computes something useful:


  WITH q1 AS
   (SELECT deptno, job, AVG(sal) AS avg_sal
    FROM emp
    GROUP BY deptno, job)
  WITH q2 AS
   (SELECT deptno, AVG(avg_sal) AS avg_avg_sal
    FROM q1
    GROUP BY deptno)
  SELECT AVG(avg_avg_sal)
  FROM q2
  GROUP BY ()

(You can omit the “GROUP BY ()” line, but I think it makes things clearer.)

Julian



> On Feb 10, 2022, at 12:17 PM, Justin Swanhart <greenl...@gmail.com> wrote:
> 
> I wish you could unsend emails :)  Answering my own question, no, because
> that would return three rows with the average :D
> 
> On Thu, Feb 10, 2022 at 3:16 PM Justin Swanhart <greenl...@gmail.com> wrote:
> 
>> Just out of curiosity, is the second level aggregation using AVG in a
>> window context?  It the frame is the whole table and it aggregates over it?
>> 
>> On Thu, Feb 10, 2022 at 3:12 PM Justin Swanhart <greenl...@gmail.com>
>> wrote:
>> 
>>> That is really neat about Oracle.
>>> 
>>> The alternative in general is to use a subquery:
>>> SELECT avg(avg(sal)) FROM emp GROUP BY deptno;
>>> becomes
>>> select avg(the_avg)
>>> from (select avg(sal) from emp group b deptno) an_alias;
>>> 
>>> or
>>> 
>>> with the_cte as (select avg(sal) x from emp group by deptno)
>>> select avg(x) from the_cte;
>>> 
>>> On Thu, Feb 10, 2022 at 3:03 PM Julian Hyde <jhyde.apa...@gmail.com>
>>> wrote:
>>> 
>>>> Some databases, e.g. Oracle, allow TWO levels of nesting:
>>>> 
>>>> SELECT avg(sal) FROM emp GROUP BY deptno;
>>>> 
>>>> AVG(SAL)
>>>> ========
>>>> 1,566.67
>>>> 2,175.00
>>>> 2,916.65
>>>> 
>>>> SELECT avg(avg(sal)) FROM emp GROUP BY deptno;
>>>> 
>>>> AVG(SUM(SAL))
>>>> =============
>>>>        9,675
>>>> 
>>>> The first level aggregates by department (returning 3 records), and the
>>>> second level computes the grand total (returning 1 record). But that is an
>>>> exceptional case.
>>>> 
>>>> Generally, any expression in the SELECT or HAVING clause of an aggregate
>>>> query is either ‘before’ or ‘after’ aggregation. Consider
>>>> 
>>>>  SELECT t.x + 1 AS a, 2 + SUM(t.y + 3) AS b
>>>>  FROM t
>>>>  GROUP BY t.x
>>>> 
>>>> The expressions “t.y” and “t.y + 3” occur before aggregation; “t.x”,
>>>> “t.x + 1”, “SUM(t.y + 3)” and “2 + SUM(t.y + 3)” occur after aggregation.
>>>> SQL semantics rely heavily on this stratification. Allowing an extra level
>>>> of aggregation would mess it all up.
>>>> 
>>>> Julian
>>>> 
>>>> 
>>>> 
>>>>> On Feb 10, 2022, at 9:45 AM, Justin Swanhart <greenl...@gmail.com>
>>>> wrote:
>>>>> 
>>>>> This is a SQL limitation.
>>>>> 
>>>>> mysql> select sum(1);
>>>>> +--------+
>>>>> | sum(1) |
>>>>> +--------+
>>>>> |      1 |
>>>>> +--------+
>>>>> 1 row in set (0.00 sec)
>>>>> 
>>>>> mysql> select sum(sum(1));
>>>>> ERROR 1111 (HY000): Invalid use of group function
>>>>> 
>>>>> On Thu, Feb 10, 2022 at 12:39 PM Gavin Ray <ray.gavi...@gmail.com>
>>>> wrote:
>>>>> 
>>>>>> Went to test this query out and found that it can't be performed:
>>>>>> 
>>>>>> SELECT
>>>>>>   JSON_OBJECT(
>>>>>>       KEY 'users'
>>>>>>       VALUE JSON_ARRAYAGG(
>>>>>>           JSON_OBJECT(
>>>>>>               KEY 'name' VALUE "users"."name",
>>>>>>               KEY 'todos' VALUE JSON_ARRAYAGG(
>>>>>>                   JSON_OBJECT(
>>>>>>                       KEY 'description' VALUE "todos"."description"
>>>>>>                   )
>>>>>>               )
>>>>>>           )
>>>>>>       )
>>>>>>   )
>>>>>> FROM
>>>>>>   "users"
>>>>>> LEFT OUTER JOIN
>>>>>>   "todos" ON "users"."id" = "todos"."user_id";
>>>>>> 
>>>>>> Checking the source, seems this is a blanket policy, not a
>>>>>> datasource-specific thing.
>>>>>> From a functional perspective, it doesn't feel like it's much
>>>> different
>>>>>> from JOINs
>>>>>> But I don't understand relational theory or DB functionality in the
>>>> least,
>>>>>> so I'm not fit to judge.
>>>>>> 
>>>>>> Just curious why Calcite doesn't allow this
>>>>>> 
>>>> 
>>>>

Re: Why are nested aggregations illegal? Best alternatives?

Reply via email to