Hi all,

I'd like to propose support for GROUP BY ALL in Calcite - a shorthand that
groups every expression in the SELECT clause that isn't an aggregate.

Here is the Jira ticket: https://issues.apache.org/jira/browse/CALCITE-7594

Context:
- Today, if I want to group by every non-aggregated expression in the
SELECT list, I would have to repeat them in the group by.
- A GROUP BY ALL shorthand has become convergence across warehouse
dialects, and supporting it keeps Calcite-based dialects updated and
aligned.

Proposal:
- Add GROUP BY ALL: when ALL appears with no grouping items, group by every
SELECT expression that is not an aggregate or window function. It's
implemented as a parser marker that the validator rewrites into the
concrete grouping expressions before normal group validation runs, so the
converter / optimizer never see a marker.
- Alternatives considered: I kept ALL (matches DuckDB, Spark, and others).
The existing ALL / DISTINCT set quantifier on grouping items (CALCITE-5089,
e.g. GROUP BY ALL CUBE(a, b)) is fully preserved. The new meaning only
applies when no grouping items follow ALL.
- Expected impact / tradeoff: purely additive. Bare GROUP BY ALL does not
parse today, so enabling it cannot change the meaning of any existing
query. SELECT * with GROUP BY ALL is rejected with a clear error for now
(the star isn't expanded at validation time)

For example:

SELECT deptno, job, SUM(sal)
FROM emp
GROUP BY ALL;

resolves to

SELECT deptno, job, SUM(sal)
FROM emp
GROUP BY deptno, job;

Open questions:
- Since this feature is additive, I don't think a conformance flag is
needed, but would we prefer one?
- Should SELECT * be supported in this proposal, or is the explicit error
enough for an initial cut?

If this direction sounds reasonable, I have a prototype ready (parser +
validator + tests + reference docs) and will put up the PR soon. A
companion ORDER BY ALL proposal (CALCITE-7597) is in the same vein.

Best,
Tisya Bhatia

Reply via email to