Hi all, I'd like to propose support for GROUP BY ALL in Calcite - a shorthand that groups every expression in the SELECT clause that isn't an aggregate.
Here is the Jira ticket: https://issues.apache.org/jira/browse/CALCITE-7594 Context: - Today, if I want to group by every non-aggregated expression in the SELECT list, I would have to repeat them in the group by. - A GROUP BY ALL shorthand has become convergence across warehouse dialects, and supporting it keeps Calcite-based dialects updated and aligned. Proposal: - Add GROUP BY ALL: when ALL appears with no grouping items, group by every SELECT expression that is not an aggregate or window function. It's implemented as a parser marker that the validator rewrites into the concrete grouping expressions before normal group validation runs, so the converter / optimizer never see a marker. - Alternatives considered: I kept ALL (matches DuckDB, Spark, and others). The existing ALL / DISTINCT set quantifier on grouping items (CALCITE-5089, e.g. GROUP BY ALL CUBE(a, b)) is fully preserved. The new meaning only applies when no grouping items follow ALL. - Expected impact / tradeoff: purely additive. Bare GROUP BY ALL does not parse today, so enabling it cannot change the meaning of any existing query. SELECT * with GROUP BY ALL is rejected with a clear error for now (the star isn't expanded at validation time) For example: SELECT deptno, job, SUM(sal) FROM emp GROUP BY ALL; resolves to SELECT deptno, job, SUM(sal) FROM emp GROUP BY deptno, job; Open questions: - Since this feature is additive, I don't think a conformance flag is needed, but would we prefer one? - Should SELECT * be supported in this proposal, or is the explicit error enough for an initial cut? If this direction sounds reasonable, I have a prototype ready (parser + validator + tests + reference docs) and will put up the PR soon. A companion ORDER BY ALL proposal (CALCITE-7597) is in the same vein. Best, Tisya Bhatia
