On the case where we're grouping only by constants over empty input: I checked the SQL standard and major dialects; they're pretty unanimous that the output should be 0 rows. Calcite already matches this because of the AggregateProjectPullUpConstantsRule (it won't reduce a non-empty GROUP BY to the empty one, it keeps at least one key), so GROUP BY ALL (which expands to the explicit GROUP BY <constants> returns 0 over empty input. I added an execution test to confirm this.
Best, Tisya Bhatia On Tue, Jun 9, 2026 at 1:58 PM Julian Hyde <[email protected]> wrote: > > As for the 0-vs-1 > > row question, happy to leave that to the existing behavior (and the > > language lawyers). > > There’s no existing behavior, so you need to decide what is the right > behavior, and test it. See what the other implementations do. > > > On Jun 9, 2026, at 11:14 AM, Tisya Bhatia via dev < > [email protected]> wrote: > > > > Hey Julian - > > > > Glad you're on board. Good point about the "contains an aggregate" vs. > "is > > an aggregate." My implementation (PR-pending) already does this: it > expands > > GROUP BY ALL to every SELECT expression that does not contain an > aggregate, > > using a recursive aggregate finder. So your example expands exactly as > you > > wrote. > > > > I'll also take note of measures - thanks, I'd missed that. And I'll add > > your example as a test case. > > > > Regarding the degenerate case (grouping only by constants over an empty > > input): GROUP BY ALL there expands to the exact same query as explicitly > > writing GROUP BY 'x', so it introduces no new semantics - it inherits > > whatever Calcite already does for constant-only grouping. As for the > 0-vs-1 > > row question, happy to leave that to the existing behavior (and the > > language lawyers). > > > > Thanks, > > Tisya Bhatia > > > > On Tue, Jun 9, 2026 at 12:57 PM Julian Hyde <[email protected] > <mailto:[email protected]>> wrote: > > > >> I agree; we should support GROUP BY ALL. > >> > >> “Isn’t an aggregate” should be broadened to something like “doesn’t > >> contain an aggregate and isn’t a measure”. > >> > >> SELECT deptno, SUBSTR(job, 1), count(*) + 1 as c, ‘x' as x > >> FROM emps > >> GROUP BY ALL > >> > >> should be equivalent to > >> > >> SELECT deptno, SUBSTR(job, 1), count(*) + 1 as c, ‘x' as x > >> FROM emps > >> GROUP BY deptno, SUBSTR(job, 1), ‘x' > >> > >> I’m guessing that’s what the major implementations do. Measures are a > >> Calcite-specific thing. > >> > >> Note that the constant ‘x’ gets swept into the GROUP BY clause but has > no > >> effect because it’s a constant. > >> When there are no rows and we’re only grouping by constants we have a > >> degenerate case and the language lawyers can argue about whether it > should > >> return 0 or 1 rows: > >> > >> SELECT ‘x’ as x, COUNT(*) > >> FROM emps > >> WHERE FALSE > >> GROUP BY ALL > >> > >> Julian > >> > >> > >> > >>> On Jun 9, 2026, at 10:25 AM, Tisya Bhatia via dev < > >> [email protected]> wrote: > >>> > >>> Hi all, > >>> > >>> I'd like to propose support for GROUP BY ALL in Calcite - a shorthand > >> that > >>> groups every expression in the SELECT clause that isn't an aggregate. > >>> > >>> Here is the Jira ticket: > >> > https://urldefense.com/v3/__https://issues.apache.org/jira/browse/CALCITE-7594__;!!Ayb5sqE7!pEba7l9-97xJOhAdqD6pEBkcgCqK3bHZKZBQDYL8-qHcbXxmWfNc0DmlEL7m8Qm9RF2bQ-92FfXAOcqWtWv7jdJL$ > >>> > >>> Context: > >>> - Today, if I want to group by every non-aggregated expression in the > >>> SELECT list, I would have to repeat them in the group by. > >>> - A GROUP BY ALL shorthand has become convergence across warehouse > >>> dialects, and supporting it keeps Calcite-based dialects updated and > >>> aligned. > >>> > >>> Proposal: > >>> - Add GROUP BY ALL: when ALL appears with no grouping items, group by > >> every > >>> SELECT expression that is not an aggregate or window function. It's > >>> implemented as a parser marker that the validator rewrites into the > >>> concrete grouping expressions before normal group validation runs, so > the > >>> converter / optimizer never see a marker. > >>> - Alternatives considered: I kept ALL (matches DuckDB, Spark, and > >> others). > >>> The existing ALL / DISTINCT set quantifier on grouping items > >> (CALCITE-5089, > >>> e.g. GROUP BY ALL CUBE(a, b)) is fully preserved. The new meaning only > >>> applies when no grouping items follow ALL. > >>> - Expected impact / tradeoff: purely additive. Bare GROUP BY ALL does > not > >>> parse today, so enabling it cannot change the meaning of any existing > >>> query. SELECT * with GROUP BY ALL is rejected with a clear error for > now > >>> (the star isn't expanded at validation time) > >>> > >>> For example: > >>> > >>> SELECT deptno, job, SUM(sal) > >>> FROM emp > >>> GROUP BY ALL; > >>> > >>> resolves to > >>> > >>> SELECT deptno, job, SUM(sal) > >>> FROM emp > >>> GROUP BY deptno, job; > >>> > >>> Open questions: > >>> - Since this feature is additive, I don't think a conformance flag is > >>> needed, but would we prefer one? > >>> - Should SELECT * be supported in this proposal, or is the explicit > error > >>> enough for an initial cut? > >>> > >>> If this direction sounds reasonable, I have a prototype ready (parser + > >>> validator + tests + reference docs) and will put up the PR soon. A > >>> companion ORDER BY ALL proposal (CALCITE-7597) is in the same vein. > >>> > >>> Best, > >>> Tisya Bhatia > >
