GitHub user pepijnve added a comment to the discussion: Multiple 'group by's,
one scan
Just FYI, in the particular case I'm working on the problem I'm dealing with is
that I want to compute a whole bunch of aggregates over a table with a
cardinality in the billions order or magnitude. For `n`
GitHub user alamb added a comment to the discussion: Multiple 'group by's, one
scan
There is some additional discussion on a similar sounding feature here:
- https://github.com/apache/datafusion/issues/8777
Another potential approach is to fully materialize the input (`INSERT INTO
temp_file.p
GitHub user alamb added a comment to the discussion: Multiple 'group by's, one
scan
The other types of plans I have seen this cause problems is when the operators
rely on sort order -- so like `SortPreservingMerge` or a group by where the
data is partially sorted on some of the group keys
Gi
GitHub user pepijnve added a comment to the discussion: Multiple 'group by's,
one scan
Perhaps satisfying the non-general case would already be of value? For the
diamond self-join example in the linked issue it doesn't make much sense
indeed. Can you think of other examples besides joins wher
GitHub user pepijnve added a comment to the discussion: Multiple 'group by's,
one scan
I read through the linked issues in the meantime. I think what we're trying to
do is closest to the Splitter idea described in the linked document at
https://github.com/apache/datafusion/pull/8558#issuecomm