GitHub user mgrenonville closed a discussion: Expose intermediary states in 
aggregation functions

Hello, 

While looking at Datafusion (what an awesome project !!), I wondered if it's 
possible to expose intermediary states (ie: before merge_batch) to allow what 
clickhouse calls ["-Merge", "-State", 
"-MergeState"](https://clickhouse.com/docs/sql-reference/aggregate-functions/combinators#-state)
 combinators.
This allow clickhouse to persist pre-aggregated data using a grouping key as 
key, thus allow to compress data without loosing ability to filter it. 

For example, uniqState returns a statistical structure (kind of count min 
sketch) that can be merge later, while querying. With this, it's easy to keep a 
uniqState by minute, and query uniqMerge by hour.

Thanks

GitHub link: https://github.com/apache/datafusion/discussions/16239

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: 
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to