[I] [SIP] Proposal for aggregate dimensions [superset]

via GitHub Mon, 20 May 2024 10:50:33 -0700


michaelritsema opened a new issue, #28610:
URL: https://github.com/apache/superset/issues/28610


   *Please make sure you are familiar with the SIP process documented*
   [here](https://github.com/apache/superset/issues/5602). The SIP will be 
numbered by a committer upon acceptance.
   
   ## [SIP] Proposal for aggregate dimensions
   
   ### Motivation
   
   Adding columns to a report that are functionality dependent on an already 
added dimension should come at a low cost.
   
   For example: 
   
   'GROUP BY person_id'
   
   If we then want to add columns to the report "first_name,last_name,age" we'd 
have the choice of adding it to the dimension or as a metric.
   
   Simply adding it to the dimension incurs penalty of putting in the GROUP BY. 
This is an expensive operation as well as a potential error (what if one of the 
rows actually had the wrong age and we ended up creating a new person)
   
   Simply adding it as a metric is a bit awkward but is the best fit for my use 
case now. I have dozens of these columns and perfer to just wrap them in an ANY 
aggregate function. 
   
   
   
   ### Proposed Change
   
   Describe how the feature will be implemented, or the problem will be solved. 
If possible, include mocks, screenshots, or screencasts (even if from different 
tools).
   
   I propose we add a toggle to set a dimension as an "aggregate dimension" 
Instead of adding this column to the group by the user can just pick the 
aggregate it would like (min,max,any, first, etc). By default we would not have 
to duplicate this as a new metric column but could keep the same name. I think 
this would also work with the newer Drill features in a better way than the 
metrics.
   
   
   ### New or Changed Public Interfaces
   
   Describe any new additions to the model, views or `REST` endpoints. Describe 
any changes to existing visualizations, dashboards and React components. 
Describe changes that affect the Superset CLI and how Superset is deployed.
   
   ### Rejected Alternatives
   
   Currently adding these strings as a metric in an ANY aggregate works best 
for me. This awkward and confusing to the end user to see dimensional type data 
as well as a lot of duplicated column naming "any(first_name) as first_name"
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[I] [SIP] Proposal for aggregate dimensions [superset]

Reply via email to