[ https://issues.apache.org/jira/browse/HIVE-28489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17894117#comment-17894117 ]
Stamatis Zampetakis commented on HIVE-28489: -------------------------------------------- Hey [~seonggon], according to the [discussion in the user list|https://lists.apache.org/thread/o2rm4xjmlfv7co2ort7o5tpg37bo5zhj] there might be a new patch around this optimization. Is PR#5424 ready for review or we should wait for another PR? nit: I was checking the attached slides but given that I don't have Powerpoint they appear somewhat broken in LibreOffice. Consider sharing such info in more widespread formats such as pdf. > Partitioning the input data of Grouping Set GroupBy operator > ------------------------------------------------------------ > > Key: HIVE-28489 > URL: https://issues.apache.org/jira/browse/HIVE-28489 > Project: Hive > Issue Type: New Feature > Reporter: Seonggon Namgung > Assignee: Seonggon Namgung > Priority: Major > Labels: pull-request-available > Attachments: 2.PartitionDataBeforeGroupingSet.pptx > > > GroupBy operator with grouping sets often emits too many rows, which becomes > the bottleneck of query execution. To reduce the number output rows, this > JIRA proposes partitioning the input data of such GroupBy operator. > Please check out the attached slides for detailed explanation. -- This message was sent by Atlassian Jira (v8.20.10#820010)