[ https://issues.apache.org/jira/browse/HIVE-21217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16767363#comment-16767363 ]
Hive QA commented on HIVE-21217: -------------------------------- Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12958544/HIVE-21217.1.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 15796 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/16055/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16055/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16055/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12958544 - PreCommit-HIVE-Build > Optimize range calculation for PTF > ---------------------------------- > > Key: HIVE-21217 > URL: https://issues.apache.org/jira/browse/HIVE-21217 > Project: Hive > Issue Type: Improvement > Reporter: Adam Szita > Assignee: Adam Szita > Priority: Major > Attachments: HIVE-21217.0.patch, HIVE-21217.1.patch > > > During window function execution Hive has to iterate on neighbouring rows of > the current row to find the beginning and end of the proper range (on which > the aggregation will be executed). > When we're using range based windows and have many rows with a certain key > value this can take a lot of time. (e.g. partition size of 80M, in which we > have 2 ranges of 40M rows according to the orderby column: within these 40M > rowsets we're doing 40M x 40M/2 steps.. which is of n^2 time complexity) > I propose to introduce a cache that keeps track of already calculated range > ends so it can be reused in future scans. -- This message was sent by Atlassian JIRA (v7.6.3#76005)