[ https://issues.apache.org/jira/browse/HIVE-24207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
László Bodor resolved HIVE-24207. --------------------------------- Resolution: Fixed > LimitOperator can leverage ObjectCache to bail out quickly > ---------------------------------------------------------- > > Key: HIVE-24207 > URL: https://issues.apache.org/jira/browse/HIVE-24207 > Project: Hive > Issue Type: Improvement > Reporter: Rajesh Balamohan > Assignee: László Bodor > Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 1.5h > Remaining Estimate: 0h > > {noformat} > select ss_sold_date_sk from store_sales, date_dim where date_dim.d_year in > (1998,1998+1,1998+2) and store_sales.ss_sold_date_sk = date_dim.d_date_sk > limit 100; > select distinct ss_sold_date_sk from store_sales, date_dim where > date_dim.d_year in (1998,1998+1,1998+2) and store_sales.ss_sold_date_sk = > date_dim.d_date_sk limit 100; > {noformat} > Queries like the above generate a large number of map tasks. Currently they > don't bail out after generating enough amount of data. > It would be good to make use of ObjectCache & retain the number of records > generated. LimitOperator/VectorLimitOperator can bail out for the later tasks > in the operator's init phase itself. > https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorLimitOperator.java#L57 > https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/LimitOperator.java#L58 -- This message was sent by Atlassian Jira (v8.3.4#803005)