[ https://issues.apache.org/jira/browse/HIVE-15474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15798665#comment-15798665 ]
Jesus Camacho Rodriguez commented on HIVE-15474: ------------------------------------------------ [~lirui], the combination with _ReduceDeduplication_ makes the trick in the case you describe. Consider the following query: {code:sql} select key, value, value2, count(key + 1) as agg1 from src group by key, value, value2 order by key, value limit 20; {code} In this case, OBy columns are a prefix of GBy columns. RS[4] in this case will end up with columns _key, value, value2_. Then limit will be pushed by the new extension to the RS[2]. As I stated above, I took a conservative approach as we need to be sure that we remain correct; it might be that the condition could be relaxed even further for some corner cases. However, I did not want to do it without checking the theoretical background further. > Extend limit propagation for chain of RS-GB-RS operators > -------------------------------------------------------- > > Key: HIVE-15474 > URL: https://issues.apache.org/jira/browse/HIVE-15474 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer > Affects Versions: 2.2.0 > Reporter: Jesus Camacho Rodriguez > Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-15474.01.patch, HIVE-15474.patch > > > The goal is to extend the work started in HIVE-14002. > For instance, given the following query: > {code:sql} > explain > select key, value, count(key + 1) as agg1 from src > group by key, value > order by key, value, agg1 limit 20; > {code} > We generate the following physical plan: > {{TS1 - GBY2 - RS3 - GBY4 - RS5 - SEL6 - LIM7 - FS8}} > We can push the limit to RS3 operator, as we will generate records for the > _top N_ keys, and thus, GBY4 will produce the _top N_ results. However, > currently we do not do it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)