[ https://issues.apache.org/jira/browse/HIVE-13727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15278165#comment-15278165 ]
Rajat Khandelwal commented on HIVE-13727: ----------------------------------------- The rejection of the first query doesn't seem to be justified. It's a typical top-n query, but the data is available in two tables. So an optimized way of getting top-n is to get the top-n values from both tables and getting top-n from 2n elements, instead of from the entire union of two tables. > Getting error Failed rule: 'orderByClause clusterByClause distributeByClause > sortByClause limitClause can only be applied to the whole union.' in subquery > ----------------------------------------------------------------------------------------------------------------------------------------------------------- > > Key: HIVE-13727 > URL: https://issues.apache.org/jira/browse/HIVE-13727 > Project: Hive > Issue Type: Bug > Reporter: Rajat Khandelwal > > The error comes in the following query: > {noformat} > SELECT * > FROM > (SELECT * > FROM srcpart a > WHERE a.ds = '2008-04-08' > AND a.hr = '11' > ORDER BY a.key LIMIT 5 > UNION ALL > SELECT * > FROM srcpart b > WHERE b.ds = '2008-04-08' > AND b.hr = '14' > ORDER BY b.key LIMIT 5) subq > ORDER BY KEY LIMIT 5 > {noformat} > But the following query works: > {noformat} > SELECT * > FROM > (SELECT * > FROM > (SELECT * > FROM srcpart a > WHERE a.ds = '2008-04-08' > AND a.hr = '11' > ORDER BY a.key LIMIT 5) pa > UNION ALL SELECT * > FROM > (SELECT * > FROM srcpart b > WHERE b.ds = '2008-04-08' > AND b.hr = '14' > ORDER BY b.key LIMIT 5) pb) subq > ORDER BY KEY LIMIT 5 > {noformat} > The queries are logically identical, the query that's rejected has dummy > select * clauses around the sub-queries. -- This message was sent by Atlassian JIRA (v6.3.4#6332)