[ 
https://issues.apache.org/jira/browse/HIVE-11531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15005265#comment-15005265
 ] 

Jesus Camacho Rodriguez commented on HIVE-11531:
------------------------------------------------

[~huizane], thanks for working on this!

Agreed with the comments left by [~sershe]. In addition:

- This block in CalcitePlanner:
{code}
      Integer offset = 
qbp.getDestToLimit().get(qbp.getClauseNames().iterator().next())==null?
          
0:qbp.getDestToLimit().get(qbp.getClauseNames().iterator().next()).getKey();
      Integer limit = 
qbp.getDestToLimit().get(qbp.getClauseNames().iterator().next())==null?
          
null:qbp.getDestToLimit().get(qbp.getClauseNames().iterator().next()).getValue();
{code}
No need to create the iterator twice for offset (and for limit) and call next. 
Could you cache the value to avoid unnecessary overhead and improve readability?
- A couple of additional style notes. There is typo in {{getOffetExpr}} method 
in HiveSortLimit. We could rename old {{limit}} variable to {{fetch}} (for 
instance, in line 2275 of CalcitePlanner) to avoid confusion. Please, pay 
attention with code spacing (e.g. {{}else{}}).
- On a side note, maybe we should support the more verbose syntax too (as part 
of this JIRA or a follow-up). In addition to the abbreviated syntax:
{noformat}
LIMIT (skip,)? n
{noformat}
that is mostly used by MySQL, we could support:
{noformat}
LIMIT n OFFSET skip
{noformat}
which is supported e.g. by MySQL and PostgreSQL.

[~sershe], Calcite should handle offset properly: support for offset was 
already there. The only code that needs to be extended is in HIVE-11684; I will 
update the patch accordingly so we do not ignore offset.

> Add mysql-style LIMIT support to Hive, or improve ROW_NUMBER performance-wise
> -----------------------------------------------------------------------------
>
>                 Key: HIVE-11531
>                 URL: https://issues.apache.org/jira/browse/HIVE-11531
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Sergey Shelukhin
>            Assignee: Hui Zheng
>         Attachments: HIVE-11531.WIP.1.patch, HIVE-11531.WIP.2.patch, 
> HIVE-11531.patch
>
>
> For any UIs that involve pagination, it is useful to issue queries in the 
> form SELECT ... LIMIT X,Y where X,Y are coordinates inside the result to be 
> paginated (which can be extremely large by itself). At present, ROW_NUMBER 
> can be used to achieve this effect, but optimizations for LIMIT such as TopN 
> in ReduceSink do not apply to ROW_NUMBER. We can add first class support for 
> "skip" to existing limit, or improve ROW_NUMBER for better performance



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to