[ https://issues.apache.org/jira/browse/HIVE-11531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15005265#comment-15005265 ]
Jesus Camacho Rodriguez commented on HIVE-11531: ------------------------------------------------ [~huizane], thanks for working on this! Agreed with the comments left by [~sershe]. In addition: - This block in CalcitePlanner: {code} Integer offset = qbp.getDestToLimit().get(qbp.getClauseNames().iterator().next())==null? 0:qbp.getDestToLimit().get(qbp.getClauseNames().iterator().next()).getKey(); Integer limit = qbp.getDestToLimit().get(qbp.getClauseNames().iterator().next())==null? null:qbp.getDestToLimit().get(qbp.getClauseNames().iterator().next()).getValue(); {code} No need to create the iterator twice for offset (and for limit) and call next. Could you cache the value to avoid unnecessary overhead and improve readability? - A couple of additional style notes. There is typo in {{getOffetExpr}} method in HiveSortLimit. We could rename old {{limit}} variable to {{fetch}} (for instance, in line 2275 of CalcitePlanner) to avoid confusion. Please, pay attention with code spacing (e.g. {{}else{}}). - On a side note, maybe we should support the more verbose syntax too (as part of this JIRA or a follow-up). In addition to the abbreviated syntax: {noformat} LIMIT (skip,)? n {noformat} that is mostly used by MySQL, we could support: {noformat} LIMIT n OFFSET skip {noformat} which is supported e.g. by MySQL and PostgreSQL. [~sershe], Calcite should handle offset properly: support for offset was already there. The only code that needs to be extended is in HIVE-11684; I will update the patch accordingly so we do not ignore offset. > Add mysql-style LIMIT support to Hive, or improve ROW_NUMBER performance-wise > ----------------------------------------------------------------------------- > > Key: HIVE-11531 > URL: https://issues.apache.org/jira/browse/HIVE-11531 > Project: Hive > Issue Type: Improvement > Reporter: Sergey Shelukhin > Assignee: Hui Zheng > Attachments: HIVE-11531.WIP.1.patch, HIVE-11531.WIP.2.patch, > HIVE-11531.patch > > > For any UIs that involve pagination, it is useful to issue queries in the > form SELECT ... LIMIT X,Y where X,Y are coordinates inside the result to be > paginated (which can be extremely large by itself). At present, ROW_NUMBER > can be used to achieve this effect, but optimizations for LIMIT such as TopN > in ReduceSink do not apply to ROW_NUMBER. We can add first class support for > "skip" to existing limit, or improve ROW_NUMBER for better performance -- This message was sent by Atlassian JIRA (v6.3.4#6332)