[ https://issues.apache.org/jira/browse/HIVE-13072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15166317#comment-15166317 ]
Yongzhi Chen commented on HIVE-13072: ------------------------------------- I can not reproduce the issue in the master branch with query: insert overwrite table rowninfo select row_number() over( order by num) as rowid, num from disrow; disrow has 329210 rows with distinct values. After the insert statement, rowninfo has same number of rows with distinct row values. There is no duplicate. [~Zyrix], could you share your reproduce? Thanks > ROW_NUMBER() function creates wrong results > ------------------------------------------- > > Key: HIVE-13072 > URL: https://issues.apache.org/jira/browse/HIVE-13072 > Project: Hive > Issue Type: Bug > Affects Versions: 1.1.0 > Reporter: Philipp Brandl > Assignee: Yongzhi Chen > > When using ROW_NUMBER() on tables with more than 25000 rows, the function > ROW_NUMBER() duplicates rows with separate row numbers. > Reproduce by using a large table with more than 25000 rows with distinct > values and then using a query involving ROW_NUMBER(). It will then result in > getting the same distinct values twice with separate row numbers apart by > 25000. -- This message was sent by Atlassian JIRA (v6.3.4#6332)