[ 
https://issues.apache.org/jira/browse/HIVE-13072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15166317#comment-15166317
 ] 

Yongzhi Chen commented on HIVE-13072:
-------------------------------------

I can not reproduce the issue in the master branch with query:
insert overwrite table rowninfo select row_number() over( order by num) as 
rowid, num from disrow;
disrow has 329210 rows with distinct values. 
After the insert statement, rowninfo has same number of rows with distinct row 
values. There is no duplicate.
[~Zyrix], could you share your reproduce?  Thanks

> ROW_NUMBER() function creates wrong results
> -------------------------------------------
>
>                 Key: HIVE-13072
>                 URL: https://issues.apache.org/jira/browse/HIVE-13072
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 1.1.0
>            Reporter: Philipp Brandl
>            Assignee: Yongzhi Chen
>
> When using ROW_NUMBER() on tables with more than 25000 rows, the function 
> ROW_NUMBER() duplicates rows with separate row numbers.
> Reproduce by using a large table with more than 25000 rows with distinct 
> values and then using a query involving ROW_NUMBER(). It will then result in 
> getting the same distinct values twice with separate row numbers apart by 
> 25000.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to