guangdong created HIVE-20289: -------------------------------- Summary: The function of row_number have different result Key: HIVE-20289 URL: https://issues.apache.org/jira/browse/HIVE-20289 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 2.3.0 Environment: hive 2.3.3
hadoop 2.7.6 Reporter: guangdong Fix For: 2.3.0 1. Create table like this: create table src( name string ,buy_time string ,consumption int ); 2.Then insert data: insert into src values('zzz','2018-08-01',20),('zzz','2018-08-01',10); 3.When i execute sql in hive 2.3.3. The result is : hive> select consumption, row_number() over(distribute by name sort by buy_time desc) from src; Query ID = dwetl_20180801210808_692d5d70-a136-4525-9cdb-b6269e6c3069 Total jobs = 1 Launching Job 1 out of 1 Number of reduce tasks not specified. Estimated from input data size: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number> In order to set a constant number of reducers: set mapreduce.job.reduces=<number> Starting Job = job_1531984581474_944267, Tracking URL = http://hadoop-jr-nn02.pekdc1.jdfin.local:8088/proxy/application_1531984581474_944267/ Kill Command = /soft/hadoop/bin/hadoop job -kill job_1531984581474_944267 Hadoop job information for Stage-1: number of mappers: 2; number of reducers: 1 2018-08-01 21:09:08,855 Stage-1 map = 0%, reduce = 0% 2018-08-01 21:09:16,026 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.12 sec 2018-08-01 21:09:22,210 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 4.09 sec MapReduce Total cumulative CPU time: 4 seconds 90 msec Ended Job = job_1531984581474_944267 MapReduce Jobs Launched: Stage-Stage-1: Map: 2 Reduce: 1 Cumulative CPU: 4.09 sec HDFS Read: 437 HDFS Write: 10 SUCCESS Total MapReduce CPU Time Spent: 4 seconds 90 msec OK 20 1 10 2 Time taken: 80.135 seconds, Fetched: 2 row(s) 4.When i execute sql in hive 0.14. The result is : > select consumption, row_number() over(distribute by name sort by buy_time > desc) from src; Query ID = dwetl_20180801212222_7812d9f0-328d-4125-ba99-0f577f4cca9a Total jobs = 1 Launching Job 1 out of 1 Number of reduce tasks not specified. Estimated from input data size: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number> In order to set a constant number of reducers: set mapreduce.job.reduces=<number> Starting Job = job_1531984581474_944597, Tracking URL = http://hadoop-jr-nn02.pekdc1.jdfin.local:8088/proxy/application_1531984581474_944597/ Kill Command = /soft/hadoop/bin/hadoop job -kill job_1531984581474_944597 Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1 2018-08-01 21:22:26,467 Stage-1 map = 0%, reduce = 0% 2018-08-01 21:22:34,839 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.13 sec 2018-08-01 21:22:40,984 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 3.28 sec MapReduce Total cumulative CPU time: 3 seconds 280 msec Ended Job = job_1531984581474_944597 MapReduce Jobs Launched: Stage-Stage-1: Map: 1 Reduce: 1 Cumulative CPU: 3.28 sec HDFS Read: 233 HDFS Write: 10 SUCCESS Total MapReduce CPU Time Spent: 3 seconds 280 msec OK I hope have the common result . How could i can do? -- This message was sent by Atlassian JIRA (v7.6.3#76005)